Learning

pyAgrum provides a complete framework for learning Bayesian networks from data. It includes various algorithms for structure learning, parameter learning, and model evaluation. The library supports both score-based and constraint-based approaches, allowing users to choose the method that best fits their needs.

pyAgrum brings together all Bayesian network learning processes in a single, easy-to-use class: pyagrum.BNLearner. This class provides direct access to complete learning algorithms and their parameters (such as priors, scores, constraints, etc.), and also offers low-level functions that facilitate the development of new learning algorithms (for example, computing chi² or conditional likelihood on the dataset).

BNLearner allows to choose :

  • the structure learning algorithm (MIIC, Greedy Hill Climbing, K2, etc.),

  • the parameter learning method (including EM),

  • the scoring function (BDeu, AIC, etc.) for score-based algorithms,

  • the prior (smoothing, Dirichlet, etc.),

  • the constraints (e.g., forbidding certain arcs, specifying a partial order among variables, etc.),

  • the correction method (NML, etc.) for the MIIC algorithm,

  • and many low-level functions, such as computing the chi², G² score, or the conditional likelihood on the dataset.

pyagrum.BNLearner is able to learn a Bayesian network from a database (a pandas.DataFrame) or from a csv file.

Class BNLearner

Methods for performing learning:

fitParameters

latentVariables

learnBN

learnDAG

learnParameters

learnPDAG

Structure learning algorithms:

isConstraintBased

isScoreBased

useGreedyHillClimbing

useK2

useLocalSearchWithTabuList

useMIIC

Managing structure learning constraints:

addForbiddenArc

addMandatoryArc

addNoChildrenNode

addNoParentNode

addPossibleEdge

eraseForbiddenArc

eraseMandatoryArc

eraseNoChildrenNode

eraseNoParentNode

erasePossibleEdge

setForbiddenArcs

setMandatoryArcs

setMaxIndegree

setSliceOrder

Scores and priors (for structure learning):

useBDeuPrior

useDirichletPrior

useSmoothingPrior

useScoreAIC

useScoreBD

useScoreBDeu

useScoreBIC

useScoreK2

useScoreLog2Likelihood

useMDLCorrection

useNMLCorrection

useNoCorrection

EM parameter learning:

EMdisableEpsilon

EMdisableMaxIter

EMdisableMaxTime

EMdisableMinEpsilonRate

EMEpsilon

EMHistory

EMMaxIter

EMMaxTime

EMMinEpsilonRate

EMState

EMStateAsInt

EMVerbosity

EMenableEpsilon

EMenableMaxIter

EMenableMaxTime

EMenableMinEpsilonRate

forbidEM

EMisEnabledEpsilon

EMisEnabledMaxIter

EMisEnabledMaxTime

EMisEnabledMinEpsilonRate

isUsingEM

EMnbrIterations

EMsetEpsilon

EMsetMaxIter

EMsetMaxTime

EMsetMinEpsilonRate

EMsetVerbosity

useEM

useEMWithDiffCriterion

useEMWithRateCriterion

Database inspection / direct requesting:

chi2

correctedMutualInformation

databaseWeight

domainSize

G2

idFromName

nameFromId

names

nbCols

nbRows

hasMissingValues

logLikelihood

mutualInformation

pseudoCount

rawPseudoCount

recordWeight

score

setDatabaseWeight

setRecordWeight

Fine-tuning the behavior of the BNLearner:

copyState

getNumberOfThreads

isGumNumberOfThreadsOverriden

setNumberOfThreads

class pyagrum.BNLearner(*args)

This class provides functionality for learning Bayesian Networks from data.

BNLearner(filename,inducedTypes=True) -> BNLearner
Parameters:
  • source (str or pandas.DataFrame) – the data to learn from

  • missingSymbols (List[str]) – list of strings that will be interpreted as missing values (by default : ?)

  • inducedTypes (Bool) – whether BNLearner should try to automatically find the type of each variable

BNLearner(filename,src) -> BNLearner
Parameters:
  • source (str or pandas.DataFrame) – the data to learn from

  • src (pyagrum.BayesNet) – the Bayesian network used to find those modalities

  • missingSymbols (List[str]) – list of strings that will be interpreted as missing values (by default : ?)

BNLearner(learner) -> BNLearner
Parameters:
  • learner (pyagrum.BNLearner) – the BNLearner to copy

EMEpsilon()

Returns a float corresponding to the minimal difference between two consecutive log-likelihoods under which the EM parameter learning algorithm stops.

Returns:

the minimal difference between two consecutive log-likelihoods under which EM stops.

Return type:

float

EMHistory()

Returns a list containing the log-likelihoods recorded after each expectation/maximization iteration of the EM parameter learning algorithm.

Returns:

A list of all the log-likelihoods recorded during EM’s execution

Return type:

List[float]

Warning

Recording log-likelihoods is enabled only when EM is executed in verbose mode. See method EMsetVerbosity().

EMMaxIter()

Returns an int containing the max number of iterations the EM parameter learning algorithm is allowed to perform when the max iterations stopping criterion is enabled.

Returns:

the max number of expectation/maximization iterations EM is allowed to perform

Return type:

float

EMMaxTime()

Returns a float indicating EM’s time limit when the max time stopping criterion is used by the EM parameter learning algorithm.

Returns:

the max time EM is allowed to execute its expectation/maximization iterations

Return type:

float

EMMinEpsilonRate()

Returns a float corresponding to the minimal log-likelihood’s evolution rate under which the EM parameter learning algorithm stops its iterations.

Returns:

the limit under which EM stops its expectation/maximization iterations

Return type:

float

EMPeriodSize()
Return type:

int

EMStateApproximationScheme()
Return type:

int

EMStateMessage()
Return type:

str

EMVerbosity()

Returns a Boolean indicating whether the EM parameter learning algorithm is in a verbose mode.

Note that EM verbosity is necessary for recording the history of the log-likelihoods computed at each expectation/maximization step.

Returns:

indicates whether EM’s verbose mode is active or not

Return type:

bool

EMdisableEpsilon()

Disables the minimal difference between two consecutive log-likelihoods as a stopping criterion for the EM parameter learning algorithm.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

EMdisableMaxIter()

Do not limit EM to perform a maximal number of iterations.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

EMdisableMaxTime()

Allow EM to learn parameters for an infinite amount of time.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

EMdisableMinEpsilonRate()

Disables the minimal log-likelihood’s evolution rate as an EM parameter learning stopping criterion.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

EMenableEpsilon()

Enforces that the minimal difference between two consecutive log-likelihoods is a stopping criterion for the EM parameter learning algorithm.

Return type:

BNLearner

Returns:

  • pyagrum.BNLearner – the BNLearner itself, so that we can chain useXXX() methods.

  • Warnings

  • ———

  • Setting the min difference between two consecutive log-likelihoods as a stopping

  • criterion disables the min log-likelihood evolution rate as a stopping criterion.

EMenableMaxIter()

Enables a limit on the number of iterations performed by EM. This number is equal to the last number specified with Method EMsetMaxIter(). See Method EMMaxIter() to get its current value.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

EMenableMaxTime()

Forbid EM to run more than a given amount of time.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

EMenableMinEpsilonRate()

Enables the minimal log-likelihood’s evolution rate as an EM parameter learning stopping criterion.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

Warning

Setting this stopping criterion disables the min log-likelihod difference criterion.

EMisEnabledEpsilon()

Returns a Boolean indicating whether the minimal difference between two consecutive log-likelihoods is a stopping criterion for the EM parameter learning algorithm.

Return type:

bool

EMisEnabledMaxIter()

Returns a Boolean indicating whether the max number of iterations is used by EM as a stopping criterion.

Return type:

bool

EMisEnabledMaxTime()

Returns a Boolean indicating whether the max time criterion is used as an EM stopping criterion.

Return type:

bool

EMisEnabledMinEpsilonRate()

Returns a Boolean indicating whether the minimal log-likelihood’s evolution rate is considered as a stopping criterion by the EM parameter learning algorithm.

Return type:

bool

EMnbrIterations()

Returns the number of iterations performed by the EM parameter learning algorithm.

Return type:

int

EMsetEpsilon(eps)

Enforces that the minimal difference between two consecutive log-likelihoods is chosen as a stopping criterion of the EM parameter learning algorithm and specifies the threshold on this criterion.

Parameters:

eps (float) – the log-likelihood difference below which EM stops its iterations

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

Raises:

pyagrum.OutOfBounds – If eps <= 0.

EMsetMaxIter(max)

Enforces a limit on the number of expectation/maximization steps performed by EM.

Parameters:

max (int) – the maximal number of iterations that EM is allowed to perform

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

Raises:

pyagrum.OutOfBounds – If max <= 1.

EMsetMaxTime(timeout)

Adds a constraint on the time that EM is allowed to run for learning parameters.

Parameters:

timeout (float) – the timeout in milliseconds

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

Raises:

pyagrum.OutOfBounds – If timeout<=0.0

EMsetMinEpsilonRate(rate)

Enforces that the minimal log-likelihood’s evolution rate is considered by the EM parameter learning algorithm as a stopping criterion.

Parameters:

rate (float) – the log-likelihood evolution rate below which EM stops its iterations

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

Raises:

pyagrum.OutOfBounds – If rate <= 0.

Warning

Setting this stopping criterion disables the min log-likelihod difference criterion (if this one was enabled).

EMsetPeriodSize(p)
Parameters:

p (int)

Return type:

BNLearner

EMsetVerbosity(v)

Sets or unsets the verbosity of the EM parameter learning algorithm.

Verbosity is necessary for keeping track of the history of the learning. See Method EMHistory().

Parameters:

v (bool) – sets EM’s verbose mode if and only if v = True.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

G2(*args)

G2 computes the G2 statistic and p-value of two variables conditionally to a list of other variables.

The variables correspond to columns in the database and are specified as the names of these columns in the database. The list of variables in the conditioning set can be empty. In this case, no need to specify it.

Usage:
  • G2(name1, name2, knowing=[])

Parameters:
  • name1 (str) – the name of a variable/column in the database

  • name2 (str) – the name/column of another variable

  • knowing (List[str]) – the list of the column names of the conditioning variables

Returns:

the G2 statistics and the corresponding p-value as a Tuple

Return type:

Tuple[float,float]

addForbiddenArc(*args)

Forbid the arc passed in argument to be added during structure learning (methods learnDAG() or learnBN()).

Usage:
  1. addForbiddenArc(tail, head)

  2. addForbiddenArc(arc)

Parameters:
  • arc (pyagrum.Arc) – an arc

  • head (int str) – a variable’s id or name

  • tail (int str) – a variable’s id or name

Return type:

BNLearner

addMandatoryArc(*args)

Allow an arc to be added if necessary during structure learning (methods learnDAG() or learnBN()).

Usage:
  1. addMandatoryArc(tail, head)

  2. addMandatoryArc(arc)

Parameters:
  • arc (pyagrum.Arc) – an arc

  • head (int str) – a variable’s id or name

  • tail (int str) – a variable’s id or name

Raises:

pyagrum.InvalidDirectedCycle – If the added arc creates a directed cycle in the DAG

Return type:

BNLearner

addNoChildrenNode(*args)

Add to structure learning algorithms the constraint that this node cannot have any children.

Parameters:

node (int str) – a variable’s id or name

Return type:

BNLearner

addNoParentNode(*args)

Add the constraint that this node cannot have any parent.

Parameters:

node (int str) – a variable’s id or name

Return type:

BNLearner

addPossibleEdge(*args)

assign a new possible edge

Warning

By default, all edge is possible. However, once at least one possible edge is defined, all other edges not declared possible are considered as impossible.

Parameters:
  • arc (pyagrum.Arc) – an arc

  • head (int str) – a variable’s id or name

  • tail (int str) – a variable’s id or name

Return type:

BNLearner

chi2(*args)

chi2 computes the chi2 statistic and p-value of two variables conditionally to a list of other variables.

The variables correspond to columns in the database and are specified as the names of these columns in the database. The list of variables in the conditioning set can be empty. In this case, no need to specify it.

Usage:
  • chi2(name1, name2, knowing=[])

Parameters:
  • name1 (str) – the name of a variable/column in the database

  • name2 (str) – the name/column of another variable

  • knowing (List[str]) – the list of the column names of the conditioning variables

Returns:

the chi2 statistics and the associated p-value as a Tuple

Return type:

Tuple[float,float]

copyState(learner)

Copy the state of the given pyagrum.BNLearner (as argument).

Parameters:
  • pyagrum.BNLearner – the learner whose state is copied.

  • learner (BNLearner)

Return type:

None

correctedMutualInformation(*args)

computes the mutual information between two columns, given a list of other columns (log2).

Warning

This function takes into account correction and prior. If you want the ‘raw’ mutual information, use pyagrum.BNLearner.mutualInformation

Parameters:
  • name1 (str) – the name of the first column

  • name2 (str) – the name of the second column

  • knowing (List[str]) – the list of names of conditioning columns

Returns:

the G2 statistic and the associated p-value as a Tuple

Return type:

Tuple[float,float]

currentTime()
Returns:

get the current running time in second (float)

Return type:

float

databaseWeight()

Get the database weight which is given as an equivalent sample size.

Returns:

The weight of the database

Return type:

float

domainSize(*args)

Return the domain size of the variable with the given name.

Parameters:

n (str | int) – the name of the id of the variable

Return type:

int

epsilon()
Returns:

the value of epsilon

Return type:

float

eraseForbiddenArc(*args)

Allow the arc to be added if necessary.

Parameters:
  • arc (pyagrum.Arc) – an arc

  • head (int str) – a variable’s id or name

  • tail (int str) – a variable’s id or name

Return type:

BNLearner

eraseMandatoryArc(*args)
Parameters:
  • arc (pyagrum.Arc) – an arc

  • head (int str) – a variable’s id or name

  • tail (int str) – a variable’s id or name

Return type:

BNLearner

eraseNoChildrenNode(*args)

Remove in structure learning algorithms the constraint that this node cannot have any children.

Parameters:

node (int str) – a variable’s id or name

Return type:

BNLearner

eraseNoParentNode(*args)

Remove the constraint that this node cannot have any parent.

Parameters:

node (int str) – a variable’s id or name

Return type:

BNLearner

erasePossibleEdge(*args)

Allow the 2 arcs to be added if necessary.

Parameters:
  • arc (pyagrum.Arc) – an arc

  • head (int str) – a variable’s id or name

  • tail (int str) – a variable’s id or name

Return type:

BNLearner

fitParameters(bn, take_into_account_score=True)
forbidEM()

Forbids the use of EM for parameter learning.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

getNumberOfThreads()

Return the number of threads used by the BNLearner during structure and parameter learning.

Returns:

the number of threads used by the BNLearner during structure and parameter learning

Return type:

int

hasMissingValues()

Indicates whether there are missing values in the database.

Returns:

True if there are some missing values in the database.

Return type:

bool

history()
Returns:

the scheme history

Return type:

tuple

Raises:

pyagrum.OperationNotAllowed – If the scheme did not performed or if verbosity is set to false

idFromName(var_name)
Parameters:
  • var_names (str) – a variable’s name

  • var_name (str)

Returns:

the column id corresponding to a variable name

Return type:

int

Raises:

pyagrum.MissingVariableInDatabase – If a variable of the BN is not found in the database.

isConstraintBased()

Return wether the current learning method is constraint-based or not.

Returns:

True if the current learning method is constraint-based.

Return type:

bool

isGumNumberOfThreadsOverriden()

Check if the number of threads use by the learner is the default one or not.

Returns:

True if the number of threads used by the BNLearner has been set.

Return type:

bool

isScoreBased()

Return wether the current learning method is score-based or not.

Returns:

True if the current learning method is score-based.

Return type:

bool

isUsingEM()

returns a Boolean indicating whether EM is used for parameter learning when the database contains missing values.

Return type:

bool

latentVariables()

Warning

learner must be using MIIC algorithm

Returns:

the list of latent variables

Return type:

list

learnBN()

Learns a BayesNet (both parameters and structure) from the BNLearner’s database

Returns:

the learnt BayesNet

Return type:

pyagrum.BayesNet

learnDAG()

learn a structure from a file

Returns:

the learned DAG

Return type:

pyagrum.DAG

learnEssentialGraph()

learn an essential graph from a file

Returns:

the learned essential graph

Return type:

pyagrum.EssentialGraph

learnPDAG()

learn a partially directed acyclic graph (PDAG) from the BNLearner’s database

Returns:

the learned PDAG

Return type:

pyagrum.PDAG

Warning

The learning method must be constraint-based (MIIC, etc.) and not score-based (K2, GreedyHillClimbing, etc.)

learnParameters(*args)

Creates a Bayes net whose structure corresponds to that passed in argument or to the last structure learnt by Method learnDAG(), and whose parameters are learnt from the BNLearner’s database.

usage:
  1. learnParameters(dag, take_into_account_score=True)

  2. learnParameters(bn, take_into_account_score=True)

  3. learnParameters(take_into_account_score=True)

When the first argument of Method learnParameters() is a DAG or a Bayes net (usages 1. and 2.), this one specifies the graphical structure of the returned Bayes net. Otherwise (usage 3.), Method learnParameters() is called implicitly with the last DAG learnt by the BNLearner.

The difference between calling this method with a DAG (usages 1. and 3.) or a Bayes net (usage 2.) arises when the database contains missing values and EM is used to learn the parameters. EM needs to initialize the conditional probability distributions (CPT) before iterating the expectation/maximimzation steps. When a DAG is passed in argument, these initializations are performed using a specific estimator that does not take into account the missing values in the database. The resulting CPTs are then perturbed randomly (see the noise in method useEM()). When a Bayes net is passed in argument, its CPT for a node A can be either filled exclusively with zeroes or not. In the first case, the initialization is performed as described above. In the second case, the value of A’s CPT is used as is, and a subsequent perturbation controlled by the noise level is applied.

Parameters:
  • dag (pyagrum.DAG) – specifies the graphical structure of the returned Bayes net.

  • bn (pyagrum.BayesNet) – specifies the graphical structure of the returned Bayes net and, when the database contains missing values and EM is used for learning, force EM to initialize the CPTs of the resulting Bayes net to the values of those passed in argument (when they are not fully filled with zeroes) before iterating over the expectation/maximization steps.

  • take_into_account_score (bool, default=True) – The graphical structure passed in argument may have been learnt from a structure learning. In this case, if the score used to learn the structure has an implicit prior (like K2 which has a 1-smoothing prior), it is important to also take into account this implicit prior for parameter learning. By default (take_into_account_score=True), we will learn parameters by taking into account the prior specified by methods usePriorXXX() + the implicit prior of the score (if any). If take_into_account_score=False, we just take into account the prior specified by usePriorXXX().

Returns:

the learnt BayesNet

Return type:

pyagrum.BayesNet

Raises:
  • pyagrum.MissingVariableInDatabase – If a variable of the Bayes net is not found in the database

  • pyagrum.MissingValueInDatabase – If the database contains some missing values and EM is not used for the learning

  • pyagrum.OperationNotAllowed – If EM is used but no stopping criterion has been selected

  • pyagrum.UnknownLabelInDatabase – If a label is found in the database that do not correspond to the variable

Warning

When using a pyagrum.DAG as input parameter, the NodeIds in the dag and index of rows in the database must fit in order to coherently fix the structure of the BN. Generally, it is safer to use a pyagrum.BayesNet as input or even to use pyagrum.BNLearner.fitParameters.

logLikelihood(*args)

logLikelihood computes the log-likelihood for the columns in vars, given the columns in the list knowing (optional)

Parameters:
  • vars (List[str]) – the name of the columns of interest

  • knowing (List[str]) – the (optional) list of names of conditioning columns

Returns:

the log-likelihood (base 2)

Return type:

float

maxIter()
Returns:

the criterion on number of iterations

Return type:

int

maxTime()
Returns:

the timeout(in seconds)

Return type:

float

messageApproximationScheme()
Returns:

the approximation scheme message

Return type:

str

minEpsilonRate()
Returns:

the value of the minimal epsilon rate

Return type:

float

mutualInformation(*args)

computes the (log2) mutual information between two columns, given a list of other columns.

Warning

This function gives the ‘raw’ mutual information. If you want a version taking into account correction and prior, use pyagrum.BNLearner.correctedMutualInformation

Parameters:
  • name1 (str) – the name of the first column

  • name2 (str) – the name of the second column

  • knowing (List[str]) – the list of names of conditioning columns

Returns:

the log2 mutual information

Return type:

float

nameFromId(id)
Parameters:

id (int) – a node id

Returns:

the variable’s name

Return type:

str

names()
Returns:

the names of the variables in the database

Return type:

Tuple[str]

nbCols()

Return the number of columns in the database

Returns:

the number of columns in the database

Return type:

int

nbRows()

Return the number of row in the database

Returns:

the number of rows in the database

Return type:

int

nbrIterations()
Returns:

the number of iterations

Return type:

int

periodSize()
Returns:

the number of samples between 2 stopping

Return type:

int

Raises:

pyagrum.OutOfBounds – If p<1

pseudoCount(vars)

access to pseudo-count (priors taken into account)

Parameters:

vars (list[str]) – a list of name of vars to add in the pseudo_count

Return type:

a Tensor containing this pseudo-counts

rawPseudoCount(*args)

computes the pseudoCount (taking priors into account) of the list of variables as a list of floats.

Parameters:

vars (List[intstr]) – the list of variables

Returns:

the pseudo-count as a list of float

Return type:

List[float]

recordWeight(i)

Get the weight of the ith record

Parameters:

i (int) – the position of the record in the database

Raises:

pyagrum.OutOfBounds – if i is outside the set of indices of the records

Returns:

The weight of the ith record of the database

Return type:

float

score(*args)

Returns the value of the score currently in use by the BNLearner of a variable given a set of other variables

Parameters:
  • name1 (str) – the name of the variable at the LHS of the conditioning bar

  • knowing (List[str]) – the list of names of the conditioning variables

Returns:

the value of the score

Return type:

float

setDatabaseWeight(new_weight)

Set the database weight which is given as an equivalent sample size.

Warning

The same weight is assigned to all the rows of the learning database so that the sum of their weights is equal to the value of the parameter weight.

Parameters:
  • weight (float) – the database weight

  • new_weight (float)

Return type:

None

setEpsilon(eps)
Parameters:

eps (float) – the epsilon we want to use

Raises:

pyagrum.OutOfBounds – If eps<0

Return type:

None

setInitialDAG(dag)

Sets the initial structure (DAG) used by the structure learning algorithm.

Parameters:

dag (pyagrum.DAG) – an initial pyagrum.DAG structure

Return type:

BNLearner

setMaxIndegree(max_indegree)
Parameters:

max_indegree (int) – the limit number of parents

Return type:

BNLearner

setMaxIter(max)
Parameters:

max (int) – the maximum number of iteration

Raises:

pyagrum.OutOfBounds – If max <= 1

Return type:

None

setMaxTime(timeout)
Parameters:
  • tiemout (float) – stopping criterion on timeout (in seconds)

  • timeout (float)

Raises:

pyagrum.OutOfBounds – If timeout<=0.0

Return type:

None

setMinEpsilonRate(rate)
Parameters:

rate (float) – the minimal epsilon rate

Return type:

None

setNumberOfThreads(nb)

If the parameter n passed in argument is different from 0, the BNLearner will use n threads during learning, hence overriding pyAgrum default number of threads. If, on the contrary, n is equal to 0, the BNLearner will comply with pyAgrum default number of threads.

Parameters:
  • n (int) – the number of threads to be used by the BNLearner

  • nb (int)

Return type:

None

setPeriodSize(p)
Parameters:

p (int) – number of samples between 2 stopping

Raises:

pyagrum.OutOfBounds – If p<1

Return type:

None

setPossibleEdges(*args)

Adds a constraint to the structure learning algorithm by fixing the set of possible edges.

Parameters:

edges (Set[Tuple[int]]) – a set of edges as couples of nodeIds.

Return type:

None

setPossibleSkeleton(skeleton)

Add a constraint by fixing the set of possible edges as a pyagrum.UndiGraph.

Parameters:
Return type:

BNLearner

setRecordWeight(i, weight)

Set the weight of the ith record

Parameters:
  • i (int) – the position of the record in the database

  • weight (float) – the weight assigned to this record

Raises:

pyagrum.OutOfBounds – if i is outside the set of indices of the records

Return type:

None

setSliceOrder(*args)

Set a partial order on the nodes.

Parameters:

l (list) – a list of sequences (composed of ids of rows or string)

Return type:

BNLearner

setVerbosity(v)
Parameters:

v (bool) – verbosity

Return type:

None

state()

Returns a dictionary containing the current state of the BNLearner.

Returns:

a dictionary containing the current state of the BNLearner.

Return type:

Dict[str,Any]

useBDeuPrior(weight=1.0)

The BDeu prior adds weight to all the cells of the counting tables. In other words, it adds weight rows in the database with equally probable values.

Parameters:

weight (float) – the prior weight

Return type:

BNLearner

useDirichletPrior(*args)

Use the Dirichlet prior.

Parameters:
  • source (str|pyagrum.BayesNet) – the Dirichlet related source (filename of a database or a Bayesian network)

  • weight (float (optional)) – the weight of the prior (the ‘size’ of the corresponding ‘virtual database’)

Return type:

BNLearner

useEM(*args)

Sets whether we use EM for parameter learning or not, depending on the value of epsilon.

usage:
  • useEM(epsilon, noise=0.1)

When epsilon is equal to 0.0, EM is forbidden, else EM is used for parameter learning whenever the database contains missing values. In this case, its stopping criterion is a threshold on the log-likelihood evolution rate, i.e., if llc and llo refer to the log-likelihoods at the current and previous EM steps respectively, EM will stop when (llc - llo) / llc drops below epsilon. If you wish to be more specific on which stopping criterion to use, you may prefer exploiting methods useEMWithRateCriterion() or useEMWithDiffCriterion().

Parameters:
  • epsilon (float) –

    if epsilon>0 then EM is used and stops whenever the relative difference between two consecutive log-likelihoods (log-likelihood evolution rate) drops below epsilon.

    if epsilon=0.0 then EM is not used. But if you wish to forbid the use of EM, prefer executing Method forbidEM() rather than useEM(0.0) as it is more unequivocal.

  • noise (float, default=0.1) – During EM’s initialization, the CPTs are randomly perturbed using the following formula: new_CPT = (1-noise) * CPT + noise * random_CPT. Parameter noise must belong to interval [0,1]. By default, noise is equal to 0.1.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

Raises:

pyagrum.OutOfBounds – if epsilon is strictly negative or if noise does not belong to interval [0,1].

useEMWithDiffCriterion(*args)

Enforces that EM with the log-likelihood min difference criterion will be used for parameter learning whenever the dataset contains missing values.

Parameters:
  • epsilon (float) – epsilon sets the approximation stopping criterion: EM stops whenever the difference between two consecutive log-likelihoods drops below epsilon. Note that epsilon should be strictly positive.

  • noise (float (optional, default = 0.1)) – During EM’s initialization, the CPTs are randomly perturbed using the following formula: new_CPT = (1-noise) * CPT + noise * random_CPT. Parameter noise must belong to interval [0,1]. By default, noise is equal to 0.1.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

Raises:

pyagrum.OutOfBounds – if epsilon is not strictly positive or if noise does not belong to interval [0,1].

useEMWithRateCriterion(*args)

Enforces that EM with the log-likelihood min evolution rate stopping criterion will be used for parameter learning when the dataset contains missing values.

Parameters:
  • epsilon (float) – epsilon sets the approximation stopping criterion: EM stops whenever the absolute value of the relative difference between two consecutive log-likelihoods drops below epsilon. Note that epsilon should be strictly positive.

  • noise (float, default=0.1) – During EM’s initialization, the CPTs are randomly perturbed using the following formula: new_CPT = (1-noise) * CPT + noise * random_CPT. Parameter noise must belong to interval [0,1]. By default, noise is equal to 0.1.

Returns:

the BNLearner itself, so that we can chain useXXX() methods.

Return type:

pyagrum.BNLearner

Raises:

pyagrum.OutOfBounds – if epsilon is not strictly positive or if noise does not belong to interval [0,1].

useGreedyHillClimbing()

Indicate that we wish to use a greedy hill climbing algorithm.

Return type:

BNLearner

useK2(*args)

Indicate to use the K2 algorithm (which needs a total ordering of the variables).

Parameters:

order (list[int or str]) – sequences of (ids or name)

Return type:

BNLearner

useLocalSearchWithTabuList(tabu_size=100, nb_decrease=2)

Indicate that we wish to use a local search with tabu list

Parameters:
  • tabu_size (int) – The size of the tabu list

  • nb_decrease (int) – The max number of changes decreasing the score consecutively that we allow to apply

Return type:

BNLearner

useMDLCorrection()

Indicate that we wish to use the MDL correction for MIIC

Return type:

BNLearner

useMIIC()

Indicate that we wish to use MIIC.

Return type:

BNLearner

useNMLCorrection()

Indicate that we wish to use the NML correction for MIIC

Return type:

BNLearner

useNoCorrection()

Indicate that we wish to use the NoCorr correction for MIIC

Return type:

BNLearner

useNoPrior()

Use no prior.

Return type:

BNLearner

useScoreAIC()

Indicate that we wish to use an AIC score.

Return type:

BNLearner

useScoreBD()

Indicate that we wish to use a BD score.

Return type:

BNLearner

useScoreBDeu()

Indicate that we wish to use a BDeu score.

Return type:

BNLearner

useScoreBIC()

Indicate that we wish to use a BIC score.

Return type:

BNLearner

useScoreK2()

Indicate that we wish to use a K2 score.

Return type:

BNLearner

useScoreLog2Likelihood()

Indicate that we wish to use a Log2Likelihood score.

Return type:

BNLearner

useSmoothingPrior(weight=1)

Use the prior smoothing.

Parameters:

weight (float) – pass in argument a weight if you wish to assign a weight to the smoothing, otherwise the current weight of the learner will be used.

Return type:

BNLearner

verbosity()
Returns:

True if the verbosity is enabled

Return type:

bool