aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
gum::learning::ParamEstimatorML Class Reference

The class for estimating parameters of CPTs using Maximum Likelihood. More...

#include <agrum/BN/learning/paramUtils/paramEstimatorML.h>

Inheritance diagram for gum::learning::ParamEstimatorML:
Collaboration diagram for gum::learning::ParamEstimatorML:

Public Member Functions

Constructors / Destructors
 ParamEstimatorML (const DBRowGeneratorParser &parser, const Prior &external_prior, const Prior &_score_internal_prior, const std::vector< std::pair< std::size_t, std::size_t > > &ranges, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >())
 default constructor
 ParamEstimatorML (const DBRowGeneratorParser &parser, const Prior &external_prior, const Prior &_score_internal_prior, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >())
 default constructor
 ParamEstimatorML (const ParamEstimatorML &from)
 copy constructor
 ParamEstimatorML (ParamEstimatorML &&from)
 move constructor
virtual ParamEstimatorMLclone () const
 virtual copy constructor
virtual ~ParamEstimatorML ()
 destructor
Operators
ParamEstimatorMLoperator= (const ParamEstimatorML &from)
 copy operator
ParamEstimatorMLoperator= (ParamEstimatorML &&from)
 move operator
Accessors / Modifiers
virtual std::vector< doubleparameters (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes)
 returns the CPT's parameters corresponding to a given nodeset
virtual std::pair< std::vector< double >, doubleparametersAndLogLikelihood (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes)
 returns the parameters of a CPT as well as its log-likelihood
std::vector< doubleparameters (const NodeId target_node)
 returns the CPT's parameters corresponding to a given target node
Accessors / Modifiers
virtual void clear ()
 clears all the data structures from memory
virtual void setNumberOfThreads (Size nb)
 sets the number max of threads that can be used
virtual Size getNumberOfThreads () const
 returns the current max number of threads of the scheduler
virtual bool isGumNumberOfThreadsOverriden () const
 indicates whether the user set herself the number of threads
virtual void setMinNbRowsPerThread (const std::size_t nb) const
 changes the number min of rows a thread should process in a multithreading context
virtual std::size_t minNbRowsPerThread () const
 returns the minimum of rows that each thread should process
void setRanges (const std::vector< std::pair< std::size_t, std::size_t > > &new_ranges)
 sets new ranges to perform the counts used by the parameter estimator
void clearRanges ()
 reset the ranges to the one range corresponding to the whole database
const std::vector< std::pair< std::size_t, std::size_t > > & ranges () const
 returns the current ranges
std::pair< std::vector< double >, doubleparametersAndLogLikelihood (const NodeId target_node)
 returns the parameters of a CPT as well as its log-likelihood
template<typename GUM_SCALAR>
double setParameters (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes, Tensor< GUM_SCALAR > &pot, const bool compute_log_likelihood=false)
 sets a CPT's parameters and, possibly, return its log-likelihhod
const Bijection< NodeId, std::size_t > & nodeId2Columns () const
 returns the mapping from ids to column positions in the database
const DatabaseTabledatabase () const
 returns the database on which we perform the counts
template<typename GUM_SCALAR>
void setBayesNet (const BayesNet< GUM_SCALAR > &new_bn)
 assign a new Bayes net to all the counter's generators depending on a BN

Protected Attributes

Priorexternal_prior_ {nullptr}
 an external a priori
Priorscore_internal_prior_ {nullptr}
 if a score was used for learning the structure of the PGM, this is the priori internal to the score
RecordCounter counter_
 the record counter used to parse the database
const std::vector< NodeIdempty_nodevect_
 an empty vector of nodes, used for empty conditioning

Private Member Functions

std::pair< std::vector< double >, double_parametersAndLogLikelihood_ (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes, const bool compute_log_likelihood)

Detailed Description

The class for estimating parameters of CPTs using Maximum Likelihood.

Definition at line 65 of file paramEstimatorML.h.

Constructor & Destructor Documentation

◆ ParamEstimatorML() [1/4]

gum::learning::ParamEstimatorML::ParamEstimatorML ( const DBRowGeneratorParser & parser,
const Prior & external_prior,
const Prior & _score_internal_prior,
const std::vector< std::pair< std::size_t, std::size_t > > & ranges,
const Bijection< NodeId, std::size_t > & nodeId2columns = BijectionNodeId, std::size_t >() )

default constructor

Parameters
parserthe parser used to parse the database
external_priorAn prior that we add to the computation of the score
score_internal_priorThe prior within the score used to learn the data structure (might be a NoPrior)
rangesa set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database.
nodeId2Columnsa mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.
Warning
If nodeId2columns is not empty, then only the scores over the ids belonging to this bijection can be computed: applying method score() over other ids will raise exception NotFound.

References gum::learning::ParamEstimator::ranges().

Referenced by ParamEstimatorML(), ParamEstimatorML(), clone(), operator=(), and operator=().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ ParamEstimatorML() [2/4]

gum::learning::ParamEstimatorML::ParamEstimatorML ( const DBRowGeneratorParser & parser,
const Prior & external_prior,
const Prior & _score_internal_prior,
const Bijection< NodeId, std::size_t > & nodeId2columns = BijectionNodeId, std::size_t >() )

default constructor

Parameters
parserthe parser used to parse the database
external_priorAn prior that we add to the computation of the score
score_internal_priorThe prior within the score used to learn the data structure (might be a NoPrior)
nodeId2Columnsa mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.
Warning
If nodeId2columns is not empty, then only the scores over the ids belonging to this bijection can be computed: applying method score() over other ids will raise exception NotFound.

◆ ParamEstimatorML() [3/4]

gum::learning::ParamEstimatorML::ParamEstimatorML ( const ParamEstimatorML & from)

copy constructor

References ParamEstimatorML().

Here is the call graph for this function:

◆ ParamEstimatorML() [4/4]

gum::learning::ParamEstimatorML::ParamEstimatorML ( ParamEstimatorML && from)

move constructor

References ParamEstimatorML().

Here is the call graph for this function:

◆ ~ParamEstimatorML()

virtual gum::learning::ParamEstimatorML::~ParamEstimatorML ( )
virtual

destructor

Member Function Documentation

◆ _parametersAndLogLikelihood_()

std::pair< std::vector< double >, double > gum::learning::ParamEstimatorML::_parametersAndLogLikelihood_ ( const NodeId target_node,
const std::vector< NodeId > & conditioning_nodes,
const bool compute_log_likelihood )
private

◆ clear()

virtual void gum::learning::ParamEstimator::clear ( )
virtualinherited

clears all the data structures from memory

Referenced by gum::learning::DAG2BNLearner::createBNwithEM(), and gum::learning::DAG2BNLearner::createBNwithEM().

Here is the caller graph for this function:

◆ clearRanges()

void gum::learning::ParamEstimator::clearRanges ( )
inherited

reset the ranges to the one range corresponding to the whole database

◆ clone()

virtual ParamEstimatorML * gum::learning::ParamEstimatorML::clone ( ) const
virtual

virtual copy constructor

Implements gum::learning::ParamEstimator.

References ParamEstimatorML().

Here is the call graph for this function:

◆ database()

const DatabaseTable & gum::learning::ParamEstimator::database ( ) const
inherited

returns the database on which we perform the counts

◆ getNumberOfThreads()

virtual Size gum::learning::ParamEstimator::getNumberOfThreads ( ) const
virtualinherited

returns the current max number of threads of the scheduler

Implements gum::IThreadNumberManager.

◆ isGumNumberOfThreadsOverriden()

virtual bool gum::learning::ParamEstimator::isGumNumberOfThreadsOverriden ( ) const
virtualinherited

indicates whether the user set herself the number of threads

Implements gum::IThreadNumberManager.

◆ minNbRowsPerThread()

virtual std::size_t gum::learning::ParamEstimator::minNbRowsPerThread ( ) const
virtualinherited

returns the minimum of rows that each thread should process

◆ nodeId2Columns()

const Bijection< NodeId, std::size_t > & gum::learning::ParamEstimator::nodeId2Columns ( ) const
inherited

returns the mapping from ids to column positions in the database

Warning
An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.

◆ operator=() [1/2]

ParamEstimatorML & gum::learning::ParamEstimatorML::operator= ( const ParamEstimatorML & from)

copy operator

References ParamEstimatorML().

Here is the call graph for this function:

◆ operator=() [2/2]

ParamEstimatorML & gum::learning::ParamEstimatorML::operator= ( ParamEstimatorML && from)

move operator

References ParamEstimatorML(), and gum::learning::ParamEstimator::parameters().

Here is the call graph for this function:

◆ parameters() [1/2]

std::vector< double > gum::learning::ParamEstimator::parameters ( const NodeId target_node)

returns the CPT's parameters corresponding to a given target node

◆ parameters() [2/2]

virtual std::vector< double > gum::learning::ParamEstimatorML::parameters ( const NodeId target_node,
const std::vector< NodeId > & conditioning_nodes )
virtual

returns the CPT's parameters corresponding to a given nodeset

The vector contains the parameters of an n-dimensional CPT. The distribution of the dimensions of the CPT within the vector is as follows: first, there is the target node, then the conditioning nodes (in the order in which they were specified).

Exceptions
DatabaseErroris raised if some values of the conditioning sets were not observed in the database.

Implements gum::learning::ParamEstimator.

◆ parametersAndLogLikelihood() [1/2]

std::pair< std::vector< double >, double > gum::learning::ParamEstimator::parametersAndLogLikelihood ( const NodeId target_node)
inherited

returns the parameters of a CPT as well as its log-likelihood

◆ parametersAndLogLikelihood() [2/2]

virtual std::pair< std::vector< double >, double > gum::learning::ParamEstimatorML::parametersAndLogLikelihood ( const NodeId target_node,
const std::vector< NodeId > & conditioning_nodes )
virtual

returns the parameters of a CPT as well as its log-likelihood

The vector contains the parameters of an n-dimensional CPT. The distribution of the dimensions of the CPT within the vector is as follows: first, there is the target node, then the conditioning nodes (in the order in which they were specified).

Parameters
target_nodethe node on the left side of the CPT's conditioning bar
conditioning_nodesthes nodes on the right side of the conditioning bar
Returns
a pair containing i) the vector of parameters and ii) the log-likelihood

Implements gum::learning::ParamEstimator.

◆ ranges()

const std::vector< std::pair< std::size_t, std::size_t > > & gum::learning::ParamEstimator::ranges ( ) const
inherited

returns the current ranges

Referenced by ParamEstimator(), and gum::learning::ParamEstimatorML::ParamEstimatorML().

Here is the caller graph for this function:

◆ setBayesNet()

template<typename GUM_SCALAR>
void gum::learning::ParamEstimator::setBayesNet ( const BayesNet< GUM_SCALAR > & new_bn)
inherited

assign a new Bayes net to all the counter's generators depending on a BN

Typically, generators based on EM or K-means depend on a model to compute correctly their outputs. Method setBayesNet enables to update their BN model.

◆ setMinNbRowsPerThread()

virtual void gum::learning::ParamEstimator::setMinNbRowsPerThread ( const std::size_t nb) const
virtualinherited

changes the number min of rows a thread should process in a multithreading context

When computing score, several threads are used by record counters to perform counts on the rows of the database, the MinNbRowsPerThread method indicates how many rows each thread should at least process. This is used to compute the number of threads actually run. This number is equal to the min between the max number of threads allowed and the number of records in the database divided by nb.

◆ setNumberOfThreads()

virtual void gum::learning::ParamEstimator::setNumberOfThreads ( Size nb)
virtualinherited

sets the number max of threads that can be used

Parameters
nbthe number max of threads to be used. If this number is set to 0, then it is defaulted to aGrUM's max number of threads

Implements gum::IThreadNumberManager.

Referenced by gum::learning::IBNLearner::createParamEstimator_().

Here is the caller graph for this function:

◆ setParameters()

template<typename GUM_SCALAR>
double gum::learning::ParamEstimator::setParameters ( const NodeId target_node,
const std::vector< NodeId > & conditioning_nodes,
Tensor< GUM_SCALAR > & pot,
const bool compute_log_likelihood = false )
inherited

sets a CPT's parameters and, possibly, return its log-likelihhod

The tensor (CPT) is assumed to be a conditional probability, the first variable of its variablesSequence() being the target variable, the other ones being on the right side of the conditioning bar.

Parameters
target_nodethe node on the left side of the CPT's conditioning bar
conditioning_nodesthe set of nodes on the right side of the conditioning bar
potthe tensor (CPT) that is filled
compute_log_likelihooda Boolean indicating whether we wish to compute the log-likelihood or not. Computing it is needed by the EM algorithm
Returns
a double which corresponds to the log-likelihood (w.r.t. the CPT) if compute_log_likelihood=true, else the method returns 0

Referenced by gum::learning::DAG2BNLearner::createBNwithEM().

Here is the caller graph for this function:

◆ setRanges()

void gum::learning::ParamEstimator::setRanges ( const std::vector< std::pair< std::size_t, std::size_t > > & new_ranges)
inherited

sets new ranges to perform the counts used by the parameter estimator

Parameters
rangesa set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database.

Referenced by gum::learning::IBNLearner::createParamEstimator_().

Here is the caller graph for this function:

Member Data Documentation

◆ counter_

RecordCounter gum::learning::ParamEstimator::counter_
protectedinherited

the record counter used to parse the database

Definition at line 273 of file paramEstimator.h.

◆ empty_nodevect_

const std::vector< NodeId > gum::learning::ParamEstimator::empty_nodevect_
protectedinherited

an empty vector of nodes, used for empty conditioning

Definition at line 276 of file paramEstimator.h.

◆ external_prior_

Prior* gum::learning::ParamEstimator::external_prior_ {nullptr}
protectedinherited

an external a priori

Definition at line 266 of file paramEstimator.h.

266{nullptr};

◆ score_internal_prior_

Prior* gum::learning::ParamEstimator::score_internal_prior_ {nullptr}
protectedinherited

if a score was used for learning the structure of the PGM, this is the priori internal to the score

Definition at line 270 of file paramEstimator.h.

270{nullptr};

The documentation for this class was generated from the following file: