![]() |
aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
|
The class for estimating parameters of CPTs using Maximum Likelihood. More...
#include <agrum/BN/learning/paramUtils/paramEstimatorML.h>
Public Member Functions | |
Constructors / Destructors | |
| ParamEstimatorML (const DBRowGeneratorParser &parser, const Prior &external_prior, const Prior &_score_internal_prior, const std::vector< std::pair< std::size_t, std::size_t > > &ranges, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >()) | |
| default constructor | |
| ParamEstimatorML (const DBRowGeneratorParser &parser, const Prior &external_prior, const Prior &_score_internal_prior, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >()) | |
| default constructor | |
| ParamEstimatorML (const ParamEstimatorML &from) | |
| copy constructor | |
| ParamEstimatorML (ParamEstimatorML &&from) | |
| move constructor | |
| virtual ParamEstimatorML * | clone () const |
| virtual copy constructor | |
| virtual | ~ParamEstimatorML () |
| destructor | |
Operators | |
| ParamEstimatorML & | operator= (const ParamEstimatorML &from) |
| copy operator | |
| ParamEstimatorML & | operator= (ParamEstimatorML &&from) |
| move operator | |
Accessors / Modifiers | |
| virtual std::vector< double > | parameters (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes) |
| returns the CPT's parameters corresponding to a given nodeset | |
| virtual std::pair< std::vector< double >, double > | parametersAndLogLikelihood (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes) |
| returns the parameters of a CPT as well as its log-likelihood | |
| std::vector< double > | parameters (const NodeId target_node) |
| returns the CPT's parameters corresponding to a given target node | |
Accessors / Modifiers | |
| virtual void | clear () |
| clears all the data structures from memory | |
| virtual void | setNumberOfThreads (Size nb) |
| sets the number max of threads that can be used | |
| virtual Size | getNumberOfThreads () const |
| returns the current max number of threads of the scheduler | |
| virtual bool | isGumNumberOfThreadsOverriden () const |
| indicates whether the user set herself the number of threads | |
| virtual void | setMinNbRowsPerThread (const std::size_t nb) const |
| changes the number min of rows a thread should process in a multithreading context | |
| virtual std::size_t | minNbRowsPerThread () const |
| returns the minimum of rows that each thread should process | |
| void | setRanges (const std::vector< std::pair< std::size_t, std::size_t > > &new_ranges) |
| sets new ranges to perform the counts used by the parameter estimator | |
| void | clearRanges () |
| reset the ranges to the one range corresponding to the whole database | |
| const std::vector< std::pair< std::size_t, std::size_t > > & | ranges () const |
| returns the current ranges | |
| std::pair< std::vector< double >, double > | parametersAndLogLikelihood (const NodeId target_node) |
| returns the parameters of a CPT as well as its log-likelihood | |
| template<typename GUM_SCALAR> | |
| double | setParameters (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes, Tensor< GUM_SCALAR > &pot, const bool compute_log_likelihood=false) |
| sets a CPT's parameters and, possibly, return its log-likelihhod | |
| const Bijection< NodeId, std::size_t > & | nodeId2Columns () const |
| returns the mapping from ids to column positions in the database | |
| const DatabaseTable & | database () const |
| returns the database on which we perform the counts | |
| template<typename GUM_SCALAR> | |
| void | setBayesNet (const BayesNet< GUM_SCALAR > &new_bn) |
| assign a new Bayes net to all the counter's generators depending on a BN | |
Protected Attributes | |
| Prior * | external_prior_ {nullptr} |
| an external a priori | |
| Prior * | score_internal_prior_ {nullptr} |
| if a score was used for learning the structure of the PGM, this is the priori internal to the score | |
| RecordCounter | counter_ |
| the record counter used to parse the database | |
| const std::vector< NodeId > | empty_nodevect_ |
| an empty vector of nodes, used for empty conditioning | |
Private Member Functions | |
| std::pair< std::vector< double >, double > | _parametersAndLogLikelihood_ (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes, const bool compute_log_likelihood) |
The class for estimating parameters of CPTs using Maximum Likelihood.
Definition at line 65 of file paramEstimatorML.h.
| gum::learning::ParamEstimatorML::ParamEstimatorML | ( | const DBRowGeneratorParser & | parser, |
| const Prior & | external_prior, | ||
| const Prior & | _score_internal_prior, | ||
| const std::vector< std::pair< std::size_t, std::size_t > > & | ranges, | ||
| const Bijection< NodeId, std::size_t > & | nodeId2columns = Bijection< NodeId, std::size_t >() ) |
default constructor
| parser | the parser used to parse the database |
| external_prior | An prior that we add to the computation of the score |
| score_internal_prior | The prior within the score used to learn the data structure (might be a NoPrior) |
| ranges | a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database. |
| nodeId2Columns | a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable. |
References gum::learning::ParamEstimator::ranges().
Referenced by ParamEstimatorML(), ParamEstimatorML(), clone(), operator=(), and operator=().
| gum::learning::ParamEstimatorML::ParamEstimatorML | ( | const DBRowGeneratorParser & | parser, |
| const Prior & | external_prior, | ||
| const Prior & | _score_internal_prior, | ||
| const Bijection< NodeId, std::size_t > & | nodeId2columns = Bijection< NodeId, std::size_t >() ) |
default constructor
| parser | the parser used to parse the database |
| external_prior | An prior that we add to the computation of the score |
| score_internal_prior | The prior within the score used to learn the data structure (might be a NoPrior) |
| nodeId2Columns | a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable. |
| gum::learning::ParamEstimatorML::ParamEstimatorML | ( | const ParamEstimatorML & | from | ) |
| gum::learning::ParamEstimatorML::ParamEstimatorML | ( | ParamEstimatorML && | from | ) |
|
virtual |
destructor
|
private |
|
virtualinherited |
clears all the data structures from memory
Referenced by gum::learning::DAG2BNLearner::createBNwithEM(), and gum::learning::DAG2BNLearner::createBNwithEM().
|
inherited |
reset the ranges to the one range corresponding to the whole database
|
virtual |
virtual copy constructor
Implements gum::learning::ParamEstimator.
References ParamEstimatorML().
|
inherited |
returns the database on which we perform the counts
|
virtualinherited |
returns the current max number of threads of the scheduler
Implements gum::IThreadNumberManager.
|
virtualinherited |
indicates whether the user set herself the number of threads
Implements gum::IThreadNumberManager.
|
virtualinherited |
returns the minimum of rows that each thread should process
|
inherited |
returns the mapping from ids to column positions in the database
| ParamEstimatorML & gum::learning::ParamEstimatorML::operator= | ( | const ParamEstimatorML & | from | ) |
| ParamEstimatorML & gum::learning::ParamEstimatorML::operator= | ( | ParamEstimatorML && | from | ) |
move operator
References ParamEstimatorML(), and gum::learning::ParamEstimator::parameters().
| std::vector< double > gum::learning::ParamEstimator::parameters | ( | const NodeId | target_node | ) |
returns the CPT's parameters corresponding to a given target node
|
virtual |
returns the CPT's parameters corresponding to a given nodeset
The vector contains the parameters of an n-dimensional CPT. The distribution of the dimensions of the CPT within the vector is as follows: first, there is the target node, then the conditioning nodes (in the order in which they were specified).
| DatabaseError | is raised if some values of the conditioning sets were not observed in the database. |
Implements gum::learning::ParamEstimator.
|
inherited |
returns the parameters of a CPT as well as its log-likelihood
|
virtual |
returns the parameters of a CPT as well as its log-likelihood
The vector contains the parameters of an n-dimensional CPT. The distribution of the dimensions of the CPT within the vector is as follows: first, there is the target node, then the conditioning nodes (in the order in which they were specified).
| target_node | the node on the left side of the CPT's conditioning bar |
| conditioning_nodes | thes nodes on the right side of the conditioning bar |
Implements gum::learning::ParamEstimator.
|
inherited |
returns the current ranges
Referenced by ParamEstimator(), and gum::learning::ParamEstimatorML::ParamEstimatorML().
|
inherited |
assign a new Bayes net to all the counter's generators depending on a BN
Typically, generators based on EM or K-means depend on a model to compute correctly their outputs. Method setBayesNet enables to update their BN model.
|
virtualinherited |
changes the number min of rows a thread should process in a multithreading context
When computing score, several threads are used by record counters to perform counts on the rows of the database, the MinNbRowsPerThread method indicates how many rows each thread should at least process. This is used to compute the number of threads actually run. This number is equal to the min between the max number of threads allowed and the number of records in the database divided by nb.
|
virtualinherited |
sets the number max of threads that can be used
| nb | the number max of threads to be used. If this number is set to 0, then it is defaulted to aGrUM's max number of threads |
Implements gum::IThreadNumberManager.
Referenced by gum::learning::IBNLearner::createParamEstimator_().
|
inherited |
sets a CPT's parameters and, possibly, return its log-likelihhod
The tensor (CPT) is assumed to be a conditional probability, the first variable of its variablesSequence() being the target variable, the other ones being on the right side of the conditioning bar.
| target_node | the node on the left side of the CPT's conditioning bar |
| conditioning_nodes | the set of nodes on the right side of the conditioning bar |
| pot | the tensor (CPT) that is filled |
| compute_log_likelihood | a Boolean indicating whether we wish to compute the log-likelihood or not. Computing it is needed by the EM algorithm |
Referenced by gum::learning::DAG2BNLearner::createBNwithEM().
|
inherited |
sets new ranges to perform the counts used by the parameter estimator
| ranges | a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database. |
Referenced by gum::learning::IBNLearner::createParamEstimator_().
|
protectedinherited |
the record counter used to parse the database
Definition at line 273 of file paramEstimator.h.
|
protectedinherited |
an empty vector of nodes, used for empty conditioning
Definition at line 276 of file paramEstimator.h.
|
protectedinherited |
|
protectedinherited |
if a score was used for learning the structure of the PGM, this is the priori internal to the score
Definition at line 270 of file paramEstimator.h.