![]() |
aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
|
The base class for estimating parameters of CPTs. More...
#include <agrum/BN/learning/paramUtils/paramEstimator.h>
Public Member Functions | |
Constructors / Destructors | |
| ParamEstimator (const DBRowGeneratorParser &parser, const Prior &external_prior, const Prior &_score_internal_prior, const std::vector< std::pair< std::size_t, std::size_t > > &ranges, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >()) | |
| default constructor | |
| ParamEstimator (const DBRowGeneratorParser &parser, const Prior &external_prior, const Prior &_score_internal_prior, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >()) | |
| default constructor | |
| ParamEstimator (const ParamEstimator &from) | |
| copy constructor | |
| ParamEstimator (ParamEstimator &&from) | |
| move constructor | |
| virtual ParamEstimator * | clone () const =0 |
| virtual copy constructor | |
| virtual | ~ParamEstimator () |
| destructor | |
Accessors / Modifiers | |
| virtual void | clear () |
| clears all the data structures from memory | |
| virtual void | setNumberOfThreads (Size nb) |
| sets the number max of threads that can be used | |
| virtual Size | getNumberOfThreads () const |
| returns the current max number of threads of the scheduler | |
| virtual bool | isGumNumberOfThreadsOverriden () const |
| indicates whether the user set herself the number of threads | |
| virtual void | setMinNbRowsPerThread (const std::size_t nb) const |
| changes the number min of rows a thread should process in a multithreading context | |
| virtual std::size_t | minNbRowsPerThread () const |
| returns the minimum of rows that each thread should process | |
| void | setRanges (const std::vector< std::pair< std::size_t, std::size_t > > &new_ranges) |
| sets new ranges to perform the counts used by the parameter estimator | |
| void | clearRanges () |
| reset the ranges to the one range corresponding to the whole database | |
| const std::vector< std::pair< std::size_t, std::size_t > > & | ranges () const |
| returns the current ranges | |
| std::vector< double > | parameters (const NodeId target_node) |
| returns the CPT's parameters corresponding to a given target node | |
| std::pair< std::vector< double >, double > | parametersAndLogLikelihood (const NodeId target_node) |
| returns the parameters of a CPT as well as its log-likelihood | |
| virtual std::vector< double > | parameters (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes)=0 |
| returns the CPT's parameters corresponding to a given nodeset | |
| virtual std::pair< std::vector< double >, double > | parametersAndLogLikelihood (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes)=0 |
| returns the parameters of a CPT as well as its log-likelihood | |
| template<typename GUM_SCALAR> | |
| double | setParameters (const NodeId target_node, const std::vector< NodeId > &conditioning_nodes, Tensor< GUM_SCALAR > &pot, const bool compute_log_likelihood=false) |
| sets a CPT's parameters and, possibly, return its log-likelihhod | |
| const Bijection< NodeId, std::size_t > & | nodeId2Columns () const |
| returns the mapping from ids to column positions in the database | |
| const DatabaseTable & | database () const |
| returns the database on which we perform the counts | |
| template<typename GUM_SCALAR> | |
| void | setBayesNet (const BayesNet< GUM_SCALAR > &new_bn) |
| assign a new Bayes net to all the counter's generators depending on a BN | |
Protected Member Functions | |
| ParamEstimator & | operator= (const ParamEstimator &from) |
| copy operator | |
| ParamEstimator & | operator= (ParamEstimator &&from) |
| move operator | |
Protected Attributes | |
| Prior * | external_prior_ {nullptr} |
| an external a priori | |
| Prior * | score_internal_prior_ {nullptr} |
| if a score was used for learning the structure of the PGM, this is the priori internal to the score | |
| RecordCounter | counter_ |
| the record counter used to parse the database | |
| const std::vector< NodeId > | empty_nodevect_ |
| an empty vector of nodes, used for empty conditioning | |
The base class for estimating parameters of CPTs.
Definition at line 67 of file paramEstimator.h.
| gum::learning::ParamEstimator::ParamEstimator | ( | const DBRowGeneratorParser & | parser, |
| const Prior & | external_prior, | ||
| const Prior & | _score_internal_prior, | ||
| const std::vector< std::pair< std::size_t, std::size_t > > & | ranges, | ||
| const Bijection< NodeId, std::size_t > & | nodeId2columns = Bijection< NodeId, std::size_t >() ) |
default constructor
| parser | the parser used to parse the database |
| external_prior | An prior that we add to the computation of the score |
| score_internal_prior | The prior within the score used to learn the data structure (might be a NoPrior) |
| ranges | a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database. |
| nodeId2Columns | a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable. |
References ranges().
Referenced by ParamEstimator(), ParamEstimator(), clone(), operator=(), and operator=().
| gum::learning::ParamEstimator::ParamEstimator | ( | const DBRowGeneratorParser & | parser, |
| const Prior & | external_prior, | ||
| const Prior & | _score_internal_prior, | ||
| const Bijection< NodeId, std::size_t > & | nodeId2columns = Bijection< NodeId, std::size_t >() ) |
default constructor
| parser | the parser used to parse the database |
| external_prior | An prior that we add to the computation of the score |
| score_internal_prior | The prior within the score used to learn the data structure (might be a NoPrior) |
| nodeId2Columns | a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable. |
| gum::learning::ParamEstimator::ParamEstimator | ( | const ParamEstimator & | from | ) |
| gum::learning::ParamEstimator::ParamEstimator | ( | ParamEstimator && | from | ) |
|
virtual |
destructor
|
virtual |
clears all the data structures from memory
Referenced by gum::learning::DAG2BNLearner::createBNwithEM(), and gum::learning::DAG2BNLearner::createBNwithEM().
| void gum::learning::ParamEstimator::clearRanges | ( | ) |
reset the ranges to the one range corresponding to the whole database
|
pure virtual |
virtual copy constructor
Implemented in gum::learning::ParamEstimatorML.
References ParamEstimator().
| const DatabaseTable & gum::learning::ParamEstimator::database | ( | ) | const |
returns the database on which we perform the counts
|
virtual |
returns the current max number of threads of the scheduler
Implements gum::IThreadNumberManager.
|
virtual |
indicates whether the user set herself the number of threads
Implements gum::IThreadNumberManager.
|
virtual |
returns the minimum of rows that each thread should process
returns the mapping from ids to column positions in the database
|
protected |
|
protected |
returns the CPT's parameters corresponding to a given target node
Referenced by gum::learning::ParamEstimatorML::operator=().
|
pure virtual |
returns the CPT's parameters corresponding to a given nodeset
The vector contains the parameters of an n-dimensional CPT. The distribution of the dimensions of the CPT within the vector is as follows: first, there is the target node, then the conditioning nodes (in the order in which they were specified).
Implemented in gum::learning::ParamEstimatorML.
| std::pair< std::vector< double >, double > gum::learning::ParamEstimator::parametersAndLogLikelihood | ( | const NodeId | target_node | ) |
returns the parameters of a CPT as well as its log-likelihood
|
pure virtual |
returns the parameters of a CPT as well as its log-likelihood
The vector contains the parameters of an n-dimensional CPT. The distribution of the dimensions of the CPT within the vector is as follows: first, there is the target node, then the conditioning nodes (in the order in which they were specified).
| target_node | the node on the left side of the CPT's conditioning bar |
| conditioning_nodes | thes nodes on the right side of the conditioning bar |
Implemented in gum::learning::ParamEstimatorML.
| const std::vector< std::pair< std::size_t, std::size_t > > & gum::learning::ParamEstimator::ranges | ( | ) | const |
returns the current ranges
Referenced by ParamEstimator(), and gum::learning::ParamEstimatorML::ParamEstimatorML().
| void gum::learning::ParamEstimator::setBayesNet | ( | const BayesNet< GUM_SCALAR > & | new_bn | ) |
assign a new Bayes net to all the counter's generators depending on a BN
Typically, generators based on EM or K-means depend on a model to compute correctly their outputs. Method setBayesNet enables to update their BN model.
|
virtual |
changes the number min of rows a thread should process in a multithreading context
When computing score, several threads are used by record counters to perform counts on the rows of the database, the MinNbRowsPerThread method indicates how many rows each thread should at least process. This is used to compute the number of threads actually run. This number is equal to the min between the max number of threads allowed and the number of records in the database divided by nb.
|
virtual |
sets the number max of threads that can be used
| nb | the number max of threads to be used. If this number is set to 0, then it is defaulted to aGrUM's max number of threads |
Implements gum::IThreadNumberManager.
Referenced by gum::learning::IBNLearner::createParamEstimator_().
| double gum::learning::ParamEstimator::setParameters | ( | const NodeId | target_node, |
| const std::vector< NodeId > & | conditioning_nodes, | ||
| Tensor< GUM_SCALAR > & | pot, | ||
| const bool | compute_log_likelihood = false ) |
sets a CPT's parameters and, possibly, return its log-likelihhod
The tensor (CPT) is assumed to be a conditional probability, the first variable of its variablesSequence() being the target variable, the other ones being on the right side of the conditioning bar.
| target_node | the node on the left side of the CPT's conditioning bar |
| conditioning_nodes | the set of nodes on the right side of the conditioning bar |
| pot | the tensor (CPT) that is filled |
| compute_log_likelihood | a Boolean indicating whether we wish to compute the log-likelihood or not. Computing it is needed by the EM algorithm |
Referenced by gum::learning::DAG2BNLearner::createBNwithEM().
| void gum::learning::ParamEstimator::setRanges | ( | const std::vector< std::pair< std::size_t, std::size_t > > & | new_ranges | ) |
sets new ranges to perform the counts used by the parameter estimator
| ranges | a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database. |
Referenced by gum::learning::IBNLearner::createParamEstimator_().
|
protected |
the record counter used to parse the database
Definition at line 273 of file paramEstimator.h.
|
protected |
an empty vector of nodes, used for empty conditioning
Definition at line 276 of file paramEstimator.h.
|
protected |
|
protected |
if a score was used for learning the structure of the PGM, this is the priori internal to the score
Definition at line 270 of file paramEstimator.h.