![]() |
aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
|
the class for computing the NML penalty used by MIIC More...
#include <kNML.h>
Public Member Functions | |
Constructors / Destructors | |
| KNML (const DBRowGeneratorParser &parser, const Prior &prior, const std::vector< std::pair< std::size_t, std::size_t > > &ranges, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >()) | |
| default constructor | |
| KNML (const DBRowGeneratorParser &parser, const Prior &prior, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >()) | |
| default constructor | |
| KNML (const KNML &from) | |
| copy constructor | |
| KNML (KNML &&from) | |
| move constructor | |
| virtual KNML * | clone () const |
| virtual copy constructor | |
| virtual | ~KNML () |
| destructor | |
Operators | |
| KNML & | operator= (const KNML &from) |
| copy operator | |
| KNML & | operator= (KNML &&from) |
| move operator | |
Accessors / Modifiers | |
| virtual void | clear () |
| clears all the data structures from memory, including the C_n^r cache | |
| virtual void | clearCache () |
| clears the current C_n^r cache | |
| virtual void | useCache (const bool on_off) |
| turn on/off the use of the C_n^r cache | |
| virtual void | setNumberOfThreads (Size nb) |
| changes the max number of threads used to parse the database | |
| virtual Size | getNumberOfThreads () const |
| returns the number of threads used to parse the database | |
| virtual bool | isGumNumberOfThreadsOverriden () const |
| indicates whether the user set herself the number of threads | |
| virtual void | setMinNbRowsPerThread (const std::size_t nb) const |
| changes the number min of rows a thread should process in a multithreading context | |
| virtual std::size_t | minNbRowsPerThread () const |
| returns the minimum of rows that each thread should process | |
| void | setRanges (const std::vector< std::pair< std::size_t, std::size_t > > &new_ranges) |
| sets new ranges to perform the counts used by kNML | |
| void | clearRanges () |
| reset the ranges to the one range corresponding to the whole database | |
| const std::vector< std::pair< std::size_t, std::size_t > > & | ranges () const |
| returns the current ranges | |
| double | score (const NodeId var1, const NodeId var2) |
| the scores | |
| double | score (const NodeId var1, const NodeId var2, const std::vector< NodeId > &rhs_ids) |
| the scores | |
| const Bijection< NodeId, std::size_t > & | nodeId2Columns () const |
| return the mapping between the columns of the database and the node ids | |
| const DatabaseTable & | database () const |
| return the database used by the score | |
Protected Member Functions | |
| virtual double | score_ (const IdCondSet &idset) final |
| returns the score for a given IdCondSet | |
Private Member Functions | |
| std::vector< double > | marginalize_ (const std::size_t node_2_marginalize, const std::size_t X_size, const std::size_t Y_size, const std::size_t Z_size, const std::vector< double > &N_xyz) const |
| returns a counting vector where variables are marginalized from N_xyz | |
Private Attributes | |
| const double | one_log2_ {M_LOG2E} |
| 1 / log(2) | |
| Prior * | prior_ {nullptr} |
| the expert knowledge a priorwe add to the contingency tables | |
| RecordCounter | counter_ |
| the record counter used for the counts over discrete variables | |
| ScoringCache | cache_ |
| the scoring cache | |
| bool | use_cache_ {true} |
| a Boolean indicating whether we wish to use the cache | |
| const std::vector< NodeId > | empty_ids_ |
| an empty vector | |
| gum::learning::KNML::KNML | ( | const DBRowGeneratorParser & | parser, |
| const Prior & | prior, | ||
| const std::vector< std::pair< std::size_t, std::size_t > > & | ranges, | ||
| const Bijection< NodeId, std::size_t > & | nodeId2columns = Bijection< NodeId, std::size_t >() ) |
default constructor
| parser | the parser used to parse the database |
| prior | An prior that we add to the computation of the score (this should come from expert knowledge): this consists in adding numbers to counts in the contingency tables |
| ranges | a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database. |
| nodeId2Columns | a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable. |
References ranges().
Referenced by KNML(), KNML(), clone(), operator=(), and operator=().
| gum::learning::KNML::KNML | ( | const DBRowGeneratorParser & | parser, |
| const Prior & | prior, | ||
| const Bijection< NodeId, std::size_t > & | nodeId2columns = Bijection< NodeId, std::size_t >() ) |
default constructor
| parser | the parser used to parse the database |
| prior | An prior that we add to the computation of the score (this should come from expert knowledge): this consists in adding numbers to counts in the contingency tables |
| nodeId2Columns | a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable. |
| gum::learning::KNML::KNML | ( | const KNML & | from | ) |
| gum::learning::KNML::KNML | ( | KNML && | from | ) |
|
virtual |
destructor
|
virtual |
clears all the data structures from memory, including the C_n^r cache
Reimplemented from gum::learning::IndependenceTest.
|
virtual |
clears the current C_n^r cache
Reimplemented from gum::learning::IndependenceTest.
reset the ranges to the one range corresponding to the whole database
|
virtual |
virtual copy constructor
Implements gum::learning::IndependenceTest.
References KNML().
| const DatabaseTable & gum::learning::IndependenceTest::database | ( | ) | const |
return the database used by the score
|
virtual |
returns the number of threads used to parse the database
Reimplemented from gum::learning::IndependenceTest.
|
virtual |
indicates whether the user set herself the number of threads
Reimplemented from gum::learning::IndependenceTest.
|
protectedinherited |
returns a counting vector where variables are marginalized from N_xyz
| node_2_marginalize | indicates which node(s) shall be marginalized:
|
| X_size | the domain size of variable X |
| Y_size | the domain size of variable Y |
| Z_size | the domain size of the set of conditioning variables Z |
| N_xyz | a counting vector of dimension X * Y * Z (in this order) |
|
virtual |
returns the minimum of rows that each thread should process
Reimplemented from gum::learning::IndependenceTest.
| const Bijection< NodeId, std::size_t > & gum::learning::IndependenceTest::nodeId2Columns | ( | ) | const |
return the mapping between the columns of the database and the node ids
move operator
References KNML(), gum::learning::IndependenceTest::clearRanges(), gum::learning::IndependenceTest::getNumberOfThreads(), gum::learning::IndependenceTest::isGumNumberOfThreadsOverriden(), gum::learning::IndependenceTest::minNbRowsPerThread(), gum::learning::IndependenceTest::ranges(), gum::learning::IndependenceTest::score(), gum::learning::IndependenceTest::setMinNbRowsPerThread(), gum::learning::IndependenceTest::setNumberOfThreads(), and gum::learning::IndependenceTest::setRanges().
| const std::vector< std::pair< std::size_t, std::size_t > > & gum::learning::IndependenceTest::ranges | ( | ) | const |
| double gum::learning::IndependenceTest::score | ( | const NodeId | var1, |
| const NodeId | var2 ) |
the scores
| double gum::learning::IndependenceTest::score | ( | const NodeId | var1, |
| const NodeId | var2, | ||
| const std::vector< NodeId > & | rhs_ids ) |
the scores
returns the score for a given IdCondSet
| OperationNotAllowed | is raised if the score does not support calling method score such an idset (due to too many/too few variables in the left hand side or the right hand side of the idset). |
Implements gum::learning::IndependenceTest.
|
virtual |
changes the number min of rows a thread should process in a multithreading context
When computing score, several threads are used by record counters to perform counts on the rows of the database, the MinNbRowsPerThread method indicates how many rows each thread should at least process. This is used to compute the number of threads actually run. This number is equal to the min between the max number of threads allowed and the number of records in the database divided by nb.
Reimplemented from gum::learning::IndependenceTest.
|
virtual |
changes the max number of threads used to parse the database
Reimplemented from gum::learning::IndependenceTest.
| void gum::learning::IndependenceTest::setRanges | ( | const std::vector< std::pair< std::size_t, std::size_t > > & | new_ranges | ) |
sets new ranges to perform the counts used by kNML
| ranges | a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database. |
|
virtual |
turn on/off the use of the C_n^r cache
Reimplemented from gum::learning::IndependenceTest.
References gum::learning::IndependenceTest::database(), and gum::learning::IndependenceTest::nodeId2Columns().
|
protectedinherited |
the scoring cache
Definition at line 222 of file independenceTest.h.
|
protectedinherited |
the record counter used for the counts over discrete variables
Definition at line 219 of file independenceTest.h.
|
protectedinherited |
an empty vector
Definition at line 228 of file independenceTest.h.
|
protectedinherited |
the expert knowledge a priorwe add to the contingency tables
Definition at line 216 of file independenceTest.h.
|
protectedinherited |
a Boolean indicating whether we wish to use the cache
Definition at line 225 of file independenceTest.h.