The class for giving access to pseudo count : count in the database + prior.
More...
#include <pseudoCount.h>
|
| | PseudoCount (const DBRowGeneratorParser &parser, const Prior &external_prior, const std::vector< std::pair< std::size_t, std::size_t > > &ranges, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >()) |
| | default constructor
|
| | PseudoCount (const DBRowGeneratorParser &parser, const Prior &external_prior, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >()) |
| | default constructor
|
| virtual | ~PseudoCount () |
| | destructor
|
| | PseudoCount (const PseudoCount &from) |
| | copy constructor
|
| | PseudoCount (PseudoCount &&from) |
| | move constructor
|
| PseudoCount & | operator= (const PseudoCount &from) |
| | copy operator
|
| PseudoCount & | operator= (PseudoCount &&from) |
| | move operator
|
| virtual void | setNumberOfThreads (Size nb) |
| | sets the number max of threads that can be used
|
| virtual Size | getNumberOfThreads () const |
| | returns the current max number of threads of the scheduler
|
| virtual bool | isGumNumberOfThreadsOverriden () const |
| | indicates whether the user set herself the number of threads
|
| virtual void | setMinNbRowsPerThread (const std::size_t nb) const |
| | changes the number min of rows a thread should process in a multithreading context
|
| virtual std::size_t | minNbRowsPerThread () const |
| | returns the minimum of rows that each thread should process
|
| void | setRanges (const std::vector< std::pair< std::size_t, std::size_t > > &new_ranges) |
| | sets new ranges to perform the counts used by the independence test
|
| void | clearRanges () |
| | reset the ranges to the one range corresponding to the whole database
|
| const std::vector< std::pair< std::size_t, std::size_t > > & | ranges () const |
| | returns the current ranges
|
| std::vector< double > | get (const std::vector< NodeId > &ids) |
| | returns the pseudo-count of a pair of nodes given some other nodes
|
| virtual void | clear () |
| | clears all the data structures from memory, including the cache
|
| const Bijection< NodeId, std::size_t > & | nodeId2Columns () const |
| | return the mapping between the columns of the database and the node ids
|
| const DatabaseTable & | database () const |
| | return the database used by the pseudo-count
|
The class for giving access to pseudo count : count in the database + prior.
Definition at line 67 of file pseudoCount.h.
◆ PseudoCount() [1/4]
default constructor
- Parameters
-
| parser | the parser used to parse the database |
| external_prior | An prior that we add to the computation of the pseudo-count (this should come from expert knowledge): this consists in adding numbers to counts in the contingency tables |
| ranges | a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database. |
| nodeId2Columns | a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable. |
- Warning
- If nodeId2columns is not empty, then only the pseudo-counts over the ids belonging to this bijection can be computed: applying method pseudo-count() over other ids will raise exception NotFound.
References ranges().
Referenced by PseudoCount(), PseudoCount(), operator=(), and operator=().
◆ PseudoCount() [2/4]
default constructor
- Parameters
-
| parser | the parser used to parse the database |
| external_prior | An prior that we add to the computation of the pseudo-count (this should come from expert knowledge): this consists in adding numbers to counts in the contingency tables |
| nodeId2Columns | a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable. |
- Warning
- If nodeId2columns is not empty, then only the pseudo-counts over the ids belonging to this bijection can be computed: applying method pseudo-count() over other ids will raise exception NotFound.
◆ ~PseudoCount()
| virtual gum::learning::PseudoCount::~PseudoCount |
( |
| ) |
|
|
virtual |
◆ PseudoCount() [3/4]
| gum::learning::PseudoCount::PseudoCount |
( |
const PseudoCount & | from | ) |
|
◆ PseudoCount() [4/4]
| gum::learning::PseudoCount::PseudoCount |
( |
PseudoCount && | from | ) |
|
◆ clear()
| virtual void gum::learning::PseudoCount::clear |
( |
| ) |
|
|
virtual |
clears all the data structures from memory, including the cache
◆ clearRanges()
| void gum::learning::PseudoCount::clearRanges |
( |
| ) |
|
reset the ranges to the one range corresponding to the whole database
◆ database()
| const DatabaseTable & gum::learning::PseudoCount::database |
( |
| ) |
const |
return the database used by the pseudo-count
◆ get()
| std::vector< double > gum::learning::PseudoCount::get |
( |
const std::vector< NodeId > & | ids | ) |
|
returns the pseudo-count of a pair of nodes given some other nodes
- Parameters
-
| var1 | the first variable on the left side of the conditioning bar |
| var2 | the second variable on the left side of the conditioning bar |
| rhs_ids | the set of variables on the right side of the conditioning bar |
Referenced by gum::learning::IBNLearner::rawPseudoCount().
◆ getNumberOfThreads()
| virtual Size gum::learning::PseudoCount::getNumberOfThreads |
( |
| ) |
const |
|
virtual |
◆ isGumNumberOfThreadsOverriden()
| virtual bool gum::learning::PseudoCount::isGumNumberOfThreadsOverriden |
( |
| ) |
const |
|
virtual |
◆ minNbRowsPerThread()
| virtual std::size_t gum::learning::PseudoCount::minNbRowsPerThread |
( |
| ) |
const |
|
virtual |
returns the minimum of rows that each thread should process
◆ nodeId2Columns()
| const Bijection< NodeId, std::size_t > & gum::learning::PseudoCount::nodeId2Columns |
( |
| ) |
const |
return the mapping between the columns of the database and the node ids
- Warning
- An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.
◆ operator=() [1/2]
◆ operator=() [2/2]
◆ ranges()
| const std::vector< std::pair< std::size_t, std::size_t > > & gum::learning::PseudoCount::ranges |
( |
| ) |
const |
◆ setMinNbRowsPerThread()
| virtual void gum::learning::PseudoCount::setMinNbRowsPerThread |
( |
const std::size_t | nb | ) |
const |
|
virtual |
changes the number min of rows a thread should process in a multithreading context
When computing pseudo-count, several threads are used by record counters to perform counts on the rows of the database, the MinNbRowsPerThread method indicates how many rows each thread should at least process. This is used to compute the number of threads actually run. This number is equal to the min between the max number of threads allowed and the number of records in the database divided by nb.
◆ setNumberOfThreads()
| virtual void gum::learning::PseudoCount::setNumberOfThreads |
( |
Size | nb | ) |
|
|
virtual |
sets the number max of threads that can be used
- Parameters
-
| nb | the number max of threads to be used. If this number is set to 0, then it is defaulted to aGrUM's max number of threads |
Implements gum::IThreadNumberManager.
◆ setRanges()
| void gum::learning::PseudoCount::setRanges |
( |
const std::vector< std::pair< std::size_t, std::size_t > > & | new_ranges | ) |
|
sets new ranges to perform the counts used by the independence test
- Parameters
-
| ranges | a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database. |
◆ counter_
the record counter used for the counts over discrete variables
Definition at line 214 of file pseudoCount.h.
◆ empty_ids_
| const std::vector< NodeId > gum::learning::PseudoCount::empty_ids_ |
|
protected |
◆ prior_
| Prior* gum::learning::PseudoCount::prior_ {nullptr} |
|
protected |
the expert knowledge a priorwe add to the contingency tables
Definition at line 211 of file pseudoCount.h.
The documentation for this class was generated from the following file: