aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
gum::learning::IndepTestG2 Class Reference

the class for computing G2 independence test scores More...

#include <agrum/BN/learning/scores_and_tests/indepTestG2.h>

Inheritance diagram for gum::learning::IndepTestG2:
Collaboration diagram for gum::learning::IndepTestG2:

Public Member Functions

Constructors / Destructors
 IndepTestG2 (const DBRowGeneratorParser &parser, const Prior &external_prior, const std::vector< std::pair< std::size_t, std::size_t > > &ranges, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >())
 default constructor
 IndepTestG2 (const DBRowGeneratorParser &parser, const Prior &prior, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >())
 default constructor
 IndepTestG2 (const IndepTestG2 &from)
 copy constructor
 IndepTestG2 (IndepTestG2 &&from)
 move constructor
virtual IndepTestG2clone () const
 virtual copy constructor
virtual ~IndepTestG2 ()
 destructor
Operators
IndepTestG2operator= (const IndepTestG2 &from)
 copy operator
IndepTestG2operator= (IndepTestG2 &&from)
 move operator
std::pair< double, doublestatistics (NodeId var1, NodeId var2, const std::vector< NodeId > &rhs_ids={})
 get the pair <G2statistic,pvalue> for a test var1 indep var2 given rhs_ids
Accessors / Modifiers
virtual void setNumberOfThreads (Size nb)
 sets the number max of threads that can be used
virtual Size getNumberOfThreads () const
 returns the current max number of threads of the scheduler
virtual bool isGumNumberOfThreadsOverriden () const
 indicates whether the user set herself the number of threads
virtual void setMinNbRowsPerThread (const std::size_t nb) const
 changes the number min of rows a thread should process in a multithreading context
virtual std::size_t minNbRowsPerThread () const
 returns the minimum of rows that each thread should process
void setRanges (const std::vector< std::pair< std::size_t, std::size_t > > &new_ranges)
 sets new ranges to perform the counts used by the independence test
void clearRanges ()
 reset the ranges to the one range corresponding to the whole database
const std::vector< std::pair< std::size_t, std::size_t > > & ranges () const
 returns the current ranges
double score (const NodeId var1, const NodeId var2)
 returns the score of a pair of nodes
double score (const NodeId var1, const NodeId var2, const std::vector< NodeId > &rhs_ids)
 returns the score of a pair of nodes given some other nodes
virtual void clear ()
 clears all the data structures from memory, including the cache
virtual void clearCache ()
 clears the current cache
virtual void useCache (const bool on_off)
 turn on/off the use of a cache of the previously computed score
const Bijection< NodeId, std::size_t > & nodeId2Columns () const
 return the mapping between the columns of the database and the node ids
const DatabaseTabledatabase () const
 return the database used by the score

Protected Member Functions

virtual double score_ (const IdCondSet &idset) final
 returns the score for a given IdCondSet
std::pair< double, doublestatistics_ (const IdCondSet &idset)
 compute the pair <G2 statistic,pvalue>
std::vector< doublemarginalize_ (const std::size_t node_2_marginalize, const std::size_t X_size, const std::size_t Y_size, const std::size_t Z_size, const std::vector< double > &N_xyz) const
 returns a counting vector where variables are marginalized from N_xyz

Protected Attributes

const double one_log2_ {M_LOG2E}
 1 / log(2)
Priorprior_ {nullptr}
 the expert knowledge a priorwe add to the contingency tables
RecordCounter counter_
 the record counter used for the counts over discrete variables
ScoringCache cache_
 the scoring cache
bool use_cache_ {true}
 a Boolean indicating whether we wish to use the cache
const std::vector< NodeIdempty_ids_
 an empty vector

Detailed Description

the class for computing G2 independence test scores

Definition at line 67 of file indepTestG2.h.

Constructor & Destructor Documentation

◆ IndepTestG2() [1/4]

gum::learning::IndepTestG2::IndepTestG2 ( const DBRowGeneratorParser & parser,
const Prior & external_prior,
const std::vector< std::pair< std::size_t, std::size_t > > & ranges,
const Bijection< NodeId, std::size_t > & nodeId2columns = BijectionNodeId, std::size_t >() )

default constructor

Parameters
parserthe parser used to parse the database
external_priorAn prior that we add to the computation of the score (this should come from expert knowledge): this consists in adding numbers to counts in the contingency tables
rangesa set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database.
nodeId2Columnsa mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.
Warning
If nodeId2columns is not empty, then only the scores over the ids belonging to this bijection can be computed: applying method score() over other ids will raise exception NotFound.

References gum::learning::IndependenceTest::ranges().

Referenced by IndepTestG2(), IndepTestG2(), clone(), operator=(), and operator=().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ IndepTestG2() [2/4]

gum::learning::IndepTestG2::IndepTestG2 ( const DBRowGeneratorParser & parser,
const Prior & prior,
const Bijection< NodeId, std::size_t > & nodeId2columns = BijectionNodeId, std::size_t >() )

default constructor

Parameters
parserthe parser used to parse the database
priorAn prior that we add to the computation of the score
nodeId2Columnsa mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.
Warning
If nodeId2columns is not empty, then only the scores over the ids belonging to this bijection can be computed: applying method score() over other ids will raise exception NotFound.

◆ IndepTestG2() [3/4]

gum::learning::IndepTestG2::IndepTestG2 ( const IndepTestG2 & from)

copy constructor

References IndepTestG2().

Here is the call graph for this function:

◆ IndepTestG2() [4/4]

gum::learning::IndepTestG2::IndepTestG2 ( IndepTestG2 && from)

move constructor

References IndepTestG2().

Here is the call graph for this function:

◆ ~IndepTestG2()

virtual gum::learning::IndepTestG2::~IndepTestG2 ( )
virtual

destructor

Member Function Documentation

◆ clear()

virtual void gum::learning::IndependenceTest::clear ( )
virtualinherited

clears all the data structures from memory, including the cache

Reimplemented in gum::learning::KNML.

◆ clearCache()

virtual void gum::learning::IndependenceTest::clearCache ( )
virtualinherited

clears the current cache

Reimplemented in gum::learning::KNML.

◆ clearRanges()

void gum::learning::IndependenceTest::clearRanges ( )
inherited

reset the ranges to the one range corresponding to the whole database

Referenced by gum::learning::KNML::operator=().

Here is the caller graph for this function:

◆ clone()

virtual IndepTestG2 * gum::learning::IndepTestG2::clone ( ) const
virtual

virtual copy constructor

Implements gum::learning::IndependenceTest.

References IndepTestG2().

Here is the call graph for this function:

◆ database()

const DatabaseTable & gum::learning::IndependenceTest::database ( ) const
inherited

return the database used by the score

Referenced by gum::learning::KNML::useCache().

Here is the caller graph for this function:

◆ getNumberOfThreads()

virtual Size gum::learning::IndependenceTest::getNumberOfThreads ( ) const
virtualinherited

returns the current max number of threads of the scheduler

Implements gum::IThreadNumberManager.

Reimplemented in gum::learning::KNML.

Referenced by gum::learning::KNML::operator=().

Here is the caller graph for this function:

◆ isGumNumberOfThreadsOverriden()

virtual bool gum::learning::IndependenceTest::isGumNumberOfThreadsOverriden ( ) const
virtualinherited

indicates whether the user set herself the number of threads

Implements gum::IThreadNumberManager.

Reimplemented in gum::learning::KNML.

Referenced by gum::learning::KNML::operator=().

Here is the caller graph for this function:

◆ marginalize_()

std::vector< double > gum::learning::IndependenceTest::marginalize_ ( const std::size_t node_2_marginalize,
const std::size_t X_size,
const std::size_t Y_size,
const std::size_t Z_size,
const std::vector< double > & N_xyz ) const
protectedinherited

returns a counting vector where variables are marginalized from N_xyz

Parameters
node_2_marginalizeindicates which node(s) shall be marginalized:
  • 0 means that X should be marginalized
  • 1 means that Y should be marginalized
  • 2 means that Z should be marginalized
X_sizethe domain size of variable X
Y_sizethe domain size of variable Y
Z_sizethe domain size of the set of conditioning variables Z
N_xyza counting vector of dimension X * Y * Z (in this order)

◆ minNbRowsPerThread()

virtual std::size_t gum::learning::IndependenceTest::minNbRowsPerThread ( ) const
virtualinherited

returns the minimum of rows that each thread should process

Reimplemented in gum::learning::KNML.

Referenced by gum::learning::KNML::operator=().

Here is the caller graph for this function:

◆ nodeId2Columns()

const Bijection< NodeId, std::size_t > & gum::learning::IndependenceTest::nodeId2Columns ( ) const
inherited

return the mapping between the columns of the database and the node ids

Warning
An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.

Referenced by gum::learning::KNML::useCache().

Here is the caller graph for this function:

◆ operator=() [1/2]

IndepTestG2 & gum::learning::IndepTestG2::operator= ( const IndepTestG2 & from)

copy operator

References IndepTestG2().

Here is the call graph for this function:

◆ operator=() [2/2]

IndepTestG2 & gum::learning::IndepTestG2::operator= ( IndepTestG2 && from)

move operator

References IndepTestG2().

Here is the call graph for this function:

◆ ranges()

const std::vector< std::pair< std::size_t, std::size_t > > & gum::learning::IndependenceTest::ranges ( ) const
inherited

returns the current ranges

Referenced by IndependenceTest(), gum::learning::IndepTestChi2::IndepTestChi2(), gum::learning::IndepTestG2::IndepTestG2(), and gum::learning::KNML::operator=().

Here is the caller graph for this function:

◆ score() [1/2]

double gum::learning::IndependenceTest::score ( const NodeId var1,
const NodeId var2 )
inherited

returns the score of a pair of nodes

Referenced by gum::learning::KNML::operator=().

Here is the caller graph for this function:

◆ score() [2/2]

double gum::learning::IndependenceTest::score ( const NodeId var1,
const NodeId var2,
const std::vector< NodeId > & rhs_ids )
inherited

returns the score of a pair of nodes given some other nodes

Parameters
var1the first variable on the left side of the conditioning bar
var2the second variable on the left side of the conditioning bar
rhs_idsthe set of variables on the right side of the conditioning bar

◆ score_()

virtual double gum::learning::IndepTestG2::score_ ( const IdCondSet & idset)
finalprotectedvirtual

returns the score for a given IdCondSet

Exceptions
OperationNotAllowedis raised if the score does not support calling method score such an idset (due to too many/too few variables in the left hand side or the right hand side of the idset).

Implements gum::learning::IndependenceTest.

◆ setMinNbRowsPerThread()

virtual void gum::learning::IndependenceTest::setMinNbRowsPerThread ( const std::size_t nb) const
virtualinherited

changes the number min of rows a thread should process in a multithreading context

When computing score, several threads are used by record counters to perform counts on the rows of the database, the MinNbRowsPerThread method indicates how many rows each thread should at least process. This is used to compute the number of threads actually run. This number is equal to the min between the max number of threads allowed and the number of records in the database divided by nb.

Reimplemented in gum::learning::KNML.

Referenced by gum::learning::KNML::operator=().

Here is the caller graph for this function:

◆ setNumberOfThreads()

virtual void gum::learning::IndependenceTest::setNumberOfThreads ( Size nb)
virtualinherited

sets the number max of threads that can be used

Parameters
nbthe number max of threads to be used. If this number is set to 0, then it is defaulted to aGrUM's max number of threads

Implements gum::IThreadNumberManager.

Reimplemented in gum::learning::KNML.

Referenced by gum::learning::KNML::operator=().

Here is the caller graph for this function:

◆ setRanges()

void gum::learning::IndependenceTest::setRanges ( const std::vector< std::pair< std::size_t, std::size_t > > & new_ranges)
inherited

sets new ranges to perform the counts used by the independence test

Parameters
rangesa set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database.

Referenced by gum::learning::KNML::operator=().

Here is the caller graph for this function:

◆ statistics()

std::pair< double, double > gum::learning::IndepTestG2::statistics ( NodeId var1,
NodeId var2,
const std::vector< NodeId > & rhs_ids = {} )

get the pair <G2statistic,pvalue> for a test var1 indep var2 given rhs_ids

Referenced by gum::learning::IBNLearner::G2().

Here is the caller graph for this function:

◆ statistics_()

std::pair< double, double > gum::learning::IndepTestG2::statistics_ ( const IdCondSet & idset)
protected

compute the pair <G2 statistic,pvalue>

◆ useCache()

virtual void gum::learning::IndependenceTest::useCache ( const bool on_off)
virtualinherited

turn on/off the use of a cache of the previously computed score

Reimplemented in gum::learning::KNML.

Member Data Documentation

◆ cache_

ScoringCache gum::learning::IndependenceTest::cache_
protectedinherited

the scoring cache

Definition at line 222 of file independenceTest.h.

◆ counter_

RecordCounter gum::learning::IndependenceTest::counter_
protectedinherited

the record counter used for the counts over discrete variables

Definition at line 219 of file independenceTest.h.

◆ empty_ids_

const std::vector< NodeId > gum::learning::IndependenceTest::empty_ids_
protectedinherited

an empty vector

Definition at line 228 of file independenceTest.h.

◆ one_log2_

const double gum::learning::IndependenceTest::one_log2_ {M_LOG2E}
protectedinherited

1 / log(2)

Definition at line 213 of file independenceTest.h.

213{M_LOG2E};
#define M_LOG2E
Definition math_utils.h:55

◆ prior_

Prior* gum::learning::IndependenceTest::prior_ {nullptr}
protectedinherited

the expert knowledge a priorwe add to the contingency tables

Definition at line 216 of file independenceTest.h.

216{nullptr};

◆ use_cache_

bool gum::learning::IndependenceTest::use_cache_ {true}
protectedinherited

a Boolean indicating whether we wish to use the cache

Definition at line 225 of file independenceTest.h.

225{true};

The documentation for this class was generated from the following file: