the class for computing K2 scores (actually their log2 value) More...

#include <agrum/BN/learning/scores_and_tests/scoreK2.h>

Inheritance diagram for gum::learning::ScoreK2:

Collaboration diagram for gum::learning::ScoreK2:

Public Member Functions
Constructors / Destructors
	ScoreK2 (const DBRowGeneratorParser &parser, const Prior &prior, const std::vector< std::pair< std::size_t, std::size_t > > &ranges, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >())
	default constructor
	ScoreK2 (const DBRowGeneratorParser &parser, const Prior &prior, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >())
	default constructor
	ScoreK2 (const ScoreK2 &from)
	copy constructor
	ScoreK2 (ScoreK2 &&from)
	move constructor
virtual ScoreK2 *	clone () const
	virtual copy constructor
virtual	~ScoreK2 ()
	destructor
Operators
ScoreK2 &	operator= (const ScoreK2 &from)
	copy operator
ScoreK2 &	operator= (ScoreK2 &&from)
	move operator
Accessors / Modifiers
virtual std::string	isPriorCompatible () const final
	indicates whether the prior is compatible (meaningful) with the score
virtual const Prior &	internalPrior () const final
	returns the internal prior of the score
Accessors / Modifiers
virtual void	setNumberOfThreads (Size nb)
	sets the number max of threads that can be used
virtual Size	getNumberOfThreads () const
	returns the current max number of threads of the scheduler
virtual bool	isGumNumberOfThreadsOverriden () const
	indicates whether the user set herself the number of threads
virtual void	setMinNbRowsPerThread (const std::size_t nb) const
	changes the number min of rows a thread should process in a multithreading context
virtual std::size_t	minNbRowsPerThread () const
	returns the minimum of rows that each thread should process
void	setRanges (const std::vector< std::pair< std::size_t, std::size_t > > &new_ranges)
	sets new ranges to perform the counts used by the score
void	clearRanges ()
	reset the ranges to the one range corresponding to the whole database
const std::vector< std::pair< std::size_t, std::size_t > > &	ranges () const
	returns the current ranges
double	score (const NodeId var)
	returns the score of a single node
double	score (const NodeId var, const std::vector< NodeId > &rhs_ids)
	returns the score of a single node given some other nodes
void	clear ()
	clears all the data structures from memory, including the cache
void	clearCache ()
	clears the current cache
void	useCache (const bool on_off)
	turn on/off the use of a cache of the previously computed score
bool	isUsingCache () const
	indicates whether the score uses a cache
const Bijection< NodeId, std::size_t > &	nodeId2Columns () const
	return the mapping between the columns of the database and the node ids
const DatabaseTable &	database () const
	return the database used by the score

Static Public Member Functions
static std::string	isPriorCompatible (PriorType prior_type, double weight=1.0f)
	indicates whether the prior is compatible (meaningful) with the score
static std::string	isPriorCompatible (const Prior &prior)
	indicates whether the prior is compatible (meaningful) with the score

Protected Member Functions
virtual double	score_ (const IdCondSet &idset) final
	returns the score for a given IdCondSet
std::vector< double >	marginalize_ (const NodeId X_id, const std::vector< double > &N_xyz) const
	returns a counting vector where variables are marginalized from N_xyz

Protected Attributes
const double	one_log2_ {M_LOG2E}
	1 / log(2)
Prior *	prior_ {nullptr}
	the expert knowledge a priorwe add to the score
RecordCounter	counter_
	the record counter used for the counts over discrete variables
ScoringCache	cache_
	the scoring cache
bool	use_cache_ {true}
	a Boolean indicating whether we wish to use the cache
const std::vector< NodeId >	empty_ids_
	an empty vector

Detailed Description

the class for computing K2 scores (actually their log2 value)

Warning: As the K2 score already includes an implicit Laplace prior on all the cells of contingency tables, the prior passed to the score should be a NoPrior. But aGrUM will let you use another (certainly incompatible) prior with the score. In this case, this additional prior will be included in addition to the implicit Laplace prior in a BD fashion, i.e., we will resort to the Bayesian Dirichlet (BD) formula to include the sum of the two priors into the score.

Definition at line 80 of file scoreK2.h.

Constructor & Destructor Documentation

◆ ScoreK2() [1/4]

gum::learning::ScoreK2::ScoreK2	(	const DBRowGeneratorParser &	parser,
		const Prior &	prior,
		const std::vector< std::pair< std::size_t, std::size_t > > &	ranges,
		const Bijection< NodeId, std::size_t > &	nodeId2columns = Bijection< NodeId, std::size_t >() )

default constructor

Parameters

parser	the parser used to parse the database
prior	An prior that we add to the computation of the score
ranges	a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database.
nodeId2Columns	a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.

Warning: If nodeId2columns is not empty, then only the scores over the ids belonging to this bijection can be computed: applying method score() over other ids will raise exception NotFound.

References gum::learning::Score::ranges().

Referenced by ScoreK2(), ScoreK2(), clone(), operator=(), and operator=().

Here is the call graph for this function:

Here is the caller graph for this function:

◆ ScoreK2() [2/4]

gum::learning::ScoreK2::ScoreK2	(	const DBRowGeneratorParser &	parser,
		const Prior &	prior,
		const Bijection< NodeId, std::size_t > &	nodeId2columns = Bijection< NodeId, std::size_t >() )

default constructor

Parameters

parser	the parser used to parse the database
prior	An prior that we add to the computation of the score
nodeId2Columns	a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.

Warning: If nodeId2columns is not empty, then only the scores over the ids belonging to this bijection can be computed: applying method score() over other ids will raise exception NotFound.

◆ ScoreK2() [3/4]

gum::learning::ScoreK2::ScoreK2 ( const ScoreK2 & from )

copy constructor

References ScoreK2().

Here is the call graph for this function:

◆ ScoreK2() [4/4]

gum::learning::ScoreK2::ScoreK2 ( ScoreK2 && from )

move constructor

References ScoreK2().

Here is the call graph for this function:

◆ ~ScoreK2()

virtual gum::learning::ScoreK2::~ScoreK2 ( )

virtual

destructor

Member Function Documentation

◆ clear()

void gum::learning::Score::clear ( )

inherited

clears all the data structures from memory, including the cache

◆ clearCache()

void gum::learning::Score::clearCache ( )

inherited

clears the current cache

◆ clearRanges()

void gum::learning::Score::clearRanges ( )

inherited

reset the ranges to the one range corresponding to the whole database

◆ clone()

virtual ScoreK2 * gum::learning::ScoreK2::clone ( ) const

virtual

virtual copy constructor

Implements gum::learning::Score.

References ScoreK2().

Here is the call graph for this function:

◆ database()

const DatabaseTable & gum::learning::Score::database ( ) const

inherited

return the database used by the score

◆ getNumberOfThreads()

virtual Size gum::learning::Score::getNumberOfThreads ( ) const

virtualinherited

returns the current max number of threads of the scheduler

Implements gum::IThreadNumberManager.

◆ internalPrior()

virtual const Prior & gum::learning::ScoreK2::internalPrior ( ) const

finalvirtual

returns the internal prior of the score

Some scores include an prior. For instance, the K2 score is a BD score with a Laplace Prior ( smoothing(1) ). K2 is a BD score with a N'/(r_i * q_i) prior, where N' is an effective sample size and r_i is the domain size of the target variable and q_i is the domain size of the Cartesian product of its parents. The goal of the score's internal prior classes is to enable to account for these priors outside the score, e.g., when performing parameter estimation. It is important to note that, to be meaningful, a structure + parameter learning requires that the same priors are taken into account during structure learning and parameter learning.

Implements gum::learning::Score.

References internalPrior().

Referenced by internalPrior().

Here is the call graph for this function:

Here is the caller graph for this function:

◆ isGumNumberOfThreadsOverriden()

virtual bool gum::learning::Score::isGumNumberOfThreadsOverriden ( ) const

virtualinherited

indicates whether the user set herself the number of threads

Implements gum::IThreadNumberManager.

◆ isPriorCompatible() [1/3]

virtual std::string gum::learning::ScoreK2::isPriorCompatible ( ) const

finalvirtual

indicates whether the prior is compatible (meaningful) with the score

The combination of some scores and priors can be meaningless. For instance, adding a Dirichlet prior to the K2 score is not very meaningful since K2 corresponds to a BD score with a 1-smoothing prior. aGrUM allows you to perform such combination, but you can check with method isPriorCompatible () whether the result the score will give you is meaningful or not.

Returns: a non empty string if the prior is compatible with the score.

Implements gum::learning::Score.

Referenced by gum::learning::IBNLearner::checkScorePriorCompatibility(), isPriorCompatible(), and isPriorCompatible().

Here is the caller graph for this function:

◆ isPriorCompatible() [2/3]

std::string gum::learning::ScoreK2::isPriorCompatible ( const Prior & prior )

static

indicates whether the prior is compatible (meaningful) with the score

a non empty string if the prior is compatible with the score.

References isPriorCompatible().

Here is the call graph for this function:

◆ isPriorCompatible() [3/3]

std::string gum::learning::ScoreK2::isPriorCompatible	(	PriorType	prior_type,
		double	weight = 1.0f )

static

indicates whether the prior is compatible (meaningful) with the score

Returns: a non empty string if the prior is compatible with the score.

References isPriorCompatible().

Here is the call graph for this function:

◆ isUsingCache()

bool gum::learning::Score::isUsingCache ( ) const

inherited

indicates whether the score uses a cache

◆ marginalize_()

std::vector< double > gum::learning::Score::marginalize_	(	const NodeId	X_id,
		const std::vector< double > &	N_xyz ) const

protectedinherited

returns a counting vector where variables are marginalized from N_xyz

Parameters

X_id	the id of the variable to marginalize (this is the first variable in table N_xyz
N_xyz	a counting vector of dimension X * cond_vars (in this order)

◆ minNbRowsPerThread()

virtual std::size_t gum::learning::Score::minNbRowsPerThread ( ) const

virtualinherited

returns the minimum of rows that each thread should process

◆ nodeId2Columns()

const Bijection< NodeId, std::size_t > & gum::learning::Score::nodeId2Columns ( ) const

inherited

return the mapping between the columns of the database and the node ids

Warning: An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.

◆ operator=() [1/2]

ScoreK2 & gum::learning::ScoreK2::operator= ( const ScoreK2 & from )

copy operator

References ScoreK2().

Here is the call graph for this function:

◆ operator=() [2/2]

ScoreK2 & gum::learning::ScoreK2::operator= ( ScoreK2 && from )

move operator

References ScoreK2().

Here is the call graph for this function:

◆ ranges()

const std::vector< std::pair< std::size_t, std::size_t > > & gum::learning::Score::ranges ( ) const

inherited

returns the current ranges

Referenced by Score(), gum::learning::ScoreAIC::ScoreAIC(), gum::learning::ScoreBD::ScoreBD(), gum::learning::ScoreBDeu::ScoreBDeu(), gum::learning::ScoreBIC::ScoreBIC(), gum::learning::ScorefNML::ScorefNML(), gum::learning::ScoreK2::ScoreK2(), and gum::learning::ScoreLog2Likelihood::ScoreLog2Likelihood().

Here is the caller graph for this function:

◆ score() [1/2]

double gum::learning::Score::score ( const NodeId var )

inherited

returns the score of a single node

◆ score() [2/2]

double gum::learning::Score::score	(	const NodeId	var,
		const std::vector< NodeId > &	rhs_ids )

inherited

returns the score of a single node given some other nodes

Parameters

var	the variable on the left side of the conditioning bar
rhs_ids	the set of variables on the right side of the conditioning bar

◆ score_()

virtual double gum::learning::ScoreK2::score_ ( const IdCondSet & idset )

finalprotectedvirtual

returns the score for a given IdCondSet

Exceptions

OperationNotAllowed is raised if the score does not support calling method score such an idset (due to too many/too few variables in the left hand side or the right hand side of the idset).

Implements gum::learning::Score.

References score_().

Referenced by score_().

Here is the call graph for this function:

Here is the caller graph for this function:

◆ setMinNbRowsPerThread()

virtual void gum::learning::Score::setMinNbRowsPerThread ( const std::size_t nb ) const

virtualinherited

changes the number min of rows a thread should process in a multithreading context

When computing score, several threads are used by record counters to perform counts on the rows of the database, the MinNbRowsPerThread method indicates how many rows each thread should at least process. This is used to compute the number of threads actually run. This number is equal to the min between the max number of threads allowed and the number of records in the database divided by nb.

◆ setNumberOfThreads()

virtual void gum::learning::Score::setNumberOfThreads ( Size nb )

virtualinherited

sets the number max of threads that can be used

Parameters

nb	the number max of threads to be used. If this number is set to 0, then it is defaulted to aGrUM's max number of threads

Implements gum::IThreadNumberManager.

◆ setRanges()

void gum::learning::Score::setRanges ( const std::vector< std::pair< std::size_t, std::size_t > > & new_ranges )

inherited

sets new ranges to perform the counts used by the score

Parameters

ranges a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database.

◆ useCache()

void gum::learning::Score::useCache ( const bool on_off )

inherited

turn on/off the use of a cache of the previously computed score

Member Data Documentation

◆ cache_

ScoringCache gum::learning::Score::cache_

protectedinherited

the scoring cache

Definition at line 244 of file score.h.

◆ counter_

RecordCounter gum::learning::Score::counter_

protectedinherited

the record counter used for the counts over discrete variables

Definition at line 241 of file score.h.

◆ empty_ids_

const std::vector< NodeId > gum::learning::Score::empty_ids_

protectedinherited

an empty vector

Definition at line 250 of file score.h.

◆ one_log2_

const double gum::learning::Score::one_log2_ {M_LOG2E}

protectedinherited

1 / log(2)

Definition at line 235 of file score.h.

235{M_LOG2E};

M_LOG2E

#define M_LOG2E

Definition math_utils.h:55

◆ prior_

Prior* gum::learning::Score::prior_ {nullptr}

protectedinherited

the expert knowledge a priorwe add to the score

Definition at line 238 of file score.h.

238{nullptr};

◆ use_cache_

bool gum::learning::Score::use_cache_ {true}

protectedinherited

a Boolean indicating whether we wish to use the cache

Definition at line 247 of file score.h.

247{true};

The documentation for this class was generated from the following file:

agrum/BN/learning/scores_and_tests/scoreK2.h

Public Member Functions

Static Public Member Functions

Protected Member Functions

Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ ScoreK2() [1/4]

◆ ScoreK2() [2/4]

◆ ScoreK2() [3/4]

◆ ScoreK2() [4/4]

◆ ~ScoreK2()

Member Function Documentation

◆ clear()

◆ clearCache()

◆ clearRanges()

◆ clone()

◆ database()

◆ getNumberOfThreads()

◆ internalPrior()

◆ isGumNumberOfThreadsOverriden()

◆ isPriorCompatible() [1/3]

◆ isPriorCompatible() [2/3]

◆ isPriorCompatible() [3/3]

◆ isUsingCache()

◆ marginalize_()

◆ minNbRowsPerThread()

◆ nodeId2Columns()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ ranges()

◆ score() [1/2]

◆ score() [2/2]

◆ score_()

◆ setMinNbRowsPerThread()

◆ setNumberOfThreads()

◆ setRanges()

◆ useCache()

Member Data Documentation

◆ cache_

◆ counter_

◆ empty_ids_

◆ one_log2_

◆ prior_

◆ use_cache_