The class for giving access to pseudo count : count in the database + prior. More...

#include <pseudoCount.h>

Inheritance diagram for gum::learning::PseudoCount:

Collaboration diagram for gum::learning::PseudoCount:

Public Member Functions
Constructors / Destructors
	PseudoCount (const DBRowGeneratorParser &parser, const Prior &external_prior, const std::vector< std::pair< std::size_t, std::size_t > > &ranges, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >())
	default constructor
	PseudoCount (const DBRowGeneratorParser &parser, const Prior &external_prior, const Bijection< NodeId, std::size_t > &nodeId2columns=Bijection< NodeId, std::size_t >())
	default constructor
virtual	~PseudoCount ()
	destructor
	PseudoCount (const PseudoCount &from)
	copy constructor
	PseudoCount (PseudoCount &&from)
	move constructor
PseudoCount &	operator= (const PseudoCount &from)
	copy operator
PseudoCount &	operator= (PseudoCount &&from)
	move operator
Accessors / Modifiers
void	setNumberOfThreads (Size nb) override
	sets the number max of threads that can be used
Size	getNumberOfThreads () const override
	returns the current max number of threads of the scheduler
bool	isGumNumberOfThreadsOverriden () const override
	indicates whether the user set herself the number of threads
virtual void	setMinNbRowsPerThread (const std::size_t nb) const
	changes the number min of rows a thread should process in a multithreading context
virtual std::size_t	minNbRowsPerThread () const
	returns the minimum of rows that each thread should process
void	setRanges (const std::vector< std::pair< std::size_t, std::size_t > > &new_ranges)
	sets new ranges to perform the counts used by the independence test
void	clearRanges ()
	reset the ranges to the one range corresponding to the whole database
const std::vector< std::pair< std::size_t, std::size_t > > &	ranges () const
	returns the current ranges
std::vector< double >	get (const std::vector< NodeId > &ids)
	returns the pseudo-count of a pair of nodes given some other nodes
virtual void	clear ()
	clears all the data structures from memory, including the cache
const Bijection< NodeId, std::size_t > &	nodeId2Columns () const
	return the mapping between the columns of the database and the node ids
const DatabaseTable &	database () const
	return the database used by the pseudo-count

Protected Attributes
Prior *	prior_ {nullptr}
	the expert knowledge a priorwe add to the contingency tables
RecordCounter	counter_
	the record counter used for the counts over discrete variables
const std::vector< NodeId >	empty_ids_
	an empty vector

Detailed Description

The class for giving access to pseudo count : count in the database + prior.

Definition at line 67 of file pseudoCount.h.

Constructor & Destructor Documentation

◆ PseudoCount() [1/4]

gum::learning::PseudoCount::PseudoCount	(	const DBRowGeneratorParser &	parser,
		const Prior &	external_prior,
		const std::vector< std::pair< std::size_t, std::size_t > > &	ranges,
		const Bijection< NodeId, std::size_t > &	nodeId2columns = Bijection< NodeId, std::size_t >() )

default constructor

Parameters

parser	the parser used to parse the database
external_prior	An prior that we add to the computation of the pseudo-count (this should come from expert knowledge): this consists in adding numbers to counts in the contingency tables
ranges	a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database.
nodeId2Columns	a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.

Warning: If nodeId2columns is not empty, then only the pseudo-counts over the ids belonging to this bijection can be computed: applying method pseudo-count() over other ids will raise exception NotFound.

References ranges().

Referenced by PseudoCount(), PseudoCount(), operator=(), and operator=().

Here is the call graph for this function:

Here is the caller graph for this function:

◆ PseudoCount() [2/4]

gum::learning::PseudoCount::PseudoCount	(	const DBRowGeneratorParser &	parser,
		const Prior &	external_prior,
		const Bijection< NodeId, std::size_t > &	nodeId2columns = Bijection< NodeId, std::size_t >() )

default constructor

Parameters

parser	the parser used to parse the database
external_prior	An prior that we add to the computation of the pseudo-count (this should come from expert knowledge): this consists in adding numbers to counts in the contingency tables
nodeId2Columns	a mapping from the ids of the nodes in the graphical model to the corresponding column in the DatabaseTable parsed by the parser. This enables estimating from a database in which variable A corresponds to the 2nd column the parameters of a BN in which variable A has a NodeId of 5. An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.

Warning: If nodeId2columns is not empty, then only the pseudo-counts over the ids belonging to this bijection can be computed: applying method pseudo-count() over other ids will raise exception NotFound.

◆ ~PseudoCount()

virtual gum::learning::PseudoCount::~PseudoCount ( )

virtual

destructor

◆ PseudoCount() [3/4]

gum::learning::PseudoCount::PseudoCount ( const PseudoCount & from )

copy constructor

References PseudoCount().

Here is the call graph for this function:

◆ PseudoCount() [4/4]

gum::learning::PseudoCount::PseudoCount ( PseudoCount && from )

move constructor

References PseudoCount().

Here is the call graph for this function:

Member Function Documentation

◆ clear()

virtual void gum::learning::PseudoCount::clear ( )

virtual

clears all the data structures from memory, including the cache

◆ clearRanges()

void gum::learning::PseudoCount::clearRanges ( )

reset the ranges to the one range corresponding to the whole database

◆ database()

const DatabaseTable & gum::learning::PseudoCount::database ( ) const

return the database used by the pseudo-count

◆ get()

std::vector< double > gum::learning::PseudoCount::get ( const std::vector< NodeId > & ids )

returns the pseudo-count of a pair of nodes given some other nodes

Parameters

var1	the first variable on the left side of the conditioning bar
var2	the second variable on the left side of the conditioning bar
rhs_ids	the set of variables on the right side of the conditioning bar

Referenced by gum::learning::IBNLearner::rawPseudoCount().

Here is the caller graph for this function:

◆ getNumberOfThreads()

Size gum::learning::PseudoCount::getNumberOfThreads ( ) const

overridevirtual

returns the current max number of threads of the scheduler

Implements gum::IThreadNumberManager.

◆ isGumNumberOfThreadsOverriden()

bool gum::learning::PseudoCount::isGumNumberOfThreadsOverriden ( ) const

overridevirtual

indicates whether the user set herself the number of threads

Implements gum::IThreadNumberManager.

◆ minNbRowsPerThread()

virtual std::size_t gum::learning::PseudoCount::minNbRowsPerThread ( ) const

virtual

returns the minimum of rows that each thread should process

◆ nodeId2Columns()

const Bijection< NodeId, std::size_t > & gum::learning::PseudoCount::nodeId2Columns ( ) const

return the mapping between the columns of the database and the node ids

Warning: An empty nodeId2Columns bijection means that the mapping is an identity, i.e., the value of a NodeId is equal to the index of the column in the DatabaseTable.

◆ operator=() [1/2]

PseudoCount & gum::learning::PseudoCount::operator= ( const PseudoCount & from )

copy operator

References PseudoCount().

Here is the call graph for this function:

◆ operator=() [2/2]

PseudoCount & gum::learning::PseudoCount::operator= ( PseudoCount && from )

move operator

References PseudoCount().

Here is the call graph for this function:

◆ ranges()

const std::vector< std::pair< std::size_t, std::size_t > > & gum::learning::PseudoCount::ranges ( ) const

returns the current ranges

Referenced by PseudoCount().

Here is the caller graph for this function:

◆ setMinNbRowsPerThread()

virtual void gum::learning::PseudoCount::setMinNbRowsPerThread ( const std::size_t nb ) const

virtual

changes the number min of rows a thread should process in a multithreading context

When computing pseudo-count, several threads are used by record counters to perform counts on the rows of the database, the MinNbRowsPerThread method indicates how many rows each thread should at least process. This is used to compute the number of threads actually run. This number is equal to the min between the max number of threads allowed and the number of records in the database divided by nb.

◆ setNumberOfThreads()

void gum::learning::PseudoCount::setNumberOfThreads ( Size nb )

overridevirtual

sets the number max of threads that can be used

Parameters

nb	the number max of threads to be used. If this number is set to 0, then it is defaulted to aGrUM's max number of threads

Implements gum::IThreadNumberManager.

◆ setRanges()

void gum::learning::PseudoCount::setRanges ( const std::vector< std::pair< std::size_t, std::size_t > > & new_ranges )

sets new ranges to perform the counts used by the independence test

Parameters

ranges a set of pairs {(X1,Y1),...,(Xn,Yn)} of database's rows indices. The counts are then performed only on the union of the rows [Xi,Yi), i in {1,...,n}. This is useful, e.g, when performing cross validation tasks, in which part of the database should be ignored. An empty set of ranges is equivalent to an interval [X,Y) ranging over the whole database.

Member Data Documentation

◆ counter_

RecordCounter gum::learning::PseudoCount::counter_

protected

the record counter used for the counts over discrete variables

Definition at line 214 of file pseudoCount.h.

◆ empty_ids_

const std::vector< NodeId > gum::learning::PseudoCount::empty_ids_

protected

an empty vector

Definition at line 217 of file pseudoCount.h.

◆ prior_

Prior* gum::learning::PseudoCount::prior_ {nullptr}

protected

the expert knowledge a priorwe add to the contingency tables

Definition at line 211 of file pseudoCount.h.

211{nullptr};

The documentation for this class was generated from the following file:

agrum/base/stattests/pseudoCount.h

Public Member Functions

Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ PseudoCount() [1/4]

◆ PseudoCount() [2/4]

◆ ~PseudoCount()

◆ PseudoCount() [3/4]

◆ PseudoCount() [4/4]

Member Function Documentation

◆ clear()

◆ clearRanges()

◆ database()

◆ get()

◆ getNumberOfThreads()

◆ isGumNumberOfThreadsOverriden()

◆ minNbRowsPerThread()

◆ nodeId2Columns()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ ranges()

◆ setMinNbRowsPerThread()

◆ setNumberOfThreads()

◆ setRanges()

Member Data Documentation

◆ counter_

◆ empty_ids_

◆ prior_