![]() |
aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
|
The class used to pack sets of generators. More...
#include <agrum/base/database/DBRowGeneratorSet.h>
Public Member Functions | |
Constructors / Destructors | |
| DBRowGeneratorSet () | |
| default constructor | |
| DBRowGeneratorSet (const DBRowGeneratorSet &from) | |
| copy constructor | |
| DBRowGeneratorSet (DBRowGeneratorSet &&from) | |
| move constructor | |
| virtual DBRowGeneratorSet * | clone () const |
| virtual copy constructor | |
| virtual | ~DBRowGeneratorSet () |
| destructor | |
Operators | |
| DBRowGeneratorSet & | operator= (const DBRowGeneratorSet &from) |
| copy operator | |
| DBRowGeneratorSet & | operator= (DBRowGeneratorSet &&from) |
| move operator | |
| DBRowGenerator & | operator[] (const std::size_t i) |
| returns the ith generator | |
| const DBRowGenerator & | operator[] (const std::size_t i) const |
| returns the ith generator | |
Accessors / Modifiers | |
| template<class Generator> | |
| void | insertGenerator (const Generator &generator) |
| inserts a new generator at the end of the set | |
| template<class Generator> | |
| void | insertGenerator (const Generator &generator, const std::size_t i) |
| inserts a new generator at the ith position of the set | |
| std::size_t | nbGenerators () const noexcept |
| returns the number of generators | |
| std::size_t | size () const noexcept |
| returns the number of generators (alias for nbGenerators) | |
| bool | hasRows () |
| returns true if there are still rows that can be output by the set of generators | |
| bool | setInputRow (const DBRow< DBTranslatedValue > &input_row) |
| sets the input row from which the generators will create new rows | |
| const DBRow< DBTranslatedValue > & | generate () |
| generates a new output row from the input row | |
| template<typename GUM_SCALAR> | |
| void | setBayesNet (const BayesNet< GUM_SCALAR > &new_bn) |
| assign a new Bayes net to all the generators that depend on a BN | |
| void | reset () |
| resets all the generators | |
| void | clear () |
| removes all the generators | |
| void | setColumnsOfInterest (const std::vector< std::size_t > &cols_of_interest) |
| sets the columns of interest: the output DBRow needs only contain correct values fot these columns | |
| void | setColumnsOfInterest (std::vector< std::size_t > &&cols_of_interest) |
| sets the columns of interest: the output DBRow needs only contain correct values fot these columns | |
| const std::vector< std::size_t > & | columnsOfInterest () const |
| returns the current set of columns of interest | |
The class used to pack sets of generators.
When learning Bayesian networks, the records of the train dataset are used to construct contingency tables that are either exploited in statistical conditional independence tests or in scores. To achieve this, the values of the DatabaseTable's records need all be observed, i.e., there should be no missing value. When this is not the case, we need to decide what to do with the records (actually the DBRows) that contain missing values. Should we discard them? Should we use an EM algorithm to substitute them by several fully-observed DBRows weighted by their probability of occurrence? Should we use a K-means algorithm to substitute them by only one DBRow of highest probability of occurrence? DBRowGenerator classes are used to perform these substitutions. From one input DBRow, they can produce from 0 to several output DBRows. DBRowGenerator instances can be used in sequences, i.e., a first DBRowGenerator can, e.g., apply an EM algorithm to produce many output DBRows, then these DBRows can feed another DBRowGenerator that only keeps those whose weight is higher than a given threshold. The purpose of Class DBRowGeneratorSet is to contain this sequence of DBRowGenerator instances. The key idea is that it makes the parsing of the output DBRow generated easier. For instance, if we want to use a sequence of 2 generators, outputing 3 times and 4 times the DBRows they get in input respectively, we could use the following code:
For each input DBRow of the DatabaseTable, these while loops output 3 x 4 = 12 identical DBRows. As can be seen, when several DBRowGenerator instances are to be used in sequence, the code is not very easy to write. The DBRowGeneratorSet simplifies the coding as follows:
As can be seen, whatever the number of DBRowGenerator instances packed into the DBRowGeneratorSet, only one while loop is needed to parse all the generated output DBRow instances.
Definition at line 129 of file DBRowGeneratorSet.h.
| gum::learning::DBRowGeneratorSet::DBRowGeneratorSet | ( | ) |
default constructor
Referenced by DBRowGeneratorSet(), DBRowGeneratorSet(), clone(), operator=(), and operator=().
| gum::learning::DBRowGeneratorSet::DBRowGeneratorSet | ( | const DBRowGeneratorSet & | from | ) |
| gum::learning::DBRowGeneratorSet::DBRowGeneratorSet | ( | DBRowGeneratorSet && | from | ) |
|
virtual |
destructor
| void gum::learning::DBRowGeneratorSet::clear | ( | ) |
|
virtual |
| const std::vector< std::size_t > & gum::learning::DBRowGeneratorSet::columnsOfInterest | ( | ) | const |
returns the current set of columns of interest
References columnsOfInterest().
Referenced by columnsOfInterest().
| const DBRow< DBTranslatedValue > & gum::learning::DBRowGeneratorSet::generate | ( | ) |
generates a new output row from the input row
References generate().
Referenced by generate().
| bool gum::learning::DBRowGeneratorSet::hasRows | ( | ) |
| void gum::learning::DBRowGeneratorSet::insertGenerator | ( | const Generator & | generator | ) |
inserts a new generator at the end of the set
| OperationNotAllowed | is raised if the generator set has already started generating output rows and is currently in a state where the generation is not completed yet (i.e., we still need to call the generate() method to complete it). |
| void gum::learning::DBRowGeneratorSet::insertGenerator | ( | const Generator & | generator, |
| const std::size_t | i ) |
inserts a new generator at the ith position of the set
| OperationNotAllowed | is raised if the generator set has already started generating output rows and is currently in a state where the generation is not completed yet (i.e., we still need to call the generate() method to complete it). |
|
noexcept |
returns the number of generators
| DBRowGeneratorSet & gum::learning::DBRowGeneratorSet::operator= | ( | const DBRowGeneratorSet & | from | ) |
| DBRowGeneratorSet & gum::learning::DBRowGeneratorSet::operator= | ( | DBRowGeneratorSet && | from | ) |
| DBRowGenerator & gum::learning::DBRowGeneratorSet::operator[] | ( | const std::size_t | i | ) |
returns the ith generator
| const DBRowGenerator & gum::learning::DBRowGeneratorSet::operator[] | ( | const std::size_t | i | ) | const |
returns the ith generator
| void gum::learning::DBRowGeneratorSet::reset | ( | ) |
| void gum::learning::DBRowGeneratorSet::setBayesNet | ( | const BayesNet< GUM_SCALAR > & | new_bn | ) |
assign a new Bayes net to all the generators that depend on a BN
Typically, generators based on EM or K-means depend on a model to compute correctly their outputs. Method setBayesNet enables to update their BN model.
References setBayesNet().
Referenced by setBayesNet().
| void gum::learning::DBRowGeneratorSet::setColumnsOfInterest | ( | const std::vector< std::size_t > & | cols_of_interest | ) |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns
This method is useful, e.g., for EM-like algorithms that need to know which unobserved variables/values need be filled. In this case, the DBRowGenerator instances contained in the DBRowGeneratorSet still output DBRows with the same columns as the DatabaseTable, but only the columns of these DBRows corresponding to those passed in argument to Method setColumnsOfInterest are meaningful. For instance, if a DatabaseTable contains 10 columns and Method setColumnsOfInterest() is applied with vector { 0, 3, 4 }, then the DBRowGenerator instances contained in the DBRowGeneratorSet will output DBRows with 10 columns, in which only columns 0, 3 and 4 are guaranteed to have correct values (columns are always indexed, starting from 0).
| OperationNotAllowed | is raised if the generator set has already started generating output rows and is currently in a state where the generation is not completed yet (i.e., we still need to call the generate() method to complete it). |
References setColumnsOfInterest().
Referenced by setColumnsOfInterest(), and setColumnsOfInterest().
| void gum::learning::DBRowGeneratorSet::setColumnsOfInterest | ( | std::vector< std::size_t > && | cols_of_interest | ) |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns
This method is useful, e.g., for EM-like algorithms that need to know which unobserved variables/values need be filled. In this case, the DBRowGenerator instances contained in the DBRowGeneratorSet still output DBRows with the same columns as the DatabaseTable, but only the columns of these DBRows corresponding to those passed in argument to Method setColumnsOfInterest are meaningful. For instance, if a DatabaseTable contains 10 columns and Method setColumnsOfInterest() is applied with vector { 0, 3, 4 }, then the DBRowGenerator instances contained in the DBRowGeneratorSet will output DBRows with 10 columns, in which only columns 0, 3 and 4 are guaranteed to have correct values (columns are always indexed, starting from 0).
| OperationNotAllowed | is raised if the generator set has already started generating output rows and is currently in a state where the generation is not completed yet (i.e., we still need to call the generate() method to complete it). |
References setColumnsOfInterest().
| bool gum::learning::DBRowGeneratorSet::setInputRow | ( | const DBRow< DBTranslatedValue > & | input_row | ) |
sets the input row from which the generators will create new rows
References setInputRow().
Referenced by setInputRow().
|
noexcept |