![]() |
aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
|
The base class for all DBRow generators. More...
#include <agrum/base/database/DBRowGenerator.h>
Public Member Functions | |
Constructors / Destructors | |
| DBRowGenerator (const std::vector< DBTranslatedValueType > &column_types, const DBRowGeneratorGoal goal) | |
| default constructor | |
| DBRowGenerator (const DBRowGenerator &from) | |
| copy constructor | |
| DBRowGenerator (DBRowGenerator &&from) | |
| move constructor | |
| virtual DBRowGenerator * | clone () const =0 |
| virtual copy constructor | |
| virtual | ~DBRowGenerator () |
| destructor | |
Accessors / Modifiers | |
| bool | hasRows () |
| returns true if there are still rows that can be output by the DBRowGenerator | |
| bool | setInputRow (const DBRow< DBTranslatedValue > &row) |
| sets the input row from which the generator will create its output rows | |
| virtual const DBRow< DBTranslatedValue > & | generate ()=0 |
| generate new rows from the input row | |
| void | decreaseRemainingRows () |
| decrease the number of remaining output rows | |
| virtual void | reset () |
| resets the generator. There are therefore no more ouput row to generate | |
| virtual void | setColumnsOfInterest (const std::vector< std::size_t > &cols_of_interest) |
| sets the columns of interest: the output DBRow needs only contain correct values fot these columns | |
| virtual void | setColumnsOfInterest (std::vector< std::size_t > &&cols_of_interest) |
| sets the columns of interest: the output DBRow needs only contain correct values fot these columns | |
| const std::vector< std::size_t > & | columnsOfInterest () const |
| returns the current set of columns of interest | |
| DBRowGeneratorGoal | goal () const |
| returns the goal of the DBRowGenerator | |
Protected Member Functions | |
| DBRowGenerator & | operator= (const DBRowGenerator &) |
| copy constructor | |
| DBRowGenerator & | operator= (DBRowGenerator &&) |
| move constructor | |
| virtual std::size_t | computeRows_ (const DBRow< DBTranslatedValue > &row)=0 |
| the method that computes the set of DBRow instances to output after method setInputRow has been called | |
Protected Attributes | |
| std::size_t | nb_remaining_output_rows_ {std::size_t(0)} |
| the number of output rows still to retrieve through the generate method | |
| std::vector< DBTranslatedValueType > | column_types_ |
| the types of the columns in the DatabaseTable | |
| std::vector< std::size_t > | columns_of_interest_ |
| the set of columns of interest | |
| DBRowGeneratorGoal | goal_ {DBRowGeneratorGoal::OTHER_THINGS_THAN_REMOVE_MISSING_VALUES} |
| the goal of the DBRowGenerator (just remove missing values or not) | |
The base class for all DBRow generators.
A DBRowGenerator instance takes as input a DBRow containing DBTranslatedValue instances provided directly by a DatabaseTable or resulting from a DBRow generation by another DBRowGenerator. Then, it produces from 0 to several instances of DBRow of DBTranslatedValue. This is essentially useful to deal with missing values: during learning, when a DBRow contains some missing values, what should we do with it? Should we discard it? Should we use an EM algorithm to produce several DBRows weighted by their probability of occurrence? Should we use a K-means algorithm to produce only one DBRow of highest probability of occurrence? Using the appropriate DBRowGenerator, you can apply any of these rules when your learning algorithm parses the DatabaseTable. You just need to indicate which DBRowGenerator to use, no line of code needs be changed in your high-level learning algorithm.
As an example of how a DBRowGenerator works, an "Identity" DBRowGenerator takes as input a DBRow and returns it without any further processing, so it "produces" only one output DBRow. An EM DBRowGenerator takes in input a DBRow in which some cells may be missing. In this case, it produces all the possible combinations of values that these missing values may take and it assigns to these combinations a weight proportional to their probability of occurrence according to a given model. As such, it may most often produce several output DBRows.
The standard usage of a DBRowGenerator is the following:
All DBRowGenerator classes should derive from this class. It takes care of the interaction with the RecordCounter / Score classes. The user who wishes to create a new DBRowGenerator, say for instance, one that outputs k times the input row, just has to define the following class (not all the constructors/destructors are required, but we provide them for self-consistency), the important part of which is located from the "Accessors / Modifiers" section on:
Definition at line 223 of file DBRowGenerator.h.
| gum::learning::DBRowGenerator::DBRowGenerator | ( | const std::vector< DBTranslatedValueType > & | column_types, |
| const DBRowGeneratorGoal | goal ) |
default constructor
| column_types | indicates for each column whether this is a continuous or a discrete one |
References goal().
Referenced by DBRowGenerator(), DBRowGenerator(), clone(), operator=(), and operator=().
| gum::learning::DBRowGenerator::DBRowGenerator | ( | const DBRowGenerator & | from | ) |
| gum::learning::DBRowGenerator::DBRowGenerator | ( | DBRowGenerator && | from | ) |
|
virtual |
destructor
|
pure virtual |
virtual copy constructor
Implemented in gum::learning::DBRowGenerator4CompleteRows, gum::learning::DBRowGeneratorEM< GUM_SCALAR >, and gum::learning::DBRowGeneratorIdentity.
References DBRowGenerator().
| const std::vector< std::size_t > & gum::learning::DBRowGenerator::columnsOfInterest | ( | ) | const |
returns the current set of columns of interest
|
protectedpure virtual |
the method that computes the set of DBRow instances to output after method setInputRow has been called
Implemented in gum::learning::DBRowGenerator4CompleteRows, gum::learning::DBRowGeneratorEM< GUM_SCALAR >, and gum::learning::DBRowGeneratorIdentity.
| void gum::learning::DBRowGenerator::decreaseRemainingRows | ( | ) |
decrease the number of remaining output rows
When method setInputRow is performed, the DBRowGenerator knows how many output rows it will be able to generate. Each time method decreaseRemainingRows is called, we decrement this number. When the number becomes equal to 0, then there remains no new output row to generate.
|
pure virtual |
generate new rows from the input row
Implemented in gum::learning::DBRowGenerator4CompleteRows, gum::learning::DBRowGeneratorEM< GUM_SCALAR >, and gum::learning::DBRowGeneratorIdentity.
| DBRowGeneratorGoal gum::learning::DBRowGenerator::goal | ( | ) | const |
returns the goal of the DBRowGenerator
Referenced by DBRowGenerator(), and gum::learning::DBRowGeneratorWithBN< GUM_SCALAR >::DBRowGeneratorWithBN().
| bool gum::learning::DBRowGenerator::hasRows | ( | ) |
returns true if there are still rows that can be output by the DBRowGenerator
|
protected |
|
protected |
|
virtual |
resets the generator. There are therefore no more ouput row to generate
|
virtual |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns
This method is useful, e.g., for EM-like algorithms that need to know which unobserved variables/values need be filled. In this case, the DBRowGenerator still outputs DBRows with the same columns as the DatabaseTable, but only the columns of these DBRows corresponding to those passed in argument to Method setColumnsOfInterest are meaningful. For instance, if a DatabaseTable contains 10 columns and Method setColumnsOfInterest() is applied with vector<> { 0, 3, 4 }, then the DBRowGenerator will output DBRows with 10 columns, in which only columns 0, 3 and 4 are guaranteed to have correct values (columns are always indexed, starting from 0).
|
virtual |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns
This method is useful, e.g., for EM-like algorithms that need to know which unobserved variables/values need be filled. In this case, the DBRowGenerator still outputs DBRows with the same columns as the DatabaseTable, but only the columns of these DBRows corresponding to those passed in argument to Method setColumnsOfInterest are meaningful. For instance, if a DatabaseTable contains 10 columns and Method setColumnsOfInterest() is applied with vector<> { 0, 3, 4 }, then the DBRowGenerator will output DBRows with 10 columns, in which only columns 0, 3 and 4 are guaranteed to have correct values (columns are always indexed, starting from 0).
| bool gum::learning::DBRowGenerator::setInputRow | ( | const DBRow< DBTranslatedValue > & | row | ) |
sets the input row from which the generator will create its output rows
|
protected |
the types of the columns in the DatabaseTable
This is useful to determine whether we need to use the .discr_val field or the .cont_val field in DBTranslatedValue instances.
Definition at line 330 of file DBRowGenerator.h.
|
protected |
the set of columns of interest
Definition at line 333 of file DBRowGenerator.h.
|
protected |
the goal of the DBRowGenerator (just remove missing values or not)
Definition at line 336 of file DBRowGenerator.h.
|
protected |
the number of output rows still to retrieve through the generate method
Definition at line 325 of file DBRowGenerator.h.