![]() |
aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
|
a helper to easily read databases More...
#include <IBNLearner.h>
Public Member Functions | |
| template<typename GUM_SCALAR> | |
| Database (const std::string &filename, const BayesNet< GUM_SCALAR > &bn, const std::vector< std::string > &missing_symbols) | |
Constructors / Destructors | |
| Database (const std::string &file, const std::vector< std::string > &missing_symbols, const bool induceTypes=false) | |
| default constructor | |
| Database (const DatabaseTable &db) | |
| default constructor | |
| Database (const std::string &filename, const Database &score_database, const std::vector< std::string > &missing_symbols) | |
| constructor for the priors | |
| template<typename GUM_SCALAR> | |
| Database (const std::string &filename, const gum::BayesNet< GUM_SCALAR > &bn, const std::vector< std::string > &missing_symbols) | |
| constructor with a BN providing the variables of interest | |
| Database (const Database &from) | |
| copy constructor | |
| Database (Database &&from) | |
| move constructor | |
| ~Database () | |
| destructor | |
Operators | |
| Database & | operator= (const Database &from) |
| copy operator | |
| Database & | operator= (Database &&from) |
| move operator | |
Accessors / Modifiers | |
| DBRowGeneratorParser & | parser () |
| returns the parser for the database | |
| const std::vector< std::size_t > & | domainSizes () const |
| returns the domain sizes of the variables | |
| const std::vector< std::string > & | names () const |
| returns the names of the variables in the database | |
| NodeId | idFromName (const std::string &var_name) const |
| returns the node id corresponding to a variable name | |
| const std::string & | nameFromId (NodeId id) const |
| returns the variable name corresponding to a given node id | |
| const DatabaseTable & | databaseTable () const |
| returns the internal database table | |
| void | setDatabaseWeight (const double new_weight) |
| assign a weight to all the rows of the database so that the sum of their weights is equal to new_weight | |
| const Bijection< NodeId, std::size_t > & | nodeId2Columns () const |
| returns the mapping between node ids and their columns in the database | |
| const std::vector< std::string > & | missingSymbols () const |
| returns the set of missing symbols taken into account | |
| std::size_t | nbRows () const |
| returns the number of records in the database | |
| std::size_t | size () const |
| returns the number of records in the database | |
| void | setWeight (const std::size_t i, const double weight) |
| sets the weight of the ith record | |
| double | weight (const std::size_t i) const |
| returns the weight of the ith record | |
| double | weight () const |
| returns the weight of the whole database | |
Protected Attributes | |
| DatabaseTable | _database_ |
| the database itself | |
| DBRowGeneratorParser * | _parser_ {nullptr} |
| the parser used for reading the database | |
| std::vector< std::size_t > | _domain_sizes_ |
| the domain sizes of the variables (useful to speed-up computations) | |
| Bijection< NodeId, std::size_t > | _nodeId2cols_ |
| a bijection assigning to each variable name its NodeId | |
| Size | _max_threads_number_ {gum::getNumberOfThreads()} |
| the max number of threads authorized | |
| Size | _min_nb_rows_per_thread_ {100} |
| the minimal number of rows to parse (on average) by thread | |
Private Member Functions | |
| template<typename GUM_SCALAR> | |
| BayesNet< GUM_SCALAR > | _BNVars_ () const |
a helper to easily read databases
Definition at line 123 of file IBNLearner.h.
|
explicit |
default constructor
| file | the name of the CSV file containing the data |
| missing_symbols | the set of symbols in the CSV file that correspond to missing data |
| induceTypes | By default, all the values in the dataset are interpreted as "labels", i.e., as categorical values. But if some columns of the dataset have only numerical values, it would certainly be better totag them as corresponding to integer, range or continuous variables. By setting induceTypes to true, this is precisely what the BNLearner will do. |
Definition at line 84 of file IBNLearner.cpp.
References Database(), gum::learning::IBNLearner::IBNLearner(), _database_, _domain_sizes_, and gum::learning::IBNLearner::readFile_().
Referenced by Database(), Database(), Database(), Database(), Database(), operator=(), and operator=().
|
explicit |
default constructor
| db | an already initialized database table that is used to fill the Database |
Definition at line 70 of file IBNLearner.cpp.
References _database_, _domain_sizes_, _nodeId2cols_, and _parser_.
| gum::learning::IBNLearner::Database::Database | ( | const std::string & | filename, |
| const Database & | score_database, | ||
| const std::vector< std::string > & | missing_symbols ) |
constructor for the priors
We must ensure that the variables of the Database are identical to those of the score database (else the counts used by the scores might be erroneous). However, we allow the variables to be ordered differently in the two databases: variables with the same name in both databases are supposed to be the same.
| file | the name of the CSV file containing the data |
| score_database | the main database used for the learning |
| missing_symbols | the set of symbols in the CSV file that correspond to missing data |
Definition at line 99 of file IBNLearner.cpp.
References Database(), _database_, _domain_sizes_, _nodeId2cols_, _parser_, databaseTable(), gum::learning::IDBInitializer::fillDatabase(), GUM_ERROR, gum::HashTable< Key, Val >::insert(), gum::learning::IBNLearner::isCSVFileName_(), gum::learning::IDatabaseTable< T_DATA >::nbVariables(), nodeId2Columns(), gum::learning::DatabaseTable::variable(), gum::learning::IDatabaseTable< T_DATA >::variableNames(), and gum::learning::IDBInitializer::variableNames().
| gum::learning::IBNLearner::Database::Database | ( | const std::string & | filename, |
| const gum::BayesNet< GUM_SCALAR > & | bn, | ||
| const std::vector< std::string > & | missing_symbols ) |
constructor with a BN providing the variables of interest
| file | the name of the CSV file containing the data |
| bn | a Bayesian network indicating which variables of the CSV file are used for learning |
| missing_symbols | the set of symbols in the CSV file that correspond to missing data |
References Database(), and weight().
| gum::learning::IBNLearner::Database::Database | ( | const Database & | from | ) |
copy constructor
Definition at line 155 of file IBNLearner.cpp.
References Database(), _database_, _domain_sizes_, _nodeId2cols_, and _parser_.
| gum::learning::IBNLearner::Database::Database | ( | Database && | from | ) |
move constructor
Definition at line 162 of file IBNLearner.cpp.
References Database(), _database_, _domain_sizes_, _nodeId2cols_, and _parser_.
| gum::learning::IBNLearner::Database::~Database | ( | ) |
| gum::learning::IBNLearner::Database::Database | ( | const std::string & | filename, |
| const BayesNet< GUM_SCALAR > & | bn, | ||
| const std::vector< std::string > & | missing_symbols ) |
Definition at line 50 of file IBNLearner_tpl.h.
References _database_, _domain_sizes_, _nodeId2cols_, _parser_, gum::learning::IDBInitializer::fillDatabase(), GUM_ERROR, gum::HashTable< Key, Val >::insert(), gum::learning::IBNLearner::isCSVFileName_(), gum::Variable::name(), and gum::learning::IDBInitializer::variableNames().
|
private |
Definition at line 91 of file IBNLearner_tpl.h.
References _database_.
| INLINE const DatabaseTable & gum::learning::IBNLearner::Database::databaseTable | ( | ) | const |
returns the internal database table
Definition at line 101 of file IBNLearner_inl.h.
References _database_.
Referenced by Database().
| INLINE const std::vector< std::size_t > & gum::learning::IBNLearner::Database::domainSizes | ( | ) | const |
returns the domain sizes of the variables
Definition at line 63 of file IBNLearner_inl.h.
References _domain_sizes_.
| INLINE NodeId gum::learning::IBNLearner::Database::idFromName | ( | const std::string & | var_name | ) | const |
returns the node id corresponding to a variable name
Definition at line 80 of file IBNLearner_inl.h.
References _database_, _nodeId2cols_, and GUM_ERROR.
| INLINE const std::vector< std::string > & gum::learning::IBNLearner::Database::missingSymbols | ( | ) | const |
returns the set of missing symbols taken into account
Definition at line 104 of file IBNLearner_inl.h.
References _database_.
| INLINE const std::string & gum::learning::IBNLearner::Database::nameFromId | ( | NodeId | id | ) | const |
returns the variable name corresponding to a given node id
Definition at line 91 of file IBNLearner_inl.h.
References _database_, _nodeId2cols_, and GUM_ERROR.
| INLINE const std::vector< std::string > & gum::learning::IBNLearner::Database::names | ( | ) | const |
returns the names of the variables in the database
Definition at line 68 of file IBNLearner_inl.h.
References _database_.
| INLINE std::size_t gum::learning::IBNLearner::Database::nbRows | ( | ) | const |
returns the number of records in the database
Definition at line 114 of file IBNLearner_inl.h.
References _database_.
| INLINE const Bijection< NodeId, std::size_t > & gum::learning::IBNLearner::Database::nodeId2Columns | ( | ) | const |
returns the mapping between node ids and their columns in the database
Definition at line 109 of file IBNLearner_inl.h.
References _nodeId2cols_.
Referenced by Database().
| IBNLearner::Database & gum::learning::IBNLearner::Database::operator= | ( | const Database & | from | ) |
copy operator
Definition at line 171 of file IBNLearner.cpp.
References Database(), _database_, _domain_sizes_, _nodeId2cols_, and _parser_.
| IBNLearner::Database & gum::learning::IBNLearner::Database::operator= | ( | Database && | from | ) |
move operator
Definition at line 185 of file IBNLearner.cpp.
References Database(), _database_, _domain_sizes_, _nodeId2cols_, and _parser_.
| INLINE DBRowGeneratorParser & gum::learning::IBNLearner::Database::parser | ( | ) |
returns the parser for the database
Definition at line 60 of file IBNLearner_inl.h.
References _parser_.
| INLINE void gum::learning::IBNLearner::Database::setDatabaseWeight | ( | const double | new_weight | ) |
assign a weight to all the rows of the database so that the sum of their weights is equal to new_weight
assign new weight to the rows of the learning database
Definition at line 73 of file IBNLearner_inl.h.
References _database_, and weight().
| INLINE void gum::learning::IBNLearner::Database::setWeight | ( | const std::size_t | i, |
| const double | weight ) |
sets the weight of the ith record
| OutOfBounds | if i is outside the set of indices of the records or if the weight is negative |
Definition at line 120 of file IBNLearner_inl.h.
References _database_, and weight().
| INLINE std::size_t gum::learning::IBNLearner::Database::size | ( | ) | const |
returns the number of records in the database
Definition at line 117 of file IBNLearner_inl.h.
References _database_.
| INLINE double gum::learning::IBNLearner::Database::weight | ( | ) | const |
returns the weight of the whole database
Definition at line 130 of file IBNLearner_inl.h.
References _database_.
| INLINE double gum::learning::IBNLearner::Database::weight | ( | const std::size_t | i | ) | const |
returns the weight of the ith record
| OutOfBounds | if i is outside the set of indices of the records |
Definition at line 125 of file IBNLearner_inl.h.
References _database_.
Referenced by Database(), setDatabaseWeight(), and setWeight().
|
protected |
the database itself
Definition at line 259 of file IBNLearner.h.
Referenced by Database(), Database(), Database(), Database(), Database(), Database(), _BNVars_(), databaseTable(), idFromName(), missingSymbols(), nameFromId(), names(), nbRows(), operator=(), operator=(), setDatabaseWeight(), setWeight(), size(), weight(), and weight().
|
protected |
the domain sizes of the variables (useful to speed-up computations)
Definition at line 265 of file IBNLearner.h.
Referenced by Database(), Database(), Database(), Database(), Database(), Database(), domainSizes(), operator=(), and operator=().
|
protected |
the max number of threads authorized
Definition at line 271 of file IBNLearner.h.
|
protected |
the minimal number of rows to parse (on average) by thread
Definition at line 274 of file IBNLearner.h.
a bijection assigning to each variable name its NodeId
Definition at line 268 of file IBNLearner.h.
Referenced by Database(), Database(), Database(), Database(), Database(), idFromName(), nameFromId(), nodeId2Columns(), operator=(), and operator=().
|
protected |
the parser used for reading the database
Definition at line 262 of file IBNLearner.h.
Referenced by Database(), Database(), Database(), Database(), Database(), ~Database(), operator=(), operator=(), and parser().