aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
gum::AdaptiveRMaxPlaner Class Reference

<agrum/FMDP/planning/adaptiveRMaxPlaner.h> More...

#include <adaptiveRMaxPlaner.h>

Inheritance diagram for gum::AdaptiveRMaxPlaner:
Collaboration diagram for gum::AdaptiveRMaxPlaner:

Public Member Functions

Planning Methods
void initialize (const FMDP< double > *fmdp)
 Initializes data structure needed for making the planning.
void makePlanning (Idx nbStep=1000000)
 Performs a value iteration.
Datastructure access methods
INLINE const FMDP< double > * fmdp ()
 Returns a const ptr on the Factored Markov Decision Process on which we're planning.
INLINE const MultiDimFunctionGraph< double > * vFunction ()
 Returns a const ptr on the value function computed so far.
virtual Size vFunctionSize ()
 Returns vFunction computed so far current size.
INLINE MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optimalPolicy ()
 Returns the best policy obtained so far.
virtual Size optimalPolicySize ()
 Returns optimalPolicy computed so far current size.
std::string optimalPolicy2String ()
 Provide a better toDot for the optimal policy where the leaves have the action name instead of its id.
Planning Methods
virtual void initialize (const FMDP< double > *fmdp)
 Initializes data structure needed for making the planning.

Static Public Member Functions

static AdaptiveRMaxPlanerReducedAndOrderedInstance (const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)
static AdaptiveRMaxPlanerTreeInstance (const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)
static StructuredPlaner< double > * spumddInstance (double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)
static StructuredPlaner< double > * sviInstance (double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)

Protected Member Functions

Value Iteration Methods
virtual void initVFunction_ ()
 Performs a single step of value iteration.
virtual MultiDimFunctionGraph< double > * valueIteration_ ()
 Performs a single step of value iteration.
Optimal policy extraction methods
virtual void evalPolicy_ ()
 Perform the required tasks to extract an optimal policy.
Value Iteration Methods
virtual MultiDimFunctionGraph< double > * evalQaction_ (const MultiDimFunctionGraph< double > *, Idx)
 Performs the P(s'|s,a).V^{t-1}(s') part of the value itération.
virtual MultiDimFunctionGraph< double > * maximiseQactions_ (std::vector< MultiDimFunctionGraph< double > * > &)
 Performs max_a Q(s,a).
virtual MultiDimFunctionGraph< double > * minimiseFunctions_ (std::vector< MultiDimFunctionGraph< double > * > &)
 Performs min_i F_i.
virtual MultiDimFunctionGraph< double > * addReward_ (MultiDimFunctionGraph< double > *function, Idx actionId=0)
 Perform the R(s) + gamma . function.

Protected Attributes

const FMDP< double > * fmdp_
 The Factored Markov Decision Process describing our planning situation (NB : this one must have function graph as transitions and reward functions ).
MultiDimFunctionGraph< double > * vFunction_
 The Value Function computed iteratively.
MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optimalPolicy_
 The associated optimal policy.
gum::VariableSet elVarSeq_
 A Set to eleminate primed variables.
double discountFactor_
 Discount Factor used for infinite horizon planning.
IOperatorStrategy< double > * operator_
bool verbose_
 Boolean used to indcates whether or not iteration informations should be displayed on terminal.

Private Member Functions

void _makeRMaxFunctionGraphs_ ()
std::pair< NodeId, NodeId_visitLearner_ (const IVisitableGraphLearner *, NodeId currentNodeId, MultiDimFunctionGraph< double > *, MultiDimFunctionGraph< double > *)
void _clearTables_ ()

Private Attributes

HashTable< Idx, MultiDimFunctionGraph< double > * > _actionsRMaxTable_
HashTable< Idx, MultiDimFunctionGraph< double > * > _actionsBoolTable_
const ILearningStrategy_fmdpLearner_
double _rThreshold_
double _rmax_
double _threshold_
 The threshold value Whenever | V^{n} - V^{n+1} | < threshold, we consider that V ~ V*.
bool _firstTime_

Incremental methods

HashTable< Idx, StatesCounter * > _counterTable_
HashTable< Idx, bool_initializedTable_
bool _initialized_
void checkState (const Instantiation &newState, Idx actionId)

Constructor & destructor.

 AdaptiveRMaxPlaner (IOperatorStrategy< double > *opi, double discountFactor, double epsilon, const ILearningStrategy *learner, bool verbose)
 Default constructor.
 ~AdaptiveRMaxPlaner ()
 Default destructor.

Optimal policy extraction methods

NodeId _recurArgMaxCopy_ (NodeId, Idx, const MultiDimFunctionGraph< double > *, MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > *, HashTable< NodeId, NodeId > &)
 Recursion part for the createArgMaxCopy.
NodeId _recurExtractOptPol_ (NodeId, const MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > *, HashTable< NodeId, NodeId > &)
 Recursion part for the createArgMaxCopy.
void _transferActionIds_ (const ArgMaxSet< double, Idx > &, ActionSet &)
 Extract from an ArgMaxSet the associated ActionSet.
MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * makeArgMax_ (const MultiDimFunctionGraph< double > *Qaction, Idx actionId)
 Creates a copy of given Qaction that can be exploit by a Argmax.
virtual MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * argmaximiseQactions_ (std::vector< MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * > &)
 Performs argmax_a Q(s,a).
void extractOptimalPolicy_ (const MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > *optimalValueFunction)
 From V(s)* = argmax_a Q*(s,a), this function extract pi*(s) This function mainly consists in extracting from each ArgMaxSet presents at the leaves the associated ActionSet.

Incremental methods

void setOptimalStrategy (MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > *optPol)
virtual ActionSet stateOptimalPolicy (const Instantiation &curState)
const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optPol_ {nullptr}
ActionSet allActions_

Detailed Description

<agrum/FMDP/planning/adaptiveRMaxPlaner.h>

A class to find optimal policy for a given FMDP.

Perform a RMax planning on given in parameter factored markov decision process

Definition at line 73 of file adaptiveRMaxPlaner.h.

Constructor & Destructor Documentation

◆ AdaptiveRMaxPlaner()

gum::AdaptiveRMaxPlaner::AdaptiveRMaxPlaner ( IOperatorStrategy< double > * opi,
double discountFactor,
double epsilon,
const ILearningStrategy * learner,
bool verbose )
private

Default constructor.

Definition at line 84 of file adaptiveRMaxPlaner.cpp.

88 :
89 StructuredPlaner(opi, discountFactor, epsilon, verbose), IDecisionStrategy(),
90 _fmdpLearner_(learner), _initialized_(false) {
91 GUM_CONSTRUCTOR(AdaptiveRMaxPlaner);
92 }
const ILearningStrategy * _fmdpLearner_
AdaptiveRMaxPlaner(IOperatorStrategy< double > *opi, double discountFactor, double epsilon, const ILearningStrategy *learner, bool verbose)
Default constructor.
StructuredPlaner(IOperatorStrategy< double > *opi, double discountFactor, double epsilon, bool verbose)

References AdaptiveRMaxPlaner(), gum::StructuredPlaner< double >::StructuredPlaner(), _fmdpLearner_, and _initialized_.

Referenced by AdaptiveRMaxPlaner(), ~AdaptiveRMaxPlaner(), ReducedAndOrderedInstance(), and TreeInstance().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ ~AdaptiveRMaxPlaner()

gum::AdaptiveRMaxPlaner::~AdaptiveRMaxPlaner ( )

Default destructor.

Definition at line 97 of file adaptiveRMaxPlaner.cpp.

97 {
98 GUM_DESTRUCTOR(AdaptiveRMaxPlaner);
99
100 for (HashTableIteratorSafe< Idx, StatesCounter* > scIter = _counterTable_.beginSafe();
101 scIter != _counterTable_.endSafe();
102 ++scIter)
103 delete scIter.val();
104 }
HashTable< Idx, StatesCounter * > _counterTable_

References AdaptiveRMaxPlaner(), and _counterTable_.

Here is the call graph for this function:

Member Function Documentation

◆ _clearTables_()

void gum::AdaptiveRMaxPlaner::_clearTables_ ( )
private

Definition at line 342 of file adaptiveRMaxPlaner.cpp.

342 {
343 for (auto actionIter = this->fmdp()->beginActions(); actionIter != this->fmdp()->endActions();
344 ++actionIter) {
345 delete _actionsBoolTable_[*actionIter];
346 delete _actionsRMaxTable_[*actionIter];
347 }
348 _actionsRMaxTable_.clear();
349 _actionsBoolTable_.clear();
350 }
HashTable< Idx, MultiDimFunctionGraph< double > * > _actionsBoolTable_
HashTable< Idx, MultiDimFunctionGraph< double > * > _actionsRMaxTable_
SequenceIteratorSafe< Idx > endActions() const
Returns an iterator reference to the end of the list of actions.
Definition fmdp.h:156
INLINE const FMDP< double > * fmdp()

References _actionsBoolTable_, _actionsRMaxTable_, gum::FMDP< GUM_SCALAR >::endActions(), and gum::StructuredPlaner< double >::fmdp().

Referenced by makePlanning().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ _makeRMaxFunctionGraphs_()

void gum::AdaptiveRMaxPlaner::_makeRMaxFunctionGraphs_ ( )
private

Definition at line 243 of file adaptiveRMaxPlaner.cpp.

243 {
244 _rThreshold_ = _fmdpLearner_->modaMax() * 5 > 30 ? _fmdpLearner_->modaMax() * 5 : 30;
245 _rmax_ = _fmdpLearner_->rMax() / (1.0 - this->discountFactor_);
246
247 for (auto actionIter = this->fmdp()->beginActions(); actionIter != this->fmdp()->endActions();
248 ++actionIter) {
249 std::vector< MultiDimFunctionGraph< double >* > rmaxs;
250 std::vector< MultiDimFunctionGraph< double >* > boolQs;
251
252 for (auto varIter = this->fmdp()->beginVariables(); varIter != this->fmdp()->endVariables();
253 ++varIter) {
254 const IVisitableGraphLearner* visited = _counterTable_[*actionIter];
255
256 MultiDimFunctionGraph< double >* varRMax = this->operator_->getFunctionInstance();
257 MultiDimFunctionGraph< double >* varBoolQ = this->operator_->getFunctionInstance();
258
259 visited->insertSetOfVars(varRMax);
260 visited->insertSetOfVars(varBoolQ);
261
262 std::pair< NodeId, NodeId > rooty
263 = _visitLearner_(visited, visited->root(), varRMax, varBoolQ);
264 varRMax->manager()->setRootNode(rooty.first);
265 varRMax->manager()->reduce();
266 varRMax->manager()->clean();
267 varBoolQ->manager()->setRootNode(rooty.second);
268 varBoolQ->manager()->reduce();
269 varBoolQ->manager()->clean();
270
271 rmaxs.push_back(varRMax);
272 boolQs.push_back(varBoolQ);
273
274 // std::cout << RECASTED(this->fmdp_->transition(*actionIter,
275 // *varIter))->toDot() << std::endl;
276 // for( auto varIter2 =
277 // RECASTED(this->fmdp_->transition(*actionIter,
278 // *varIter))->variablesSequence().beginSafe(); varIter2 !=
279 // RECASTED(this->fmdp_->transition(*actionIter,
280 // *varIter))->variablesSequence().endSafe(); ++varIter2 )
281 // std::cout << (*varIter2)->name() << " | ";
282 // std::cout << std::endl;
283
284 // std::cout << varRMax->toDot() << std::endl;
285 // for( auto varIter =
286 // varRMax->variablesSequence().beginSafe(); varIter !=
287 // varRMax->variablesSequence().endSafe(); ++varIter )
288 // std::cout << (*varIter)->name() << " | ";
289 // std::cout << std::endl;
290
291 // std::cout << varBoolQ->toDot() << std::endl;
292 // for( auto varIter =
293 // varBoolQ->variablesSequence().beginSafe(); varIter !=
294 // varBoolQ->variablesSequence().endSafe(); ++varIter )
295 // std::cout << (*varIter)->name() << " | ";
296 // std::cout << std::endl;
297 }
298
299 // std::cout << "Maximising" << std::endl;
300 _actionsRMaxTable_.insert(*actionIter, this->maximiseQactions_(rmaxs));
301 _actionsBoolTable_.insert(*actionIter, this->minimiseFunctions_(boolQs));
302 }
303 }
std::pair< NodeId, NodeId > _visitLearner_(const IVisitableGraphLearner *, NodeId currentNodeId, MultiDimFunctionGraph< double > *, MultiDimFunctionGraph< double > *)
SequenceIteratorSafe< const DiscreteVariable * > endVariables() const
Returns an iterator reference to the end of the list of variables.
Definition fmdp.h:116
virtual MultiDimFunctionGraph< double > * minimiseFunctions_(std::vector< MultiDimFunctionGraph< double > * > &)
IOperatorStrategy< double > * operator_
virtual MultiDimFunctionGraph< double > * maximiseQactions_(std::vector< MultiDimFunctionGraph< double > * > &)

References _actionsBoolTable_, _actionsRMaxTable_, _counterTable_, _fmdpLearner_, _rmax_, _rThreshold_, _visitLearner_(), gum::MultiDimFunctionGraphManager< GUM_SCALAR, TerminalNodePolicy >::clean(), gum::StructuredPlaner< double >::discountFactor_, gum::FMDP< GUM_SCALAR >::endActions(), gum::FMDP< GUM_SCALAR >::endVariables(), gum::StructuredPlaner< double >::fmdp(), gum::IVisitableGraphLearner::insertSetOfVars(), gum::MultiDimFunctionGraph< GUM_SCALAR, TerminalNodePolicy >::manager(), gum::StructuredPlaner< double >::maximiseQactions_(), gum::StructuredPlaner< double >::minimiseFunctions_(), gum::StructuredPlaner< double >::operator_, gum::MultiDimFunctionGraphManager< GUM_SCALAR, TerminalNodePolicy >::reduce(), gum::IVisitableGraphLearner::root(), and gum::MultiDimFunctionGraphManager< GUM_SCALAR, TerminalNodePolicy >::setRootNode().

Referenced by makePlanning().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ _recurArgMaxCopy_()

NodeId gum::StructuredPlaner< double >::_recurArgMaxCopy_ ( NodeId currentNodeId,
Idx actionId,
const MultiDimFunctionGraph< double > * src,
MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * argMaxCpy,
HashTable< NodeId, NodeId > & visitedNodes )
privateinherited

Recursion part for the createArgMaxCopy.

Definition at line 291 of file structuredPlaner_tpl.h.

504 {
506
507 NodeId nody;
508 if (src->isTerminalNode(currentNodeId)) {
510 nody = argMaxCpy->manager()->addTerminalNode(leaf);
511 } else {
513 NodeId* sonsMap = static_cast< NodeId* >(
514 SOA_ALLOCATE(sizeof(NodeId) * currentNode->nodeVar()->domainSize()));
515 for (Idx moda = 0; moda < currentNode->nodeVar()->domainSize(); ++moda)
518 nody = argMaxCpy->manager()->addInternalNode(currentNode->nodeVar(), sonsMap);
519 }
521 return nody;
522 }
<agrum/FMDP/planning/structuredPlaner.h>
NodeId _recurArgMaxCopy_(NodeId, Idx, const MultiDimFunctionGraph< double > *, MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > *, HashTable< NodeId, NodeId > &)
#define SOA_ALLOCATE(x)

References vFunction_.

◆ _recurExtractOptPol_()

NodeId gum::StructuredPlaner< double >::_recurExtractOptPol_ ( NodeId currentNodeId,
const MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * argMaxOptVFunc,
HashTable< NodeId, NodeId > & visitedNodes )
privateinherited

Recursion part for the createArgMaxCopy.

Definition at line 321 of file structuredPlaner_tpl.h.

580 {
582
583 NodeId nody;
584 if (argMaxOptVFunc->isTerminalNode(currentNodeId)) {
587 nody = optimalPolicy_->manager()->addTerminalNode(leaf);
588 } else {
590 NodeId* sonsMap = static_cast< NodeId* >(
591 SOA_ALLOCATE(sizeof(NodeId) * currentNode->nodeVar()->domainSize()));
592 for (Idx moda = 0; moda < currentNode->nodeVar()->domainSize(); ++moda)
594 nody = optimalPolicy_->manager()->addInternalNode(currentNode->nodeVar(), sonsMap);
595 }
597 return nody;
598 }
NodeId _recurExtractOptPol_(NodeId, const MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > *, HashTable< NodeId, NodeId > &)
void _transferActionIds_(const ArgMaxSet< double, Idx > &, ActionSet &)

◆ _transferActionIds_()

void gum::StructuredPlaner< double >::_transferActionIds_ ( const ArgMaxSet< double, Idx > & src,
ActionSet & dest )
privateinherited

Extract from an ArgMaxSet the associated ActionSet.

Definition at line 329 of file structuredPlaner_tpl.h.

605 {
606 for (auto idi = src.beginSafe(); idi != src.endSafe(); ++idi)
607 dest += *idi;
608 }

References evalQaction_(), fmdp_, and vFunction_.

Here is the call graph for this function:

◆ _visitLearner_()

std::pair< NodeId, NodeId > gum::AdaptiveRMaxPlaner::_visitLearner_ ( const IVisitableGraphLearner * visited,
NodeId currentNodeId,
MultiDimFunctionGraph< double > * rmax,
MultiDimFunctionGraph< double > * boolQ )
private

Definition at line 309 of file adaptiveRMaxPlaner.cpp.

312 {
313 std::pair< NodeId, NodeId > rep;
314 if (visited->isTerminal(currentNodeId)) {
315 rep.first = rmax->manager()->addTerminalNode(
316 visited->nodeNbObservation(currentNodeId) < _rThreshold_ ? _rmax_ : 0.0);
317 rep.second = boolQ->manager()->addTerminalNode(
318 visited->nodeNbObservation(currentNodeId) < _rThreshold_ ? 0.0 : 1.0);
319 return rep;
320 }
321
322 auto rmaxsons = static_cast< NodeId* >(
323 SOA_ALLOCATE(sizeof(NodeId) * visited->nodeVar(currentNodeId)->domainSize()));
324 auto bqsons = static_cast< NodeId* >(
325 SOA_ALLOCATE(sizeof(NodeId) * visited->nodeVar(currentNodeId)->domainSize()));
326
327 for (Idx moda = 0; moda < visited->nodeVar(currentNodeId)->domainSize(); ++moda) {
328 std::pair< NodeId, NodeId > sonp
329 = _visitLearner_(visited, visited->nodeSon(currentNodeId, moda), rmax, boolQ);
330 rmaxsons[moda] = sonp.first;
331 bqsons[moda] = sonp.second;
332 }
333
334 rep.first = rmax->manager()->addInternalNode(visited->nodeVar(currentNodeId), rmaxsons);
335 rep.second = boolQ->manager()->addInternalNode(visited->nodeVar(currentNodeId), bqsons);
336 return rep;
337 }
Size Idx
Type for indexes.
Definition types.h:79
Size NodeId
Type for node ids.

References _rmax_, _rThreshold_, _visitLearner_(), gum::MultiDimFunctionGraphManager< GUM_SCALAR, TerminalNodePolicy >::addInternalNode(), gum::MultiDimFunctionGraphManager< GUM_SCALAR, TerminalNodePolicy >::addTerminalNode(), gum::DiscreteVariable::domainSize(), gum::IVisitableGraphLearner::isTerminal(), gum::MultiDimFunctionGraph< GUM_SCALAR, TerminalNodePolicy >::manager(), gum::IVisitableGraphLearner::nodeNbObservation(), gum::IVisitableGraphLearner::nodeSon(), gum::IVisitableGraphLearner::nodeVar(), and SOA_ALLOCATE.

Referenced by _makeRMaxFunctionGraphs_(), and _visitLearner_().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ addReward_()

MultiDimFunctionGraph< double > * gum::StructuredPlaner< double >::addReward_ ( MultiDimFunctionGraph< double > * function,
Idx actionId = 0 )
protectedvirtualinherited

Perform the R(s) + gamma . function.

Warning
function is deleted, new one is returned

Definition at line 256 of file structuredPlaner_tpl.h.

409 {
410 // *****************************************************************************************
411 // ... we multiply the result by the discount factor, ...
413 newVFunction->copyAndMultiplyByScalar(*Vold, this->discountFactor_);
414 delete Vold;
415
416 // *****************************************************************************************
417 // ... and finally add reward
419
420 return newVFunction;
421 }
#define RECAST(x)
Definition fmdp_tpl.h:57

References _firstTime_.

Referenced by gum::AdaptiveRMaxPlaner::evalPolicy_(), and gum::AdaptiveRMaxPlaner::valueIteration_().

Here is the caller graph for this function:

◆ argmaximiseQactions_()

MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * gum::StructuredPlaner< double >::argmaximiseQactions_ ( std::vector< MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * > & qActionsSet)
protectedvirtualinherited

Performs argmax_a Q(s,a).

Warning
Performs also the deallocation of the QActions

Definition at line 304 of file structuredPlaner_tpl.h.

Referenced by gum::AdaptiveRMaxPlaner::evalPolicy_().

Here is the caller graph for this function:

◆ checkState()

void gum::AdaptiveRMaxPlaner::checkState ( const Instantiation & newState,
Idx actionId )
inlinevirtual

Implements gum::IDecisionStrategy.

Definition at line 222 of file adaptiveRMaxPlaner.h.

222 {
223 if (!_initializedTable_[actionId]) {
224 _counterTable_[actionId]->reset(newState);
225 _initializedTable_[actionId] = true;
226 } else _counterTable_[actionId]->incState(newState);
227 }
HashTable< Idx, bool > _initializedTable_

References _counterTable_, and _initializedTable_.

◆ evalPolicy_()

void gum::AdaptiveRMaxPlaner::evalPolicy_ ( )
protectedvirtual

Perform the required tasks to extract an optimal policy.

Reimplemented from gum::StructuredPlaner< double >.

Definition at line 204 of file adaptiveRMaxPlaner.cpp.

204 {
205 // *****************************************************************************************
206 // Loop reset
207 MultiDimFunctionGraph< double >* newVFunction = operator_->getFunctionInstance();
208 newVFunction->copyAndReassign(*vFunction_, fmdp_->mapMainPrime());
209
210 std::vector< MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy >* >
211 argMaxQActionsSet;
212 // *****************************************************************************************
213 // For each action
214 for (auto actionIter = fmdp_->beginActions(); actionIter != fmdp_->endActions(); ++actionIter) {
215 MultiDimFunctionGraph< double >* qAction = this->evalQaction_(newVFunction, *actionIter);
216
217 qAction = this->addReward_(qAction, *actionIter);
218
219 qAction = this->operator_->maximize(
220 _actionsRMaxTable_[*actionIter],
221 this->operator_->multiply(qAction, _actionsBoolTable_[*actionIter], 1),
222 2);
223
224 argMaxQActionsSet.push_back(makeArgMax_(qAction, *actionIter));
225 }
226 delete newVFunction;
227
228 // *****************************************************************************************
229 // Next to evaluate main value function, we take maximise over all action
230 // value, ...
231 MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy >* argMaxVFunction
232 = argmaximiseQactions_(argMaxQActionsSet);
233
234 // *****************************************************************************************
235 // Next to evaluate main value function, we take maximise over all action
236 // value, ...
237 extractOptimalPolicy_(argMaxVFunction);
238 }
virtual MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * argmaximiseQactions_(std::vector< MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * > &)
void extractOptimalPolicy_(const MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > *optimalValueFunction)
virtual MultiDimFunctionGraph< double > * addReward_(MultiDimFunctionGraph< double > *function, Idx actionId=0)
MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * makeArgMax_(const MultiDimFunctionGraph< double > *Qaction, Idx actionId)
MultiDimFunctionGraph< double > * vFunction_
virtual MultiDimFunctionGraph< double > * evalQaction_(const MultiDimFunctionGraph< double > *, Idx)

References _actionsBoolTable_, _actionsRMaxTable_, gum::StructuredPlaner< double >::addReward_(), gum::StructuredPlaner< double >::argmaximiseQactions_(), gum::MultiDimFunctionGraph< GUM_SCALAR, TerminalNodePolicy >::copyAndReassign(), gum::StructuredPlaner< double >::evalQaction_(), gum::StructuredPlaner< double >::extractOptimalPolicy_(), gum::StructuredPlaner< double >::fmdp_, gum::StructuredPlaner< double >::makeArgMax_(), gum::StructuredPlaner< double >::operator_, and gum::StructuredPlaner< double >::vFunction_.

Here is the call graph for this function:

◆ evalQaction_()

MultiDimFunctionGraph< double > * gum::StructuredPlaner< double >::evalQaction_ ( const MultiDimFunctionGraph< double > * Vold,
Idx actionId )
protectedvirtualinherited

Performs the P(s'|s,a).V^{t-1}(s') part of the value itération.

Definition at line 235 of file structuredPlaner_tpl.h.

358 {
359 // ******************************************************************************
360 // Initialisation :
361 // Creating a copy of last Vfunction to deduce from the new Qaction
362 // And finding the first var to eleminate (the one at the end)
363
364 return operator_->regress(Vold, actionId, this->fmdp_, this->elVarSeq_);
365 }

Referenced by _transferActionIds_(), gum::AdaptiveRMaxPlaner::evalPolicy_(), and gum::AdaptiveRMaxPlaner::valueIteration_().

Here is the caller graph for this function:

◆ extractOptimalPolicy_()

void gum::StructuredPlaner< double >::extractOptimalPolicy_ ( const MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * optimalValueFunction)
protectedinherited

From V(s)* = argmax_a Q*(s,a), this function extract pi*(s) This function mainly consists in extracting from each ArgMaxSet presents at the leaves the associated ActionSet.

Warning
deallocate the argmax optimal value function

Definition at line 313 of file structuredPlaner_tpl.h.

554 {
555 optimalPolicy_->clear();
556
557 // Insertion des nouvelles variables
559 = argMaxOptimalValueFunction->variablesSequence().beginSafe();
560 varIter != argMaxOptimalValueFunction->variablesSequence().endSafe();
561 ++varIter)
562 optimalPolicy_->add(**varIter);
563
565 optimalPolicy_->manager()->setRootNode(_recurExtractOptPol_(argMaxOptimalValueFunction->root(),
567 src2dest));
568
570 }

Referenced by gum::AdaptiveRMaxPlaner::evalPolicy_().

Here is the caller graph for this function:

◆ fmdp()

INLINE const FMDP< double > * gum::StructuredPlaner< double >::fmdp ( )
inlineinherited

Returns a const ptr on the Factored Markov Decision Process on which we're planning.

Definition at line 148 of file structuredPlaner.h.

148{ return fmdp_; }

Referenced by gum::AdaptiveRMaxPlaner::_clearTables_(), gum::AdaptiveRMaxPlaner::_makeRMaxFunctionGraphs_(), and gum::AdaptiveRMaxPlaner::initialize().

Here is the caller graph for this function:

◆ initialize() [1/2]

void gum::AdaptiveRMaxPlaner::initialize ( const FMDP< double > * fmdp)
virtual

Initializes data structure needed for making the planning.

Warning
No calling this methods before starting the first makePlaninng will surely and definitely result in a crash

Implements gum::IPlanningStrategy< double >.

Definition at line 117 of file adaptiveRMaxPlaner.cpp.

117 {
118 if (!_initialized_) {
121 for (auto actionIter = fmdp->beginActions(); actionIter != fmdp->endActions(); ++actionIter) {
122 _counterTable_.insert(*actionIter, new StatesCounter());
123 _initializedTable_.insert(*actionIter, false);
124 }
125 _initialized_ = true;
126 }
127 }
virtual void initialize(const FMDP< double > *fmdp)
Initializes the learner.
virtual void initialize(const FMDP< GUM_SCALAR > *fmdp)
Initializes data structure needed for making the planning.

References _counterTable_, _initialized_, _initializedTable_, gum::StructuredPlaner< double >::fmdp(), gum::IDecisionStrategy::initialize(), and gum::StructuredPlaner< GUM_SCALAR >::initialize().

Here is the call graph for this function:

◆ initialize() [2/2]

void gum::StructuredPlaner< double >::initialize ( const FMDP< double > * fmdp)
virtualinherited

Initializes data structure needed for making the planning.

Warning
No calling this methods before starting the first makePlaninng will surely and definitely result in a crash

Definition at line 197 of file structuredPlaner_tpl.h.

243 {
244 fmdp_ = fmdp;
245
246 // Determination of the threshold value
248
249 // Establishement of sequence of variable elemination
250 for (auto varIter = fmdp_->beginVariables(); varIter != fmdp_->endVariables(); ++varIter)
251 elVarSeq_ << fmdp_->main2prime(*varIter);
252
253 // Initialisation of the value function
254 vFunction_ = operator_->getFunctionInstance();
255 optimalPolicy_ = operator_->getAggregatorInstance();
256 _firstTime_ = true;
257 }

References gum::HashTable< Key, Val >::exists(), gum::Set< Key >::exists(), gum::HashTable< Key, Val >::insert(), and gum::InternalNode::son().

Here is the call graph for this function:

◆ initVFunction_()

void gum::AdaptiveRMaxPlaner::initVFunction_ ( )
protectedvirtual

Performs a single step of value iteration.

Reimplemented from gum::StructuredPlaner< double >.

Definition at line 151 of file adaptiveRMaxPlaner.cpp.

151 {
152 vFunction_->manager()->setRootNode(vFunction_->manager()->addTerminalNode(0.0));
153 for (auto actionIter = fmdp_->beginActions(); actionIter != fmdp_->endActions(); ++actionIter)
154 vFunction_ = this->operator_->add(vFunction_, RECASTED(this->fmdp_->reward(*actionIter)), 1);
155 }
#define RECASTED(x)
For shorter line and hence more comprehensive code purposes only.

References gum::StructuredPlaner< double >::fmdp_, gum::StructuredPlaner< double >::operator_, RECASTED, and gum::StructuredPlaner< double >::vFunction_.

◆ makeArgMax_()

MultiDimFunctionGraph< ArgMaxSet< double, Idx >, SetTerminalNodePolicy > * gum::StructuredPlaner< double >::makeArgMax_ ( const MultiDimFunctionGraph< double > * Qaction,
Idx actionId )
protectedinherited

Creates a copy of given Qaction that can be exploit by a Argmax.

Hence, this step consists in replacing each lea by an ArgMaxSet containing the value of the leaf and the actionId of the Qaction

Parameters
Qaction: the function graph we want to transform
actionId: the action Id associated to that graph
Warning
delete the original Qaction, returns its conversion

Definition at line 285 of file structuredPlaner_tpl.h.

476 {
478 = operator_->getArgMaxFunctionInstance();
479
480 // Insertion des nouvelles variables
482 = qAction->variablesSequence().beginSafe();
483 varIter != qAction->variablesSequence().endSafe();
484 ++varIter)
485 amcpy->add(**varIter);
486
488 amcpy->manager()->setRootNode(
490
491 delete qAction;
492 return amcpy;
493 }

References _threshold_, and verbose_.

Referenced by gum::AdaptiveRMaxPlaner::evalPolicy_().

Here is the caller graph for this function:

◆ makePlanning()

void gum::AdaptiveRMaxPlaner::makePlanning ( Idx nbStep = 1000000)
virtual

Performs a value iteration.

Parameters
nbStep: enables you to specify how many value iterations you wish to do. makePlanning will then stop whether when optimal value function is reach or when nbStep have been performed

Reimplemented from gum::StructuredPlaner< double >.

Definition at line 132 of file adaptiveRMaxPlaner.cpp.

132 {
134
136
138 }
virtual void makePlanning(Idx nbStep=1000000)
Performs a value iteration.

References _clearTables_(), _makeRMaxFunctionGraphs_(), and gum::StructuredPlaner< GUM_SCALAR >::makePlanning().

Here is the call graph for this function:

◆ maximiseQactions_()

MultiDimFunctionGraph< double > * gum::StructuredPlaner< double >::maximiseQactions_ ( std::vector< MultiDimFunctionGraph< double > * > & qActionsSet)
protectedvirtualinherited

Performs max_a Q(s,a).

Warning
Performs also the deallocation of the QActions

Definition at line 242 of file structuredPlaner_tpl.h.

372 {
374 qActionsSet.pop_back();
375
376 while (!qActionsSet.empty()) {
378 qActionsSet.pop_back();
380 }
381
382 return newVFunction;
383 }

Referenced by gum::AdaptiveRMaxPlaner::_makeRMaxFunctionGraphs_(), and gum::AdaptiveRMaxPlaner::valueIteration_().

Here is the caller graph for this function:

◆ minimiseFunctions_()

MultiDimFunctionGraph< double > * gum::StructuredPlaner< double >::minimiseFunctions_ ( std::vector< MultiDimFunctionGraph< double > * > & qActionsSet)
protectedvirtualinherited

Performs min_i F_i.

Warning
Performs also the deallocation of the F_i

Definition at line 249 of file structuredPlaner_tpl.h.

390 {
392 qActionsSet.pop_back();
393
394 while (!qActionsSet.empty()) {
396 qActionsSet.pop_back();
398 }
399
400 return newVFunction;
401 }

References elVarSeq_, fmdp_, operator_, optimalPolicy_, and vFunction_.

Referenced by gum::AdaptiveRMaxPlaner::_makeRMaxFunctionGraphs_().

Here is the caller graph for this function:

◆ optimalPolicy()

INLINE MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * gum::StructuredPlaner< double >::optimalPolicy ( )
inlinevirtualinherited

Returns the best policy obtained so far.

Implements gum::IPlanningStrategy< double >.

Definition at line 163 of file structuredPlaner.h.

163 {
164 return optimalPolicy_;
165 }

◆ optimalPolicy2String()

std::string gum::StructuredPlaner< double >::optimalPolicy2String ( )
virtualinherited

Provide a better toDot for the optimal policy where the leaves have the action name instead of its id.

Implements gum::IPlanningStrategy< double >.

Definition at line 179 of file structuredPlaner_tpl.h.

124 {
125 // ************************************************************************
126 // Discarding the case where no \pi* have been computed
127 if (!optimalPolicy_ || optimalPolicy_->root() == 0) return "NO OPTIMAL POLICY CALCULATED YET";
128
129 // ************************************************************************
130 // Initialisation
131
132 // Declaration of the needed string stream
137
138 // First line for the toDot
139 output << std::endl << "digraph \" OPTIMAL POLICY \" {" << std::endl;
140
141 // Form line for the internal node stream en the terminal node stream
142 terminalStream << "node [shape = box];" << std::endl;
143 nonTerminalStream << "node [shape = ellipse];" << std::endl;
144
145 // For somme clarity in the final string
146 std::string tab = "\t";
147
148 // To know if we already checked a node or not
150
151 // FIFO of nodes to visit
153
154 // Loading the FIFO
155 fifo.push(optimalPolicy_->root());
156 visited << optimalPolicy_->root();
157
158
159 // ************************************************************************
160 // Main loop
161 while (!fifo.empty()) {
162 // Node to visit
163 NodeId currentNodeId = fifo.front();
164 fifo.pop();
165
166 // Checking if it is terminal
167 if (optimalPolicy_->isTerminalNode(currentNodeId)) {
168 // Get back the associated ActionSet
170
171 // Creating a line for this node
172 terminalStream << tab << currentNodeId << ";" << tab << currentNodeId << " [label=\""
173 << currentNodeId << " - ";
174
175 // Enumerating and adding to the line the associated optimal actions
176 for (SequenceIteratorSafe< Idx > valIter = ase.beginSafe(); valIter != ase.endSafe();
177 ++valIter)
178 terminalStream << fmdp_->actionName(*valIter) << " ";
179
180 // Terminating line
181 terminalStream << "\"];" << std::endl;
182 continue;
183 }
184
185 // Either wise
186 {
187 // Geting back the associated internal node
189
190 // Creating a line in internalnode stream for this node
191 nonTerminalStream << tab << currentNodeId << ";" << tab << currentNodeId << " [label=\""
192 << currentNodeId << " - " << currentNode->nodeVar()->name() << "\"];"
193 << std::endl;
194
195 // Going through the sons and agregating them according the the sons Ids
197 for (Idx sonIter = 0; sonIter < currentNode->nbSons(); ++sonIter) {
198 if (!visited.exists(currentNode->son(sonIter))) {
199 fifo.push(currentNode->son(sonIter));
200 visited << currentNode->son(sonIter);
201 }
202 if (!sonMap.exists(currentNode->son(sonIter)))
203 sonMap.insert(currentNode->son(sonIter), new LinkedList< Idx >());
204 sonMap[currentNode->son(sonIter)]->addLink(sonIter);
205 }
206
207 // Adding to the arc stram
208 for (auto sonIter = sonMap.beginSafe(); sonIter != sonMap.endSafe(); ++sonIter) {
209 arcstream << tab << currentNodeId << " -> " << sonIter.key() << " [label=\" ";
210 Link< Idx >* modaIter = sonIter.val()->list();
211 while (modaIter) {
212 arcstream << currentNode->nodeVar()->label(modaIter->element());
213 if (modaIter->nextLink()) arcstream << ", ";
214 modaIter = modaIter->nextLink();
215 }
216 arcstream << "\",color=\"#00ff00\"];" << std::endl;
217 delete sonIter.val();
218 }
219 }
220 }
221
222 // Terminating
224 << nonTerminalStream.str() << std::endl
225 << arcstream.str() << std::endl
226 << "}" << std::endl;
227
228 return output.str();
229 }

◆ optimalPolicySize()

virtual Size gum::StructuredPlaner< double >::optimalPolicySize ( )
inlinevirtualinherited

Returns optimalPolicy computed so far current size.

Implements gum::IPlanningStrategy< double >.

Definition at line 170 of file structuredPlaner.h.

170 {
171 return optimalPolicy_ != nullptr ? optimalPolicy_->realSize() : 0;
172 }

◆ ReducedAndOrderedInstance()

AdaptiveRMaxPlaner * gum::AdaptiveRMaxPlaner::ReducedAndOrderedInstance ( const ILearningStrategy * learner,
double discountFactor = 0.9,
double epsilon = 0.00001,
bool verbose = true )
inlinestatic

Definition at line 83 of file adaptiveRMaxPlaner.h.

86 {
87 return new AdaptiveRMaxPlaner(new MDDOperatorStrategy< double >(),
88 discountFactor,
89 epsilon,
90 learner,
91 verbose);
92 }

References AdaptiveRMaxPlaner().

Referenced by gum::SDYNA::RMaxMDDInstance().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ setOptimalStrategy()

void gum::IDecisionStrategy::setOptimalStrategy ( MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optPol)
inlineinherited

Definition at line 111 of file IDecisionStrategy.h.

111 {
112 optPol_ = optPol;
113 }
const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optPol_

References optPol_.

◆ spumddInstance()

StructuredPlaner< double > * gum::StructuredPlaner< double >::spumddInstance ( double discountFactor = 0.9,
double epsilon = 0.00001,
bool verbose = true )
inlinestaticinherited

Definition at line 92 of file structuredPlaner.h.

◆ stateOptimalPolicy()

virtual ActionSet gum::IDecisionStrategy::stateOptimalPolicy ( const Instantiation & curState)
inlinevirtualinherited

Reimplemented in gum::E_GreedyDecider, and gum::RandomDecider.

Definition at line 115 of file IDecisionStrategy.h.

115 {
116 return (optPol_ && optPol_->realSize() != 0) ? optPol_->get(curState) : allActions_;
117 }

References allActions_, and optPol_.

Referenced by gum::E_GreedyDecider::stateOptimalPolicy().

Here is the caller graph for this function:

◆ sviInstance()

StructuredPlaner< double > * gum::StructuredPlaner< double >::sviInstance ( double discountFactor = 0.9,
double epsilon = 0.00001,
bool verbose = true )
inlinestaticinherited

Definition at line 104 of file structuredPlaner.h.

◆ TreeInstance()

AdaptiveRMaxPlaner * gum::AdaptiveRMaxPlaner::TreeInstance ( const ILearningStrategy * learner,
double discountFactor = 0.9,
double epsilon = 0.00001,
bool verbose = true )
inlinestatic

Definition at line 97 of file adaptiveRMaxPlaner.h.

100 {
101 return new AdaptiveRMaxPlaner(new TreeOperatorStrategy< double >(),
102 discountFactor,
103 epsilon,
104 learner,
105 verbose);
106 }

References AdaptiveRMaxPlaner().

Referenced by gum::SDYNA::RMaxTreeInstance().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ valueIteration_()

MultiDimFunctionGraph< double > * gum::AdaptiveRMaxPlaner::valueIteration_ ( )
protectedvirtual

Performs a single step of value iteration.

Reimplemented from gum::StructuredPlaner< double >.

Definition at line 160 of file adaptiveRMaxPlaner.cpp.

160 {
161 // *****************************************************************************************
162 // Loop reset
163 MultiDimFunctionGraph< double >* newVFunction = operator_->getFunctionInstance();
164 newVFunction->copyAndReassign(*vFunction_, fmdp_->mapMainPrime());
165
166 // *****************************************************************************************
167 // For each action
168 std::vector< MultiDimFunctionGraph< double >* > qActionsSet;
169 for (auto actionIter = fmdp_->beginActions(); actionIter != fmdp_->endActions(); ++actionIter) {
170 MultiDimFunctionGraph< double >* qAction = evalQaction_(newVFunction, *actionIter);
171
172 // *******************************************************************************************
173 // Next, we add the reward
174 qAction = addReward_(qAction, *actionIter);
175
176 qAction = this->operator_->maximize(
177 _actionsRMaxTable_[*actionIter],
178 this->operator_->multiply(qAction, _actionsBoolTable_[*actionIter], 1),
179 2);
180
181 qActionsSet.push_back(qAction);
182 }
183 delete newVFunction;
184
185 // *****************************************************************************************
186 // Next to evaluate main value function, we take maximise over all action
187 // value, ...
188 newVFunction = maximiseQactions_(qActionsSet);
189
190 return newVFunction;
191 }

References _actionsBoolTable_, _actionsRMaxTable_, gum::StructuredPlaner< double >::addReward_(), gum::MultiDimFunctionGraph< GUM_SCALAR, TerminalNodePolicy >::copyAndReassign(), gum::StructuredPlaner< double >::evalQaction_(), gum::StructuredPlaner< double >::fmdp_, gum::StructuredPlaner< double >::maximiseQactions_(), gum::StructuredPlaner< double >::operator_, and gum::StructuredPlaner< double >::vFunction_.

Here is the call graph for this function:

◆ vFunction()

INLINE const MultiDimFunctionGraph< double > * gum::StructuredPlaner< double >::vFunction ( )
inlineinherited

Returns a const ptr on the value function computed so far.

Definition at line 153 of file structuredPlaner.h.

153{ return vFunction_; }

◆ vFunctionSize()

virtual Size gum::StructuredPlaner< double >::vFunctionSize ( )
inlinevirtualinherited

Returns vFunction computed so far current size.

Implements gum::IPlanningStrategy< double >.

Definition at line 158 of file structuredPlaner.h.

158{ return vFunction_ != nullptr ? vFunction_->realSize() : 0; }

Member Data Documentation

◆ _actionsBoolTable_

HashTable< Idx, MultiDimFunctionGraph< double >* > gum::AdaptiveRMaxPlaner::_actionsBoolTable_
private

◆ _actionsRMaxTable_

HashTable< Idx, MultiDimFunctionGraph< double >* > gum::AdaptiveRMaxPlaner::_actionsRMaxTable_
private

◆ _counterTable_

HashTable< Idx, StatesCounter* > gum::AdaptiveRMaxPlaner::_counterTable_
private

◆ _firstTime_

bool gum::StructuredPlaner< double >::_firstTime_
privateinherited

Definition at line 382 of file structuredPlaner.h.

Referenced by addReward_().

◆ _fmdpLearner_

const ILearningStrategy* gum::AdaptiveRMaxPlaner::_fmdpLearner_
private

Definition at line 210 of file adaptiveRMaxPlaner.h.

Referenced by AdaptiveRMaxPlaner(), and _makeRMaxFunctionGraphs_().

◆ _initialized_

bool gum::AdaptiveRMaxPlaner::_initialized_
private

Definition at line 233 of file adaptiveRMaxPlaner.h.

Referenced by AdaptiveRMaxPlaner(), and initialize().

◆ _initializedTable_

HashTable< Idx, bool > gum::AdaptiveRMaxPlaner::_initializedTable_
private

Definition at line 231 of file adaptiveRMaxPlaner.h.

Referenced by checkState(), and initialize().

◆ _rmax_

double gum::AdaptiveRMaxPlaner::_rmax_
private

Definition at line 213 of file adaptiveRMaxPlaner.h.

Referenced by _makeRMaxFunctionGraphs_(), and _visitLearner_().

◆ _rThreshold_

double gum::AdaptiveRMaxPlaner::_rThreshold_
private

Definition at line 212 of file adaptiveRMaxPlaner.h.

Referenced by _makeRMaxFunctionGraphs_(), and _visitLearner_().

◆ _threshold_

double gum::StructuredPlaner< double >::_threshold_
privateinherited

The threshold value Whenever | V^{n} - V^{n+1} | < threshold, we consider that V ~ V*.

Definition at line 381 of file structuredPlaner.h.

Referenced by evalPolicy_(), and makeArgMax_().

◆ allActions_

ActionSet gum::IDecisionStrategy::allActions_
protectedinherited

◆ discountFactor_

double gum::StructuredPlaner< double >::discountFactor_
protectedinherited

Discount Factor used for infinite horizon planning.

Definition at line 365 of file structuredPlaner.h.

Referenced by gum::AdaptiveRMaxPlaner::_makeRMaxFunctionGraphs_().

◆ elVarSeq_

gum::VariableSet gum::StructuredPlaner< double >::elVarSeq_
protectedinherited

A Set to eleminate primed variables.

Definition at line 360 of file structuredPlaner.h.

Referenced by minimiseFunctions_().

◆ fmdp_

const FMDP< double >* gum::StructuredPlaner< double >::fmdp_
protectedinherited

The Factored Markov Decision Process describing our planning situation (NB : this one must have function graph as transitions and reward functions ).

Definition at line 340 of file structuredPlaner.h.

Referenced by ~StructuredPlaner(), _transferActionIds_(), gum::AdaptiveRMaxPlaner::evalPolicy_(), gum::AdaptiveRMaxPlaner::initVFunction_(), minimiseFunctions_(), and gum::AdaptiveRMaxPlaner::valueIteration_().

◆ operator_

◆ optimalPolicy_

The associated optimal policy.

Warning
Leaves are ActionSet which contains the ids of the best actions While this is sufficient to be exploited, to be understood by a human somme translation from the fmdp_ is required. optimalPolicy2String do this job.

Definition at line 355 of file structuredPlaner.h.

Referenced by ~StructuredPlaner(), and minimiseFunctions_().

◆ optPol_

const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy >* gum::IDecisionStrategy::optPol_ {nullptr}
protectedinherited

Definition at line 121 of file IDecisionStrategy.h.

121{nullptr};

Referenced by initialize(), setOptimalStrategy(), and stateOptimalPolicy().

◆ verbose_

bool gum::StructuredPlaner< double >::verbose_
protectedinherited

Boolean used to indcates whether or not iteration informations should be displayed on terminal.

Definition at line 373 of file structuredPlaner.h.

Referenced by makeArgMax_().

◆ vFunction_


The documentation for this class was generated from the following files: