aGrUM 2.3.2
a C++ library for (probabilistic) graphical models
gum::E_GreedyDecider Class Reference

<agrum/FMDP/decision/E_GreedyDecider.h> More...

#include <E_GreedyDecider.h>

Inheritance diagram for gum::E_GreedyDecider:
Collaboration diagram for gum::E_GreedyDecider:

Public Member Functions

Constructor & destructor.
 E_GreedyDecider ()
 Constructor.
 ~E_GreedyDecider ()
 Destructor.
Initialization
void initialize (const FMDP< double > *fmdp)
 Initializes the learner.
Incremental methods
void checkState (const Instantiation &newState, Idx actionId)
ActionSet stateOptimalPolicy (const Instantiation &curState)

Private Attributes

StatesChecker _statecpt_
double _sss_

Incremental methods

void setOptimalStrategy (MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > *optPol)
const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optPol_ {nullptr}
ActionSet allActions_

Detailed Description

<agrum/FMDP/decision/E_GreedyDecider.h>

Class to make decision following an epsilon-greedy compromise between exploration and exploitation

Definition at line 73 of file E_GreedyDecider.h.

Constructor & Destructor Documentation

◆ E_GreedyDecider()

gum::E_GreedyDecider::E_GreedyDecider ( )

Constructor.

Definition at line 69 of file E_GreedyDecider.cpp.

69 {
70 GUM_CONSTRUCTOR(E_GreedyDecider);
71
72 _sss_ = 1.0;
73 }
E_GreedyDecider()
Constructor.

References E_GreedyDecider(), and _sss_.

Referenced by E_GreedyDecider(), and ~E_GreedyDecider().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ ~E_GreedyDecider()

gum::E_GreedyDecider::~E_GreedyDecider ( )

Destructor.

Definition at line 80 of file E_GreedyDecider.cpp.

80 {
81 GUM_DESTRUCTOR(E_GreedyDecider);
82 ;
83 }

References E_GreedyDecider().

Here is the call graph for this function:

Member Function Documentation

◆ checkState()

void gum::E_GreedyDecider::checkState ( const Instantiation & newState,
Idx actionId )
virtual

Implements gum::IDecisionStrategy.

Definition at line 111 of file E_GreedyDecider.cpp.

111 {
112 if (_statecpt_.nbVisitedStates() == 0) _statecpt_.reset(reachedState);
113 else if (!_statecpt_.checkState(reachedState)) _statecpt_.addState(reachedState);
114 }

References _statecpt_.

◆ initialize()

void gum::E_GreedyDecider::initialize ( const FMDP< double > * fmdp)
virtual

Initializes the learner.

Reimplemented from gum::IDecisionStrategy.

Definition at line 94 of file E_GreedyDecider.cpp.

94 {
96 for (auto varIter = fmdp->beginVariables(); varIter != fmdp->endVariables(); ++varIter)
97 _sss_ *= (double)(*varIter)->domainSize();
98 }
virtual void initialize(const FMDP< double > *fmdp)
Initializes the learner.

References _sss_, gum::FMDP< GUM_SCALAR >::beginVariables(), gum::FMDP< GUM_SCALAR >::endVariables(), and gum::IDecisionStrategy::initialize().

Here is the call graph for this function:

◆ setOptimalStrategy()

void gum::IDecisionStrategy::setOptimalStrategy ( MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optPol)
inlineinherited

Definition at line 111 of file IDecisionStrategy.h.

111 {
112 optPol_ = optPol;
113 }
const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optPol_

References optPol_.

◆ stateOptimalPolicy()

ActionSet gum::E_GreedyDecider::stateOptimalPolicy ( const Instantiation & curState)
virtual

Reimplemented from gum::IDecisionStrategy.

Definition at line 122 of file E_GreedyDecider.cpp.

122 {
123 double explo = randomProba();
124 double temp = std::pow((_sss_ - (double)_statecpt_.nbVisitedStates()) / _sss_, 3.0);
125 double exploThreshold = temp < 0.1 ? 0.1 : temp;
126
127 // std::cout << exploThreshold << std::endl;
128
129 ActionSet optimalSet = IDecisionStrategy::stateOptimalPolicy(curState);
130 if (explo > exploThreshold) {
131 // std::cout << "Exploit : " << optimalSet << std::endl;
132 return optimalSet;
133 }
134
135 if (allActions_.size() > optimalSet.size()) {
136 ActionSet ret(allActions_);
137 ret -= optimalSet;
138 // std::cout << "Explore : " << ret << std::endl;
139 return ret;
140 }
141
142 // std::cout << "Explore : " << allActions_ << std::endl;
143 return allActions_;
144 }
virtual ActionSet stateOptimalPolicy(const Instantiation &curState)
double randomProba()
Returns a random double between 0 and 1 included (i.e.

References _sss_, _statecpt_, gum::IDecisionStrategy::allActions_, gum::randomProba(), gum::ActionSet::size(), and gum::IDecisionStrategy::stateOptimalPolicy().

Here is the call graph for this function:

Member Data Documentation

◆ _sss_

double gum::E_GreedyDecider::_sss_
private

Definition at line 120 of file E_GreedyDecider.h.

Referenced by E_GreedyDecider(), initialize(), and stateOptimalPolicy().

◆ _statecpt_

StatesChecker gum::E_GreedyDecider::_statecpt_
private

Definition at line 119 of file E_GreedyDecider.h.

Referenced by checkState(), and stateOptimalPolicy().

◆ allActions_

ActionSet gum::IDecisionStrategy::allActions_
protectedinherited

◆ optPol_

const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy >* gum::IDecisionStrategy::optPol_ {nullptr}
protectedinherited

Definition at line 121 of file IDecisionStrategy.h.

121{nullptr};

Referenced by initialize(), setOptimalStrategy(), and stateOptimalPolicy().


The documentation for this class was generated from the following files: