Strategy Base (part. Obs.)


source

POstrategybase

 POstrategybase (env, learning_rates, discount_factors,
                 choice_intensities=1, **kwargs)

Base Class for deterministic policy-average independent (multi-agent) partially observable temporal-difference reinforcement learning in policy space.


source

POstrategybase.random_softmax_policy

 POstrategybase.random_softmax_policy ()

Softmax policy with random probabilities.


source

POstrategybase.zero_intelligence_policy

 POstrategybase.zero_intelligence_policy ()

Policy with equal probabilities.