Strategy Base
Base class containing the core methods of CRLD agents in strategy space
strategybase
strategybase (env, learning_rates:Union[float,Iterable], discount_factors:Union[float,Iterable], choice_intensities:Union[float,Iterable]=1.0, use_prefactor=False, opteinsum=True, **kwargs)
Base class for deterministic strategy-average independent (multi-agent) temporal-difference reinforcement learning in strategy space.
| Type | Default | Details | |
|---|---|---|---|
| env | An environment object | ||
| learning_rates | Union | agents’ learning rates | |
| discount_factors | Union | agents’ discount factors | |
| choice_intensities | Union | 1.0 | agents’ choice intensities |
| use_prefactor | bool | False | use the 1-DiscountFactor prefactor |
| opteinsum | bool | True | optimize einsum functions |
| kwargs |
Further optional paramerater inherting from abase:
| Type | Default | Details | |
|---|---|---|---|
| use_prefactor | bool | False | use the 1-DiscountFactor prefactor |
| opteinsum | bool | True | optimize einsum functions |
strategybase.step
strategybase.step (Xisa)
Performs a learning step along the reward-prediction/temporal-difference error in strategy space, given joint strategy Xisa.
| Type | Details | |
|---|---|---|
| Xisa | Joint strategy | |
| Returns | tuple | (Updated joint strategy, Prediction error) |
strategybase.reverse_step
strategybase.reverse_step (Xisa)
*Performs a reverse learning step in strategy space, given joint strategy Xisa.
This is useful to compute the separatrix of a multistable regime.*
| Type | Details | |
|---|---|---|
| Xisa | Joint strategy | |
| Returns | tuple | (Updated joint strategy, Prediction error) |
strategybase.zero_intelligence_strategy
strategybase.zero_intelligence_strategy ()
Returns strategy Xisa with equal action probabilities.
strategybase.random_softmax_strategy
strategybase.random_softmax_strategy ()
Returns softmax strategy Xisa with random action probabilities.
strategybase.id
strategybase.id ()
Returns an identifier to handle simulation runs.