= RiskReward(pc=0.3,pr=.1,rs=0.6,rr=0.8,rd=0.001)
env # pc, pr, rs, rr, rd
Risk Reward Dilemma
This class models a two-state social dilemma where a single agent chooses between
risky
or cautious
actions. The actions taken by the agent determine the probability of transitioning between degraded
and prosporus
states . In each state, the agent receives different rewards, reflecting the consequences of its chosen action!
Implementation
RiskReward
RiskReward (pc:float, pr:float, rs:float, rr:float, rd:float)
An MDP model for decision-making under uncertainty with two states (prosperous and degraded) and two actions (cautious and risky).
RiskReward.TransitionTensor
RiskReward.TransitionTensor ()
Define the Transition Tensor for the MDP.
RiskReward.RewardTensor
RiskReward.RewardTensor ()
Define the Reward Tensor for the MDP.
RiskReward.actions
RiskReward.actions ()
Define the actions available in the MDP.
RiskReward.states
RiskReward.states ()
Define the states of the MDP.
RiskReward.id
RiskReward.id ()
Provide an identifier for the environment.
Example
id() env.
'RiskReward_pc0.3_pr0.1_rs0.6_rr0.8_rd0.001'
env.TransitionTensor()
array([[[1. , 0. ],
[0.7, 0.3]],
[[0.1, 0.9],
[0. , 1. ]]])
0] env.RewardTensor()[
array([[[0.6 , 0. ],
[0.8 , 0.001]],
[[0.001, 0.001],
[0.001, 0.001]]])
env.actions()
[['cautious', 'risky']]
env.states()
['prosperous', 'degraded']