Risk Reward Dilemma

This class models a two-state social dilemma where a single agent chooses between risky or cautious actions. The actions taken by the agent determine the probability of transitioning between degraded and prosporus states . In each state, the agent receives different rewards, reflecting the consequences of its chosen action!

Implementation

source

RiskReward

 RiskReward (pc:float, pr:float, rs:float, rr:float, rd:float)

An MDP model for decision-making under uncertainty with two states (prosperous and degraded) and two actions (cautious and risky).

source

RiskReward.TransitionTensor

 RiskReward.TransitionTensor ()

Define the Transition Tensor for the MDP.

source

RiskReward.RewardTensor

 RiskReward.RewardTensor ()

Define the Reward Tensor for the MDP.

source

RiskReward.actions

 RiskReward.actions ()

Define the actions available in the MDP.

source

RiskReward.states

 RiskReward.states ()

Define the states of the MDP.

source

RiskReward.id

 RiskReward.id ()

Provide an identifier for the environment.

Example

env = RiskReward(pc=0.3,pr=.1,rs=0.6,rr=0.8,rd=0.001)
# pc, pr, rs, rr, rd

env.id()

'RiskReward_pc0.3_pr0.1_rs0.6_rr0.8_rd0.001'

env.TransitionTensor()

array([[[1. , 0. ],
        [0.7, 0.3]],

       [[0.1, 0.9],
        [0. , 1. ]]])

env.RewardTensor()[0]

array([[[0.6  , 0.   ],
        [0.8  , 0.001]],

       [[0.001, 0.001],
        [0.001, 0.001]]])

env.actions()

[['cautious', 'risky']]

env.states()

['prosperous', 'degraded']