RL interaction Dynamics

image.png

Markov decision process is a mathematical model.

The above interaction dynamics can be studied using the MDP. So here, the MDP comes into the picture.

Remark: Because of a lack of knowledge, we use probability in this. #Randomness

Remark: When we fix the policy, the above process becomes automatic. It will just run and run.