How to solve a Markov Decision Problem with State Transition Matrix and Reward Matrix

Question

I'm stuck in solving a simple dynamic probabilistic model. I have Three states {Sunny, Cloudy, Rainy}. I have the Transition Probability Matrix for the states transitioning to another state (for eg. Sunny -> Cloudy or Sunny -> Sunny). For the Action Space I have {"Bring Umbrella", "Don't Bring Umbrella"} and I have decided on the Reward Matrix. Now, I want to solve this problem. That is, I want to find the best policy. I was referring to various models and was directed towards Markov Decision Process. How can I solve the same with the above given information?

I have looked for python and R packages to solve the same. I came across mdptoolbox. To solve this problem the library requires the transition matrix with actions, i.e. for each given action, what is the corresponding transition matrix. (I don't know how to find these).

How shall I proceed further? State Transition Matrix and Reward Matrix is all the information that I have.

score 3 · Accepted Answer · answered Oct 18 '20 at 07:22

3

The Transition Probability Matrix corresponds to the transition matrix with actions, since for every state you know the appropriate action, i.e. bring the umbrella when it rains, otherwise leave it at home.

answered Oct 18 '20 at 07:22

Tom Dörr

331
1
5

How to solve a Markov Decision Problem with State Transition Matrix and Reward Matrix

1 Answers1