2

I'm reading through reinforcement learning literature; anything 2016 or more recent makes heavy usage of the library OpenAI Gym. The tutorials and content with most visibility is centered around robotics, Atari games, and other flashy applications of RL. I'm simply trying to use OpenAI Gym to leverage RL to solve a Markov Decision Process. Is there tutorial on how to implement an MDP in OpenAI Gym?

As some examples of the sort of MDPs I'll be working with: Optimal per channel marketing budget, traveling salesman problem, etc.

jbuddy_13
  • 1,578
  • 3
  • 22

1 Answers1

3

You should refer to the documentation for specifics. Anyway, an MDP consists of states $S$, actions $A$, transition probabilities $P$, and rewards $R$. In software implementations, $P$ and $R$ are often both implemented in some step function. The state and action spaces $S, A$ don't have to be implemented per say, but gym does have some relevant datatypes to make specifying them easy.

Also I want to point out that most of the "flashy applications" -- robotics, atari, etc -- are just particular instances of MDPs.

shimao
  • 22,706
  • 2
  • 42
  • 81
  • Could you link a business-specific MDP example? As you noted, I'm not interested in abstract examples like robotics. – jbuddy_13 Mar 01 '21 at 22:10
  • robotics is a pretty concrete example of an MDP. it sounds like you have some business problem in mind, and don't know how to formulate it as an MDP? if so, i recommend posting a separate question about this, with some details about your specific problem – shimao Mar 01 '21 at 23:30
  • @shimao- https://stats.stackexchange.com/questions/511922/open-ai-gym-for-tsp-problem – jbuddy_13 Mar 02 '21 at 15:58