3

What is the difference between a Reinforcement Learning(RL) and a Markov Decision Process(MDP)?

I believed I understood the principles of both, but now when I need to compare the two I feel lost. They mean almost the same to me. Surely they are not.

Links to other resources are also appreciated.

Pluviophile
  • 2,381
  • 8
  • 18
  • 45

1 Answers1

1

From Reinforcement Learning: An Introduction (Sutton, Barto):

Reinforcement learning is learning what to do — how to map situations to actions—so as to maximize a numerical reward signal. The learner is not told which actions to take, but instead must discover which actions yield the most reward by trying them.

So roughly speaking RL is a field of machine learning that describes methods aimed to learn an optimal policy (i.e. mapping from states to actions) given an agent moving in an environment.

Markov Decision Process is a formalism (a process) that allows you to define such an environment. Specifically, MDP describes a fully observable environment in RL, but in general the environment might me partially observable (see Partially observable Markov decision process (POMDP).

So RL is a set of methods that learn "how to (optimally) behave" in an environment, whereas MDP is a formal representation of such environment.

Tomasz Bartkowiak
  • 1,249
  • 12
  • 20
  • 1
    One should also add that there is a third player in that game: a Markov Decision Automata (MDA). Given an initial distribution and a policy, an MDA gives rise to an MDP using a strategy like in here: https://mathoverflow.net/questions/292942/markov-processes-construction-of-the-state-variables. – Fabian Werner May 17 '20 at 10:09