What is the difference between Reinforcement Learning(RL) and Markov Decision Process(MDP)?

Question

What is the difference between a Reinforcement Learning(RL) and a Markov Decision Process(MDP)?

I believed I understood the principles of both, but now when I need to compare the two I feel lost. They mean almost the same to me. Surely they are not.

Links to other resources are also appreciated.

score 1 · Accepted Answer · answered May 17 '20 at 09:52

From Reinforcement Learning: An Introduction (Sutton, Barto):

Reinforcement learning is learning what to do — how to map situations to actions—so as to maximize a numerical reward signal. The learner is not told which actions to take, but instead must discover which actions yield the most reward by trying them.

So roughly speaking RL is a field of machine learning that describes methods aimed to learn an optimal policy (i.e. mapping from states to actions) given an agent moving in an environment.

Markov Decision Process is a formalism (a process) that allows you to define such an environment. Specifically, MDP describes a fully observable environment in RL, but in general the environment might me partially observable (see Partially observable Markov decision process (POMDP).

So RL is a set of methods that learn "how to (optimally) behave" in an environment, whereas MDP is a formal representation of such environment.

One should also add that there is a third player in that game: a Markov Decision Automata (MDA). Given an initial distribution and a policy, an MDA gives rise to an MDP using a strategy like in here: https://mathoverflow.net/questions/292942/markov-processes-construction-of-the-state-variables. — Fabian Werner, May 17 '20 at 10:09

What is the difference between Reinforcement Learning(RL) and Markov Decision Process(MDP)?

1 Answers1