meaning of 'Monte Carlo' in this sentence

Question

This is from a paper 'Algorithms for Inverse Reinforcement Learning' by Ng, Russell (2001)

We assume that we have the ability to simulate trajectories in the MDP (from the initial state $s_0$) under the optimal policy, or under any policy of our choice. For each policy $\pi$ that we will consider (including the optimal one), we will need a way of estimating $V^{\pi}(s_0)$ for any setting of the $\alpha_i$'s. To do this, we first execute $m$ $\underline{\text{Monte Carlo}}$ trajectories under $\pi$.

Sorry for the long quote. What is the meaning of 'Monte Carlo' in the last sentence?

My first thought would be to just run the simulation again and again $m$ times. But rethinking it, I might be very wrong.

score 10 · Accepted Answer · edited Apr 13 '17 at 12:44

10

What Ng and Russell seem to be saying is that for each policy $\pi$ they simulate $m$ "possible" outcomes for processes starting at point $s_0$. By "trajectories" they seem to mean the possible developments in time of simulated processes -- different possible scenarios created by simulation. So you were correct, Monte Carlo stands here for "simulation" (see also this thread).

edited Apr 13 '17 at 12:44

Community

1

answered Nov 30 '15 at 16:01

Tim

108,699
20
212
390

score 0 · Answer 2 · answered Dec 01 '15 at 10:36

0

Monte Carlo here simply means use sampling to estimate the values. Practically this means collecting a sequence of (state, action) pairs, i.e. the trajectory using some arbitrary policy, and from this you can compute relevant quantities like V, etc

answered Dec 01 '15 at 10:36

makokal

143
6

meaning of 'Monte Carlo' in this sentence

2 Answers2

Linked