12

I'm studying Reinforcement Learning, and have come across multi-armed bandits.

Why are these called bandits? And why are they armed?

Tom Hale
  • 2,231
  • 3
  • 13
  • 31

2 Answers2

12

This is actually explained on the Wikipedia page

This is a classic reinforcement learning problem that exemplifies the exploration–exploitation tradeoff dilemma. The name comes from imagining a gambler at a row of slot machines (sometimes known as "one-armed bandits"), who has to decide which machines to play, how many times to play each machine and in which order to play them, and whether to continue with the current machine or try a different machine.

they even have a picture of few of those:

enter image description here

As noticed by Henry in the comment, there is even more accurate image on Wikipedia to show the etymology:

enter image description here

Tim
  • 108,699
  • 20
  • 212
  • 390
  • 6
    It's probably worth noting that in the 'old days' mechanical slot machines used to have a long handle that the user pulled to play. That was the 'arm'. These were even commonly included on machines after they were no longer necessary for operation but modern machine rarely have these which is why there are none visible in the image. – JimmyJames May 18 '20 at 21:42
  • 9
    And it's a one-armed *bandit* because it takes all of your money. – hobbs May 19 '20 at 00:37
  • 1
    See https://commons.wikimedia.org/wiki/File:Slot_machines_at_Wookey_Hole_Caves.JPG for a picture of four antique one-armed bandits and https://commons.wikimedia.org/wiki/File:One-Armed_Bandits_at_Stockmen%27s_Hotel,_Elko,_Nevada_(83581).jpg for some dressed up as bandits – Henry May 19 '20 at 08:54
5

In section 2.1 of Sutton and Bardo's Reinforcement Learning: An Introduction, they say:

[...] the k-armed bandit problem, so named by analogy to a slot machine, or “one-armed bandit,” except that it has k levers instead of one. Each action selection is like a play of one of the slot machine’s levers, and the rewards are the payout for hitting the jackpot.

Tom Hale
  • 2,231
  • 3
  • 13
  • 31