Samuel's checkers player: the program that coined machine learning

Listen · 3:55

When Thomas J. Watson, Sr. — president of IBM since 1914, and not given to understatement — watched his engineers demonstrate a new program in the early 1950s, he said out loud that it would raise IBM’s stock 15 points. According to company history, he was right. The program played checkers.

Arthur Samuel had been thinking about checkers since his time at the University of Illinois in the late 1940s. Born in 1901 in Emporia, Kansas, he had gone from MIT (master’s in electrical engineering, 1926) to Bell Labs to a faculty post at Illinois before IBM’s Poughkeepsie laboratory recruited him in 1949. There he got access to the IBM 701, one of the company’s first commercial computers, and the time — off-hours and nights — to run his experiment.

The first version of the program ran on the 701 starting around 1952. It was demonstrated publicly on television on February 24, 1956. Samuel built it around two kinds of learning. The first he called rote learning: the program memorized every board position it had seen, along with whether that position had eventually led to a win or a loss, so it could avoid repeating its own mistakes. The second was more unusual — a scoring function, a polynomial that assigned numerical weights to strategic features like piece advantage and center control, with the weights updated by self-play: two versions of the program, Alpha and Beta, played each other; the loser’s weights shifted; the cycle repeated, thousands of games running on borrowed overnight hours. No human told the program what good play looked like. It found out by losing.

What unsettled observers wasn’t that a computer could play checkers — a simpler program could manage that. It was that this one got better. It played too well to have memorized its way to the answer.

In 1962, IBM arranged a match against Robert Nealey, described in the company’s Research News as “a former Connecticut checkers champion, and one of the nation’s foremost players.” Samuel’s program won. The details were somewhat embellished: Nealey only became Connecticut state champion in 1966, four years after the match. IBM had rounded his credentials up. Still, by all accounts the game was genuine, and the program beat him.

Samuel published the full technical account in July 1959, in the IBM Journal of Research and Development. The paper was titled “Some Studies in Machine Learning Using the Game of Checkers.” In it, he named the discipline he had been practicing: machine learning is the field of study that gives computers the ability to learn without being explicitly programmed. The phrase was new. It stuck because it named something real: not a program that stored the answer, but one that found it.

The scoring-function idea — updating weights based on the difference between predicted and actual outcomes — was later formalized by Richard Sutton and Andrew Barto as temporal-difference learning, which sits at the core of modern reinforcement learning. Every agent that has trained itself at chess, Go, or a video game by playing millions of rounds against itself is doing a version of what Samuel’s 701 did on borrowed nights in Poughkeepsie.

Sources

Arthur Samuel (computer scientist) — Wikipedia — Biography, timeline, learning mechanisms, TV demonstration February 24, 1956, coinage of “machine learning” in 1959.
Arthur Samuel — Computer History Museum Pioneers — Watson Sr. stock prediction, IBM 701 context, Connecticut championship match details.
A Strange Game — Ben Recht, arg min — Technical analysis of Samuel’s self-play and temporal differencing approach and its connection to modern reinforcement learning.
Samuel’s Checkers-Playing System — GM-RKB — Details of the 1962 Nealey match and the overstated credentials.

Spot a mistake?

Samuel's checkers player: the program that coined machine learning

Sources

Subscribe by email