All In: Artificial Intelligence Beats the World's Best Poker Players:

The world's best artificial intelligence poker player seems to know exactly when to hold 'em and when to fold 'em.
An artificial-intelligence program known as Libratus has beat the world's absolute best human poker players in a 20-day No-Limit Texas Hold'em tournament, defeating four opponents by about $1.77 million in poker chips, according to Pittsburgh's Rivers Casino, where the "Brains vs. Artificial Intelligence" poker tournament was held.
At the end of each day, at least one of the human players was beating the AI program. But in the end, it was not enough.
"We appreciate their hard work, but unfortunately, the computer won," said Craig Clark, general manager of Rivers Casino.
Computer scientists can now add Texas Hold'em to a growing list of games — including chess, Go and "Jeopardy!" — in which AI can beat the best human competitor in the world. [Super-Intelligent Machines: 7 Robotic Futures]

Artificial-intelligence strides

Since IBM's Deep Blue bested chess player Garry Kasparov in 1997, the robots have been gaining on humans. Last year, AI shocked the world by trouncing the world's best Go player in a set of matches in the strategy game involving black and white stones. The task was so difficult because Go contains more potential moves than atoms in the universe. To tackle that problem, the computer, known as AlphaGo, used a deep-learning strategy, a spookily powerful method that involves computing calculations at one layer and then feeding those up to another layer in the algorithm.
And yet, in many ways, Texas Hold'em is even harder, said Tuomas Sandholm, a computer scientist at Carnegie Mellon University who helped design Libratus and helped organize the tournament. (In Heads-Up Texas Hold'em, two players each hold two cards and then have to make the best hand from the five cards that are eventually placed face-up on the table over several rounds of play. After each card is turned, players can call, or match, another player's bet; raise the bet; or fold their cards, or give up.)
It turns out, cracking this type of play may be even trickier than mastering Go, where each player knows the other's position perfectly. [5 Intriguing Uses for Artificial Intelligence (That Aren't Killer Robots)]
"In incomplete-information games like poker, it's much harder," Sandholm told Live Science.
For instance, imagine you're playing a hand against an opponent. You need to not only think about the ace-ace in your hand but also consider what's on the table, what the other player could be holding, what his bet tells you about his cards and what he is trying to learn with his bets.
So Sandholm and his colleagues relied on a different concept to program Libratus. Known as Nash equilibrium, it is a mathematical way of determining the best game strategy to maximize your own payoffs while minimizing those of your opponent. In any one hand of poker, random chance dictates that the Nash equilibrium play may lose, but over the course of many hands, Nash equilibrium translates to the "unbeatable play" strategy, Sandholm said.
However, "the game has 10 to the power of 160 different situations," meaning it has many, many more computational possibilities than Go. As a result, the program can't calculate the perfect Nash equilibrium solution, but must instead approximate.
In the past, that's been a stumbling block. Libratus was involved in a poker tournament in 2015 and couldn't beat the humans, with the match ending in a statistical tie. However, the souped-up version of Libratus used in the recent tournament has a better end-game solving strategy, Sandholm said.

The tournament

For the "Brains vs. Artificial Intelligence" tournament, four of the world's best poker players faced off one-on-one against Libratus in 120,000 hands of poker. At stake was a $200,000 pot, which the human players received even if they lost.
"They are professionals, so they were fighting to the bitter end, really hard," Sandholm said. "They were studying really hard every night on their computers, trying to find holes in the AI."
In the end, it was no contest: The AI prevailed.
As part of the program, bluffing naturally emerged as a mathematically sound strategy, Sandholm noted.
Its win also involved some surprising moves. For instance, AI was more likely than humans to make huge overbets — meaning that they would bet three, five or even 20 times the amount of chips in the pot. Interestingly, those overbets sometimes made mathematical sense in two very different situations.
"With a very strong hand and with the weakest hands, you want to make those big overbets," Sandholm said.
Libratus was also more likely than the humans to underbet in certain surprising situations, Sandholm said. And every night, it went home and adapted its strategy based on the hands it had played.
"The adaptation was not to learn to exploit the opponent, but rather to determine what holes the opponent had found in the AI strategy and automatically patch those holes," Sandholm said.
Still, there's some hope for the mere mortals. In Heads-Up Texas Hold'em, two players compete. But Libratus would have no idea how to beat players in a poker game with five or six players. There, Nash equilibrium solutions don't work, Sandholm said.
"I would say the top humans in something like that would probably do better than the best AI," Sandholm said.
Originally published on Live Science.

Anything That Matters.

Pages

Monday, 6 February 2017