[ExI] AI poker

Tue Jan 31 15:49:40 UTC 2017

On 29 January 2017 at 16:04, Dave Sill  wrote:
> There is no "correct" decision for every hand because all of the players
> have incomplete knowledge. These competitions consist of playing thousands
> of hands, and the advantage goes to the players best able to calculate the
> probabilities of the cards and their opponent's behavior. Clearly, the AI
> has the edge in calculating the card probabilities. It also has the
> advantage of perfect memory of every hand played and every time an opponent
> bluffed.
>

Developed by Carnegie Mellon University, the AI won the “Brains Vs.
Artificial Intelligence” tournament against four poker pros by
$1,766,250 in chips over 120,000 hands (games). Researchers can now
say that the victory margin was large enough to count as a
statistically significant win, meaning that they could be at least
99.7 percent sure that the AI victory was not due to chance.

<http://spectrum.ieee.org/automaton/robotics/artificial-intelligence/ai-learns-from-mistakes-to-defeat-human-poker-players>

Quotes:

Previous attempts to develop poker-playing AI that can exploit the
mistakes of opponents—whether AI or human—have generally not been
overly successful, says Tuomas Sandholm, a computer scientist at
Carnegie Mellon University. Libratus instead focuses on improving its
own play, which he describes as safer and more reliable compared to
the riskier approach of trying to exploit opponent mistakes.

Even more importantly, the victory demonstrates how AI has likely
surpassed the best humans at doing strategic reasoning in “imperfect
information” games such as poker. The no-limit Texas Hold’em version
of poker is a good example of an imperfect information game because
players must deal with the uncertainty of two hidden cards and
unrestricted bet sizes. An AI that performs well at no-limit Texas
Hold’em could also potentially tackle real-world problems with similar
levels of uncertainty.

“The algorithms we used are not poker specific,” Sandholm explains.
“They take as input the rules of the game and output strategy.”

In fact, Libratus played the same overall strategy against all the
players based on three main components.

First, the AI’s algorithms computed a strategy before the tournament
by running for 15 million processor-core-hours on a new supercomputer
called Bridges.

Second, the AI would perform “endgame solving” during each hand to
precisely calculate how much it could afford to risk in the third and
fourth betting rounds (the “turn” and “river” rounds in poke
parlance). Sandholm credits the endgame solver algorithms as
contributing the most to the AI victory. The poker pros noticed
Libratus taking longer to compute during these rounds and realized
that the AI was especially dangerous in the final rounds, but their
“bet big early” counter strategy was ineffective.

Third, Libratus ran background computations during each night of the
tournament so that it could fix holes in its overall strategy. That
meant Libratus was steadily improving its overall level of play and
minimizing the ways that its human opponents could exploit its
mistakes. It even prioritized fixes based on whether or not its human
opponents had noticed and exploited those holes.

----------

I bet the military will be using AI in their wargames.

BillK