[ExI] Superhuman Poker

Dave Sill sparge at gmail.com
Thu Jul 18 14:39:29 UTC 2019

On Thu, Jul 18, 2019 at 9:06 AM John Clark <johnkclark at gmail.com> wrote:

> On Thu, Jul 18, 2019 at 8:05 AM Dave Sill <sparge at gmail.com> wrote:
> *> Unlike the human genius, the poker program can't talk or understand
>> speech or converse in any language. It's like an idiot savant. If if could
>> talk, it'd just say "I just pick the statistically best play".*
> People were always asking human geniuses like Einstein and Feynman how
> they got their ideas but they could never give satisfactory answers, if
> they could we'd all be as smart as they were.

At least Einstein and Feynman could talk intelligently about abstract
concepts. But, no, I don't think that knowing how they got their ideas
would one as smart as they were. They weren't special because of their
knowledge, they were special because of their intelligence.

*> Machine learning isn't self-modifying code. The code never changes. The
>> learning process builds huge tables of statistics recording the outcomes of
>> different plays.*
> I don't know what you mean by that, the changes that the program made in
> its own code is the very thing that made it extraordinary; if it only
> used the code that the humans had written it would play lousy Poker.

You're mistaken about how it works. From

*The key breakthrough was developing a method that allowed Pluribus to make
good choices after looking ahead only a few moves rather than to the end of
the game.Pluribus teaches itself from scratch using a form of
reinforcement learning similar to that used by DeepMind’s Go AI, AlphaZero.
It starts off playing poker randomly and improves as it works out which
actions win more money. After each hand, it looks back at how it played and
checks whether it would have made more money with different actions, such
as raising rather than sticking to a bet. If the alternatives lead to
better outcomes, it will be more likely to choose theme in future.By
playing trillions of hands of poker against itself, Pluribus created a
basic strategy that it draws on in matches. At each decision point, it
compares the state of the game with its blueprint and searches a few moves
ahead to see how the action played out. It then decides whether it can
improve on it. And because it taught itself to play without human input,
the AI settled on a few strategies that human players tend not to use.*

Pluribus isn't modifying it's own code. When I said it'd say "I just pick
the statistically best play", that was overly simplified. It more like "I
pick the statistically best play and continually look at my previous play
and try different things and adjust the probabilities so I can do better
next time".

*> These AIs are learning very narrowly-defined games in very simple
>> domains and a tiny set of well-defined rules.*
> I think you're whistling through the graveyard. Everyday the field of AI's
> expertise becomes less narrow, and the super impressive thing is we didn't
> teach them how to do it, they taught themselves.

I think you're anthropomorphizing.

*> What they do is impressive to us in the same way that a calculator is
>> impressive to us at doing arithmetic. *
> A calculator doesn't get better at arithmetic every day and it can't
> teach itself things.

I didn't say it was the same thing, just impressive in the same way.

Things become unstable and unpredictable when the tool becomes more
>> intelagent than the tool user. It's called a singularity.
Intelligence isn't everything, John. You have to consider motivation,
drive, understanding, knowledge, abilities, etc.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20190718/629368b1/attachment.htm>

More information about the extropy-chat mailing list