[ExI] alpha zero

Sat Dec 9 05:43:50 UTC 2017

-----Original Message-----
From: extropy-chat [mailto:extropy-chat-bounces at lists.extropy.org] On Behalf Of Alejandro Dubrovsky
Sent: Friday, December 08, 2017 9:06 PM
To: extropy-chat at lists.extropy.org
Subject: Re: [ExI] alpha zero

On 09/12/17 08:38, spike wrote:
> 
> f5 picks up a knight for the pawn, then once one of the black knights 
> is sacked back, the position no longer puts the ug in ugly.
> 
>>... DeepMind has some esplainin to do?

>...I set the position up on StockFish 8. My computer doesn't have 64 CPU threads, so I only gave it one but I let it mull on it for a few hours to compensate. It flips between the two but it does seem to prefer Kh8 after all...
>...d5 44. Kd2 b4 45. Bxa5 which does look pretty tricky. If I play that through, its evaluation at the end is a solid 0.00...
>...but this position is so asymmetric and strange that I don't know what to think or how to go about winning it for white. Spike?

I have a theory which exonerates DeepMind of any attempt at deception, a theory which I think is most likely the right one.

The DeepMind guys aren't really specifically chess guys, they are programmers.  So they aren't specifically trying to necessarily create the top chess program, but rather demonstrate a paradigm where software can teach itself, given a clear end goal.

Top level computer chess ends in draws most of the time, and seldom do they ever get to the crazy interesting unsymmetrical positional situations like that game 5.  So what I think the DeepMind guys did was to set StockFish to always play away from boring safe drawish situations.  Set it to where it hates draws, it will do risky or even inferior moves to play away from drawish situations.  That makes for fun coffeehouse chess, but at the top level, it loses.

My theory after spending most of the evening over the board is that f5 forces down this crazy game into a more sedate drawish situation, but is probably better for Stockfish in this position to go for a simplify down into a draw rather than keep playing for a win.  The f8 shot picks up a piece for a pawn, but then requires black to give back the piece a few moves later, which allows black to get the queens off the board, which connects black's rooks and de-uglifies black's terrible position, leading to a likely draw.

If I play it out over the board with black always playing into draws, black always gets a draw.  If I play for a win with black, then black always loses in that position (game 5 move 20.)  So... my theory is that DeepMind set StockFish to play for wins, perhaps not even realizing how much it disadvantages StockFish, being as they are not specifically chess guys.

If that theory is correct, here is the prediction: AlphaZero will compete in the next TCEC, or even sooner than that, will compete against this season's TCEC champion (Houdini 6.02) under conditions where the Houdini people control the settings.  Prediciton: Houdini will win the match.

Further prediction: a Houdini/Alpha match will be held in the next few weeks.

Another prediction: it will be great fun to watch, and have my undivided attention.

Conclusion: AlphaZero has accomplished something crazy cool here.  But the press is mostly getting it wrong.

spike