[ExI] DeepMind’s Vibrant New Virtual World Trains Flexible AI With Endless Play

John Grigg possiblepaths2050 at gmail.com
Sat Aug 7 16:56:52 UTC 2021

A major step forward for A.I. development?

"Last year
DeepMind researchers wrote that future AI developers may spend less time
programming algorithms and more time generating rich virtual worlds in
which to train them.

In a new paper released this week <https://arxiv.org/pdf/2107.12808.pdf> on
the preprint server arXiv, it would seem they’re taking the latter part of
that prediction very seriously.

The paper’s authors said they’ve created an endlessly challenging virtual
playground for AI. The world, called XLand, is a vibrant video game managed
by an AI overlord and populated by algorithms that must learn the skills to
navigate it.

The game-managing AI keeps an eye on what the game-playing algorithms are
learning and automatically generates new worlds, games, and tasks to
continuously confront them with new experiences.

The team said some veteran algorithms faced 3.4 million unique tasks while
playing around 700,000 games in 4,000 XLand worlds. But most notably, they
developed a general skillset not related to any one game, but useful in all
of them.

These skills included experimentation, simple tool use, and cooperation
with other players. General skills in hand, the algorithms performed well
when confronted with new games, including more complex ones, such as
capture the flag, hide and seek, and tag.

This, the authors say, is a step towards solving a major challenge in deep
learning. Most algorithms trained to accomplish a specific task—like, in
DeepMind’s case, to win at games such as Go or Starcraft—are savants.
They’re superhuman at the one task they know and useless at the rest. They
can defeat world champions at Go or chess, but have to be retrained from
scratch to do anything else.

By presenting deep reinforcement learning algorithms with an open-ended,
always-shifting world to learn from, DeepMind says their algorithms are
beginning to demonstrate “zero-shot” learning at new never-before-seen
tasks. That is, they don’t need retraining to perform novel tasks at a
decent level—sight-unseen.

This is a step towards more generally capable algorithms that can interact,
navigate, and solve problems in the also-endlessly-novel real world."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20210807/55b130f6/attachment-0001.htm>

More information about the extropy-chat mailing list