[ExI] New article on AGI dangers

Sat Jul 30 20:31:16 UTC 2022

AGI Ruin: A List of Lethalities
by Eliezer Yudkowsky 5th Jun 2022    643 comments

<https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities>

Quote:
We can gather all sorts of information beforehand from less powerful
systems that will not kill us if we screw up operating them; but once
we are running more powerful systems, we can no longer update on
sufficiently catastrophic errors.  This is where practically all of
the real lethality comes from, that we have to get things right on the
first sufficiently-critical try.  If we had unlimited retries - if
every time an AGI destroyed all the galaxies we got to go back in time
four years and try again - we would in a hundred years figure out
which bright ideas actually worked.  Human beings can figure out
pretty difficult things over time, when they get lots of tries; when a
failed guess kills literally everyone, that is harder.  That we have
to get a bunch of key stuff right on the first try is where most of
the lethality really and ultimately comes from; likewise the fact that
no authority is here to tell us a list of what exactly is 'key' and
will kill us if we get it wrong.
-----------------

In their terminology, they call this "the Alignment problem". i.e. how
to build an above human intelligence machine that will still want to
benefit humans and not end up killing everyone.
Eliezer sees this as an almost impossibly difficult problem.

BillK