[ExI] Self improvement

Keith Henson hkeithhenson at gmail.com
Sat Apr 23 16:47:39 UTC 2011

On Sat, Apr 23, 2011 at 6:25 AM, Anders Sandberg <anders at aleph.se> wrote:

> I prefer to approach the whole AI safety thing from a broader standpoint,
> checking the different possibilities.

That seems like it would be a reasonably approach.

> AI might emerge rapidly, or more
> slowly. In the first case it is more likely to be unitary since there will
> be just one system recursively selfimproving, and there are some arguments
> that for some forms of softer takeoffs there would be a 'merging' of the
> goals if they are compatible (Shulman, Armstrong). In the case of softer
> takeoffs power differentials are going to be less extreme, but there will
> also be many more motivations and architectures around. Systems can be
> designed in different ways for safety - by reducing their capabilities
> (might or might not preclude recursive self improvement), various
> motivational measures, or by constraints from other systems of roughly equal
> power.

One such constraint is the amount of processing power available.  Hard
takeoff seems to imply that the AI either has or can obtain all the
processing it needs.  We do have an example of a worm that doubled
every 8.5 seconds sucking in computers from the net.  Now there's a
scary scenario, an AI running on MS boxes.

Of course the effect of an AI running on the net would be to slow down
its clock rate to nearly human speeds because of communication delay
between processing nodes.

> So we need to analyse a number of cases:
> Hard takeoff (one agent): capability constraints ("AI in a box" for
> example), motivational constraints ("friendly" designs of various kinds,
> CEV, values copied from person, etc), ability to prevent accidental or
> deliberate takeoffs

Motivational constraints based on humans requires deep insight into
human motivation, and especially into understanding how human
motivations appear to change depending on external conditions.

Because is has never been in the "interest" of genes for humans to
understand gene derived motivations, I suspect we are biased against
this kind of understanding.

Humans go to war on the basis of a pending resource crisis.  It would
not do for an AI with a copied human motivational system to wake up,
figure out it faced a resource crisis and was motivated to go to war
with the people who made it.

Agree with the rest of this.


> Soft takeoff (multiple agents): as above (but with the complication that
> defection of some agents is likely due to accident, design or systemic
> properties), various mutual balancing schemes, issues of safeguarding
> against or using 'evolutionary attractors' or coordination issues (e.g. the
> large scale economic/evolutionary pressures discussed in some of Robin's and
> Nick's papers, singletons)
> This is not a small field. Even the subfields have pretty deep ramifications
> (friendliness being a good case in point). It is stupid to spend all one's
> effort on one possible subsection and claiming this is the only one that
> matters. Some doubtlessly matter more, but even if you think there is 99%
> chance of a soft takeoff safeguarding against that 1% chance of a bad hard
> takeoff can be very rational. We will need to specialize ourselves in
> researching them, but we shouldn't forget we need to cover the possibility
> space well enough that we can start formulating policies (including even the
> null policy of not doing anything and just hoping for the best).
> --
> Anders Sandberg,
> Future of Humanity Institute James Martin 21st Century School Philosophy
> Faculty Oxford University
> _______________________________________________
> extropy-chat mailing list
> extropy-chat at lists.extropy.org
> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat

More information about the extropy-chat mailing list