[ExI] Unfrendly AI is a mistaken idea.

Fri Jun 1 16:58:00 UTC 2007

> Stathis Papaioannou wrote:
> 
> It seems a sorry state of affairs if we can't copy the behaviour 
> of a few protein molecules, and yet are talking about super-human 
> AI taking over the world. 

I used to feel this way, but then a particular analogy popped into my
head that clarified things a bit:

Why would I save for retirement today if it's not going to happen for
another 35 years or so?  I don't know what situation I'll be in then, so
why worry about it today?

Well, luckily I can leverage the experience of those who *have*
successfully retired.  And most of those who have done so don't tell me
that they built and sold a private business for millions of dollars.
What they tell me is that they planned and executed on a 40-year prior
chain of events (yes, even those that have built and sold companies say
this first).

And the first year they saved for retirement, 40 years ago?  That didn't
give them an extra $5000 saved, even though that's all they put away in
year one.  What it gained them was an extra year of compounding results
tacked onto the tail-end of a 39 year interval.  It got them roughly
$50,000 more.  Not bad for one extra year's advanced planning and $5000.
(This is assuming about $100/wk deposit at 5% APR compounded monthly,
starting 1 year apart.)

With AGI we don't have the benefit of experience, but I think it's
prudent to analyze potential classes of outcomes thoroughly before
someone has committed to actualizing that risk.  The Los Alamos
scientists didn't think it was likely that a nuke would ignite the
atmosphere, but they still ran the best calculations they could come up
with beforehand, just in case.  And starting sooner, rather than later,
often results in achieving a deeper understanding of the nature of the
problems themselves, things we haven't even identified as potential
issues today.  

I believe that's the real reason to worry about it now: not because
we're in a position to solve the problem of FAI, but because without
further exploration we won't even be able to state the full scope of the
problem we're trying to solve.  The reality is that until you actively
discover which requirements are necessary to solve a particular problem,
you can't architect a design that has a very good chance of working at
all, let alone avoids the generation of multiple side-effects.  So you
can do what evolution does and iterate through many implementations, at
huge cost and with even larger potential losses (considering that *we*
share that implementation environment), or you can iterate in your
design process, gradually constraining things into a space to where only
a few full implementations (or one) need to be implemented.  And it is
reflection on this design-side iteration looping which can help identify
new concerns that require additional design criteria and associated
mechanism to accommodate.

I guess my main position is that if we can use our intelligence to avoid
making expensive mistakes down the road, doesn't it make sense to try?
We might not be able to avoid those unknown mistakes *today*, but if we
can discern some general categories and follow those insights where they
might lead, then our perceptual abilities will slowly start to ratchet
forward into new areas.  We'll have a larger set of tools with which to
probe reality, and just maybe at some point during this process the
solution will become obvious, or at least tractable.

I agree with you in that this course isn't intuitively obvious to me,
but I think this is because my intuitions discount the future in
degenerate ways, based on the fact that the scope for these kind of
issues was not a major factor in the EEA.  This is one of those topics
on which I try and look past my intuitions, because while they quite
often have some wisdom to offer, sometimes they're just plain wrong.

-Chris Healey