[ExI] Unfrendly AI is a mistaken idea.

Christopher Healey CHealey at unicom-inc.com
Fri Jun 1 15:06:32 UTC 2007

> Stathis Papaioannou wrote:
> We don't have human level AI, but we have lots of dumb AI. In 
> nature, dumb organisms are no less inclined to try to take over 
> than smarter organisms 

Yes, but motivation and competence are not the same thing.  Considering
two organisms that are equivalent in functional capability, varying only
intelligence level, the smarter ones succeed more often. However, within
a small range of intelligence variation, other factors contribute to
one's aggregate ability to execute those better plans.   So If I'm a
smart chimpanzee, but I'm physically weak, following particular courses
of action that may be more optimal in general carries greater risk.
Adjusting for that risk may actually leave me with a smaller range of
options than if I was physically stronger and a bit less smart.  But
when intelligence differential is large, those other factors become very
small indeed.  Humans don't worry about chimpanzee politics (no jokes
here please :o) because our only salient competition is other humans.
We worry about those entities that possess an intelligence that is at
least in the same range as our own.

Smart chimpanzees are not going to take over our civilization anytime
soon, but a smarter and otherwise well-adapted chimp will probably be
inclined and succeed in leading its band of peers.  

> (and no less capable of succeeding, as a 
> general rule, but leave that point for the sake of argument). 

I don't want to leave it, because this is a critical point.  As I
mentioned above, in nature you rarely see intelligence considered as an
isolated variable, and in evolution, intelligence is the product of a
red queen race.  By definition (of a red queen race), you're
intelligence isn't going to be radically different from your direct
competition, or the race would never have started or escalated.  So it
confusingly might not look like you're chances of beating "the Whiz on
the block" are that disproportionate, but the context is so narrow that
other factors can overwhelm the effect of intelligence over that limited
range.  In some sense, our experiential day-to-day understanding of
intelligence (other humans) biases us to consider its effects over too
narrow a range of values.  As a general rule, I'd say humans have been
very much more successful at "taking over" than chimpanzees and salmon,
and that it is primarily due to our superior intelligence.

> Given that dumb AI doesn't try to take over, why should smart AI 
> be more inclined to do so? 

I don't think a smart AI would be more inclined to try and take over, a
priori.  But assuming it has *some* goal or goals, it's going to use all
of its available intelligence in support of those ends.  Since the
future is uncertain, and overly directed plans can unnecessarily limit
other courses of action that may turn out to be required, it seems
highly probable that an increasingly intelligent actor would
increasingly seek to preserve its autonomy by constraining that of
others in *some* way.  

Looking at friendly AI in a bit if a non-standard way (kind of flipped
around), I'd expect *any* superintelligent AGI to constrain our autonomy
in some ways, to preserve its own.  That's basic security, and we all do
it to others through one means or another.  Friendly AI is about *how*
the AGI seeks to constrain our autonomy.  Instead of looking at it from
humanity's perspective which is, how can we launch a recursively
improving process that maintains some abstract invariant in its goals
(i.e. we don't know where it's going, but we have a strong sense of
where it *won't* be going), we can look at FAI from the AGI's viewpoint:
how do I assert such abstract invariants on other agents?  Which of my
priorities do I choose to merely satisfy, and which do I optimize
against?  As my abilities grow, do I increasingly constrain you,
maintain fixed limits, or allow your autonomy to expand along with my
own (maintaining a reasonably constant assurance level for my autonomy).

>From this perspective, FAI is about the complementary engagement of
humanity's autonomy with the AGI's.  It's about ensuring that the AGI's
representation of reality can include such complex attributions to begin
with, and then making sure that it has a sane starting point.  As
mentioned by others here, it needs *some* starting point, and it would
be irresponsible to simply assign one at random.

> And why should that segment of smart 
> AI which might try to do so, whether spontaneously or by malicious
> design, be more successful than all the other AI, which maintains 
> its ancestral motivation to work and improve itself for humans 

The consideration that also needs to be addressed is that the AI may
maintain its "motivation to work and improve itself for humans", and due
to this motivation, take over (in some sense at least).  In fact, it has
been argued by others here (and I tend to agree) that an AGI
*consistently* pursuing such benign directives must intercede where its
causal understanding of certain outcomes passes a minimum assurance
level (which would likely vary based on probability and magnitude of the

It's up to our activities on the input-side of building a functional AGI
to determine not just what it tries to do, but what it actually
accomplishes; meaning that in pursuing goals, very often a bunch of
side-effects are created.  These side-effects need to be iterated back
through the model, and hopefully the results converge.  If they don't
you need a better model that subsumes those side-effects.  Can AGI X
represent this model-management process to begin with?  Will it
generalize this process in actuality?  How many errors will accrue, or
for how long will it stomp on reality before it *does* generalize these
concepts?  Can the degenerate outcomes during this period be reversed
after-the-fact, or are certain losses (deaths?) permanent?

This picture is what FAI, by my understanding, is intended to address.
And I think there is a lot to be gained by considering its complement:
Given the eventual creation of superintelligent AGI, what is the maximum
volume of autonomy that we can carve out for humanity in the space of
all possible outcomes, while minimizing the possibility our destruction,
and how do we achieve that?  This last question and FAI seem to be
different sides of the same coin.

-Chris Healey

More information about the extropy-chat mailing list