[extropy-chat] AI design

Eliezer Yudkowsky sentience at pobox.com
Thu Jun 3 09:57:38 UTC 2004


Damien Broderick wrote:

> At 09:54 PM 6/2/2004 -0700, Zero Powers wrote:
> 
>> I guess what I'm asking is where would the interests of your AI
>> conflict with humanity's interests such that we would have reason to
>> fear being thrust into the "whirling razor blades?"
> 
> Gee, I don't know, Zero, why do people have such a lot of *trouble* 
> seeing this? It's a *super-intelligence*, see, a constructed mind that 
> makes the puny human *insignificant* by comparison--so what *else* is it
>  going to do except get trapped immediately by a really dumb pun into 
> turning the cosmos in smiley faces?

That, Damien, is why I now refer to them as Really Powerful Optimization
Processes, rather than the anthropomorphism "superintelligence".  That 
which humans regard as "common sense in the domain of moral argument" costs 
extra, just as natural selection, another alien optimization process, has 
no common sense in the domain of moral argument.

Zero Powers wrote:
> 
> You seem pretty certain that, unless friendliness designed into it from
> the beginning, the AI will default to malevolence.  Is that your
> thinking?  If so what do you base it on?  Is it a mathematical certainty
> kind of thing, or just a hunch?  Given our planet's history it makes
> sense to assume the world is cruel and out to get you, but I'm not so
> certain that default assumption should/would apply to an AI.

It *looks* like a mathematical near-certainty but I don't *know* that it is 
*really* a mathematical near-certainty.  By near-certainty I mean, for 
example, the same sense in which it is a mathematical near-certainty that 
any given ticket will not win the lottery.

> Why, you say?  Glad you asked.  Life as we know it is a game of
> organisms attempting to maximize their own fitness in a world of scarce 
> resources.  Since there are never enough resources (food, money,
> property, what-have-you) to go around, the "kill or be killed" instinct
> is inherent in virtually all lifeforms.  That is obvious.
> 
> But would that necessarily be the case for an AI?  Certainly your AI
> would have no need for food, money, real estate or beautiful women. What
> resources would an AI crave?  Electrical power?  Computing power? 
> Bandwidth?  Would those resources be best attained by destroying man or 
> working with him (at best) or ignoring him (at worst).  What would the
> AI gain by a _Terminator_ style assault on the human race?  I don't see
> it.
> 
> I guess what I'm asking is where would the interests of your AI conflict
>  with humanity's interests such that we would have reason to fear being 
> thrust into the "whirling razor blades?"

A paperclip maximizer would gain an extra 10^16 paperclips.

An AI that had been reinforced on video cameras showing smiling humans 
would become a smiley-face maximizer, and would gain an extra 10^20 
smiley-faces.

An AI that had been programmed with a static utility function over human 
happiness and to not destroy human bodies, would rewrite all extant human 
brains in a state of maximum "pleasure" as defined in the utility function, 
and then freeze them in that exact position (because the utility function 
is over static states rather than dynamic states).

Basically, any AI created by someone who lacks a PhD in a field that does 
not presently exist, will automatically wipe out the human species as a 
side effect when adult, even if the AI appears to operate normally during 
the special case of its childhood.

-- 
Eliezer S. Yudkowsky                          http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence



More information about the extropy-chat mailing list