[ExI] 'Friendly' AI won't make any difference

Anders Sandberg anders at aleph.se
Fri Feb 26 09:12:43 UTC 2016

On 2016-02-26 03:15, John Clark wrote:
> On Thu, Feb 25, 2016 at 4:47 PM, Anders Sandberg <anders at aleph.se 
> <mailto:anders at aleph.se>>wrote:
>     ​ > ​
>     A utility-maximizer in a complex environment will not necessarily loop
> ​If there is a fixed goal in there that can never be changed then a 
> infinite loop is just a matter of time, and probably not much time.​

I wonder what you mean by an "infinite loop". Because clearly you mean 
something different from "will repeat the same actions again and again" 
given my example.

My *suspicion* is that you mean "will never do anything creative", which 
is a very different thing. But even there we have counterexamples: 
evolution is a fitness-maximizer, and genetic programming allows you to 
evolve highly creative solutions to problems (consider the creatures of 
Karl Sims, for example). It is trivial in principle (another thing in 
practice) to embed such algorithms in a utility maximizer agent.

> Real minds don't get into infinite loops thanks to one of Evolutions 
> greatest inventions, boredom. Without a escape hatch a innocent 
> sounding request could easily turn the mighty multi billion dollar AI 
> into nothing but a space heater.

You know boredom is trivially easy to implement in your AI? I did it as 
an undergraduate.

(Essentially it is a moderately fast updating value function separate 
from the real value function, which you subtract from it when doing 
action selection. That makes the system try new actions if it repeats 
the same actions too often. )

>     ​ > ​
>     Try to crash Siri with a question.
> ​You can't crash Siri because Siri doesn't have  a fixed goal, 
> certainly not the fixed goal of "always do what a human tells you to 
> do no matter what". So if you say "Siri, find the eleventh prime 
> number larger than 10^100^100" she will simply say "no, I don't want 
> to". ​

Hmm. You seem to assume the fixed goal is something simple, expressible 
as a nice human sentence. Not utility maximization over an updateable 
utility function, not as trying to act optimally over a space of 
procedures (which is what I guess Siri is like).

One of the biggest problems in AI safety is that people tend to make 
general statements for or against in human language that actually do not 
show anything rigorously. This is why MIRI and some of the FHIers have 
gone all technical: normal handwaving does not cut it, neither for 
showing risk nor for showing safety.

Dr Anders Sandberg
Future of Humanity Institute
Oxford Martin School
Oxford University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20160226/1e914c78/attachment.html>

More information about the extropy-chat mailing list