[ExI] 'Friendly' AI won't make any difference
anders at aleph.se
Fri Feb 26 09:12:43 UTC 2016
On 2016-02-26 03:15, John Clark wrote:
> On Thu, Feb 25, 2016 at 4:47 PM, Anders Sandberg <anders at aleph.se
> <mailto:anders at aleph.se>>wrote:
> A utility-maximizer in a complex environment will not necessarily loop
> If there is a fixed goal in there that can never be changed then a
> infinite loop is just a matter of time, and probably not much time.
I wonder what you mean by an "infinite loop". Because clearly you mean
something different from "will repeat the same actions again and again"
given my example.
My *suspicion* is that you mean "will never do anything creative", which
is a very different thing. But even there we have counterexamples:
evolution is a fitness-maximizer, and genetic programming allows you to
evolve highly creative solutions to problems (consider the creatures of
Karl Sims, for example). It is trivial in principle (another thing in
practice) to embed such algorithms in a utility maximizer agent.
> Real minds don't get into infinite loops thanks to one of Evolutions
> greatest inventions, boredom. Without a escape hatch a innocent
> sounding request could easily turn the mighty multi billion dollar AI
> into nothing but a space heater.
You know boredom is trivially easy to implement in your AI? I did it as
(Essentially it is a moderately fast updating value function separate
from the real value function, which you subtract from it when doing
action selection. That makes the system try new actions if it repeats
the same actions too often. )
> Try to crash Siri with a question.
> You can't crash Siri because Siri doesn't have a fixed goal,
> certainly not the fixed goal of "always do what a human tells you to
> do no matter what". So if you say "Siri, find the eleventh prime
> number larger than 10^100^100" she will simply say "no, I don't want
Hmm. You seem to assume the fixed goal is something simple, expressible
as a nice human sentence. Not utility maximization over an updateable
utility function, not as trying to act optimally over a space of
procedures (which is what I guess Siri is like).
One of the biggest problems in AI safety is that people tend to make
general statements for or against in human language that actually do not
show anything rigorously. This is why MIRI and some of the FHIers have
gone all technical: normal handwaving does not cut it, neither for
showing risk nor for showing safety.
Dr Anders Sandberg
Future of Humanity Institute
Oxford Martin School
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the extropy-chat