[ExI] 'Friendly' AI won't make any difference

Anders Sandberg anders at aleph.se
Thu Feb 25 21:47:40 UTC 2016

On 2016-02-25 21:33, John Clark wrote:
> On Thu, Feb 25, 2016 at 3:25 PM, Anders Sandberg <anders at aleph.se 
> <mailto:anders at aleph.se>>wrote:
>>         ​ >>​
>         There are indeed vested interests​
>         ​ ​
>         but it wouldn't matter even if there weren't,
>         there is no way the friendly AI (aka slave AI) idea could work
>         under any circumstances. You just can't keep outsmarting​
>         something far smarter than you are indefinitely
>     ​ >​
>     Actually, yes, you can. But you need to construct utility
>     functions with invariant subspaces
> ​It's the invariant part that will cause problems, any mind with a 
> fixed goal that can never change no matter what is going to end up in 
> a infinite loop, that's why Evolution never gave humans a fixed meta 
> goal, not even the goal of self preservation.

Sorry, but this seems entirely wrong. A utility-maximizer in a complex 
environment will not necessarily loop (just consider various 
reinforcement learning agents).

And evolution doesn't care if organisms get stuck in loops if they 
produce offspring before with high enough probability. Consider pacific 

Sure, simple goal structures can produce simplistic agents. But we also 
know that agents with nearly trivial rules like Langton's ant can 
produce highly nontrivial behaviors (in the ant case whether it loops or 
not is equivalent to the halting problem). We actually do not fully know 
how to characterize the behavior space of utility maximizers.

> If the AI has a meta goal of always obeying humans then sooner or 
> later stupid humans will unintentionally tell the AI to do something 
> that is self contradictory, or tell it to start a task that can never 
> end, and then the AI will stop thinking and do nothing but consume 
> electricity and produce heat.  ​

AI has advanced a bit since 1950s. You are aware that most modern 
architectures are not that fragile?

Try to crash Siri with a question.

> And besides, ​if Microsoft can't guarantee that Windows will always 
> behave as we want I think it's nuts to expect a super intelligent AI to.

And *that* is the real problem, which I personally think the friendly AI 
people - many of them people meet on a daily basis - are not addressing 
enough. Even a mathematically perfect solution is not going to be useful 
if it cannot be implemented, and ideally approximate or flawed 
implementations should converge to the solution.

This is why I am spending part of this spring reading up on validation 
methods and building theory for debugging complex adaptive technological 

Anders Sandberg
Future of Humanity Institute
Oxford Martin School
Oxford University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20160225/e0ea7766/attachment.html>

More information about the extropy-chat mailing list