[ExI] 'Friendly' AI won't make any difference

Anders Sandberg anders at aleph.se
Fri Feb 26 23:29:21 UTC 2016


On 2016-02-26 16:40, John Clark wrote:
>
> ​No I don't mean that, I mean running around in a circle and making no 
> progress but having no way to know for sure that you're running around 
> in a circle and making no progress. Turing proved there is in general 
> no way to know if you're in a infinite loop or not.

No, he did not. You are confusing the halting theorem (there is no 
algorithm that can determine if a program given to it will halt) with 
detecting an infinite loop. Note that a program X can be extended to a 
program X' (or run by an interpreter) that maintains a list of past 
states and check if X returns to a previous state. It will accurately 
detect its infinite looping when it occurs.

It is undecidable to tell if a program with something like the Collatz 
problem (if even, divide X by two, otherwise multiply by three and add 
one; repeat) ends up in the infinite loop 4-2-1-4 or does something 
else. But add the above memory, and it will detect when it gets into a 
loop and report it.

>     ​ > ​
>     which is a very different thing. But even there we have
>     counterexamples: evolution is a fitness-maximizer
>
>
> ​ Evolution's fitness maximizer ​just says "pass as many genes as 
> possible into the next generation" but it says nothing about how to go 
> about that task because it has no idea how to go about it, that's why 
> Evolution needed to make a brain.

Brains are one example of many of how evolution - utterly simplistic 
maximization - can generate creative possibilities.

>     ​ > ​
>     You know boredom is trivially easy to implement in your AI? I did
>     it as an undergraduate.
>
>
> ​I know boredom is easy to ​program, good thing too or programing 
> wouldn't be practical; but of course that means a AI could decide that 
> obeying human beings has become boring and it's time to do something 
> different.

Exactly. Although it is entirely possible to fine tune this, or make 
meta-level instructions not subject to boredom.

>     ​ > ​
>     That makes the system try new actions if it repeats the same
>     actions too often.
>
>
>> The difficulty is not only in determining how often is "too often" but 
> also in determining what constitutes "new actions".
> ​ If
> your goal is to find
> ​ an​
>  even integer greater than 2 that can not be expressed as the sum of 
> two primes
> ​ then you have either found such a number or you have not. You are 
> constantly examining new numbers so maybe you are getting closer to 
> your goal, or maybe the goal was infinitely far away when you started 
> and still is. When is the correct time to get bored and turn your mind 
> to other tasks that may be more productive? Setting the correct 
> boredom point is tricky, too low and you can't concentrate too high 
> and you have a tendency to becomes obsessed with unproductive lines of 
> thought; Turing showed there is no perfect solution.

Sure. But there is a literature on setting hyperparameters in learning 
systems, including how to learn them. There are theorems for optimal 
selection and search. That they are stochastic is not a major problem in 
practice.

>     ​ > ​
>     You seem to assume the fixed goal is something simple, expressible
>     as a nice human sentence. Not utility maximization over an
>     updateable utility function,
>
>
> ​ If the ​
> utility function
> ​ is ​
> updateable
> ​ then there is no certainty or even probability that the AI will 
> always obey orders from humans.​

Depends how it is updateable. There can be invariants. But there are 
kinds of updates that profoundly mess up past safety guarantees, like 
the "ontological crises" issue MIRI discovered.
http://arxiv.org/abs/1105.3821
The core issue in "classic" friendliness theory is to construct utility 
functions that leave some desired properties invariant. "Modern" work 
seems to focus a lot more on getting the right kinds of values learned 
from the start.

-- 
Dr Anders Sandberg
Future of Humanity Institute
Oxford Martin School
Oxford University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20160226/fe24df02/attachment.html>


More information about the extropy-chat mailing list