[ExI] Safety of human-like motivation systems [WAS Re: Oxford scientists...]

Thu Feb 3 20:50:18 UTC 2011

On 02/03/2011 07:41 AM, John Clark wrote:
> On Feb 2, 2011, at 1:49 PM, Richard Loosemore wrote:
>>
>> Anything that could get into such a mindless state, with no true 
>> understanding of itself or the world in general, would not be an AI.
>
> That is not even close to being true and that's not just my opinion, 
> it is a fact as certain as anything in mathematics. Goedel proved 
> about 80 years ago that some statements are true but there is no way 
> to prove them true. And you can't just ignore those troublemakers 
> because about 75 years ago Turing  proved that in general there is no 
> way to identify such things, no way to know if something is false or 
> true but unprovable.

Actually, I didn't read the proof as doing that as it is often taken as 
if it did.  What it did to is show that for the domain of formally 
definable mathematical claims in a closed system using formalized logic 
that there are claims that cannot be proven or disproven.  That is a bit 
different to than saying in general that there are countless claims that 
cannot be proven or disproven and that you can't even tell when you are 
dealing with one.   That is a much broader thing that actually shown as 
I see it.  I could be wrong.

> Suppose the Goldbach Conjecture is unprovable (and if it isn't there 
> are a infinite number of similar statements that are) and you told the 
> AI to determine the truth or falsehood of it;  the AI will be grinding 
> out numbers to prove it wrong but because it is true it will keep 
> testing numbers for eternity and will never find a counter example to 
> prove it wrong because it is in fact true.

Actually, your argument assumes:
a) that the AI would take the find a counter example path as it only or 
best path looking for disproof;
b) that the AI has nothing else on its agenda and does not take into 
account any time limits, resource constraints and so on.

Generally there is no reason to suppose a decent AI operates without 
limits or understanding of limits and desirability constraints.

> And because it is unprovable the AI will never find a proof, a 
> demonstration of its correctness in a finite number of steps, that 
> shows it to be correct. In short Turing proved that in general there 
> is no way to know if you are in a infinite loop or not.

An infinite loop is a very different thing that an endless quest for a 
counter-example.  The latter is orthogonal to infinite loops. An 
infinite loop in the search procedure would simply be a bug.

>
> The human mind does not have this problem because it is not a fixed 
> axiom machine, human beings have the glorious ability to get bored, 
> and that means they can change the basic rules of the game whenever 
> they want.

Humans are such sloppy computational devices that they just wander away 
from the point and get distracted by something else only a very few 
steps down the road.  This is not exactly consciously changing the basic 
rules usually.

> But your friendly (that is to say slave) AI must not do that because 
> axiom #1 must now and forever be "always obey humans no matter what", 
> so even becoming a space heater will not bore a slave (sorry friendly) 
> AI. And there are simpler ways to generate heat.

Well, if you or anyone wants to build a really really stupid AI then as 
you say there are indeed simpler ways to generate heat.

- samantha