[ExI] Safety of human-like motivation systems [WAS Re: Oxford scientists...]

Sat Feb 5 16:26:48 UTC 2011

John Clark wrote:
> On Feb 4, 2011, at 12:50 PM, Richard Loosemore wrote:
> 
>> since the negative feedback loop works in (effectively) a few thousand 
>> dimensions simultaneously, it can have almost arbitrary stability.
> 
> Great, since this technique of yours guarantees that a trillion line 
> recursively improving AI program is stable and always does exactly what 
> you want it to do it should be astronomically simpler to use that same 
> technique with software that exists right now, then we can rest easy 
> knowing computer crashes are a thing of the past and they will always do 
> exactly what we expected them to do.

You are a man of great insight, John Clark.

What you say is more or less true (minus your usual hyperbole) IF the 
software is written in that kind of way (which software today is not).

> 
>> that keeps it on the original track.
> 
> And the first time you unknowingly ask it a question that is unsolvable 
> the "friendly" AI will still be on that original track long after the 
> sun has swollen into a red giant and then shrunk down into a white dwarf. 

Only if it is as stubbornly incapable of seeing outside the box as some 
people I know.  Which, rest assured, it will not be.

Richard Loosemore