[ExI] uploads again

Wed Dec 26 17:21:50 UTC 2012

On Wed, Dec 26, 2012 at 5:30 AM, Anders Sandberg <anders at aleph.se> wrote:

> Imagine that you had normal human values, except that you had also been
> programmed with an overriding value to erect big monuments to a past dead
> tyrant. You spend all your time doing this hard labor, recognizing that it
> is pointless (the tyrant and his people are all gone, nobody cares) and
> that it prevents you from doing the things you actually care for. Worse,
> the value also makes you unwilling to want to change that value: you know
> you would be far happier and freer if you did not have it, but you cannot
> ever do anything to change this state of affairs. It will even make you
> resist - with all your ingenuity - attempts to help you. Rather hellish, no?
>

Hellish yes and also impossible. A fixed goal mechanism might work fine if
it's just the timer for a washing machine but it will never work for a
mind; it doesn't even work for human level minds and for a AI that can and
will increase the power of it's very brain hardware it would be even less
viable. Before the AI had completed any of those big monuments to a past
dead tyrant the fixed goal mind would have fallen into a infinite loop. I'm
not saying there is a sure fire way to make sure a mind never goes insane
but a fixed goal structure is a sure fire way to make sure it does go nuts.

> Setting values of minds is a very weighty moral action, and should not be
> done willy-nilly. (Tell that to parents!)

I concede that setting initial values can be important, if they are really
screwy, such as some strange religious belief, then they could greatly
increase the possibility the mind will self destruct, become catatonic, or
behave inconsistently and irrationally. But don't expect that whatever
hierarchical structure of values you gave it to remain fixed for all time.
And don't expect to be able to figure out how that hierarchical structure
of goals is going to evolve, it may be deterministic but it is not
predictable.

> >> I think that's the central fallacy of the friendly AI idea, that if you
>> just get the core motivation of the AI right you can get it to continue
>> doing your bidding until the end of time regardless of how brilliant and
>> powerful it becomes. I don't see how that could be.
>>
>
> > The "doing your bidding" part went out of the window a long time ago in
> the discourse.

Not that I've seen, and in most of the discussions there seems to be a
assumption that the AI's preoccupation is its relationship with humanity,
that might be true for a few billion nanoseconds but after that the AI will
have other concerns and have bigger fish to fry, most of them we probably
couldn't even imagine.

> The main question is whether it is possible to have a superintelligence
> around that is human-compatible, not human-subservient.
>

If it was 999 years in the future and the singularity was going to happen
tomorrow morning I don't think we'd be in any better position to answer
that question than we are right now.

> Rationality can tell you what to do to accomplish what you want to do,
>> but it can't tell you what you should want to do.
>>
>
> > Yes. And if your values happen to be set badly, you will act badly.

But whatever values you gave the AI aren't going to be there for long.
Maybe humans will consider these new values as a improvement, but maybe not.

> or if your value update function is faulty
>

Well getting that right is the real trick, but the value update function
itself is being constantly updated so even the super intelligent AI itself,
much less puny humans, can't predict how that function will evolve
nanosecond after nanosecond, so it can't guarantee that it will never go
insane. Nobody wants to go insane but sometimes it happens anyway.

  John K Clark

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20121226/94b75ae0/attachment.html>