[ExI] Self improvement

Fri Apr 22 21:11:30 UTC 2011

Samantha Atkins wrote:
> On 04/22/2011 11:40 AM, Richard Loosemore wrote:
>> Eugen Leitl wrote:
>>> On Fri, Apr 22, 2011 at 12:17:13PM -0400, Richard Loosemore wrote:
>>>
>>>> The question of whether the AIs would be friendly toward each other 
>>>> is a  matter of design, very much the same as the question of 
>>>> whether or not  they would be good at arithmetic.
>>>
>>> The difference is that "friendly" is meaningless until defined.
>>>
>>> Please define "friendly" in formal terms, including context.
>>
>> That is not the difference at all.  It is not possible, and will never 
>> be possible, to define "friendly" in formal terms.  For the same 
>> reason that it is not possible to define "intelligent" in formal terms.
>>
>> There are other ways to effect a definition without it being "formal". 
>> I we understand the roots of human motivation, and if we understand 
>> the components of human motivation in such a way that we can 
>> differentiate "empathic" behavior from other kinds, like selfishly 
>> aggressive behavior, then we could discover that (as seems likely) 
>> these are impemented by separable modules in the human brain.
> 
> Lumping "selfish" with "aggressive" or implying it is somehow opposite 
> than or not convoluted with "empathic" all seems problematic.  I don't 
> know if it is likely at all these are controlled by separable human 
> brain modules.

On what grounds do you claim "problematic"?

And, I did not intentionally lump selfish and aggressive:  I was picking 
random modules that would be unwanted.  Clearly, the exact details are 
less important than the general method.

>>
>> Under those circumstances we would be in a position to investigate the 
>> dynamics of those modules and in the end we could come to be sure that 
>> with that type of design, and with the aggressive modules missing from 
>> the design, a thinking creature would experience all of the empathic 
>> motivations that we so prize in human beings, and not even be aware of 
>> the violent or aggressive motivations.
> 
> Given the above problematic issues this seems wildly optimistic.  Also 
> if you just model what is in this purported empathy module of the human 
> brain and none of what is in the purported selfish and or aggressive 
> module you are very likely to get a slew of other things that are mixed 
> in with or give rise to empathy in human beings, many of which may be 
> far less than ideal in an AGI.

You're doing no more than wave your hands:  why speculate on the limits 
of mechanisms that have only been proposed *in* *principle* by myself?

I say "if this is the kind of system we find..."  and you seem to 
already know about the likely properties of the system I have put on the 
table as no more than a potential candidate!

I did not go as far as to present my own speculations as to what is and 
is not possible.  If I had, I would have waved my hands in a direction 
exactly contradicting your position here.

>>
>> With that information in hand, we would then be able to predict that 
>> the appropriately designed AI would exhibit a cluster of behaviors 
>> that, in normal human terms, would be summed up by the label 
>> "friendly".  By that stage, the term "friendliness" would have 
>> achieved the status of having a functional definition in terms of the 
>> behavior, and the stability, of certain brain modules (and their AGI 
>> counterparts).  At that point we would have:
> 
> What does "feeling with" have to do with actual behavior toward.  I can 
> feel the pain of an other and yet still cause them pain or act in a way 
> most would consider "unfriendly" toward them.

What has this got to do with what I said?  The phrase "feeling with" is 
not in the above quote text of mine.

Richard Loosemore