[ExI] Self improvement

Sat Apr 23 08:20:02 UTC 2011

On Apr 22, 2011, at 2:11 PM, Richard Loosemore wrote:

> Samantha Atkins wrote:
>> On 04/22/2011 11:40 AM, Richard Loosemore wrote:
>>> Eugen Leitl wrote:
>>>> On Fri, Apr 22, 2011 at 12:17:13PM -0400, Richard Loosemore wrote:
>>>> 
>>>>> The question of whether the AIs would be friendly toward each other is a  matter of design, very much the same as the question of whether or not  they would be good at arithmetic.
>>>> 
>>>> The difference is that "friendly" is meaningless until defined.
>>>> 
>>>> Please define "friendly" in formal terms, including context.
>>> 
>>> That is not the difference at all.  It is not possible, and will never be possible, to define "friendly" in formal terms.  For the same reason that it is not possible to define "intelligent" in formal terms.
>>> 
>>> There are other ways to effect a definition without it being "formal". I we understand the roots of human motivation, and if we understand the components of human motivation in such a way that we can differentiate "empathic" behavior from other kinds, like selfishly aggressive behavior, then we could discover that (as seems likely) these are impemented by separable modules in the human brain.
>> Lumping "selfish" with "aggressive" or implying it is somehow opposite than or not convoluted with "empathic" all seems problematic.  I don't know if it is likely at all these are controlled by separable human brain modules.
> 
> On what grounds do you claim "problematic"?

I would hope that is rather obvious.  Rational selfishness, for instance, is in no way synonymous with aggression or contrary to empathy.   

> 
> And, I did not intentionally lump selfish and aggressive:  I was picking random modules that would be unwanted.  Clearly, the exact details are less important than the general method.
> 

Selfishness and its highest sense is certainly not unwanted.  The general method is very doubtful. 

> 
> 
> 
>>> 
>>> Under those circumstances we would be in a position to investigate the dynamics of those modules and in the end we could come to be sure that with that type of design, and with the aggressive modules missing from the design, a thinking creature would experience all of the empathic motivations that we so prize in human beings, and not even be aware of the violent or aggressive motivations.
>> Given the above problematic issues this seems wildly optimistic.  Also if you just model what is in this purported empathy module of the human brain and none of what is in the purported selfish and or aggressive module you are very likely to get a slew of other things that are mixed in with or give rise to empathy in human beings, many of which may be far less than ideal in an AGI.
> 
> You're doing no more than wave your hands:  why speculate on the limits of mechanisms that have only been proposed *in* *principle* by myself?
> 

You are waving your hands! You are claiming on mere supposition that there are separate brain modules for empathy and selfishness aggression.  You are apparently assuming you can copy these to a machine version which is quite an assumption.  And that you can filter out the unwanted human aspects that may be entangled with them.  And you say I am waving my hands when I point out how doubtful this is?    

You're in principal does not even see like a likely guess to me at this point.

> I say "if this is the kind of system we find..."  and you seem to already know about the likely properties of the system I have put on the table as no more than a potential candidate!
> 
> I did not go as far as to present my own speculations as to what is and is not possible.  If I had, I would have waved my hands in a direction exactly contradicting your position here.
> 

Do tell.  If you aren't serious about the proposal then please excuse me for taking it seriously given my general respect for your intelligence and care in this area.

> 
>>> 
>>> With that information in hand, we would then be able to predict that the appropriately designed AI would exhibit a cluster of behaviors that, in normal human terms, would be summed up by the label "friendly".  By that stage, the term "friendliness" would have achieved the status of having a functional definition in terms of the behavior, and the stability, of certain brain modules (and their AGI counterparts).  At that point we would have:
>> What does "feeling with" have to do with actual behavior toward.  I can feel the pain of an other and yet still cause them pain or act in a way most would consider "unfriendly" toward them.
> 
> What has this got to do with what I said?  The phrase "feeling with" is not in the above quote text of mine.
> 

Please define "empathy" for me if it does not include that.  The obvious point I was attempted to make is that instilling empathy, even if you can, is not at all the same as guarantee of friendliness.

- s