[ExI] Self improvement

Fri Apr 22 19:14:37 UTC 2011

On 04/22/2011 11:40 AM, Richard Loosemore wrote:
> Eugen Leitl wrote:
>> On Fri, Apr 22, 2011 at 12:17:13PM -0400, Richard Loosemore wrote:
>>
>>> The question of whether the AIs would be friendly toward each other 
>>> is a  matter of design, very much the same as the question of 
>>> whether or not  they would be good at arithmetic.
>>
>> The difference is that "friendly" is meaningless until defined.
>>
>> Please define "friendly" in formal terms, including context.
>
> That is not the difference at all.  It is not possible, and will never 
> be possible, to define "friendly" in formal terms.  For the same 
> reason that it is not possible to define "intelligent" in formal terms.
>
> There are other ways to effect a definition without it being "formal". 
> I we understand the roots of human motivation, and if we understand 
> the components of human motivation in such a way that we can 
> differentiate "empathic" behavior from other kinds, like selfishly 
> aggressive behavior, then we could discover that (as seems likely) 
> these are impemented by separable modules in the human brain.

Lumping "selfish" with "aggressive" or implying it is somehow opposite 
than or not convoluted with "empathic" all seems problematic.  I don't 
know if it is likely at all these are controlled by separable human 
brain modules.

>
> Under those circumstances we would be in a position to investigate the 
> dynamics of those modules and in the end we could come to be sure that 
> with that type of design, and with the aggressive modules missing from 
> the design, a thinking creature would experience all of the empathic 
> motivations that we so prize in human beings, and not even be aware of 
> the violent or aggressive motivations.

Given the above problematic issues this seems wildly optimistic.  Also 
if you just model what is in this purported empathy module of the human 
brain and none of what is in the purported selfish and or aggressive 
module you are very likely to get a slew of other things that are mixed 
in with or give rise to empathy in human beings, many of which may be 
far less than ideal in an AGI.

>
> With that information in hand, we would then be able to predict that 
> the appropriately designed AI would exhibit a cluster of behaviors 
> that, in normal human terms, would be summed up by the label 
> "friendly".  By that stage, the term "friendliness" would have 
> achieved the status of having a functional definition in terms of the 
> behavior, and the stability, of certain brain modules (and their AGI 
> counterparts).  At that point we would have:

What does "feeling with" have to do with actual behavior toward.  I can 
feel the pain of an other and yet still cause them pain or act in a way 
most would consider "unfriendly" toward them.

- samantha