[ExI] Brain emulation, regions and AGI [WAS Re: Kelly's future]

Sat Jun 4 16:41:46 UTC 2011

Kelly Anderson wrote:
> I'm not talking about the presence or absence of a particular module
> in the original design. Nor of a module sneaking it's way into
> existence. Rather, I am talking about the more mundane unintended
> consequences. Whenever I try to come up with an optimization algorithm
> for AGI goals, I keep running into the wall (i.e. human extinction)
> because I don't think we can come up with such a function that doesn't
> have the unintended consequence of making humanity rather irrelevant
> and unimportant.

That is (almost certainly) because you are thinking in terms of the 
AGI's motivation system as "algorithm" driven (in the sense of it having 
what I have called a "goal stack").  This was what the last para of my 
message was all about.  In terms of goal-stack motivation mechanisms, 
yes, there is no way to make the system stable enough and safe enough to 
guarantee friendliness.  Indeed, I think it is much worse than that: 
there is in fact no way to ensure that such a motivation mechanism will 
make an AGI stably *intelligent*, never mind friendly.

>> So, in the case of humans what that means is that even someone who happens
>> to be a non-violent person, as a result of upbringing or conscious decision,
>> will almost certainly have that module in there, but it will be very weak.
>>
>> When building an AGI, it is not my plan to include modules like that and
>> then try to ensure that they stayed weak when the system was growing up ....
> 
> Yes, we should absolutely try. It might buy us a few years. ;-)
> 
>> that is not my idea of trying to guarantee friendliness!  Instead, we would
>> leave the module out entirely.  As a result the system might be unable to
>> understand the concept of "losing it" (i.e. outburst of anger), but that
>> would not be much of a barrier to its understanding of us.
> 
> Anger is a useful emotion in humans. It helps you know what's
> important to work against. If AGI doesn't have the feeling of anger, I
> don't know how it will really understand us. Again, these differences
> seem as dangerous as the similarities, just in different ways.

No, no, no, it is not!  I mean, respectively but firmly disagree! :-) 
It can be made in such a way that it feels some mild frustration in some 
circumstances.  From that it can understand what anger is, by analogy. 
But there is no reason whatever to suppose that it needs to experience 
real anger in order to empathize with us.  If you think this is not 
true, can you explain your reasoning?

>> Bear in mind that part of my reason for talking in terms of modules is that
>> I have in mind a specific way to implement motivation and drives in an AGI,
>> and that particular approach is radically different than the "Goal Stack"
>> approach that is assumed by most people to be the only way to do it.  One
>> feature of that alternate approach is that it is relatively easy to have
>> such modules.  (Although, having said that, it is still one of the most
>> difficult aspects to implement).
> 
> You still have to have some mechanism for determining which module
> "wins"... call it what you will.

Yes, but that is not a problem.  If the competition is between modules 
that are all fairly innocuous, why would this present a difficulty?

Richard Loosemore