[ExI] Safety of human-like motivation systems [WAS Re: Oxford scientists...]

Thu Feb 3 19:20:17 UTC 2011

Stefano Vaj wrote:
> On 2 February 2011 17:40, Richard Loosemore <rpwl at lightlink.com> wrote:
>> The problem with humans is that they have several modules in the motivation
>> system, some of them altruistic and empathic and some of them selfish or
>> aggressive.   The nastier ones were built by evolution because she needed to
>> develop a species that would fight its way to the top of the heap.  But an
>> AGI would not need those nastier motivation mechanisms.
> 
> Am I the only one finding all that a terribly naive projection?

I fail to understand.  I am talking about mechanisms.   What projections 
are you talking about?

> Either we deliberately program an AGI to emulate evolution-driven
> "motivations", and we end up with either an uploaded (or a
> patchwork/artificial) human or animal or vegetal individual - where it
> might make some metaphorical sense to speak of "altruism" or
> "selfishness" as we do with existing organisms in sociobiological
> terms -; 

Wait!  There is nothing metaphorical about this.  I am not a poet, I am 
a cognitive scientist ;-).  I am describing the mechanisms that are 
(probably) at the root of your cognitive system.  Mechanisms that may be 
the only way to drive a full-up intelligence in a stable manner.

I do not know why you parody this.  It is just science.

or we do not do anything like that, and in that case our AGI
> is neither more nor less saint or evil than my PC or Wolfram's
> cellular automata, no matter what its intelligence may be.

Again, where on earth did you get that from?  If you wish you can try to 
build a control system for an AGI, and use a design that has nothing to 
do with the human design.  But the question of "evil" behavior is not 
ruled in or out by the underlying features of the design, it is 
determined by the CONTENT of the mechanism, after the design stage.

Thus, a human-like motivation system can be given aggression modules, 
and no empathy module.  Result: psychopath.  Or the AGI can have some 
other mechanism, and someone can try to design to follow goals that are 
aggressive and non-empathic.  Same result.

And vice versa for both.

The difference is in the stability of the motivation mechanism.  I claim 
that you cannot make a stable system AT ALL if you extrapolate from the 
"goal stack" control mechanisms that most people now assume are the only 
way to drive an AGI.

> We need not detract anything. In principle I do not see why an AGI
> should be  any less absolutely "indifferent" to the results of its
> action than any other program in execution today...
> 
This is quite wrong.  I am at a loss to explain:  it seems too obvious 
to need explaining.

Richard Loosemore