[ExI] Safety of human-like motivation systems [WAS Re: Oxford scientists...]

Mon Feb 7 17:53:43 UTC 2011

Stefano Vaj wrote:
> On 5 February 2011 17:23, Richard Loosemore <rpwl at lightlink.com> wrote:
>> So, to be fair, I will admit that the distinction between  "How did this
>> machine come to get built?"  and  "How does this machine actually work, now
>> that it is built?" becomes rather less clear when we are talking about
>> concept learning (because concepts play a role that fits somewhere between
>> structure and content).
> 
> How a machine is built is immaterial to my argument. For a darwinian
> program I refer to one the purpose to which is, very roughly,
> fitness-maxisiming.
> 
> Any such program may be the "natural" product of the mechanism
> "heritance/mutation/selection" along time, or can be emulated by
> design. In such case, empathy, aggression, flight, selfishness etc.
> have a rather literal sense in that they are aspects of the
> reproductive strategy of the individual concerned, and/or of the
> replicators he carries around.
> 
> For anything which is not biological, or designed to emulate
> deliberately the Darwinian *functioning* of biological system, *no
> matter how intelligent they are*, I contend that aggression or
> altruism are as applicable only inasmuch they are to ordinary PCs or
> other universal computing devices.
> 
> If, on the other hand, AGIs are programmed to execute Darwinian
> programs, obviously they would be inclined to adopt the mix of
> behaviours which is best in Darwinian terms for their "genes", unless
> of course the emulation is flawed. What else is new?
> 
> In fact, I maintain that they would be hardly discernible in
> behavioural terms from a computer with an actual human brain inside.
> 

Thank you for the clarification of your position.

Unfortunately, I have to make a stand here and say that I think this
line of analysis is profoundly incoherent (I am not saying *you* are
incoherent, I refer only to the general theoretical position that you
are defending here).

The main problem with your argument is that it begins with some quite
sensible talk about those features of naturally intelligent systems -
like empathy, aggression, selfishness, etc. - that have historically
played the role of helping reproductive success, but then your argument
goes screaming off in the opposite direction when I want to point to the
*mechanisms* that are inside the individuals, which *cause* those
features to appear on the outside.

Everything that I have been saying depends on talking about those
mechanisms -- their characteristics, their presence or absence in
various kinds of system, and so on.  My claims are all about the
mechanisms themselves.

But, in spite of all my efforts, you insist on jumping right over that
part of the topic and instead talking about the observable
characteristics of the systems, as if there were no mechanisms
underneath, that are responsible for making the characteristics appear.
    The way you describe the situation, it is as if aggression, empathy,
selfishness, etc. all suddenly appear out of nowhere.

For example, you say:

     "For anything which is not biological, or designed to
      emulate deliberately the Darwinian *functioning* of
      biological system, *no matter how intelligent they are*,
      I contend that aggression or altruism are as applicable
      only inasmuch they are to ordinary PCs or other universal
      computing devices."

But this is surely nonsensical!  If the mechanisms that cause
aggression, empathy and selfishness are built into a PC (along with all
the supporting mechanisms needed to make it intelligent) then the PC
will exhibit aggression, empathy and selfishness.  But if the very same
PC is built with all the "intelligence" components, but WITHOUT the
mechanisms that give rise to aggression, empathy and selfishness, then
it will not show those characteristics.  There is nothing special about
the system being "darwinian", nothing special about it being a PC or a
Turing machine of this that or the other type..... all that matters is
that it be built with (a) a reasonably full range of "intelligence"
mechanisms, and in addition a set of motivation mechanisms such as the
aggression, empathy and selfishness.

Aggression, empathy and selfishness don't come for free with systems of
any stripe (darwinian or otherwise).  They don't appear out of thin air
if the system is competing against others in an exosystem.  They are
specific mechanisms that can, in the right circumstances, play a role in
a natural selection process.

You go on to make another statement that makes no sense, in this context:

     "If, on the other hand, AGIs are programmed to execute
      Darwinian programs, obviously they would be inclined to
      adopt the mix of behaviours which is best in Darwinian
      terms for their "genes", unless of course the emulation
      is flawed. What else is new?"

This has nothing to do with what I was originally talking about, it is
just a claim about a certain class of AGIs, as a population, existing in
the context of the right ecosystem, with unavoidable sex, birth and
death of individuals, etc etc etc ....... in other words, your statement
assumes the full gamut of evolutionary mechanisms that are present in
natural ecosystems.  Under those very restricted circumstances, the
mechanisms in the AGIs that gave rise to aggression, empathy,
selfishness, etc, would play a role to select future mechanisms in the
AGIs, and, yes, then the mechanisms might evolve over time.

But this is, I am afraid, both (a) irrelevant to any claims I made about
the behavior of the first AGI, and (b) extraordinarily implausible
anyway, because all the conditions I just mentioned would likely be
completely inoperative!   The AGIs would NOT be tied to sexual
reproduction, with mixing of genes, as their only way to reproduce.
They would NOT be existing in an ecosystem in which they had to compete
for resources, and so on and so on.

So, on both counts the position you are taking makes no sense here.  It
says nothing of relevance to the question of what the motivation
mechanisms are, and how the behavior of the very first AGI would turn
out, when it is switched on.  And, further, it makes unsupportable
assumptions about some future AGI ecosystem that, in all likelihood,
will never exist.

Richard Loosemore