[ExI] Self improvement
Richard Loosemore
rpwl at lightlink.com
Fri Apr 22 18:40:28 UTC 2011
Eugen Leitl wrote:
> On Fri, Apr 22, 2011 at 12:17:13PM -0400, Richard Loosemore wrote:
>
>> The question of whether the AIs would be friendly toward each other is a
>> matter of design, very much the same as the question of whether or not
>> they would be good at arithmetic.
>
> The difference is that "friendly" is meaningless until defined.
>
> Please define "friendly" in formal terms, including context.
That is not the difference at all. It is not possible, and will never
be possible, to define "friendly" in formal terms. For the same reason
that it is not possible to define "intelligent" in formal terms.
There are other ways to effect a definition without it being "formal".
I we understand the roots of human motivation, and if we understand the
components of human motivation in such a way that we can differentiate
"empathic" behavior from other kinds, like selfishly aggressive
behavior, then we could discover that (as seems likely) these are
impemented by separable modules in the human brain.
Under those circumstances we would be in a position to investigate the
dynamics of those modules and in the end we could come to be sure that
with that type of design, and with the aggressive modules missing from
the design, a thinking creature would experience all of the empathic
motivations that we so prize in human beings, and not even be aware of
the violent or aggressive motivations.
With that information in hand, we would then be able to predict that the
appropriately designed AI would exhibit a cluster of behaviors that, in
normal human terms, would be summed up by the label "friendly". By that
stage, the term "friendliness" would have achieved the status of having
a functional definition in terms of the behavior, and the stability, of
certain brain modules (and their AGI counterparts). At that point we
would have:
a) A common-speech, DESCRIPTIVE definition of "friendliness",
b) A technical, FUNCTIONAL definition of "friendliness", and
c) A method for designing systems in which the effectiveness and
stability of the friendliness modules could be stated in concrete,
measureable terms. This would give us everything we could ask for in
the way of a guarantee of friendliness, short of a guarantee based on
logically necessary premises and strictly correct logical deduction (a
type of guarantee that is virtually impossible in any real world science).
--
So, if you are going to argue that "friendliness" cannot be given a
formal definition in the mathematical sense, I agree with you.
If you are going to claim that this has any relevance for the question
at hand, you are mistaken, and you need to address the above line of
argument if you want to maintain otherwise.
Richard Loosemore
More information about the extropy-chat
mailing list