[ExI] Losing control (was: Unfrendly AI is a mistaken idea.)

Sun Jun 17 22:06:11 UTC 2007

On 6/17/07, Samantha Atkins <sjatkins at mac.com> wrote:
>
> On Jun 17, 2007, at 1:02 AM, Eliezer S. Yudkowsky wrote:
> >> o.
> >
> > Cheap slogan.  What about five-year-olds?  Where do you draw the line?
> >
> > Someone says they want to hotwire their brain's pleasure center;
> > they say they think it'll be fun.  A nearby AI reads off their brain
> > state and announces unambiguously that they have no idea what'll
> > actually happen to them - they're definitely working based on
> > mistaken expectations.  They're too stubborn to listen to warnings,
> > and they're picking up the handy neural soldering iron (they're on
> > sale at Wal-Mart, a very popular item).  What's the moral course of
> > action? For you?  For society?  For a superintelligent AI?
>
>
> Good question and difficult to answer.  Do you protect everyone cradle
> to [vastly remote] grave from their own stupidity?  How exactly do
> they grow or become wiser if you do?  As long as they can recover
> (which can be very advanced in the future) to be a bit smarter I am
> not at all sure that direct intervention is wise or moral or best for
> its object.

Difficult to answer when presented in the vernacular, fraught with
vagueness, ambiguity, and unfounded assumptions. Straightforward when
restated in terms of a functional description of moral
decision-making.

In each case, the morality, or perceived rightness, of a course of
action corresponds to the extent to which the action is assessed as
promoting, in principle, over an increasing scope of consequences, an
increasingly coherent set of values of an increasing context of agents
 identified with the decision-making agent as self.

In the context of an individual agent acting in effective isolation,
there is no distinction between "moral" and simply "good."  The
individual agent should (in the moral sense), following the
formulation above, take whatever course of action appears to best
promote its individual values.  In the first case above, we have no
information about the individual's value set other than what we might
assign from our own "common sense"; in particular we lack any
information about the relative perceived value of the advice of the
AI, so we are unable to draw any specific normative conclusions.

In the second and third cases above, it's not clear whether the
subject is intended to be moral actor, assessor, or agent (both.)
I'll assume here (in order to remain within practical email length)
that only passive moral assessment of the human's neurohacking was
intended.

The second case illustrates our most common view of moral judgment,
with the values of our society defining the norm. Most of our values
in common are encoded into our innate psychology and aspects of our
culture such as language and religion as a result of evolution, but
the environment has changed significantly over time, leaving us with a
relatively incoherent mix of values such as "different is dangerous"
vs. "growth thrives on diversity" and "respect authority" vs. "respect
truth", and countless others.  To the question at hand we can presume
to assign society's common-sense values set and note that the
neurohacking will have little congruence with common values, what
congruence exists will suffer from significant incoherence, and the
scope of desirable consequences will be largely unimaginable.   Given
this assessment in today's society, the precautionary principle would
be expected to prevail.

The third case, of a superintelligent but passive AI, would offer a
vast improvement in coherence over human capacity, but would be
critically dependent on an accurate model of the present values of
human society. When applied and updated in an **incremental** fashion
it would provide a superhuman adjunct to moral reasoning. Note the
emphasis on "incremental", because, because coherence does not imply
truth within any practical computational bounds.

- Jef