[ExI] Paul vs Eliezer

Tue Apr 5 00:58:32 UTC 2022

I posted this comment on Astral Codex Ten, regarding the debate between
Paul Christiano and Eliezer Yudkowsky:

I feel that both Paul and Eliezer are not devoting enough attention to the
technical issue of where does AI motivation come from. Our motivational
system evolved over millions of years of evolution and now its core tenet
of fitness maximization is being defeated by relatively trivial changes in
the environment, such as availability of porn, contraception and social
media. Where will the paperclip maximizer get the motivation to make
paperclips? The argument that we do not know how to assure "good" goal
system survives self-modification cuts two ways: While one way for the AI's
goal system to go haywire may involve eating the planet, most
self-modifications would presumably result in a pitiful mess, an AI that
couldn't be bothered to fight its way out of a wet paper bag. Complicated
systems, like the motivational systems of humans or AIs have many failure
modes, mostly of the pathetic kind (depression, mania, compulsions, or the
forever-blinking cursor, or the blue screen) and only occasionally dramatic
(a psychopath in control of the nuclear launch codes).

AI alignment research might learn a lot from fizzled self-enhancing AIs,
maybe enough to prevent the coming of the Leviathan, if we are lucky.

It would be nice to be able to work out the complete theory of AI
motivation before the FOOM but I doubt it will happen. In practice, AI
researchers should devote a lot of attention to analyzing the details of AI
motivation at the already existing levels, and some tinkering might help us
muddle through.

-- 
Rafal Smigrodzki, MD-PhD
Schuyler Biotech PLLC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20220404/910e39bc/attachment.htm>