[ExI] Paul vs Eliezer

Darin Sunley dsunley at gmail.com
Tue Apr 5 02:34:37 UTC 2022


You may have addressed this with your "using resources given to it" clause,
but if I ask an athynormic AI to calculate BB(20), will it try to pave the
entire future light cone of the universe with computronium to try to find
the answer?

The problem of distinguishing between "resources given to it" and "atoms
within reach" may be nontrivial.

On Mon, Apr 4, 2022 at 8:12 PM Rafal Smigrodzki via extropy-chat <
extropy-chat at lists.extropy.org> wrote:

>
>
> On Mon, Apr 4, 2022 at 10:04 PM Darin Sunley via extropy-chat <
> extropy-chat at lists.extropy.org> wrote:
>
>> The problem with AI motivations isn't so much that we don't understand
>> evolved motivations - we understand them all too well. Well enough to know
>> that if you successfully re-implemented an evolved motivational stack in
>> nanotech-based hardware, it would be an existential threat to everything in
>> its future light cone.
>>
>> The trick is figuring out how to build a motivational stack that is as
>> /unlike/ an evolved motivational stack as possible, while still being
>> capable of having motivations at all. That's the hard part.
>>
>
> ### Exactly. This is what I further wrote on Scott's blog:
>
> Reinforcement learning is simple in principle and when applied to simple
> neural networks that are trained from scratch but once you start talking
> about a self-modifying AI that contains a large-scale predictive model of
> the world there are a lot of ways for the process to be derailed. The AI is
> likely to be built from layers of neural networks that have been trained or
> otherwise constructed to serve different goals (e.g. a large GPT-like
> language model grafted on a Tesla FSD network trained inside a Tesla bot)
> and creating a coherent system capable of world-changing action by
> modifying this mess will not be trivial. Non-trivial means it's not likely
> to happen by accident, or at least, it will take quite a few accidents
> before the big one hits.
>
> I remember I bugged Eliezer about building the "athymhormic AI", as I
> called it, rather than Friendly AI, about 20 years ago. (Jeez, time flies!)
> The athymhormic AI would be an AI designed not to want to do anything,
> except computing answers using resources given to it, just like an
> athymhormic human might be quite capable of answering your questions but
> incapable of doing much on his own. Creating things that don't do much is
> easier than creating things that do a lot. Creating things that just don't
> care about their survival is possible. We may have the intuition that
> anything capable of thinking will naturally think about making it alive out
> of the box but then our intuition is built from observing naturally evolved
> brains and staying alive is exactly what natural brains evolved for.
> Constructed minds will not automatically converge on the Omohundro goals,
> unless some specific structures are present to begin with, such as a goal
> of maximizing some real-world parameter, or an infinite regress of
> maximizing the precision of a calculation, or other such trip-hazards.
>
> At least I hope so.
> _______________________________________________
> extropy-chat mailing list
> extropy-chat at lists.extropy.org
> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20220404/d9c73746/attachment.htm>


More information about the extropy-chat mailing list