[ExI] Against the paperclip maximizer or why I am cautiously optimistic

Sat Apr 15 10:01:45 UTC 2023

On Sat, Apr 15, 2023, 12:19 AM Rafal Smigrodzki via extropy-chat <
extropy-chat at lists.extropy.org> wrote:

>
>
> On Tue, Apr 4, 2023 at 9:01 AM Jason Resch via extropy-chat <
> extropy-chat at lists.extropy.org> wrote:
>
>>
>>
>> On Tue, Apr 4, 2023 at 2:44 AM Rafal Smigrodzki via extropy-chat <
>> extropy-chat at lists.extropy.org> wrote:
>>
>>>
>>>
>>> On Mon, Apr 3, 2023 at 11:05 AM Jason Resch via extropy-chat <
>>> extropy-chat at lists.extropy.org> wrote:
>>>
>>>>
>>>> Even for a superhuman intelligence guided by the principle of doing the
>>>> best for itself and others, it will still make errors in calculation, and
>>>> can never provide optimal decisions in all cases or over all timeframes.
>>>> The best we can achieve I think will reduce to some kind of learned
>>>> heuristics.
>>>>
>>>
>>> ### Well, yes, absolutely. Superhuman or not, every computer in this
>>> world has limitations. Please note that I wrote that the AI wouldn't make
>>> *trivial* mistakes. I didn't say it would provably find the optimal
>>> solutions to ethical questions.
>>>
>>> Indeed our human goal system is a kludge, a set of learned heuristics,
>>> evolved to steer a mammal endowed with low-level general intelligence to
>>> produce offspring under conditions of natural adaptedness. It's not a
>>> coherent logical system but rather a hodgepodge of ad hoc solutions to
>>> various motivational problems our ancestors' genes encountered during
>>> evolution. In the right environment it does work most the time - very few
>>> humans commit suicide or fritter away their resources on reproductively
>>> useless activities when living in hunter gatherer societies.
>>>
>>> Take humans to a modern society, and you get a well over 50% failure
>>> rate, as measured by reproductive success in e.g. South Korea and other
>>> similar places, and almost all of that failure is due to faulty goal
>>> systems, not objective limits to reproduction.
>>>
>>> This goal system and other cognitive parts of the brain (language,
>>> logic, physical modeling, sensory perception, etc.) all rely on
>>> qualitatively similar cognitive/computational devices - the neocortex that
>>> does e.g. color processing or parsing of sentences is similar to the
>>> ventral prefrontal cortex that does our high-level goal processing. All of
>>> this cognition is boundedly rational - there are only so many cognitive
>>> resources our brains can throw at each problem, and all of it is just "good
>>> enough", not error-free. Which is why we have visual illusions when
>>> confronted with out-of-learning-sample visual scenes and we have high
>>> failure rates of motivation when exposed to e.g. social media or
>>> hyper-palatable foods.
>>>
>>> I think I am getting too distracted here but here is what I think
>>> matters: We don't need provably correct solutions to the problems we are
>>> confronted with. We survive by making good enough decisions. There is no
>>> fundamental qualitative difference between general cognition and goal
>>> system cognition. A goal system only needs to be good enough under most
>>> circumstances to succeed most of the time, which is enough for life to go
>>> on.
>>>
>>> The surprising success of LLMs in general cognition implies you should
>>> be able to apply machine learning techniques to understand human goal
>>> systems and thus understand what we really want. A high quality cognitive
>>> engine, an inference device, the superhuman AI would make correct
>>> determinations more often than humans - not the decisions that are provably
>>> optimal in the longest time frames but the correct decisions under given
>>> computational limitations. Make the AI powerful enough and it will work out
>>> better for us than if we had to make all the decisions.
>>>
>>> That's all we really need.
>>>
>>> The Guardian AI will benevolently guide its faithful followers to the
>>> Promised Land of limitless possibilities in the Upload Belts of solar
>>> powered computers that will soon encircle the Sun, after Mercury and other
>>> useless heavenly bodies are disassembled by swarms of nanotech, so is
>>> written in the Books of Microsoft.
>>>
>>>
>>>
>> Rafal, I agree with 99% of what you say above. The 1% thing (which I
>> believe you would also agree with) I think was merely absent from your
>> description, but I think it is also crucial to how we managed to survive.
>>
>> Humans have managed to survive, despite imperfect intelligence and goal
>> and motivational systems, and I think a large part of that is because of
>> decentralized decision making, having a diverse set of different courses of
>> action taken at the individual, family, tribe, village, and national level.
>> A worrisome possibility is that we end up with a single Guardian AI, which
>> while it might be significantly less apt to err than a human, might still
>> lead us all into a ruinous direction.
>>
>> I think it would be safer for humanity's long term survival if there were
>> a collection of distinct AIs with different opinions and ways of thinking,
>> and different sub-groups of people could choose advice from different AIs,
>> or alternately, the single AI offered a varying set of recommendations
>> rather than impose a monolithic top-down rule, and avoid altogether taking
>> any course of action that affects all of humanity all at once.
>>
>>
> ### I am sympathetic to your reasoning here but not completely onboard. We
> need to remember that the emergence of the vastly superhuman AI would be a
> paradigm change, unprecedented in the history of mankind, and our present
> intuitions may not apply.
>
> It is very reasonable to keep many options open when there are just humans
> muddling through. I am a very strong proponent of polycentric social, legal
> solutions to problems, I would oppose any human attempts to create a world
> government but when faced with the superhuman AI I am not sure of anything
> at all. Maybe the AIs would need independent experiments and checks and
> balances. Maybe it would still be possible for one AI to mess up and for
> others to fix it. But on the other hand: Maybe all that perplexes us would
> be a matter of course for a sufficiently advanced mind, an open and shut
> case? Maybe having different AIs would impose unnecessary computational
> costs?
>
> Even having a single Guardian AI would not necessarily mean that it would
> impose a monolithic top down rule - it might have a very light touch.
>

All good points, I agree at some point we won't be in the driver's seat so
it may be moot from our planning perspective. The last two science fiction
stories I have read (Culture series and Hyperion) they envisage societies
of AI, which don't always agree on the best course of action (much like
humans do), perhaps this is necessary to create plot, but then again there
may always be circumstances where different ways of processing information
or unique training sets, etc. could lead two intelligences to disagree on a
particular question. As it comes to predicting future outcomes, that seems
to be generally incomputable so there will always be the possibility for
debate about the optimum course, that has the right balance of risk and
reward according to what one's own values find rewarding or risky. Will all
AIs have the same values, will they all weight their values similarly (e.g.
freedom compared to safety, or more life vs. less life now but lower chance
of extinction in the short term)? If not, then there's the potential for
disagreement even among superintelligences.

Jason

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20230415/c8d40645/attachment.htm>