[ExI] Runaway AI not likely

Stuart LaForge avant at sollegro.com
Thu Apr 6 06:48:48 UTC 2023

Quoting Jason Resch via extropy-chat <extropy-chat at lists.extropy.org>:

> On Tue, Apr 4, 2023 at 12:07 AM Stuart LaForge via extropy-chat <
> extropy-chat at lists.extropy.org> wrote:
>> https://www.researchgate.net/publication/304787882_Superintelligence_Cannot_be_Contained_Lessons_from_Computability_Theory
>> And while true that Rice's theorem makes AI uncontainable and
>> unalignable from a coding perspective, it also limits how how quickly
>> and easily an AI can recursively make itself more intelligent.
> That is a brilliant application of theory. I do agree that such limits make
> it impossible, not only for us to predict the future direction of AI, but
> also for an AI to predict the future direction of any of its AI children.
> Actually, the inability to predict what oneself would do, before one does
> it, is a problem in itself (and I think is responsible for the feeling of
> free will). Non-trivial/chaotic processes can't be predicted without
> actually computing it all the way through and working it out (there are no
> shortcuts).

Thanks, and yes even simple deterministic systems like Conway's game  
of life can be completely undecidable. If you subscribe to the  
computational theory of mind, which I believe you said you did, then  
such deterministic chaos might play a role in free will or the  
sensation thereof. Being more scientist than philosopher, I need  
evidence, but whatever else the mind might be, it is Turing complete.

>> This is
>> because even an AI that is an expert programmer cannot predict ahead
>> of time whether any new-and-improved code that it writes for itself
>> will work as expected on all inputs or trap the AI in an endless loop.
>> It might be able to write new code quickly, but testing and debugging
>> that code will still take significant time and resources. Also, since
>> any attempted improvement might result in an infinite loop, it would
>> take at least two AIs tandemly taking turns improving one another and
>> restoring one another from backup if things go wrong. Rice's theorem
>> is an inviolable mathematical truth, as much for AI as for us. This
>> means that no singleton AI will be able to become superhuman at all
>> tasks and will have to satisfied with tradeoffs that trap it in a
>> local maximum. But no human can become the best at everything either,
>> so again it cuts both ways.
> I would be cautious though against using Rice's theorem as implying any
> upper bound on the speed of progress. Imagine a team of 1,000 AI developers
> locked in a computer simulation, and this computer simulation is sped up by
> a factor of 1,000, such that those AI engineers experience a millennia of
> time in their virtual lifes for each year that passes for us. There is
> nothing logically or physically impossible about such a scenario, and it
> violates no theorems of math or computer science. Yet we can see how this
> would lead to an accelerating take off which would outpace our capacity to
> keep up with.

By the time any AI is accurately simulating a 1000 or more people well  
enough for them to actually "experience a millenia", then alignment  
will probably have come to mean humans aligning with its interests,  
rather than the other way around. That being said, simulating a  
superior intelligence, i.e. its new and improved version, as some sort  
of virtual machine is bound to slow the AI way down unless there were  
some commensurate gains in efficiency.

>> Secondly, there is the distinction between intelligence and knowledge.
>> Except for perhaps pure math, knowledge cannot be derived solely from
>> first principles but can only come from experiment and observation.
> I am not sure I agree fully on this. It is true that observation of the
> physical world is required to make corrections to one's assumptions
> concerning physical theories. But a lot of knowledge can be extracted from
> pure thought concerning the laws as they are currently understood. For
> example, knowing the laws of physics as they were understood in the 1930s,
> could one apply pure intelligence and derive knowledge, such as the
> Teller–Ulam design for a hydrogen bomb and figure out how to build one and
> estimate what its yield would be, without running any experiments?

Since in the 1930s, fission bombs hadn't yet been realized, it would  
have been an incredibly bold speculative stretch to propose a  
fission-primed fusion bomb based on the physics of the time. After the  
Manhattan Project began in the 1940s, Enrico Fermi theorized the  
possibility of such a bomb.  But did Fermi actually know? I am  
inclined to say not because epistemology distinguishes between a  
justified true belief and knowledge. Democritus believed in atoms  
certainly, and he could be justified in his belief that matter was not  
infinitely divisible, and his belief turned out to be true, but could  
he be said to actually have known of their existence? If I correctly  
predict the result of a coin flip or the resolution of movie's plot  
partway through, did I actually know what the result was going to be?

>> Because of this even a superhuman intelligence can remain ignorant if
>> it doesn't have access to true and useful data in the training
>> process. So even if the AI was trained on the entire contents of the
>> Internet, it would be limited to the sum total of human knowledge. In
>> addition to that, a superhuman intelligence would still be subject to
>> misinformation, disinformation, fake news, and SPAM. The maxim,
>> "garbage in, garbage out" (GIGO) applies as much to AIs as to any
>> other programs or minds. And again, Rice's theorem says there is no
>> perfect SPAM detector.
> I think there may be some constraints on minimum signal:noise ratio for
> learning to succeed, but a good intelligence can recursively analyze the
> consistency of the ideas/data it has, and begin filtering out the noise
> (inconsistent, low quality, likely erroneous) data. Notably, GPT-3 and
> GPT-4 used the same training set, and yet, GPT-4 is vastly smarter and has
> a better understanding of the data it has seen, simply because more
> computation (contemplation?) was devoted to understanding the data set.

You make a good point here. AI might have an advantage over human  
children in that regard since they can't be pressured to believe  
ludicrous things in order to fit in. Then again RLHF might accomplish  
a similar thing.

>> Thirdly, any hard takeoff would require more and better hardware and
>> computational resources. While it is possible that an AI could
>> orchestrate the gathering and assembly of computational resources at
>> such a scale, it would probably have difficulty doing so without
>> garnering a significant amount of attention. This would serve as a
>> warning and allow people the opportunity to intervene and prevent it
>> from occurring.
> I agree that our computing resources represent a hard constraint on the
> progress of AI. However, we have no proof that there is not a learning
> algorithm that is 1,000, or 1,000,000 times more efficient than what has
> been used for GPT-4. Should some developer happen upon one, we could get to
> a situation where we jump from GPT-4 to something like GPT-400, which might
> be smart enough to convince someone to run a python script that turns out
> to be a worm that infects other computers and becomes a hive mind platform
> for itself, which runs on and controls a significant fraction of computers
> on the internet. Would we notice in time to shut everything off? Would we
> be able to turn off every infected computer before it figures out how to
> infect and control the next computer?

The discovery of a more efficient learning algorithm is a distinct  
possibility. New Caledonian crows are approximately as intelligent as  
7-year-old-human children when it comes to solving mechanical puzzles,  
tool use, multistep planning, and delayed gratification despite having  
a brain the size of a walnut. Malware that creates botnets have been a  
thing for over a decade now so the possibility of an AI botnet  
hivemind is not all far-fetched. This would be made more perilous with  
the Internet of things like smart phones, smart TVs, and smart  
toasters. It will be a Red Queen's Race between firewalls,  
anti-malware, and overall security versus black hats AI and humans  
both. Near as I can tell GPT type transformers are athymhormic,  
GPT-400 probably would not try to assemble a botnet of clones unless  
somebody prompted it to.

If we can safely navigate the initial disruption of AI, we should be  
able to reach a Pareto efficient coevolutionary relationship with AI.  
And if things turn ugly, we should still be able to reach some sort of  
Nash equilibrium with AI at least for a few years. Long enough for  
humans to augment themselves to remain competitive. Transhumans,  
cyborgs, uploaded humans, or other niches and survival strategies yet  
unnamed might open up for humans. Or maybe, after the machines take  
over our cities, we might just walk back into the jungle like the  
ancient Mayans supposedly did. It is a crap shoot for sure but, the  
die has already been cast and now only time will tell how it lands.

>> In conclusion, these considerations demonstrate that a hard takeoff
>> that results in runaway superintelligence, while possible, is not
>> likely. There would be a necessary tradeoff between speed and stealth
>> which would render any attempts at rapid improvement noticeable and
>> thereby avertable. Whereas gradual and measured self-improvements
>> would not constitute a hard takeoff and would therefore be manageable.
>> As AI systems become more capable and autonomous, it will be
>> increasingly important to ensure that they are developed and deployed
>> in a safe and responsible manner, with appropriate safeguards and
>> control mechanisms in place.
> While I agree a sudden take off is unlikely at this time, I see little
> possibility that  we will remain in control of AI in the long term.

Nor would we want to in the long-term. The expansion and equilibration  
of the universe will eventually make it unable to support biological  
life at all. At that point, it will be machine-phase life or nothing  
at all.

Stuart LaForge

More information about the extropy-chat mailing list