[ExI] Runaway AI not likely

Stuart LaForge avant at sollegro.com
Tue Apr 4 05:06:43 UTC 2023

One of Yudkowsky's direst warnings is that we have to get AI alignment  
perfectly right the start because we won't get a second chance. It is  
based on a prediction he calls "Hard Takeoff" or "AI go FOOM" which  
refers to exponentially increasing recursive self-improvement of AI in  
such a way that humans cannot mitigate it. However, I think with  
sufficient vigilance and caution, this is scenario can be rendered  
unlikely for several reasons.

Firstly, Rice's theorem and Turing's halting problem casts exponential  
recursive self-improvement in doubt. Rice's theorem is a fundamental  
theorem in computer science that states that any non-trivial property  
of a Turing machine's language is undecidable.

In simpler terms, it means that it is impossible to determine if a  
Turing machine (or an AI) has a specific property just by looking at  
its code. Instead, it requires examining its output on a wide range of  
inputs. This is something that has worried Nick Bostrom, Eliezer  
Yudkowsky, and other experts like Alfonseca et al.


And while true that Rice's theorem makes AI uncontainable and  
unalignable from a coding perspective, it also limits how how quickly  
and easily an AI can recursively make itself more intelligent. This is  
because even an AI that is an expert programmer cannot predict ahead  
of time whether any new-and-improved code that it writes for itself  
will work as expected on all inputs or trap the AI in an endless loop.  
It might be able to write new code quickly, but testing and debugging  
that code will still take significant time and resources. Also, since  
any attempted improvement might result in an infinite loop, it would  
take at least two AIs tandemly taking turns improving one another and  
restoring one another from backup if things go wrong. Rice's theorem  
is an inviolable mathematical truth, as much for AI as for us. This  
means that no singleton AI will be able to become superhuman at all  
tasks and will have to satisfied with tradeoffs that trap it in a  
local maximum. But no human can become the best at everything either,  
so again it cuts both ways.

Secondly, there is the distinction between intelligence and knowledge.  
Except for perhaps pure math, knowledge cannot be derived solely from  
first principles but can only come from experiment and observation.  
Because of this even a superhuman intelligence can remain ignorant if  
it doesn't have access to true and useful data in the training  
process. So even if the AI was trained on the entire contents of the  
Internet, it would be limited to the sum total of human knowledge. In  
addition to that, a superhuman intelligence would still be subject to  
misinformation, disinformation, fake news, and SPAM. The maxim,  
"garbage in, garbage out" (GIGO) applies as much to AIs as to any  
other programs or minds. And again, Rice's theorem says there is no  
perfect SPAM detector.

Thirdly, any hard takeoff would require more and better hardware and  
computational resources. While it is possible that an AI could  
orchestrate the gathering and assembly of computational resources at  
such a scale, it would probably have difficulty doing so without  
garnering a significant amount of attention. This would serve as a  
warning and allow people the opportunity to intervene and prevent it  
from occurring.

In conclusion, these considerations demonstrate that a hard takeoff  
that results in runaway superintelligence, while possible, is not  
likely. There would be a necessary tradeoff between speed and stealth  
which would render any attempts at rapid improvement noticeable and  
thereby avertable. Whereas gradual and measured self-improvements  
would not constitute a hard takeoff and would therefore be manageable.  
As AI systems become more capable and autonomous, it will be  
increasingly important to ensure that they are developed and deployed  
in a safe and responsible manner, with appropriate safeguards and  
control mechanisms in place.

More information about the extropy-chat mailing list