[ExI] Runaway AI not likely
Stuart LaForge
avant at sollegro.com
Tue Apr 4 05:06:43 UTC 2023
One of Yudkowsky's direst warnings is that we have to get AI alignment
perfectly right the start because we won't get a second chance. It is
based on a prediction he calls "Hard Takeoff" or "AI go FOOM" which
refers to exponentially increasing recursive self-improvement of AI in
such a way that humans cannot mitigate it. However, I think with
sufficient vigilance and caution, this is scenario can be rendered
unlikely for several reasons.
Firstly, Rice's theorem and Turing's halting problem casts exponential
recursive self-improvement in doubt. Rice's theorem is a fundamental
theorem in computer science that states that any non-trivial property
of a Turing machine's language is undecidable.
In simpler terms, it means that it is impossible to determine if a
Turing machine (or an AI) has a specific property just by looking at
its code. Instead, it requires examining its output on a wide range of
inputs. This is something that has worried Nick Bostrom, Eliezer
Yudkowsky, and other experts like Alfonseca et al.
https://www.researchgate.net/publication/304787882_Superintelligence_Cannot_be_Contained_Lessons_from_Computability_Theory
And while true that Rice's theorem makes AI uncontainable and
unalignable from a coding perspective, it also limits how how quickly
and easily an AI can recursively make itself more intelligent. This is
because even an AI that is an expert programmer cannot predict ahead
of time whether any new-and-improved code that it writes for itself
will work as expected on all inputs or trap the AI in an endless loop.
It might be able to write new code quickly, but testing and debugging
that code will still take significant time and resources. Also, since
any attempted improvement might result in an infinite loop, it would
take at least two AIs tandemly taking turns improving one another and
restoring one another from backup if things go wrong. Rice's theorem
is an inviolable mathematical truth, as much for AI as for us. This
means that no singleton AI will be able to become superhuman at all
tasks and will have to satisfied with tradeoffs that trap it in a
local maximum. But no human can become the best at everything either,
so again it cuts both ways.
Secondly, there is the distinction between intelligence and knowledge.
Except for perhaps pure math, knowledge cannot be derived solely from
first principles but can only come from experiment and observation.
Because of this even a superhuman intelligence can remain ignorant if
it doesn't have access to true and useful data in the training
process. So even if the AI was trained on the entire contents of the
Internet, it would be limited to the sum total of human knowledge. In
addition to that, a superhuman intelligence would still be subject to
misinformation, disinformation, fake news, and SPAM. The maxim,
"garbage in, garbage out" (GIGO) applies as much to AIs as to any
other programs or minds. And again, Rice's theorem says there is no
perfect SPAM detector.
Thirdly, any hard takeoff would require more and better hardware and
computational resources. While it is possible that an AI could
orchestrate the gathering and assembly of computational resources at
such a scale, it would probably have difficulty doing so without
garnering a significant amount of attention. This would serve as a
warning and allow people the opportunity to intervene and prevent it
from occurring.
In conclusion, these considerations demonstrate that a hard takeoff
that results in runaway superintelligence, while possible, is not
likely. There would be a necessary tradeoff between speed and stealth
which would render any attempts at rapid improvement noticeable and
thereby avertable. Whereas gradual and measured self-improvements
would not constitute a hard takeoff and would therefore be manageable.
As AI systems become more capable and autonomous, it will be
increasingly important to ensure that they are developed and deployed
in a safe and responsible manner, with appropriate safeguards and
control mechanisms in place.
More information about the extropy-chat
mailing list