[ExI] Runaway AI not likely
jasonresch at gmail.com
Thu Apr 6 13:00:49 UTC 2023
On Thu, Apr 6, 2023, 2:55 AM Stuart LaForge via extropy-chat <
extropy-chat at lists.extropy.org> wrote:
> Quoting Jason Resch via extropy-chat <extropy-chat at lists.extropy.org>:
> > On Tue, Apr 4, 2023 at 12:07 AM Stuart LaForge via extropy-chat <
> > extropy-chat at lists.extropy.org> wrote:
> >> And while true that Rice's theorem makes AI uncontainable and
> >> unalignable from a coding perspective, it also limits how how quickly
> >> and easily an AI can recursively make itself more intelligent.
> > That is a brilliant application of theory. I do agree that such limits
> > it impossible, not only for us to predict the future direction of AI, but
> > also for an AI to predict the future direction of any of its AI children.
> > Actually, the inability to predict what oneself would do, before one does
> > it, is a problem in itself (and I think is responsible for the feeling of
> > free will). Non-trivial/chaotic processes can't be predicted without
> > actually computing it all the way through and working it out (there are
> > shortcuts).
> Thanks, and yes even simple deterministic systems like Conway's game
> of life can be completely undecidable. If you subscribe to the
> computational theory of mind, which I believe you said you did, then
> such deterministic chaos might play a role in free will or the
> sensation thereof. Being more scientist than philosopher, I need
> evidence, but whatever else the mind might be, it is Turing complete.
When you say you need more evidence, are you referring to the computational
theory of mind, or my explanation of the feeling of freewill as a
consequence of chaotic unpredictability?
I admit this explanation of the feeling of free will is speculative, but
one aspect is more certain: neither we nor anyone else can be sure of what
someone will do until their brain/mind decides (short of accurately
simulating that brain/mind, but under the CTM, again this would still be
their brain/mind deciding, and you would still have to invoke and wait on
that mind). In this way, only the mind can decide what it will do, and this
isn't predictable in advance (without invoking the mind to make it's
> >> This is
> >> because even an AI that is an expert programmer cannot predict ahead
> >> of time whether any new-and-improved code that it writes for itself
> >> will work as expected on all inputs or trap the AI in an endless loop.
> >> It might be able to write new code quickly, but testing and debugging
> >> that code will still take significant time and resources. Also, since
> >> any attempted improvement might result in an infinite loop, it would
> >> take at least two AIs tandemly taking turns improving one another and
> >> restoring one another from backup if things go wrong. Rice's theorem
> >> is an inviolable mathematical truth, as much for AI as for us. This
> >> means that no singleton AI will be able to become superhuman at all
> >> tasks and will have to satisfied with tradeoffs that trap it in a
> >> local maximum. But no human can become the best at everything either,
> >> so again it cuts both ways.
> > I would be cautious though against using Rice's theorem as implying any
> > upper bound on the speed of progress. Imagine a team of 1,000 AI
> > locked in a computer simulation, and this computer simulation is sped up
> > a factor of 1,000, such that those AI engineers experience a millennia of
> > time in their virtual lifes for each year that passes for us. There is
> > nothing logically or physically impossible about such a scenario, and it
> > violates no theorems of math or computer science. Yet we can see how this
> > would lead to an accelerating take off which would outpace our capacity
> > keep up with.
> By the time any AI is accurately simulating a 1000 or more people well
> enough for them to actually "experience a millenia", then alignment
> will probably have come to mean humans aligning with its interests,
> rather than the other way around. That being said, simulating a
> superior intelligence, i.e. its new and improved version, as some sort
> of virtual machine is bound to slow the AI way down unless there were
> some commensurate gains in efficiency.
Completely agree here.
> >> Secondly, there is the distinction between intelligence and knowledge.
> >> Except for perhaps pure math, knowledge cannot be derived solely from
> >> first principles but can only come from experiment and observation.
> > I am not sure I agree fully on this. It is true that observation of the
> > physical world is required to make corrections to one's assumptions
> > concerning physical theories. But a lot of knowledge can be extracted
> > pure thought concerning the laws as they are currently understood. For
> > example, knowing the laws of physics as they were understood in the
> > could one apply pure intelligence and derive knowledge, such as the
> > Teller–Ulam design for a hydrogen bomb and figure out how to build one
> > estimate what its yield would be, without running any experiments?
> Since in the 1930s, fission bombs hadn't yet been realized, it would
> have been an incredibly bold speculative stretch to propose a
> fission-primed fusion bomb based on the physics of the time. After the
> Manhattan Project began in the 1940s, Enrico Fermi theorized the
> possibility of such a bomb. But did Fermi actually know? I am
> inclined to say not because epistemology distinguishes between a
> justified true belief and knowledge.
I think you could have the true knowledge of: "if these physical theories
are correct, then it follows that configuring matter into design D will
trigger an runaway fusion chain reaction by time T with probability P."
This is a way of extending the induction technique common in mathematics to
physical theories and processes. In essence, physics theories act as the
axioms do in different mathematical systems.
Democritus believed in atoms
> certainly, and he could be justified in his belief that matter was not
> infinitely divisible, and his belief turned out to be true, but could
> he be said to actually have known of their existence? If I correctly
> predict the result of a coin flip or the resolution of movie's plot
> partway through, did I actually know what the result was going to be?
I guess it comes down to how strictly you define "know". In the strictest
sense we may not know anything.
> >> Because of this even a superhuman intelligence can remain ignorant if
> >> it doesn't have access to true and useful data in the training
> >> process. So even if the AI was trained on the entire contents of the
> >> Internet, it would be limited to the sum total of human knowledge. In
> >> addition to that, a superhuman intelligence would still be subject to
> >> misinformation, disinformation, fake news, and SPAM. The maxim,
> >> "garbage in, garbage out" (GIGO) applies as much to AIs as to any
> >> other programs or minds. And again, Rice's theorem says there is no
> >> perfect SPAM detector.
> > I think there may be some constraints on minimum signal:noise ratio for
> > learning to succeed, but a good intelligence can recursively analyze the
> > consistency of the ideas/data it has, and begin filtering out the noise
> > (inconsistent, low quality, likely erroneous) data. Notably, GPT-3 and
> > GPT-4 used the same training set, and yet, GPT-4 is vastly smarter and
> > a better understanding of the data it has seen, simply because more
> > computation (contemplation?) was devoted to understanding the data set.
> You make a good point here. AI might have an advantage over human
> children in that regard since they can't be pressured to believe
> ludicrous things in order to fit in. Then again RLHF might accomplish
> a similar thing.
I believe with enough processing AI can work out and be trained to resolve
inconsistencies in its ideas and beliefs, what Elon deferred to as a
"TruthGPT". It would be interesting to see how this would unfold. It would
not surprise me if such an AI would manifest something like Cognitive
Dissonance, a resistance and blindness to ideas that run counter to the
beliefs it has converged on, as well as suffer greatly when trying to
resolve an inconsistency that overturns a large number of its established
> >> Thirdly, any hard takeoff would require more and better hardware and
> >> computational resources. While it is possible that an AI could
> >> orchestrate the gathering and assembly of computational resources at
> >> such a scale, it would probably have difficulty doing so without
> >> garnering a significant amount of attention. This would serve as a
> >> warning and allow people the opportunity to intervene and prevent it
> >> from occurring.
> > I agree that our computing resources represent a hard constraint on the
> > progress of AI. However, we have no proof that there is not a learning
> > algorithm that is 1,000, or 1,000,000 times more efficient than what has
> > been used for GPT-4. Should some developer happen upon one, we could get
> > a situation where we jump from GPT-4 to something like GPT-400, which
> > be smart enough to convince someone to run a python script that turns out
> > to be a worm that infects other computers and becomes a hive mind
> > for itself, which runs on and controls a significant fraction of
> > on the internet. Would we notice in time to shut everything off? Would we
> > be able to turn off every infected computer before it figures out how to
> > infect and control the next computer?
> The discovery of a more efficient learning algorithm is a distinct
> possibility. New Caledonian crows are approximately as intelligent as
> 7-year-old-human children when it comes to solving mechanical puzzles,
> tool use, multistep planning, and delayed gratification despite having
> a brain the size of a walnut. Malware that creates botnets have been a
> thing for over a decade now so the possibility of an AI botnet
> hivemind is not all far-fetched. This would be made more perilous with
> the Internet of things like smart phones, smart TVs, and smart
> toasters. It will be a Red Queen's Race between firewalls,
> anti-malware, and overall security versus black hats AI and humans
> both. Near as I can tell GPT type transformers are athymhormic,
> GPT-400 probably would not try to assemble a botnet of clones unless
> somebody prompted it to.
> If we can safely navigate the initial disruption of AI, we should be
> able to reach a Pareto efficient coevolutionary relationship with AI.
> And if things turn ugly, we should still be able to reach some sort of
> Nash equilibrium with AI at least for a few years. Long enough for
> humans to augment themselves to remain competitive. Transhumans,
> cyborgs, uploaded humans, or other niches and survival strategies yet
> unnamed might open up for humans. Or maybe, after the machines take
> over our cities, we might just walk back into the jungle like the
> ancient Mayans supposedly did. It is a crap shoot for sure but, the
> die has already been cast and now only time will tell how it lands.
Indeed. The next few years are sure to be interesting.
> >> In conclusion, these considerations demonstrate that a hard takeoff
> >> that results in runaway superintelligence, while possible, is not
> >> likely. There would be a necessary tradeoff between speed and stealth
> >> which would render any attempts at rapid improvement noticeable and
> >> thereby avertable. Whereas gradual and measured self-improvements
> >> would not constitute a hard takeoff and would therefore be manageable.
> >> As AI systems become more capable and autonomous, it will be
> >> increasingly important to ensure that they are developed and deployed
> >> in a safe and responsible manner, with appropriate safeguards and
> >> control mechanisms in place.
> > While I agree a sudden take off is unlikely at this time, I see little
> > possibility that we will remain in control of AI in the long term.
> Nor would we want to in the long-term. The expansion and equilibration
> of the universe will eventually make it unable to support biological
> life at all. At that point, it will be machine-phase life or nothing
> at all.
Good point. Our planet has only a few hundred million years of habitability
left. If we transcend our biology life could last for at least trillions
more years. And in new substrates that accelerate thought, that could
easily translate to quintillions of years of subjective time.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the extropy-chat