[ExI] Is the GPT-3 statistical language model conscious?
Stuart LaForge
avant at sollegro.com
Sat Oct 10 20:44:56 UTC 2020
Quoting Dylan Distasio:
> This also might help if you want a lot more detail:
> https://jalammar.github.io/how-gpt3-works-visualizations-animations/
It is a pretty good explanation although it is probably not helpful
that they labelled all the many layers of neurons as a black box
labelled "magic".
> That said, I don't believe GPT-3 is intelligent or conscious. It's another
> deep learning parlor trick, albeit an impressive one, with some real world
> applications.
As a data point, it fits my theory that intelligence is a Universal
Learning Function or ULF. The neurons are not using magic, they are
simply using the irreducible complexity of the composite of several
billion non-linear functions to map inputs to predicted outputs and
then tuning those functions until the predicted outputs match the
observed outputs contained in its training set. Which for GPT-3, the
training set is apparently a significant fraction of all written text
on the Internet. And the non-linearity is the key to the
irreducibility. For example, it prevents the combined output of 3
individual neurons from being reduced to the output of just one of the
neurons multiplied by 3.
The capacity to learn from experience and then extrapolate that
experience into the future is indistinguishable from intelligence and
therefore, in my theory, IS intelligence. And the way we assess
intelligence is by observing the speed, accuracy, efficiency, and
depth of the extrapolated output. So all else being equal, if two
individuals are thrust into an unfamiliar environment, then the one
who adapts faster to the new environment is likely to be the more
intelligent of the two.
> I happen to be very interested in text generation via AI, so I'm not
> disparaging it by any means, but like all this other stuff, it's
> dead behind the eyes.
Baby steps, Dylan. Baby steps. It can't be a mere coincidence that
large sparse networks of mathematical neurons can so convincingly
mimic large sparse networks of biological neurons i.e. brains. From
where I am standing the biggest difference seems to be that biological
neurons utilize time in a way that artificial neurons do not. Deep
learning networks have the appropriate space-complexity, they just
don't utilize time complexity the way biological brains do.
Also instead of thinking of deep nets as complete brains, we should
think of them as brain structures. So a deep convolutional perceptron
network is not an entire brain in and of itself but is instead like a
visual cortex. And GPT-3 is not a full brain, but instead would make a
fine language center. In fact if someone could somehow connect all the
various specialized deep learning "parlor tricks" together such they
could share information with one another and perhaps add some kind of
"ego module" with artificial instinctive drives, then we would have
pretty good approximation of an AGI.
Stuart LaForge
More information about the extropy-chat
mailing list