[ExI] Is the GPT-3 statistical language model conscious?

Sat Oct 10 20:44:56 UTC 2020

Quoting Dylan Distasio:

> This also might help if you want a lot more detail:
> https://jalammar.github.io/how-gpt3-works-visualizations-animations/

It is a pretty good explanation although it is probably not helpful  
that they labelled all the many layers of neurons as a black box  
labelled "magic".

> That said, I don't believe GPT-3 is intelligent or conscious.  It's another
> deep learning parlor trick, albeit an impressive one, with some real world
> applications.

As a data point, it fits my theory that intelligence is a Universal  
Learning Function or ULF. The neurons are not using magic, they are  
simply using the irreducible complexity of the composite of several  
billion non-linear functions to map inputs to predicted outputs and  
then tuning those functions until the predicted outputs match the  
observed outputs contained in its training set. Which for GPT-3, the  
training set is apparently a significant fraction of all written text  
on the Internet. And the non-linearity is the key to the  
irreducibility. For example, it prevents the combined output of 3  
individual neurons from being reduced to the output of just one of the  
neurons multiplied by 3.

The capacity to learn from experience and then extrapolate that  
experience into the future is indistinguishable from intelligence and  
therefore, in my theory, IS intelligence. And the way we assess  
intelligence is by observing the speed, accuracy, efficiency, and  
depth of the extrapolated output. So all else being equal, if two  
individuals are thrust into an unfamiliar environment, then the one  
who adapts faster to the new environment is likely to be the more  
intelligent of the two.

> I happen to be very interested in text generation via AI, so I'm not
> disparaging it by any means, but like all this other stuff, it's  
> dead behind the eyes.

Baby steps, Dylan. Baby steps. It can't be a mere coincidence that  
large sparse networks of mathematical neurons can so convincingly  
mimic large sparse networks of biological neurons i.e. brains. From  
where I am standing the biggest difference seems to be that biological  
neurons utilize time in a way that artificial neurons do not. Deep  
learning networks have the appropriate space-complexity, they just  
don't utilize time complexity the way biological brains do.

Also instead of thinking of deep nets as complete brains, we should  
think of them as brain structures. So a deep convolutional perceptron  
network is not an entire brain in and of itself but is instead like a  
visual cortex. And GPT-3 is not a full brain, but instead would make a  
fine language center. In fact if someone could somehow connect all the  
various specialized deep learning "parlor tricks" together such they  
could share information with one another and perhaps add some kind of  
"ego module" with artificial instinctive drives, then we would have  
pretty good approximation of an AGI.

Stuart LaForge