[ExI] LLMs cannot be conscious

Sun Mar 19 21:13:51 UTC 2023

Quoting Jason Resch via extropy-chat <extropy-chat at lists.extropy.org>:

> On Sun, Mar 19, 2023, 2:04 AM Gordon Swobe via extropy-chat <
> extropy-chat at lists.extropy.org> wrote:
>
>> Consider that LLMs are like dictionaries. A complete dictionary can give
>> you the definition of any word, but that definition is in terms of other
>> words in the same dictionary. If you want to understand  *meaning* of any
>> word definition,  you must look up the definitions of each word in the
>> definition, and then look up each of the words in those definitions, which
>> leads to an infinite regress.

There are multiple architectures of LLM. Before OpenAI's GPT, most  
natural language models were recurrent neural networks (RNN) and the  
property of attention was modelled as short-term memory loops of  
neural impulses where down stream neurons feedback onto upstream  
neurons. The problem is that RNN are slower to train than feed forward  
neural networks (FNN). The innovation in transformer is that it uses  
self-attention which is similar to convolution in which attention  
occurs across a layer in parallel instead between layers in loops.  
This allowed transformers like the GPT series to train a lot faster  
than the RNN language models at the cost of more layers, and some  
purported information loss.

Interestingly, there is evidence that biological connectomes have  
similar neural network topology to RNN with loops and such at least in  
Drosophila. Here is the paper if you are interested.

https://www.science.org/doi/10.1126/science.add9330

If biological brains are indeed RNN, that would suggest that:

1. Biological brains take longer to train than FNN do. That is borne  
out comparing even the brightest of our children that take years to  
train and GPT-3 which can be fully trained in mere hours to days.

2. Biological brains have fewer layers than FNN do. Check. GPT models  
have hundreds of layers whereas the human brain has approximately a  
dozen counting both input and output layers.

[snip]
> To move forward, we need to answer:
>
> 1. What is meaning?

I have been struggling with finding connections between semantics and  
information theory for quite awhile now. I can summarize my finding  
thusly:

1. The meaning of a message is subjective and will signify different  
meanings to the sender, intended recipients, and unintended recipients.

2. The meaning of a symbol, word, token, icon, or message is context  
dependant. As elegantly put by 20th century linguist John Rupert  
Firth, "You shall know a word by the company it keeps." This is why  
understanding of sentences and text might be the emergent property of  
the statistical clustering of words, even in people.

Take for example, the word "run". The verb form has over 600 different  
definitions according to Merriam-Webster. As this excerpt from  
Reader's Digest suggests, the only way you can understand the word run  
is through its context in a larger body of text:

"When you run a fever, for example, those three letters have a very  
different meaning than when you run a bath to treat it, or when your  
bathwater subsequently runs over and drenches your cotton bath runner,  
forcing you to run out to the store and buy a new one. There, you run  
up a bill of $85 because besides a rug and some cold medicine, you  
also need some thread to fix the run in your stockings and some tissue  
for your runny nose and a carton of milk because you’ve run through  
your supply at home, and all this makes dread run through your soul  
because your value-club membership runs out at the end of the month  
and you’ve already run over your budget on last week’s grocery run  
when you ran over a nail in the parking lot and now your car won’t  
even run properly because whatever idiot runs that Walmart apparently  
lets his custodial staff run amok and you know you’re letting your  
inner monologue run on and on but, gosh—you’d do things differently if  
you ran the world. (And breathe). Maybe you should run for office."

3. Real-world referents might serve as conceptual landmarks or  
reference points from which to measure truth values of abstract  
statements. This is the whole notion of "Does Mary the color scientist  
raised in a black and white environment understand color argument.

> 2. Do human brains contain meaning?
They contain memories, and memories have meaning.

> 2. How is meaning present or inherent in the organization of neurons in the
> human brain?

Since the connectomes of biological brains use recurrent loops to  
model attention and possibly memory, I would say yes.

> 4. Can similar organizations that create meaning in the human brain be
> found within LLMs?

As I go into in the other thread, the transformer LLMs don't use  
recurrent feedback loops like RNNs do to model attention, instead they  
use massively parallel feed-sideways loops to model attention in a  
process known as self-attention. This has the effect of allowing  
faster execution of training in a FNN at the trade-off of more memory  
through an increased number of layers. There is a paper by  
Facebook/Meta researchers that suggest there is some information loss  
in pure FNN transformers also, but I haven't analyzed the paper.

>
> Answering these questions is necessary to move forward. Otherwise we will
> only go back and forth with some saying that LLMs are more like
> dictionaries, and others saying LLMs are more like language processing
> centers of human brains.

Those are my thoughts on the matter. I hope that gives us good  
foundation to discuss the matter upon. Broca's area of the brain and  
LLM might be similar mappings that are orthogonal to one another.  
Language centers might use circular definitions in time and LLM might  
use circular definitions in space. Of course dictionaries contain  
circular definitions of word clusters also since synonyms are used to  
define one another. Strange loops in space rather than strange loops  
in time. Humans and LLM might have orthogonal consciousnesses.

Stuart LaForge