[ExI] LLMs cannot be conscious
Stuart LaForge
avant at sollegro.com
Sun Mar 19 21:13:51 UTC 2023
Quoting Jason Resch via extropy-chat <extropy-chat at lists.extropy.org>:
> On Sun, Mar 19, 2023, 2:04 AM Gordon Swobe via extropy-chat <
> extropy-chat at lists.extropy.org> wrote:
>
>> Consider that LLMs are like dictionaries. A complete dictionary can give
>> you the definition of any word, but that definition is in terms of other
>> words in the same dictionary. If you want to understand *meaning* of any
>> word definition, you must look up the definitions of each word in the
>> definition, and then look up each of the words in those definitions, which
>> leads to an infinite regress.
There are multiple architectures of LLM. Before OpenAI's GPT, most
natural language models were recurrent neural networks (RNN) and the
property of attention was modelled as short-term memory loops of
neural impulses where down stream neurons feedback onto upstream
neurons. The problem is that RNN are slower to train than feed forward
neural networks (FNN). The innovation in transformer is that it uses
self-attention which is similar to convolution in which attention
occurs across a layer in parallel instead between layers in loops.
This allowed transformers like the GPT series to train a lot faster
than the RNN language models at the cost of more layers, and some
purported information loss.
Interestingly, there is evidence that biological connectomes have
similar neural network topology to RNN with loops and such at least in
Drosophila. Here is the paper if you are interested.
https://www.science.org/doi/10.1126/science.add9330
If biological brains are indeed RNN, that would suggest that:
1. Biological brains take longer to train than FNN do. That is borne
out comparing even the brightest of our children that take years to
train and GPT-3 which can be fully trained in mere hours to days.
2. Biological brains have fewer layers than FNN do. Check. GPT models
have hundreds of layers whereas the human brain has approximately a
dozen counting both input and output layers.
[snip]
> To move forward, we need to answer:
>
> 1. What is meaning?
I have been struggling with finding connections between semantics and
information theory for quite awhile now. I can summarize my finding
thusly:
1. The meaning of a message is subjective and will signify different
meanings to the sender, intended recipients, and unintended recipients.
2. The meaning of a symbol, word, token, icon, or message is context
dependant. As elegantly put by 20th century linguist John Rupert
Firth, "You shall know a word by the company it keeps." This is why
understanding of sentences and text might be the emergent property of
the statistical clustering of words, even in people.
Take for example, the word "run". The verb form has over 600 different
definitions according to Merriam-Webster. As this excerpt from
Reader's Digest suggests, the only way you can understand the word run
is through its context in a larger body of text:
"When you run a fever, for example, those three letters have a very
different meaning than when you run a bath to treat it, or when your
bathwater subsequently runs over and drenches your cotton bath runner,
forcing you to run out to the store and buy a new one. There, you run
up a bill of $85 because besides a rug and some cold medicine, you
also need some thread to fix the run in your stockings and some tissue
for your runny nose and a carton of milk because you’ve run through
your supply at home, and all this makes dread run through your soul
because your value-club membership runs out at the end of the month
and you’ve already run over your budget on last week’s grocery run
when you ran over a nail in the parking lot and now your car won’t
even run properly because whatever idiot runs that Walmart apparently
lets his custodial staff run amok and you know you’re letting your
inner monologue run on and on but, gosh—you’d do things differently if
you ran the world. (And breathe). Maybe you should run for office."
3. Real-world referents might serve as conceptual landmarks or
reference points from which to measure truth values of abstract
statements. This is the whole notion of "Does Mary the color scientist
raised in a black and white environment understand color argument.
> 2. Do human brains contain meaning?
They contain memories, and memories have meaning.
> 2. How is meaning present or inherent in the organization of neurons in the
> human brain?
Since the connectomes of biological brains use recurrent loops to
model attention and possibly memory, I would say yes.
> 4. Can similar organizations that create meaning in the human brain be
> found within LLMs?
As I go into in the other thread, the transformer LLMs don't use
recurrent feedback loops like RNNs do to model attention, instead they
use massively parallel feed-sideways loops to model attention in a
process known as self-attention. This has the effect of allowing
faster execution of training in a FNN at the trade-off of more memory
through an increased number of layers. There is a paper by
Facebook/Meta researchers that suggest there is some information loss
in pure FNN transformers also, but I haven't analyzed the paper.
>
> Answering these questions is necessary to move forward. Otherwise we will
> only go back and forth with some saying that LLMs are more like
> dictionaries, and others saying LLMs are more like language processing
> centers of human brains.
Those are my thoughts on the matter. I hope that gives us good
foundation to discuss the matter upon. Broca's area of the brain and
LLM might be similar mappings that are orthogonal to one another.
Language centers might use circular definitions in time and LLM might
use circular definitions in space. Of course dictionaries contain
circular definitions of word clusters also since synonyms are used to
define one another. Strange loops in space rather than strange loops
in time. Humans and LLM might have orthogonal consciousnesses.
Stuart LaForge
More information about the extropy-chat
mailing list