[ExI] LLMs and Computability
gordon.swobe at gmail.com
Thu Apr 20 17:05:01 UTC 2023
On Wed, Apr 19, 2023 at 5:25 PM Jason Resch via extropy-chat <
extropy-chat at lists.extropy.org> wrote:
> I believe the only way to explain their demonstrated capacities is to
> assume that they create models.
While as you know I argue against the notion that language models have
anything like what most people mean by consciousness, I also find GPT-4 a
good source of information about language models. Here is what it says
about the question of it creating internal models.
Q: Do large language models construct internal models of the world? And, if
so, what are the building blocks of these models? How do they work?
A: Yes, large language models like GPT-4 construct internal models of the
world, though these models are not as explicit or well-defined as a human's
mental model. These internal models emerge from the process of learning
patterns in the text data the model has been trained on. The building
blocks of these models are primarily embeddings and neural network layers,
which are responsible for encoding and processing information.
Embeddings: Embeddings are high-dimensional vector representations of
words, phrases, or even sentences. These representations capture semantic
and syntactic relationships between different linguistic elements in a
continuous vector space. Embeddings help the model to understand and
generalize across words and phrases with similar meanings or functions.
Neural Network Layers: Language models use deep learning architectures,
such as Transformer-based networks, which consist of multiple layers of
interconnected neurons. These layers enable the model to learn complex
relationships and patterns in the input data. The neural network is
organized into encoder and decoder layers, which are responsible for
processing input sequences and generating output sequences, respectively.
When a large language model receives input, such as a text prompt, it
processes the input through its neural network layers. As the input moves
through the layers, the model makes predictions about the relationships
between words and phrases based on the patterns it has learned during
training. The model then generates a response or output based on these
It's essential to understand that the internal models constructed by large
language models are inherently probabilistic and are based on patterns
learned from the text data. They do not have a deep, structured
understanding of the world like humans do, and their knowledge is limited
to the data they have been trained on. While they can generate coherent and
contextually relevant responses, they may not always produce accurate or
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the extropy-chat