[ExI] LLMs and Computability

Wed Apr 19 21:50:36 UTC 2023

There's a running thread in LLM discourse about how LLMs don't have a world
model and therefore are a non-starter on the path to AGI.

And indeed, on a surface level, this is true. LLMs are a function - they
map token vectors to other token vectors. No cognition involved.

And yet, I wonder if this perspective is missing something important. The
token vectors LLMs are working with aren't just structureless streams of
tokens. they're human language - generated by human-level general
processing.

That seems like it's the secret sauce - the ginormous hint that got ignored
by data/statisticis-centric ML researchers. When you learn how to map token
streams with significant internal structure, the function your neural net
is being trained to approximate will inevitably come to implement at least
some of the processing that generated your token streams.

It won't do it perfectly, and it'll be broken in weird ways. But not
completely nonfunctional. Actually pretty darned useful, A direct analogy
that comes to mind would be training a deep NN on mapping assembler program
listings to output. What you will end up with is a learned model that, to
paraphrase Greenspun's Tenth Rule, "contains an ad hoc,
informally-specified, bug-ridden, slow implementation of half of" a
Turing-complete computer.

The token streams GPT is trained on represent an infinitesimal fraction of
a tiny corner of the space of all token streams, but they're token streams
generated by human-level general intelligences. This seems to me to suggest
that an LLM could very well be implementing significant pieces of General
Intelligence, and that this is why they're so surprisingly capable.

Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20230419/e150717f/attachment.htm>