[ExI] Emily M. Bender — Language Models and Linguistics (video interview)

Sat Mar 25 22:47:10 UTC 2023

Hi Gordon,

Thanks for sharing this video. I watched and and found the following points
of interest:

*1. She said they can't possibly be understanding as they are only seeing a
sequence of characters and predicting distributions and what these models
do is not the same thing as understanding language.*
My Reply: These models demonstrate many emergent capabilities that were not
things that were programmed in or planned. They can answer questions,
summarize texts, translate languages, write programs, etc. All these
abilities emerged purely from being trained on the single task of
predicting text. Given this, can we be certain that "understanding" is not
another one of the emergent capabilities manifested by the LLM?

*2. She uses the analogy that the LLM looking at characters would be the
same as a human who doesn't understand Cherokee looking at Cherokee
characters.*
My Reply: This is reminiscent of Searle's Chinese Room. The error is
looking at the behavior of the computer only at the lowest level, while
ignoring the goings-on at the higher levels. She sweeps all possible
behavior of a computer under the umbrella of "symbol manipulation", but
anything computable can be framed under "symbol manipulation" if described
on that level (including what atoms, or neurons in the human brain do).
This therefore fails as an argument that no understanding exists in the
higher-level description of the processing performed by the computer
program.

*3. She was asked what a machine would have to do to convince her they have
understanding. Her example was that if Siri or Alexa were asked to do
something in the real world, like turn on the lights, and if it does that,
then it has understanding (by virtue of having done something in the real
world).*
My Reply: Perhaps she does not see the analogy between turning on or off a
light, and the ability of an LLM to output characters to a monitor as
interacting in the real world (turning on and off many thousands of pixels
on the user's monitor as they read the reply).

*4. She admits her octopus test is exactly like the Turing test. She claims
the hyper-intelligent octopus would be able to send some pleasantries and
temporarily fool the other person, but that it has no real understanding
and this would be revealed if there were any attempt to communicate about
any real ideas.*
My Reply: I think she must be totally unaware of the capabilities of recent
models like GPT-4 to come to a conclusion like this.

*5. The interviewer pushes back and says he has learned a lot about math,
despite not seeing or experiencing mathematical objects. And has graded a
blind student's paper which appeared to show he was able to visualize
objects in math, despite not being sighted. She says the octopus never
learned language, we acquired a linguistic system, but the hyper
intelligent octopus has not, and that all the octopus has learned is
language distribution patterns.*
My Reply: I think the crucial piece missing from her understanding of LLMs
is that the only way for them to achieve the levels of accuracy in the text
that they predict is by constructing internal mental models of reality.
That is the only way they can answer hypotheticals concerning novel
situations described to them, or for example, to play chess. The only way
to play chess with a LLM is if it is internally constructing a model of the
board and pieces. It cannot be explained in terms of mere patterns or
distributions of language. Otherwise, the LLM would be as likely to guess
any potential move rather than an optimal move, and one can readily
guarantee a chess board position that has never before appeared in the
history of the universe, we can know the LLM is not relying on memory.

*6. The Interviewer asks what prevents the octopus from learning language
over time as a human would? She says it requires joint-attention: seeing
some object paired with some word at the same time.*
My Reply: Why can't joint attention manifest as the co-occurrence of words
as they appear within a sentence, paragraph, or topic of discussion?

*7. The interviewer asks do you think there is some algorithm that could
possibly exist that could take a stream of words and understand them in
that sense? She answers yes, but that would require programming in from the
start the structure and meanings of the words and mapping them to a model
of the world, or providing the model other sensors or imagery. The
interviewer confirms: "You are arguing that just consuming language without
all this extra stuff, that no algorithm could just from that, really
understand language? She says that's right.*
My Reply: We already know that these models build maps of things
corresponding to reality in their head. See, for example, the paper I
shared where the AI was given a description of how rooms were connected to
each other, then the AI was able to visually draw the layout of the room
from this textual description. If that is not an example of understanding,
I don't know what possibly could be. Note also: this was an early model of
GPT-4 before it had been trained on images, it was purely trained on text.

*8. She says, imagine that you are dropped into the middle of the Thai
library of congress and you have any book you could possibly want but only
in Thai. Could you learn Thai? The Interviewer says: I think so. She asks:
What would you first do, where would you start? She adds if you just have
form, that's not going to give you information. She then says she would
have to find an encyclopedia or a translation of a book we know.*
My Reply: We know there is information (objectively) in the Thai library,
even if there were no illustrations or copies of books we had the
translations to. We know the Thai library contains scruitable information
because the text is compressible. If text is compressible it means there
are discoverable patterns in the text which can be exploited to reduce the
amount of bits needed to represent it. All our understanding can be viewed
as forms of compression. For example, the physical laws that we have
discovered "compress" the amount of information we need to store about the
universe. Moreover, when compression works by constructing an internal toy
model of reality, we can play with and permute the inputs to the model to
see how it behaves under different situations. This provides a genuine
understanding of the outer world from which our sensory inputs are based. I
believe the LLM has successfully done this to predict text, it has various
internal, situational models it can deploy to help it in predicting text.
Having these models and knowing when and how to use them, I argue, is
tantamount to understanding.

Jason

On Sat, Mar 25, 2023 at 4:30 PM Gordon Swobe via extropy-chat <
extropy-chat at lists.extropy.org> wrote:

> I mentioned Emily Pender in another thread. She is Professor of
> Linguistics and Faculty Director of the Master's Program in Computational
> Linguistics at University of Washington.
>
> In the other thread, I made the mistake of introducing her with her
> Octopus thought experiment which I soon realized from the responses here is
> easily misinterpreted outside of the context of her general thesis and the
> academic paper in which she introduced it.
>
> As I learned from this interview, she and her colleague Koller wrote that
> paper in response to a twitter debate in which she found herself arguing
> with non-linguists who insist that language models understand language.
> Like me, she is critical of such claims. She considers them "hype."
>
> The relevant material starts at the 26 minute mark.
>
> https://www.youtube.com/watch?v=VaxNN3YRhBA
>
> _______________________________________________
> extropy-chat mailing list
> extropy-chat at lists.extropy.org
> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20230325/a9d5d42d/attachment.htm>