[ExI] Emily M. Bender — Language Models and Linguistics (video interview)

Sat Mar 25 23:39:17 UTC 2023

How a linguist doesn't know about the decoding of Linear B?
Point 8 from Jason list is exactly what happened with Linear B. LLMs have
actually a much easier task given they are pre-trained with supervised data
instead of completely unsupervised approach that is what it was done when
people tried to decipher linear B.

Giovanni

On Sat, Mar 25, 2023 at 3:49 PM Jason Resch via extropy-chat <
extropy-chat at lists.extropy.org> wrote:

> Hi Gordon,
>
> Thanks for sharing this video. I watched and and found the following
> points of interest:
>
> *1. She said they can't possibly be understanding as they are only seeing
> a sequence of characters and predicting distributions and what these models
> do is not the same thing as understanding language.*
> My Reply: These models demonstrate many emergent capabilities that were
> not things that were programmed in or planned. They can answer questions,
> summarize texts, translate languages, write programs, etc. All these
> abilities emerged purely from being trained on the single task of
> predicting text. Given this, can we be certain that "understanding" is not
> another one of the emergent capabilities manifested by the LLM?
>
> *2. She uses the analogy that the LLM looking at characters would be the
> same as a human who doesn't understand Cherokee looking at Cherokee
> characters.*
> My Reply: This is reminiscent of Searle's Chinese Room. The error is
> looking at the behavior of the computer only at the lowest level, while
> ignoring the goings-on at the higher levels. She sweeps all possible
> behavior of a computer under the umbrella of "symbol manipulation", but
> anything computable can be framed under "symbol manipulation" if described
> on that level (including what atoms, or neurons in the human brain do).
> This therefore fails as an argument that no understanding exists in the
> higher-level description of the processing performed by the computer
> program.
>
> *3. She was asked what a machine would have to do to convince her they
> have understanding. Her example was that if Siri or Alexa were asked to do
> something in the real world, like turn on the lights, and if it does that,
> then it has understanding (by virtue of having done something in the real
> world).*
> My Reply: Perhaps she does not see the analogy between turning on or off a
> light, and the ability of an LLM to output characters to a monitor as
> interacting in the real world (turning on and off many thousands of pixels
> on the user's monitor as they read the reply).
>
> *4. She admits her octopus test is exactly like the Turing test. She
> claims the hyper-intelligent octopus would be able to send some
> pleasantries and temporarily fool the other person, but that it has no real
> understanding and this would be revealed if there were any attempt to
> communicate about any real ideas.*
> My Reply: I think she must be totally unaware of the capabilities of
> recent models like GPT-4 to come to a conclusion like this.
>
> *5. The interviewer pushes back and says he has learned a lot about math,
> despite not seeing or experiencing mathematical objects. And has graded a
> blind student's paper which appeared to show he was able to visualize
> objects in math, despite not being sighted. She says the octopus never
> learned language, we acquired a linguistic system, but the hyper
> intelligent octopus has not, and that all the octopus has learned is
> language distribution patterns.*
> My Reply: I think the crucial piece missing from her understanding of LLMs
> is that the only way for them to achieve the levels of accuracy in the text
> that they predict is by constructing internal mental models of reality.
> That is the only way they can answer hypotheticals concerning novel
> situations described to them, or for example, to play chess. The only way
> to play chess with a LLM is if it is internally constructing a model of the
> board and pieces. It cannot be explained in terms of mere patterns or
> distributions of language. Otherwise, the LLM would be as likely to guess
> any potential move rather than an optimal move, and one can readily
> guarantee a chess board position that has never before appeared in the
> history of the universe, we can know the LLM is not relying on memory.
>
> *6. The Interviewer asks what prevents the octopus from learning language
> over time as a human would? She says it requires joint-attention: seeing
> some object paired with some word at the same time.*
> My Reply: Why can't joint attention manifest as the co-occurrence of words
> as they appear within a sentence, paragraph, or topic of discussion?
>
> *7. The interviewer asks do you think there is some algorithm that could
> possibly exist that could take a stream of words and understand them in
> that sense? She answers yes, but that would require programming in from the
> start the structure and meanings of the words and mapping them to a model
> of the world, or providing the model other sensors or imagery. The
> interviewer confirms: "You are arguing that just consuming language without
> all this extra stuff, that no algorithm could just from that, really
> understand language? She says that's right.*
> My Reply: We already know that these models build maps of things
> corresponding to reality in their head. See, for example, the paper I
> shared where the AI was given a description of how rooms were connected to
> each other, then the AI was able to visually draw the layout of the room
> from this textual description. If that is not an example of understanding,
> I don't know what possibly could be. Note also: this was an early model of
> GPT-4 before it had been trained on images, it was purely trained on text.
>
> *8. She says, imagine that you are dropped into the middle of the Thai
> library of congress and you have any book you could possibly want but only
> in Thai. Could you learn Thai? The Interviewer says: I think so. She asks:
> What would you first do, where would you start? She adds if you just have
> form, that's not going to give you information. She then says she would
> have to find an encyclopedia or a translation of a book we know.*
> My Reply: We know there is information (objectively) in the Thai library,
> even if there were no illustrations or copies of books we had the
> translations to. We know the Thai library contains scruitable information
> because the text is compressible. If text is compressible it means there
> are discoverable patterns in the text which can be exploited to reduce the
> amount of bits needed to represent it. All our understanding can be viewed
> as forms of compression. For example, the physical laws that we have
> discovered "compress" the amount of information we need to store about the
> universe. Moreover, when compression works by constructing an internal toy
> model of reality, we can play with and permute the inputs to the model to
> see how it behaves under different situations. This provides a genuine
> understanding of the outer world from which our sensory inputs are based. I
> believe the LLM has successfully done this to predict text, it has various
> internal, situational models it can deploy to help it in predicting text.
> Having these models and knowing when and how to use them, I argue, is
> tantamount to understanding.
>
>
> Jason
>
>
>
> On Sat, Mar 25, 2023 at 4:30 PM Gordon Swobe via extropy-chat <
> extropy-chat at lists.extropy.org> wrote:
>
>> I mentioned Emily Pender in another thread. She is Professor of
>> Linguistics and Faculty Director of the Master's Program in Computational
>> Linguistics at University of Washington.
>>
>> In the other thread, I made the mistake of introducing her with her
>> Octopus thought experiment which I soon realized from the responses here is
>> easily misinterpreted outside of the context of her general thesis and the
>> academic paper in which she introduced it.
>>
>> As I learned from this interview, she and her colleague Koller wrote that
>> paper in response to a twitter debate in which she found herself arguing
>> with non-linguists who insist that language models understand language.
>> Like me, she is critical of such claims. She considers them "hype."
>>
>> The relevant material starts at the 26 minute mark.
>>
>> https://www.youtube.com/watch?v=VaxNN3YRhBA
>>
>> _______________________________________________
>> extropy-chat mailing list
>> extropy-chat at lists.extropy.org
>> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat
>>
> _______________________________________________
> extropy-chat mailing list
> extropy-chat at lists.extropy.org
> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20230325/daa31175/attachment.htm>