[ExI] Emily M. Bender — Language Models and Linguistics (video interview)

Sun Mar 26 14:59:43 UTC 2023

Hi Gordon,

Thank you for noticing and pointing out my mistake. I did intend to reply
to the list.

Jason

On Sun, Mar 26, 2023, 10:52 AM Gordon Swobe <gordon.swobe at gmail.com> wrote:

> Jason, I received a reply to my last to you, but before I dig into it, I
> notice that it does not appear addressed to ExI. I am happy to reply
> privately, but I think you probably meant it for the list.
>
> In the meantime, I wanted to add to comments about emergent properties
> that I have no argument with the idea that GPT might be exhibiting emergent
> properties -- I agree it certainly appears that way -- but I would say they
> are emergent properties of the grammatical relationships between and among
> words, not evidence of any understanding of the meanings. I think Bender
> actually touches on this subject in the interview but without actually
> using the term "emergent properties."
>
> It could very well be that something like this is one aspect of human
> intelligence, but I think we also understand the meanings.
>
> -gts
>
>
>
>
>
> On Sun, Mar 26, 2023 at 2:56 AM Gordon Swobe <gordon.swobe at gmail.com>
> wrote:
>
>> I have a smart home. Some of the iPhone apps associated with it have and
>> can display what could be described as internal models representing my
>> home. Does this mean these apps have a conscious understanding of the
>> layout of my home? No, I think not, not as I and most people use the word
>> understand. Only minds can understand things, and despite my home being
>> "smart," I reject the idea that it has a mind of its own. That is nothing
>> more than foolish science-fiction..
>>
>> -gts
>>
>> On Sun, Mar 26, 2023 at 12:01 AM Gordon Swobe <gordon.swobe at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Sat, Mar 25, 2023 at 4:49 PM Jason Resch via extropy-chat <
>>> extropy-chat at lists.extropy.org> wrote:
>>>
>>>> Hi Gordon,
>>>>
>>>> Thanks for sharing this video. I watched and and found the following
>>>> points of interest:
>>>>
>>>> *1. She said they can't possibly be understanding as they are only
>>>> seeing a sequence of characters and predicting distributions and what these
>>>> models do is not the same thing as understanding language.*
>>>> My Reply: These models demonstrate many emergent capabilities that were
>>>> not things that were programmed in or planned. They can answer questions,
>>>> summarize texts, translate languages, write programs, etc. All these
>>>> abilities emerged purely from being trained on the single task of
>>>> predicting text. Given this, can we be certain that "understanding" is not
>>>> another one of the emergent capabilities manifested by the LLM?
>>>>
>>>
>>> This gets into philosophical debate about what, exactly, are emergent
>>> properties. As I understand the term, whatever it is that emerges is
>>> somehow hidden but intrinsic prior to the emergence. For example, from the
>>> rules of chess there emerge many abstract properties and strategies of
>>> chess. To someone naive about chess, it is difficult to imagine from the
>>> simple rules of chess how chess looks to a grandmaster, but those emergent
>>> properties are inherent in and follow logically from the simple rules of
>>> chess.
>>>
>>> So how does meaning emerge from mere symbols(words)? Sequences of
>>> abstract characters in no possible way contain the seeds of their meanings,
>>> as we can see by the fact that many different words exist in different
>>> languages and in entirely different alphabets for the same meaning.
>>>
>>>
>>>> *2. She uses the analogy that the LLM looking at characters would be
>>>> the same as a human who doesn't understand Cherokee looking at Cherokee
>>>> characters.*
>>>> My Reply: This is reminiscent of Searle's Chinese Room. The error is
>>>> looking at the behavior of the computer only at the lowest level, while
>>>> ignoring the goings-on at the higher levels. She sweeps all possible
>>>> behavior of a computer under the umbrella of "symbol manipulation", but
>>>> anything computable can be framed under "symbol manipulation" if described
>>>> on that level (including what atoms, or neurons in the human brain do).
>>>> This therefore fails as an argument that no understanding exists in the
>>>> higher-level description of the processing performed by the computer
>>>> program.
>>>>
>>>
>>> Yes her argument is similar to Searle's. See above. Sequences of
>>> characters (words) in no possible way contain the hidden seeds of their
>>> meanings, as we can see from the fact that many different words exist
>>> in different languages and alphabets for the same meaning.
>>>
>>> *3. She was asked what a machine would have to do to convince her they
>>>> have understanding. Her example was that if Siri or Alexa were asked to do
>>>> something in the real world, like turn on the lights, and if it does that,
>>>> then it has understanding (by virtue of having done something in the real
>>>> world).*
>>>> My Reply: Perhaps she does not see the analogy between turning on or
>>>> off a light, and the ability of an LLM to output characters to a monitor as
>>>> interacting in the real world (turning on and off many thousands of pixels
>>>> on the user's monitor as they read the reply).
>>>>
>>>
>>> I thought that was the most interesting part of her interview. She was
>>> using the word "understanding" in a more generous way than I would prefer
>>> to use it, even attributing "understanding" to a stupid app like Alexa, but
>>> she does not think GPT has understanding. I think she means it in exactly
>>> the way I do, which is why I put it in scare-quotes. As she put, it is a
>>> "kind of" understanding. As I wrote to you I think yesterday, I will grant
>>> that my pocket calculator "understands" how to do math, but it is
>>> not holding the meaning of those calculations in mind consciously, which is
>>> what I (and most everyone on earth) mean by understanding.
>>>
>>> Understanding involves the capacity to consciously hold something in
>>> mind. Otherwise, pretty much everything understands something and the word
>>> loses meaning. Does the automated windshield wiper mechanism in my car
>>> understand how to clear the rain off my windows when it starts raining? No,
>>> but I will grant that it "understands" it in scare-quotes.
>>>
>>> The other point I would make here is that even if we grant that turning
>>> the pixels off and on your screen makes GPT sentient or conscious, the real
>>> question is "how can it know the meanings of those pixel arrangements?"
>>> From its point of view (so to speak) it is merely generating meaningless
>>> strings of text for which it has never been taught the meanings except via
>>> other meaningless strings of text.
>>>
>>> Bender made the point that language models have no grounding, which is
>>> something I almost mentioned yesterday in another thread. The symbol
>>> grounding problem in philosophy is about exactly this question. They are
>>> not grounded in the world of conscious experience like you and me. Or, if
>>> we think so, then that is to me something like a religious belief.
>>>
>>>
>>>
>>>> *4. She admits her octopus test is exactly like the Turing test. She
>>>> claims the hyper-intelligent octopus would be able to send some
>>>> pleasantries and temporarily fool the other person, but that it has no real
>>>> understanding and this would be revealed if there were any attempt to
>>>> communicate about any real ideas.*
>>>> My Reply: I think she must be totally unaware of the capabilities of
>>>> recent models like GPT-4 to come to a conclusion like this.
>>>>
>>>
>>> Again, no grounding.
>>>
>>>
>>>> *5. The interviewer pushes back and says he has learned a lot about
>>>> math, despite not seeing or experiencing mathematical objects. And has
>>>> graded a blind student's paper which appeared to show he was able to
>>>> visualize objects in math, despite not being sighted. She says the octopus
>>>> never learned language, we acquired a linguistic system, but the hyper
>>>> intelligent octopus has not, and that all the octopus has learned is
>>>> language distribution patterns.*
>>>> My Reply: I think the crucial piece missing from her understanding of
>>>> LLMs is that the only way for them to achieve the levels of accuracy in the
>>>> text that they predict is by constructing internal mental models of
>>>> reality. That is the only way they can answer hypotheticals concerning
>>>> novel situations described to them, or for example, to play chess. The only
>>>> way to play chess with a LLM is if it is internally constructing a model of
>>>> the board and pieces. It cannot be explained in terms of mere patterns or
>>>> distributions of language. Otherwise, the LLM would be as likely to guess
>>>> any potential move rather than an optimal move, and one can readily
>>>> guarantee a chess board position that has never before appeared in the
>>>> history of the universe, we can know the LLM is not relying on memory.
>>>>
>>>
>>> I don't dispute that LLMs construct internal models of reality, but I
>>> cough when you include the word "mental," as if they have minds
>>> with conscious awareness of their internal models.
>>>
>>> I agree that it is absolutely amazing what these LLMs can do and will
>>> do. The question is, how could they possibly know it any more than my
>>> pocket calculator knows the rules of mathematics or my watch knows the time?
>>>
>>>
>>>
>>>>
>>>> *6. The Interviewer asks what prevents the octopus from learning
>>>> language over time as a human would? She says it requires joint-attention:
>>>> seeing some object paired with some word at the same time.*
>>>> My Reply: Why can't joint attention manifest as the co-occurrence of
>>>> words as they appear within a sentence, paragraph, or topic of discussion?
>>>>
>>>
>>> Because those other words also have no meanings or refrents. There is no
>>> grounding and there is no Rosetta Stone.
>>>
>>> Bender co-authored another paper about "stochastic parrots," which is
>>> how she characterizes LLMs and which I like. These models are like parrots
>>> that mimic human language and understanding. It is amazing how talented
>>> they appear, but they are only parrots who have no idea what they are
>>> saying.
>>>
>>>
>>>>
>>>> *7. The interviewer asks do you think there is some algorithm that
>>>> could possibly exist that could take a stream of words and understand them
>>>> in that sense? She answers yes, but that would require programming in from
>>>> the start the structure and meanings of the words and mapping them to a
>>>> model of the world, or providing the model other sensors or imagery. The
>>>> interviewer confirms: "You are arguing that just consuming language without
>>>> all this extra stuff, that no algorithm could just from that, really
>>>> understand language? She says that's right.*
>>>> My Reply: We already know that these models build maps of things
>>>> corresponding to reality in their head. See, for example, the paper I
>>>> shared where the AI was given a description of how rooms were connected to
>>>> each other, then the AI was able to visually draw the layout of the room
>>>> from this textual description. If that is not an example of understanding,
>>>> I don't know what possibly could be. Note also: this was an early model of
>>>> GPT-4 before it had been trained on images, it was purely trained on text.
>>>>
>>>
>>> This goes back to the question about Alexa.Yes, if that is what you mean
>>> by "understanding" then I am forced to agree that even Alexa and Siri
>>> "understand" language. But, again, I must put it in scare quotes. There is
>>> nobody out there named Alexa who is actually aware of understanding
>>> anything. She exists only in a manner of speaking.
>>>
>>>
>>>>
>>>> *8. She says, imagine that you are dropped into the middle of the Thai
>>>> library of congress and you have any book you could possibly want but only
>>>> in Thai. Could you learn Thai? The Interviewer says: I think so. She asks:
>>>> What would you first do, where would you start? She adds if you just have
>>>> form, that's not going to give you information. She then says she would
>>>> have to find an encyclopedia or a translation of a book we know.*
>>>> My Reply: We know there is information (objectively) in the Thai
>>>> library, even if there were no illustrations or copies of books we had the
>>>> translations to. We know the Thai library contains scruitable information
>>>> because the text is compressible. If text is compressible it means there
>>>> are discoverable patterns in the text which can be exploited to reduce the
>>>> amount of bits needed to represent it. All our understanding can be viewed
>>>> as forms of compression. For example, the physical laws that we have
>>>> discovered "compress" the amount of information we need to store about the
>>>> universe. Moreover, when compression works by constructing an internal toy
>>>> model of reality, we can play with and permute the inputs to the model to
>>>> see how it behaves under different situations. This provides a genuine
>>>> understanding of the outer world from which our sensory inputs are based. I
>>>> believe the LLM has successfully done this to predict text, it has various
>>>> internal, situational models it can deploy to help it in predicting text.
>>>> Having these models and knowing when and how to use them, I argue, is
>>>> tantamount to understanding.
>>>>
>>>
>>> How could you possibly know what those "discoverable patterns of text"
>>> mean, given that they are in Thai and there is no Thai to English
>>> dictionary in the Thai library?
>>>
>>> As she points out and I mentioned above, there is no Rosetta Stone.
>>>
>>> Thanks for the thoughtful email.
>>>
>>> -gts
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20230326/aa3c2ff2/attachment-0001.htm>