[ExI] all we are is just llms was

Mon Apr 24 04:15:51 UTC 2023

On Sat, Apr 22, 2023 at 4:17 AM Jason Resch via extropy-chat <
extropy-chat at lists.extropy.org> wrote:

>
>
> On Sat, Apr 22, 2023, 3:06 AM Gordon Swobe via extropy-chat <
> extropy-chat at lists.extropy.org> wrote:
>
>> On Fri, Apr 21, 2023 at 5:44 AM Ben Zaiboc via extropy-chat <
>> extropy-chat at lists.extropy.org> wrote:
>>
>>> On 21/04/2023 12:18, Gordon Swobe wrote:
>>
>> > Yes, still, and sorry no, I haven't watched that video yet, but I will
>>> > if you send me the link again.
>>>
>>>
>>> https://www.youtube.com/watch?app=desktop&v=xoVJKj8lcNQ&t=854s
>>>
>>>
>> Thank you to you and Keith. I watched the entire presentation. I think
>> the Center for Human Technology is behind the movement to pause AI
>> development. Yes? In any case, I found it interesting.
>>
>> The thing (one of the things!) that struck me particularly was the
>>> remark about what constitutes 'language' for these systems, and that
>>> make me realise we've been arguing based on a false premise.
>>
>>
>> Near the beginning of the presentation, they talk of how, for example,
>> digital images can be converted into language and then processed by the
>> language model like any other language. Is that what you mean?
>>
>> Converting digital images into language is exactly how I might also
>> describe it to someone unfamiliar with computer programming. The LLM is
>> then only processing more text similar in principle to English text that
>> describes the colors and shapes in the image. Each pixel in the image is
>> described in symbolic language as "red" or "blue" and so on. The LLM then
>> goes on to do what might be amazing things with that symbolic information,
>> but the problem remains that these language models have no access to the
>> referents. In the case of colors, it can process whatever
>> symbolic representation it uses for "red" in whatever programming language
>> in which it is written, but it cannot actually see the color red to ground
>> the symbol "red."
>>
>
> That was not my interpretation of his description. LLMs aren't used to
> process other types of signals (sound, video, etc.), it's the "transformer
> model" i.e. the 'T' in GPT.
>
> The transformer model is a recent discovery (2017) found to be adept at
> learning any stream of data containing discernable patterns: video,
> pictures, sounds, music, text, etc. This is why it has all these broad
> applications across various fields of machine learning.
>
> When the transformer model is applied to text (e.g., human language) you
> get a LLM like ChatGPT. When you give it images and text you get something
> not quite a pure LLM, but a hybrid model like GPT-4. If you give it just
> music audio files, you get something able to generate music. If you give it
> speech-text pairs you get something able to generate and clone speech (has
> anyone here checked out ElevenLabs?).
>
> This is the magic that AI researchers don't quite fully understand. It is
> a general purpose learning algorithm that manifests all kinds of emergent
> properties. It's able to extract and learn temporal or positional patterns
> all on its own, and then it can be used to take a short sample of input,
> and continue generation from that point arbitrarily onward.
>
> I think when the Google CEO said it learned translation despite not being
> trained for that purpose, this is what he was referring to: the unexpected
> emergent capacity of the model to translate Bengali text when promoted to
> do so. This is quite unlike how Google translate (GNMT) was trained, which
> required giving it many samples of explicit language translations between
> one language and another (much of the data was taken from the U.N. records).
>

That is all fine and good, but nowhere do I see any reason to think the AI
has any conscious understanding of its inputs or outputs. You write in
terms of the transformer, but to me all this is covered in my phrase "the
LLM then goes on to do what might be amazing things with that symbolic
information, but..."

>  (has anyone here checked out ElevenLabs?).

Yes. About a week ago, I used GPT-4, ElevenLabs and D-ID.com in
combination. I asked GPT-4 to write a short speech about AI, then converted
it to speech, then created an animated version of my mugshot giving the
speech, then uploaded the resulting video to facebook where it amazed my
friends.

These are impressive feats in software engineering, interesting and amazing
to be sure, but it's just code.

-gts
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20230423/1a16cd71/attachment-0001.htm>