<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Apr 22, 2023 at 4:17 AM Jason Resch via extropy-chat <<a href="mailto:extropy-chat@lists.extropy.org">extropy-chat@lists.extropy.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Apr 22, 2023, 3:06 AM Gordon Swobe via extropy-chat <<a href="mailto:extropy-chat@lists.extropy.org" target="_blank">extropy-chat@lists.extropy.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Fri, Apr 21, 2023 at 5:44 AM Ben Zaiboc via extropy-chat <<a href="mailto:extropy-chat@lists.extropy.org" rel="noreferrer" target="_blank">extropy-chat@lists.extropy.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 21/04/2023 12:18, Gordon Swobe wrote: </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">> Yes, still, and sorry no, I haven't watched that video yet, but I will <br>
> if you send me the link again. <br>
<br>
<br>
<a href="https://www.youtube.com/watch?app=desktop&v=xoVJKj8lcNQ&t=854s" rel="noreferrer noreferrer" target="_blank">https://www.youtube.com/watch?app=desktop&v=xoVJKj8lcNQ&t=854s</a><br>
<br></blockquote><div><br>Thank you to you and Keith. I watched the entire presentation. I think the Center for Human Technology is behind the movement to pause AI development. Yes? In any case, I found it interesting.<br><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
The thing (one of the things!) that struck me particularly was the <br>
remark about what constitutes 'language' for these systems, and that <br>
make me realise we've been arguing based on a false premise.</blockquote><div><br>Near the beginning of the presentation, they talk of how, for example, digital images can be converted into language and then processed by the language model like any other language. Is that what you mean?<br><br>Converting digital images into language is exactly how I might also describe it to someone unfamiliar with computer programming. The LLM is then only processing more text similar in principle to English text that describes the colors and shapes in the image. Each pixel in the image is described in symbolic language as "red" or "blue" and so on. The LLM then goes on to do what might be amazing things with that symbolic information, but the problem remains that these language models have no access to the referents. In the case of colors, it can process whatever symbolic representation it uses for "red" in whatever programming language in which it is written, but it cannot actually see the color red to ground the symbol "red."</div></div></div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">That was not my interpretation of his description. LLMs aren't used to process other types of signals (sound, video, etc.), it's the "transformer model" i.e. the 'T' in GPT.</div><div dir="auto"><br></div><div dir="auto">The transformer model is a recent discovery (2017) found to be adept at learning any stream of data containing discernable patterns: video, pictures, sounds, music, text, etc. This is why it has all these broad applications across various fields of machine learning.</div><div dir="auto"><br></div><div dir="auto">When the transformer model is applied to text (e.g., human language) you get a LLM like ChatGPT. When you give it images and text you get something not quite a pure LLM, but a hybrid model like GPT-4. If you give it just music audio files, you get something able to generate music. If you give it speech-text pairs you get something able to generate and clone speech (has anyone here checked out ElevenLabs?).</div><div dir="auto"><br></div><div dir="auto">This is the magic that AI researchers don't quite fully understand. It is a general purpose learning algorithm that manifests all kinds of emergent properties. It's able to extract and learn temporal or positional patterns all on its own, and then it can be used to take a short sample of input, and continue generation from that point arbitrarily onward.</div><div dir="auto"><br></div><div dir="auto">I think when the Google CEO said it learned translation despite not being trained for that purpose, this is what he was referring to: the unexpected emergent capacity of the model to translate Bengali text when promoted to do so. This is quite unlike how Google translate (GNMT) was trained, which required giving it many samples of explicit language translations between one language and another (much of the data was taken from the U.N. records).</div></div></blockquote><div><br>That is all fine and good, but nowhere do I see any reason to think the AI has any conscious understanding of its inputs or outputs. You write in terms of the transformer, but to me all this is covered in my phrase "the LLM then goes on to do what might be amazing things with that symbolic information, but..."<br><br>> (has anyone here checked out ElevenLabs?).<br><br>Yes. About a week ago, I used GPT-4, ElevenLabs and D-ID.com in combination. I asked GPT-4 to write a short speech about AI, then converted it to speech, then created an animated version of my mugshot giving the speech, then uploaded the resulting video to facebook where it amazed my friends. <br><br>These are impressive feats in software engineering, interesting and amazing to be sure, but it's just code.<br><br><br>-gts</div></div></div>