<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--

/* Font Definitions */

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

span.EmailStyle19

        {mso-style-type:personal-reply;

        font-family:"Calibri",sans-serif;

        color:windowtext;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}

@page WordSection1

        {size:8.5in 11.0in;

        margin:1.0in 1.0in 1.0in 1.0in;}

div.WordSection1

        {page:WordSection1;}

--></style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple style='word-wrap:break-word'><div class=WordSection1><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal><b>…</b>> <b>On Behalf Of </b>Gadersd via extropy-chat<br><b>Sent:</b> Monday, 6 March, 2023 8:25 AM<br><br><o:p></o:p></p></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Toy models can and have been trained in parallel across consumer computers, but I think you would be disappointed in their intelligence as compared to ChatGPT.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>For example I tried a 6 billion parameter model GPT-JT, accessible at <a href="https://huggingface.co/spaces/togethercomputer/GPT-JT">https://huggingface.co/spaces/togethercomputer/GPT-JT</a>.<o:p></o:p></p></div><div><p class=MsoNormal>Prompt: "s<b>olve 2x+3=-1 step by step. 2x="</b><o:p></o:p></p></div><div><p class=MsoNormal><b>Answer: "</b><i>1, so x=1/2.<br><br>A:<br><br>The answer is $1”</i><o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>This model was trained in parallel as you have suggested. Not very useful, is it?<o:p></o:p></p><div><p class=MsoNormal><br><br><o:p></o:p></p></div><p class=MsoNormal>…<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>In your example, I am getting x = -2.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>But no matter, we know how to do algebra with software, and it is good at it.  <o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Regarding the value of a toy ChatGPT, it depends on how you look at it.  If I ask ChatGPT to write a 2 page essay on civil rights in the 20<sup>th</sup> century, it will do so in a few seconds.  So imagine I had a microChatGPT and asked it to write a 2 page essay on civil rights by tomorrow morning.  It would be analogous to Deep Blue doing the calculations of 3 minutes in 18 hours, ja?  <o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The real question is how do we scale ChatGPT down six orders of magnitude and make it a commercial product?  It isn’t yet what we need if a company or organization controls it and trains it.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>spike<o:p></o:p></p></div></div></body></html>