[ExI] ChatGPT 'Not Interesting' for creative works

Wed Mar 8 16:50:36 UTC 2023

You can try GPT-JT (not related to ChatGPT) https://huggingface.co/spaces/togethercomputer/GPT-JT <https://huggingface.co/spaces/togethercomputer/GPT-JT>. Try it yourself and you will see that it is completely useless compared to ChatGPT. Note that this is a 6 billion parameter model trained using parallel computing as you have suggested. Even this small model is behind the capabilities of most consumer hardware: one needs a pricey GPU to run it. Running something as large as 175 billion parameters (ChatGPT) is impossible on consumer hardware.

> Sure, but my contention is that the distributed model would still multiply the same size matrix.  If we need to multiply a 50x50, that task can be effectively distributed into background computing, but it would take reliable bandwidth and probably a lot of redundancy.

The issue with this is that the transformer model uses quadratic matrix operations in addition to linear operations. The quadratic operations cannot be easily split across devices. It must be done on a single device (GPU) that has the memory to hold giant matrices. This is why one needs a legion of $10,000 GPU’s with massive memory to run ChatGPT level models.

It turns out that these quadratic operations are what enabled the intelligence of these models to far surpass previous techniques. The quadratic operations require the entire input memory to be collected in one place as this one step integrates every piece of information together. This reflects the nature of our intelligence: it is more than a sum of parts (linear).

> On Mar 7, 2023, at 10:38 AM, spike jones via extropy-chat <extropy-chat at lists.extropy.org> wrote:
> 
>  
>  
> From: extropy-chat <extropy-chat-bounces at lists.extropy.org <mailto:extropy-chat-bounces at lists.extropy.org>> On Behalf Of Gadersd via extropy-chat
> 
> Subject: Re: [ExI] ChatGPT 'Not Interesting' for creative works
>  
>> The year-old prediction is useless of course, but the idea is to compensate for the limited calculation ability and bandwidth by giving it more time.
>  
> >…The analogy does not extend to language models. You cannot compensate for a small model with more computing time. These models have a fixed computing burden that is inversely proportional to model size…
>  
> OK but the explanation you gave doesn’t support that contention.  Read on please:
>  
> >…I think you have the wrong intuition … These models are essentially matrix multiplication. Small matrices multiply faster than large matrices…
>  
> Sure, but my contention is that the distributed model would still multiply the same size matrix.  If we need to multiply a 50x50, that task can be effectively distributed into background computing, but it would take reliable bandwidth and probably a lot of redundancy.  
>  
> Consider the task of finding the determinant of a 50x50.  That can be distributed among 50 computers each finding the determinant of a 49x49, each of which can be distributed into 49 processors and so on.  Matrix multiplies and inversions can likewise be distributed, but of course it would be a brittle process: any one processor could mess it up.
>  
> OK idea: get a bunch of investors together who can kick in a few tens of thousands, rent some unused office or warehouse space somewhere, set up a closed system server farm training toward a particular bias agreed upon by the investors.  You would form a scaled down (but still big) GPT which is intentionally trained in material friendly to libertarianism for instance, or believing that causing the extinction of mosquitoes is good but in general causing extinction is bad.
>  
> Contention: whatever the ChatGPT investors did, a smaller group with less money can do likewise.
>  
> Given that, one could create a commercial chatbot specialized in training students for instance, or spreading religion, or selling products.  Oh I see mega profits trying to be made here.
>  
> spike
>  
>  
> _______________________________________________
> extropy-chat mailing list
> extropy-chat at lists.extropy.org <mailto:extropy-chat at lists.extropy.org>
> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat <http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20230308/d021c5c4/attachment.htm>