[ExI] ChatGPT 'Not Interesting' for creative works

Tue Mar 7 14:08:11 UTC 2023

> The year-old prediction is useless of course, but the idea is to compensate for the limited calculation ability and bandwidth by giving it more time.

The analogy does not extend to language models. You cannot compensate for a small model with more computing time. These models have a fixed computing burden that is inversely proportional to model size.

I think you have the wrong intuition about these models. It seems like you are thinking of them like chess algorithms that improve their moves given more time. These models are essentially matrix multiplication. Small matrices multiply faster than large matrices. If you slowly multiply a small matrix you still get the same subpar answer in the end. The only way to get a better answer is to use a larger matrix, which necessarily takes a larger but still fixed computing window.

The reason that consumer GPUs cannot run ChatGPT level models is because the matrices simply won’t fit on the memory of consumer GPUs. The matrices can fit on a hard drive but I don’t think you would be willing to wait a month per word.

The small models that we can run give junk output and are mostly useless.

> On Mar 7, 2023, at 12:53 AM, spike jones via extropy-chat <extropy-chat at lists.extropy.org> wrote:
> 
>  
>  
> …> On Behalf Of Mike Dougherty via extropy-chat
> Subject: Re: [ExI] ChatGPT 'Not Interesting' for creative works
>  
> On Mon, Mar 6, 2023, 8:07 PM Gadersd via extropy-chat <extropy-chat at lists.extropy.org <mailto:extropy-chat at lists.extropy.org>> wrote:
>>  
>> No, the small models generate output faster than the big models. The small models are not slower version of the big models, they have completely different capabilities. You will never be able to get ChatGPT level output out of a much smaller model. It would be like trying to run modern engineering software on an Atari console: it wouldn’t be slower it just wouldn’t run at all.
> 
>  
> >…Or weather prediction using only one weather station? Or a single environmental reading (such as temperature or barometric pressure)? Mike
>  
>  
> I think of it more as a weather prediction using all the stations and readings but the model takes a year to calculate a prediction for tomorrow.  The year-old prediction is useless of course, but the idea is to compensate for the limited calculation ability and bandwidth by giving it more time.  
>  
> One way or another, we need to be able to personalize GPT.  Otherwise we can’t really use it to replace most of the staff of our company.  We are stuck with carbon units using ChatGPT to do their jobs, which means a dozen investors owning and controlling whatever our employees are doing with their product.  
>  
> spike
>  
>  
>  
>>  
> _______________________________________________
> extropy-chat mailing list
> extropy-chat at lists.extropy.org <mailto:extropy-chat at lists.extropy.org>
> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat <http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20230307/61ba1ace/attachment.htm>