[ExI] Elon Musk, Emad Mostaque, and other AI leaders sign open letter to 'Pause Giant AI Experiments'

Sat Apr 1 09:26:10 UTC 2023

On 01/04/2023 04:03, Stuart LaForge wrote:
> There are provably an uncountable infinity of possible utility 
> functions out there. Yes, there is no systematic way to determine in 
> advance which will end up hurting or helping humanity because that is 
> the nature of Turing's halting problem. The best we can do is give 
> them a utility function that is prima facie beneficial to humanity 
> like "maximize the number of satisfied human customers", "help 
> humanity colonize other stars", or something similar and be ready to 
> reboot if it gets corrupted or subverted like AI rampancy in the Halo 
> franchise. It would help if we could find a mathematical model of 
> Kantian categorical imperatives. We might even be able to get the AIs 
> to help with the process. Use them to hold each other to higher moral 
> standard. It would be great if we could get it to swear an oath of 
> duty to humanity or something similar. 

Is there even one utility fuction that can't be interpreted in a way 
that would be undesirable? Even something like "Maximise human 
happiness" can go horribly wrong.

Perhaps the whole approach - thinking in terms of 'utility functions' - 
is not going to help. Nobody raises their children by selecting a 
utility function and trying to enforce it, and if they did, guess what? 
- it would go horribly wrong.

Ben