[ExI] Ethical AI?

efc at swisscows.email efc at swisscows.email
Fri Apr 21 11:05:38 UTC 2023


Hello everyone,

I saw this paper on hackernews this morning:

https://arxiv.org/pdf/2302.07459.pdf

With the title: "The Capacity for Moral Self-Correction in Large
Language Models".

On page 11, I find this:

"Along these lines, a recent technique called Constitutional AI, trains 
language models to adhere to a human- written set of ethical principles 
(a constitution) by first having models determine whether their outputs 
violate these principles, then training models to avoid such violations 
[4].  Constitutional AI and our work observe the same phenomenon: 
sufficiently large language models, with a modest amount of RLHF 
training to be helpful, can learn how to abide by high-level ethical 
principles expressed in natural language."

What I find interesting here, is that for me, this is about programming
the system to just follow rules, as defined by a human being. I do not
see this having anything to do with morals. The rules can be rewritten
by a human being, and given a sufficiently powerful system, by the
system itself, since we ourselves do not even know the full workings of
what goes on inside the LLMs.

The second thing I find interesting is the choice of morals. I see
graphs about discrimination, gender identity, etc. which means that in
my opinion the morals progarmmed into the system is more left oriented
than right oriented.

What I would really like to study, is what kind of ethics the machine
would naturally come up with, instead of having rules decided upon and
programmed into the it by humans who obviously have their own ideas.

Food for thought.

Best regards, 
Daniel




More information about the extropy-chat mailing list