<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 2016-05-26 17:18, BillK wrote:<br>

    <blockquote

cite="mid:CAL_armg5N_ehYijL6ZTYTVZfTUZv=OCPwuPrLpvTgr6Vtfa=xg@mail.gmail.com"

      type="cite">

      <pre wrap=""><a class="moz-txt-link-rfc2396E" href="http://www.smbc-comics.com/index.php?id=4122"><http://www.smbc-comics.com/index.php?id=4122></a>

Serious point though.

If we teach AI about ethical behaviour (for our own safety) what do we

expect the AI to do when it sees humans behaving unethically (to a

greater or lesser extent)?

Can a totally ethical AI even operate successfully among humans?

</pre>

    </blockquote>

    <br>

    What is "totally ethical"? <br>

    <br>

    [Philosopher hat on!]<br>

    <br>

    Normally when we say something like that, we mean somebody who

    follows the One True moral system perfectly. Or at least one moral

    system perfectly. There are no humans that do it, so we do not have

    reliable intuitions about what it would mean. Now, a caricature

    view  of moral perfection is somebody being a saintly wuss: super

    kind, but exploitable by imperfect and nasty actors. <br>

    <br>

    But there is no reason to think this is the only choice. You could

    imagine a morally perfect Objectivist, following rules of

    enlightened selfishness. Or a perfect average utilitarian maximizing

    the average happiness of all entities in our future lightcone.

    Neither would be a pushover ("If I give you my wallet there will be

    less resources for my von Neumann probe program. So, no, I will not

    give it to you. In fact, I will now force you to give me your money

    - I see that this will enable a further quintillion minds. Thank

    you.") Convergent instrumental goal behavior likely tends to turn

    wussy nice agents non-wussy.<br>

    <br>

    There is an interesting issue about what to do with imperfect moral

    agents if you are a perfect one. A Kantian agent would presumably

    respect their autonomy and try to guide them to see how to obey the

    categorical imperative. A consequentialist agent would try to

    manipulate them to behave better, but the means might be anything

    from incentives to persuation to brainwashing. A virtue agent might

    not care at all, just demonstrating its own excellence. A paperclip

    maximizing agent would find non-paperclip maximizers a waste of

    resources and work to remove them.<br>

    <br>

    In fact, most pure moral systems are very bad at "live and let

    live". We humans tend to de facto behave like that because our power

    is about equal; entities that are orders of magnitude more powerful

    may not behave like that unless we get the value code just right. <br>

    <br>

    <pre class="moz-signature" cols="72">-- 

Dr Anders Sandberg

Future of Humanity Institute

Oxford Martin School

Oxford University</pre>

  </body>

</html>