<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    On 2016-02-25 21:33, John Clark wrote:<br>
    <blockquote
cite="mid:CAJPayv0fa2Shbx9EKkgAo+BzWx0PnH2RTLP86p+FbV4LKvJ=rA@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div class="gmail_default"
          style="font-family:arial,helvetica,sans-serif"><span
            style="font-family:arial,sans-serif">On Thu, Feb 25, 2016 at
            3:25 PM, Anders Sandberg </span><span dir="ltr"
            style="font-family:arial,sans-serif"><<a
              moz-do-not-send="true" href="mailto:anders@aleph.se"
              target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:anders@aleph.se">anders@aleph.se</a></a>></span><span
            style="font-family:arial,sans-serif"> wrote:</span><br>
        </div>
        <div class="gmail_extra">
          <div class="gmail_quote"><br>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"><span class="">
                  <div dir="ltr">
                    <div class="gmail_extra">
                      <blockquote class="gmail_quote" style="margin:0px
                        0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">​
                        <div class="gmail_default"
                          style="font-family:arial,helvetica,sans-serif;display:inline">​
                          >>​</div>
                        <font size="4">There are indeed vested
                          interests​
                          <div class="gmail_default"
                            style="font-family:arial,helvetica,sans-serif;display:inline">​
                            ​</div>
                        </font>
                        <div
style="font-size:large;font-family:arial,helvetica,sans-serif;display:inline">but
                          it wouldn't matter even if there weren't, </div>
                        <span style="font-size:large"> </span>
                        <div
style="font-size:large;font-family:arial,helvetica,sans-serif;display:inline">there

                          is no way the friendly AI (aka slave AI) idea
                          could work under any circumstances. You just
                          can't keep outsmarting​ something far smarter
                          than you are indefinitely</div>
                      </blockquote>
                    </div>
                  </div>
                  <br>
                </span>
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  >​</div>
                Actually, yes, you can. But you need to construct
                utility functions with invariant subspaces</div>
            </blockquote>
            <div><br>
            </div>
            <div>
              <div class="gmail_default"
                style="font-family:arial,helvetica,sans-serif;display:inline">​<font
                  size="4">It's the invariant part that will cause
                  problems, any mind with a fixed goal that can never
                  change no matter what is going to end up in a infinite
                  loop, that's why Evolution never gave humans a fixed
                  meta goal, not even the goal of self preservation.</font></div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Sorry, but this seems entirely wrong. A utility-maximizer in a
    complex environment will not necessarily loop (just consider various
    reinforcement learning agents). <br>
    <br>
    And evolution doesn't care if organisms get stuck in loops if they
    produce offspring before with high enough probability. Consider
    pacific salmon. <br>
    <br>
    Sure, simple goal structures can produce simplistic agents. But we
    also know that agents with nearly trivial rules like Langton's ant
    can produce highly nontrivial behaviors (in the ant case whether it
    loops or not is equivalent to the halting problem). We actually do
    not fully know how to characterize the behavior space of utility
    maximizers. <br>
    <br>
    <blockquote
cite="mid:CAJPayv0fa2Shbx9EKkgAo+BzWx0PnH2RTLP86p+FbV4LKvJ=rA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>
              <div class="gmail_default"
                style="font-family:arial,helvetica,sans-serif;display:inline"><font
                  size="4"> If the AI has a meta goal of always obeying
                  humans then sooner or later stupid humans will
                  unintentionally tell the AI to do something that is
                  self contradictory, or tell it to start a task that
                  can never end, and then the AI will stop thinking and
                  do nothing but consume electricity and produce heat.
                   ​</font></div>
              <font size="4"> </font></div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    AI has advanced a bit since 1950s. You are aware that most modern
    architectures are not that fragile?<br>
    <br>
    Try to crash Siri with a question. <br>
    <br>
    <blockquote
cite="mid:CAJPayv0fa2Shbx9EKkgAo+BzWx0PnH2RTLP86p+FbV4LKvJ=rA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>
              <div class="gmail_default"
                style="font-family:arial,helvetica,sans-serif;display:inline"><br>
              </div>
            </div>
            <div>
              <div class="gmail_default"
                style="font-family:arial,helvetica,sans-serif;display:inline"><font
                  size="4">And besides, ​if Microsoft can't guarantee
                  that Windows will always behave as we want I think
                  it's nuts to expect a super intelligent AI to.</font><br>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    And *that* is the real problem, which I personally think the
    friendly AI people - many of them people meet on a daily basis - are
    not addressing enough. Even a mathematically perfect solution is not
    going to be useful if it cannot be implemented, and ideally
    approximate or flawed implementations should converge to the
    solution. <br>
    <br>
    This is why I am spending part of this spring reading up on
    validation methods and building theory for debugging complex
    adaptive technological systems. <br>
    <br>
    <br>
    <pre class="moz-signature" cols="72">-- 
Anders Sandberg
Future of Humanity Institute
Oxford Martin School
Oxford University</pre>
  </body>
</html>