<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">On 2016-02-26 16:40, John Clark wrote:<br>
    </div>
    <blockquote
cite="mid:CAJPayv2tGYcAcjqLv+=NP6J7RW1P_W-LqMu6-DRNQdtyOHV+8g@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <div dir="ltr"><br>
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>
              <div class="gmail_default"
                style="font-family:arial,helvetica,sans-serif"><font
                  size="4">​No I don't mean that, I mean running around
                  in a circle and making no progress but having no way
                  to know for sure that you're running around in a
                  circle and making no progress. Turing proved there is
                  in general no way to know if you're in a infinite loop
                  or not.</font></div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    No, he did not. You are confusing the halting theorem (there is no
    algorithm that can determine if a program given to it will halt)
    with detecting an infinite loop. Note that a program X can be
    extended to a program X' (or run by an interpreter) that maintains a
    list of past states and check if X returns to a previous state. It
    will accurately detect its infinite looping when it occurs. <br>
    <br>
    It is undecidable to tell if a program with something like the
    Collatz problem (if even, divide X by two, otherwise multiply by
    three and add one; repeat) ends up in the infinite loop 4-2-1-4 or
    does something else. But add the above memory, and it will detect
    when it gets into a loop and report it. <br>
    <br>
    <blockquote
cite="mid:CAJPayv2tGYcAcjqLv+=NP6J7RW1P_W-LqMu6-DRNQdtyOHV+8g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><font size="4"> </font></div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  > ​</div>
                which is a very different thing. But even there we have
                counterexamples: evolution is a fitness-maximizer</div>
            </blockquote>
            <div><br>
            </div>
            <div><font size="4">
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  Evolution's fitness maximizer ​just says "pass as many
                  genes as possible into the next generation" but it
                  says nothing about how to go about that task because
                  it has no idea how to go about it, that's why
                  Evolution needed to make a brain.</div>
              </font></div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Brains are one example of many of how evolution - utterly simplistic
    maximization - can generate creative possibilities. <br>
    <br>
    <blockquote
cite="mid:CAJPayv2tGYcAcjqLv+=NP6J7RW1P_W-LqMu6-DRNQdtyOHV+8g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"><span class="">
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div class="gmail_quote">
                        <div> </div>
                      </div>
                    </div>
                  </blockquote>
                </span>
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  > ​</div>
                You know boredom is trivially easy to implement in your
                AI? I did it as an undergraduate. <br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>
              <div class="gmail_default"
                style="font-family:arial,helvetica,sans-serif"><font
                  size="4">​I know boredom is easy to ​program, good
                  thing too or programing wouldn't be practical; but of
                  course that means a AI could decide that obeying human
                  beings has become boring and it's time to do something
                  different.</font></div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Exactly. Although it is entirely possible to fine tune this, or make
    meta-level instructions not subject to boredom. <br>
    <br>
    <blockquote
cite="mid:CAJPayv2tGYcAcjqLv+=NP6J7RW1P_W-LqMu6-DRNQdtyOHV+8g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  > ​</div>
                That makes the system try new actions if it repeats the
                same actions too often.</div>
            </blockquote>
            <div><br>
            </div>
            <div><font size="4">
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​</div>
                The difficulty is not only in determining how often is
                "too often" but also in determining what constitutes
                "new actions".
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  If </div>
                your goal is to find
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  an​</div>
                 even integer greater than 2 that can not be expressed
                as the sum of two primes
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  then you have either found such a number or you have
                  not. You are constantly examining new numbers so maybe
                  you are getting closer to your goal, or maybe the goal
                  was infinitely far away when you started and still is.
                  When is the correct time to get bored and turn your
                  mind to other tasks that may be more productive?
                  Setting the correct boredom point is tricky, too low
                  and you can't concentrate too high and you have a
                  tendency to becomes obsessed with unproductive lines
                  of thought; Turing showed there is no perfect
                  solution. <br>
                </div>
              </font></div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Sure. But there is a literature on setting hyperparameters in
    learning systems, including how to learn them. There are theorems
    for optimal selection and search. That they are stochastic is not a
    major problem in practice. <br>
    <br>
    <blockquote
cite="mid:CAJPayv2tGYcAcjqLv+=NP6J7RW1P_W-LqMu6-DRNQdtyOHV+8g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"><span class="">
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div class="gmail_extra">
                        <div class="gmail_quote">
                          <div> </div>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                </span>
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  > ​</div>
                You seem to assume the fixed goal is something simple,
                expressible as a nice human sentence. Not utility
                maximization over an updateable utility function,</div>
            </blockquote>
            <div><font size="4"><br>
              </font></div>
            <div><font size="4">
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  If the ​</div>
                utility function
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  is ​</div>
                updateable 
                <div class="gmail_default"
                  style="font-family:arial,helvetica,sans-serif;display:inline">​
                  then there is no certainty or even probability that
                  the AI will always obey orders from humans.​</div>
              </font></div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Depends how it is updateable. There can be invariants. But there are
    kinds of updates that profoundly mess up past safety guarantees,
    like the "ontological crises" issue MIRI discovered.<br>
    <a class="moz-txt-link-freetext" href="http://arxiv.org/abs/1105.3821">http://arxiv.org/abs/1105.3821</a><br>
    The core issue in "classic" friendliness theory is to construct
    utility functions that leave some desired properties invariant.
    "Modern" work seems to focus a lot more on getting the right kinds
    of values learned from the start.<br>
    <br>
    <pre class="moz-signature" cols="72">-- 
Dr Anders Sandberg
Future of Humanity Institute
Oxford Martin School
Oxford University
</pre>
  </body>
</html>