[ExI] malevolent machines

Anders Sandberg anders at aleph.se
Thu Apr 10 08:30:45 UTC 2014


William Flynn Wallace <foozler83 at gmail.com> , 9/4/2014 8:56 PM:
We all know that all of our machines are out to get us and frustrate us in various ways, but is it really possible?  In scifi there are dozens of books about entire machine cultures which, of course, are the enemies of humans and maybe all living things.
 
Unless there is something really important that I don't know about computers, it seems to me that having a machine 'wake up' like Mike in Heinlein's The Moon is a Harsh Mistress, is just absurd.  It does what it is programmed to do and cannot do anything else. 

There is a difference between spontaneously waking up and being designed to become smart (and the latter is more likely). In both these cases the problem of making the machine behave well is tricky. But if the waking up or self-improvement is rapid, the behaviour/motivation design problem becomes crucial, since it leads to very powerful entities that may still have flawed goals. (And since flawed goals appear more likely than sensible goals, we have a problem: http://www.nickbostrom.com/superintelligentwill.pdf )
The objection that machines only do what they are told is known as Lady Lovelace's Objection and has a venerable history, but Turing refuted it in his 1950 paper. He concludes the section:
"The view that machines cannot give rise to surprises is due, I believe, to a fallacy to which philosophers and mathematicians are particularly subject. This is the assumption that as soon as a fact is presented to a mind all consequences of that fact spring into the mind simultaneously with it. It is a very useful assumption under many circumstances, but one too easily forgets that it is false. A natural consequence of doing so is that one then assumes that there is no virtue in the mere working out of consequences from data and general principles."
Another way of seeing why machines can do new things not explicitly programmed into them is to recognize that a program with learning features will potentially change its behaviour based on environmental stimuli that the original programmer cannot know. Here the limit is not programmer prediction ability but the fact that the machine is part of a bigger, complex system. 
The real problem is dealing with logical and environmental unpredictability to produce reliable behaviour. It *does* work in many restricted domains! But we do not know if it can work for intelligent systems, which are by definition a fairly open-ended domain. It might be that one can restrict/initialise the motivation subsystem in such a way that it remains trustworthy, but so far we have no good theory for it even in very restricted formal cases (just check out the papers at http://intelligence.org/research/ ).
Anders Sandberg, Future of Humanity Institute Philosophy Faculty of Oxford University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20140410/13f8c823/attachment.html>


More information about the extropy-chat mailing list