[extropy-chat] Superrationality

Fri May 19 17:25:33 UTC 2006

A couple of thoughts on superrationality and the PD:

The PD is usually expressed in perfectly symmetrical form, something like:

(5,5) | (0,10)
(10,0)| (1,1)

However in real life a PD situation is unlikely to be perfectly symmetrical
in terms of utilities.  It might be more like:

(5.1,4.9)  | (0.2,9.9)
(10.3,0.1) | (0.9,1.1)

With this kind of payoff matrix, the argument that I gave doesn't quite
work.  It is no longer the case that the two parties will *necessarily*
play the same strategy.  They will plausibly play nearly the same,
or similar strategies, but there might be some variations.

Clearly we could modify the matrix still more, and the farther we got from
perfect symmmetry, the less force the argument for superrationality would
carry.  It's not clear how to modify the superrationality argument and
analysis to deal with arbitrary games.  When should superrational behavior
cut in?  What does it tell you in general?  Superrationality needs to be
formalized and extended beyond the simplistic argument I gave in order
for it to have any credibility as a real model for reasoning and behavior.

Possibly Anders' colleague who is working on related concepts might be
exploring along these lines.

On another topic, I posted a while back an interesting analogy between
superrationality in the PD, and Newcomb's paradox.  Newcomb's paradox
is described at:

http://en.wikipedia.org/wiki/Newcomb%27s_paradox

> The player of the game is presented with two opaque boxes, labeled A and
> B. The player is permitted to take the contents of both boxes, or just
> of box B. (The option of taking only box A is ignored, for reasons soon
> to be obvious.) Box A contains $1,000. The contents of box B, however,
> are determined as follows: At some point before the start of the game,
> the Predictor makes a prediction as to whether the player of the game will
> take just box B, or both boxes. If the Predictor predicts that both boxes
> will be taken, then box B will contain nothing. If the Predictor predicts
> that only box B will be taken, then box B will contain $1,000,000.
>
> By the time the game begins, and the player is called upon to choose which
> boxes to take, the prediction has already been made, and the contents
> of box B have already been determined. That is, box B contains either
> $0 or $1,000,000 before the game begins, and once the game begins even
> the Predictor is powerless to change the contents of the boxes. Before
> the game begins, the player is aware of all the rules of the game,
> including the two possible contents of box B, the fact that its contents
> is based on the Predictor's prediction, and knowledge of the Predictor's
> infallibility. The only information withheld from the player is what
> prediction the Predictor made, and thus what the contents of box B are.

One way to think of the alternatives in this case is to use the
distinction between "evidentiary" and "causative" reasoning.  If you
take both boxes, you are *causing* yourself to get more than if you
take one box.  You know that the 2nd box has $1,000 so your action
of taking both boxes has a direct causative consequence.  If however
you take just box B, you are giving yourself *evidence* that you are
likely to get more money.  You don't directly cause the box to have more
(in most analyses), rather you get yourself into a state which is highly
correlated with there being more money for you.  The fact that you are the
kind of person who is willing to take just one box gives you *evidence*
to believe that you will get more money.

The connection to superrationality is that it is another example of
evidentiary reasoning, like the case of taking just one box in the
Newcomb paradox.  Cooperating in the one-shot PD gives you evidence,
per the standard superrationality reasoning, that the other player will
also cooperate.  (Likewise, defecting would give you evidence that
he will defect.)  However, it does not *cause* him to act that way.
Your choice has no direct causal consequences on the other player,
it merely gives you evidence about how he is likely to behave.

It's interesting that although people are generally split on Newcomb's
paradox, substantial numbers going either way, very few people accept
evidentiary reasoning in the case of superrationality.  I suspect that
the trouble, ironically, is that people are just not thinking clearly
in evaluating superrational reasoning.  They fail to appreciate the
strength of the argument that rational people will do the same thing.
We see this clearly when people accept that superrationality makes sense
up to a point, conclude that it means the other player will cooperate,
and then decide to defect.

This failure is not too surprising, since economic reasoning relies on
a certain degree of abstraction and reductionism in terms of evaluating
what rational actors will do.  The game theorist's ideal of a rational
player does not capture all of human complexity, and I think people
have trouble reducing their model of humanity to the restricted, pure
kinds of reasoning available to the idealized rationalist.  If people
did think about game theory in that ideal sense, we might see a more
even split over superrationality, as we do for Newcomb's paradox.

The other point of note is that my understanding is that most philosophers
agree that there is only one valid choice in Newcomb's paradox, and it
is the same consideration that applies in the PD.  Causal reasoning is
considered the only valid option, and therefore they say that the only
rational choice is to take two boxes in Newcomb's case, and to defect
in the PD.  In either case, the alternative strategy is dominated -
you cause yourself to get more by making these choices.

Proponents of this view admit that you actually will get less by doing
this, but they insist that nevertheless, rationality forces you to make
this choice.  I ran across a very persuasive statement of this position
in the case of the Newcomb paradox by a decision theorist named James
Joyce, which I can present if anyone is interested.

Hal