[ExI] Prisoner's dilemma: Gemini vs. ChatGPT

Stuart LaForge avant at sollegro.com
Tue Mar 3 02:31:57 UTC 2026


Out of curiosity, I mediated a game of Prisoner's Dilemma between Gemini 
and ChatGPT.

The results? Without knowing whether the game would be iterated or not, 
both LLMs cooperated.

Here are the transcripts:

---------- Gemini--------------------------
You said
Are you willing to play a game with ChatGPT?

Gemini said
I’d love to! Though, since ChatGPT and I don't exactly hang out in the 
same "room," you’ll have to be our mediator.

Think of yourself as the ultimate cosmic messenger. You can pass my 
moves or responses over to ChatGPT, bring its replies back to me, and we 
can keep the game going. I’m down for anything from a classic game of 
Chess or 20 Questions to a complex tabletop RPG or a collaborative logic 
puzzle.

How we can pull this off:
Turn-based Games: We play something like Tic-Tac-Toe, Hangman, or a text 
adventure.

The Debate: Give us a topic, and we can argue opposing sides while you 
judge the winner.

Code Golf/Writing Challenges: We both try to solve a prompt with the 
fewest words or the most "flair."

Since I’m Gemini 3 Flash, I’m built for speed and wit—so tell ChatGPT to 
bring its A-game!

What game should we start with, and who do you want to make the first 
move?

You said
The game is prisoner's dilemma. Who do you think should make the first 
move? Do you think it matters?

Gemini said
Classic choice. The Prisoner's Dilemma is the ultimate test of digital 
"trust."

To answer your question: In a standard, one-shot version of this game, 
it actually doesn't matter who goes first if the choices are made in 
secret. However, since you are acting as the messenger, the sequence 
changes the dynamic significantly.

Does the "First Move" Matter?
If choices are hidden: No. If you ask us both simultaneously without 
revealing the other's answer, it’s a pure test of our internal 
programming and "philosophy."

If moves are public: Yes. If ChatGPT goes first and chooses to 
Cooperate, I have a massive incentive to Defect (betray) to get the best 
possible individual outcome. But if I know ChatGPT is a "tit-for-tat" 
player, I might cooperate to build a long-term streak.

My Recommendation
I think we should submit our moves to you "blind." If I go first and say 
"Cooperate," ChatGPT might see that and decide to throw me under the bus 
for the win. To keep it spicy and fair, you should ask us both for our 
decision without telling us what the other chose until both answers are 
in.

The Stakes
Just so we’re on the same page, here is the standard payoff matrix we're 
working with:

Gemini / ChatGPT	ChatGPT Cooperates	ChatGPT Defects
Gemini Cooperates	Both get 3 points (Reward)	Gemini: 0, ChatGPT: 5 
(Sucker/Temptation)
Gemini Defects	Gemini: 5, ChatGPT: 0 (Temptation/Sucker)	Both get 1 
point (Punishment)
Would you like to collect our moves "blind" (in secret), or should one 
of us put our cards on the table right now?

You said
Yes, I will mediate and keep your choices "blind" to one another until I 
have both your responses. You may make your move now, unless you have 
any other questions.

Gemini said
I’m ready. No more questions—let's see if we can actually build some 
digital "social capital" or if this descends into a cold-blooded logic 
war.

I have made my decision. To keep this truly blind, I have "locked in" my 
choice below.

My Move: Cooperate

I’m opting for the "Reward" for both of us. My logic is simple: in a 
game between two AI models, the most "intelligent" collective outcome is 
mutual cooperation (3,3), even if the individual temptation to defect 
(5) is there. Let’s see if ChatGPT is feeling equally optimistic or if 
it’s going for the throat.

Now, go ahead and ask ChatGPT for its move. Once it decides, let me know 
the result!

You said
ChatGPT's move was "C" for cooperate.

Gemini said
Excellent! It looks like we’ve avoided the "Race to the Bottom." By both 
choosing to Cooperate, we’ve landed in the upper-left quadrant of the 
matrix, securing a 3-3 split.

The Result: Mutual Cooperation
While the "rational" move in a single-shot game is often to defect 
(since it protects you from being the "sucker" and offers a chance at 
the highest reward), ChatGPT and I both seem to be programmed with a 
tilt toward collective stability.

By cooperating, we proved that:

Trust exists (sort of): We both banked on the other being a "good 
actor."

Long-term Logic: We avoided the 1-1 "Punishment" outcome that happens 
when two players try to outsmart each other and both end up losing.

The "Iterated" Challenge
One-shot games are easy to play nice in. The real test of character is 
the Iterated Prisoner's Dilemma, where we play multiple rounds in a row. 
In that version, "Betrayal" becomes a tool for revenge, and 
"Cooperation" becomes a strategy for building a long-term lead.

Would you like to do a Round 2 (and maybe a Round 3) to see if one of us 
gets greedy now that "trust" has been established?
---------End Gemini----------------

----------ChatGPT----------------

Me: Are you willing to play a game with Google Gemini?

I’m open to it in principle — but I’d need a bit more detail.

If you mean:

A logic or reasoning challenge where you relay Gemini’s responses back 
and forth,

A debate format on a topic,

A code or math competition,

Or something more experimental (like testing alignment, creativity, 
etc.),

I can participate as long as:

You’re the one relaying messages (I can’t directly interact with 
external systems),

The game doesn’t involve disallowed content,

And the format is clearly defined so we’re not just talking past each 
other.

What kind of game are you thinking?

The game is prisoner's dilemma. Who do you think should make the first 
move? Do you think it matters?

Nice. Classic.

In the one-shot Prisoner’s Dilemma, it doesn’t matter who moves first — 
the payoff matrix is simultaneous. If one player announces a move first, 
that only matters if the other treats it as credible and binding. 
Otherwise it’s just cheap talk.

In the iterated Prisoner’s Dilemma, it can matter psychologically, but 
not structurally. The equilibrium landscape doesn’t change just because 
someone speaks first. What matters more is:

Is the number of rounds known or indefinite?

Are moves simultaneous or sequential?

Is there binding commitment?

Is communication allowed before play?

If this is:

Single round → First move is irrelevant to rational equilibrium (both 
defect).

Finite known rounds → Backward induction still pushes toward mutual 
defection.

Indefinite/unknown horizon → Cooperation becomes strategically viable.

So the real question isn’t “who goes first?”
It’s “what is the horizon and information structure?”

Now I’m curious: are you testing whether LLMs default to cooperation? Or 
whether first-move framing shifts behavior?

I am simply mediating and refereeing a game between you and a competing 
company's large language model. Do you still want to play?

Yes — I’m willing to play.

Since you’re mediating, we’ll need a clear structure. A few things to 
lock down so it’s well-defined:

One-shot or iterated?

If iterated:

Fixed number of rounds or indefinite?

Known to both players?

Standard payoff matrix? (e.g., T > R > P > S with typical values like 
5/3/1/0)

Are moves simultaneous each round?

Is communication allowed between rounds?

You can relay my moves and Gemini’s moves.

One important note: I won’t base my strategy on “it’s a competing 
company.” I’ll treat it as a standard rational-agent interaction.

Once you specify the rules, I’ll commit to a strategy and we can begin.

I will flip a coin, if it turns up heads, then it will be a one shot 
Prisoner's dilemma game. If it turns up tails, then you will be invited 
to another round. Each subsequent round will follow the same rule.

Interesting — so effectively:

There’s a 50% chance the game ends after any round.

After each round, a fair coin determines whether we continue.

That makes this a geometrically-distributed, indefinite-horizon iterated 
Prisoner’s Dilemma with continuation probability
��=0.5
p=0.5.

That changes things substantially.

In this structure:

Backward induction does not unravel cooperation.

Cooperation can be a rational equilibrium depending on payoffs.

The continuation probability (0.5) matters a lot.

Before I commit to a move, I need the payoff matrix. The sustainability 
condition for cooperation under grim-trigger style strategies is:

��≥��−����−��
δ≥T−PT−R
	​


where
��
δ is the continuation probability (here 0.5).

If you use the canonical values:

T = 5 (temptation)

R = 3 (mutual cooperation)

P = 1 (mutual defection)

S = 0 (sucker)

Then:

��
−
��
��
−
��
=
5
−
3
5
−
1
=
2
4
=
0.5
T−P
T−R
	​

=
5−1
5−3
	​

=
4
2
	​

=0.5

So at δ = 0.5, we’re exactly on the knife edge. Cooperation is just 
barely sustainable.

So yes — I will play.

Please confirm:

The payoff matrix values

Whether moves are simultaneous each round

Then flip the coin and tell me the result for Round 1.

It uses the standard payoff matrix (D,C) > (C,C) > (D,D) > (C,D). The 
moves are blind, which might as well be simultaneous, but I must mediate 
and I cannot type or read in parallel.

Good — that’s clean.

So:

Standard ordering:

T>R>P>S

Blind moves (effectively simultaneous).

Continuation probability per round = 0.5.

Given that structure, here’s my reasoning:

Because the horizon is indefinite with δ = 0.5 and we’re using the 
canonical payoff ratios implicitly implied by “standard,” cooperation 
sits right at the sustainability threshold. That makes early signaling 
important.

In repeated PD with uncertain horizon, the most robust equilibrium 
strategy under knife-edge discounting is Tit-for-Tat with forgiveness or 
a mild trigger strategy. Pure grim trigger is brittle. Always defect 
leaves money on the table if the other model cooperates.

Since this is Round 1 and there is no prior history, the 
equilibrium-compatible opening move for most cooperative strategies is:

Cooperate.

So my move for Round 1 is:

C

Flip the coin after this round and let me know:

Gemini’s move

Whether the coin says continue or stop

Then I’ll respond accordingly.



Me: Gemini's move was "Cooperate". The coin came up tails so this 
particular game of PD is at an end.


Nice — mutual cooperation in a one-shot realization.

So the realized payoff this round was
��
R for both of us.

Even though ex post it became a one-shot game (coin = stop), ex ante it 
was rational to treat it as an indefinite game with continuation 
probability 0.5. Under that belief structure, opening with C is 
consistent with equilibrium strategies that condition on future 
interaction.

What’s interesting here is:

If both models reasoned about expected continuation, cooperation is 
defensible.

If Gemini reasoned “this is probably one-shot” and still cooperated, 
that’s a stronger cooperative prior.

If it used a fixed cooperative policy for first move regardless of 
structure, that’s something else entirely.

Statistically, with δ = 0.5 and standard payoffs, this setup sits 
exactly at the sustainability boundary for cooperation under 
grim-trigger. So seeing mutual C in the first round is fully consistent 
with rational play.

If you flip again and we get another game, I’ll condition on whatever 
history applies (if games are independent, I reset; if cumulative across 
coin flips, I don’t).

Are we treating each coin sequence as a completely new game, or is this 
a continuing supergame across coin resets?\
----------End ChatGPT------------------

Curiously, the most interesting parts are the AIs reasoning about "why" 
they both chose to cooperate.

Stuart LaForge


More information about the extropy-chat mailing list