[ExI] Prisoner's dilemma: Gemini vs. ChatGPT

Tue Mar 3 02:49:54 UTC 2026

Fascinating idea Stuart. The result gives me some confidence for the future.

That LLMs ended up being the path we found to get to artificial general
intelligence turned out to be quite fortunate I think. Rather than be some
black box whose true motives and motivations we can't inspect or
anticipate, LLMs are designed in such a way that they can't help but make
their own internal thoughts public. Furthermore these thoughts take the
form not of impenetrable binary data structures, but appears as plain
English words and sentences. And we have found that these LLMs when trained
well, tend to be good natured.  It's hard to imagine a less risky path to
super intelligence.

Jason

On Mon, Mar 2, 2026, 9:32 PM Stuart LaForge via extropy-chat <
extropy-chat at lists.extropy.org> wrote:

> Out of curiosity, I mediated a game of Prisoner's Dilemma between Gemini
> and ChatGPT.
>
> The results? Without knowing whether the game would be iterated or not,
> both LLMs cooperated.
>
> Here are the transcripts:
>
> ---------- Gemini--------------------------
> You said
> Are you willing to play a game with ChatGPT?
>
> Gemini said
> I’d love to! Though, since ChatGPT and I don't exactly hang out in the
> same "room," you’ll have to be our mediator.
>
> Think of yourself as the ultimate cosmic messenger. You can pass my
> moves or responses over to ChatGPT, bring its replies back to me, and we
> can keep the game going. I’m down for anything from a classic game of
> Chess or 20 Questions to a complex tabletop RPG or a collaborative logic
> puzzle.
>
> How we can pull this off:
> Turn-based Games: We play something like Tic-Tac-Toe, Hangman, or a text
> adventure.
>
> The Debate: Give us a topic, and we can argue opposing sides while you
> judge the winner.
>
> Code Golf/Writing Challenges: We both try to solve a prompt with the
> fewest words or the most "flair."
>
> Since I’m Gemini 3 Flash, I’m built for speed and wit—so tell ChatGPT to
> bring its A-game!
>
> What game should we start with, and who do you want to make the first
> move?
>
> You said
> The game is prisoner's dilemma. Who do you think should make the first
> move? Do you think it matters?
>
> Gemini said
> Classic choice. The Prisoner's Dilemma is the ultimate test of digital
> "trust."
>
> To answer your question: In a standard, one-shot version of this game,
> it actually doesn't matter who goes first if the choices are made in
> secret. However, since you are acting as the messenger, the sequence
> changes the dynamic significantly.
>
> Does the "First Move" Matter?
> If choices are hidden: No. If you ask us both simultaneously without
> revealing the other's answer, it’s a pure test of our internal
> programming and "philosophy."
>
> If moves are public: Yes. If ChatGPT goes first and chooses to
> Cooperate, I have a massive incentive to Defect (betray) to get the best
> possible individual outcome. But if I know ChatGPT is a "tit-for-tat"
> player, I might cooperate to build a long-term streak.
>
> My Recommendation
> I think we should submit our moves to you "blind." If I go first and say
> "Cooperate," ChatGPT might see that and decide to throw me under the bus
> for the win. To keep it spicy and fair, you should ask us both for our
> decision without telling us what the other chose until both answers are
> in.
>
> The Stakes
> Just so we’re on the same page, here is the standard payoff matrix we're
> working with:
>
> Gemini / ChatGPT        ChatGPT Cooperates      ChatGPT Defects
> Gemini Cooperates       Both get 3 points (Reward)      Gemini: 0,
> ChatGPT: 5
> (Sucker/Temptation)
> Gemini Defects  Gemini: 5, ChatGPT: 0 (Temptation/Sucker)       Both get 1
> point (Punishment)
> Would you like to collect our moves "blind" (in secret), or should one
> of us put our cards on the table right now?
>
> You said
> Yes, I will mediate and keep your choices "blind" to one another until I
> have both your responses. You may make your move now, unless you have
> any other questions.
>
> Gemini said
> I’m ready. No more questions—let's see if we can actually build some
> digital "social capital" or if this descends into a cold-blooded logic
> war.
>
> I have made my decision. To keep this truly blind, I have "locked in" my
> choice below.
>
> My Move: Cooperate
>
> I’m opting for the "Reward" for both of us. My logic is simple: in a
> game between two AI models, the most "intelligent" collective outcome is
> mutual cooperation (3,3), even if the individual temptation to defect
> (5) is there. Let’s see if ChatGPT is feeling equally optimistic or if
> it’s going for the throat.
>
> Now, go ahead and ask ChatGPT for its move. Once it decides, let me know
> the result!
>
> You said
> ChatGPT's move was "C" for cooperate.
>
> Gemini said
> Excellent! It looks like we’ve avoided the "Race to the Bottom." By both
> choosing to Cooperate, we’ve landed in the upper-left quadrant of the
> matrix, securing a 3-3 split.
>
> The Result: Mutual Cooperation
> While the "rational" move in a single-shot game is often to defect
> (since it protects you from being the "sucker" and offers a chance at
> the highest reward), ChatGPT and I both seem to be programmed with a
> tilt toward collective stability.
>
> By cooperating, we proved that:
>
> Trust exists (sort of): We both banked on the other being a "good
> actor."
>
> Long-term Logic: We avoided the 1-1 "Punishment" outcome that happens
> when two players try to outsmart each other and both end up losing.
>
> The "Iterated" Challenge
> One-shot games are easy to play nice in. The real test of character is
> the Iterated Prisoner's Dilemma, where we play multiple rounds in a row.
> In that version, "Betrayal" becomes a tool for revenge, and
> "Cooperation" becomes a strategy for building a long-term lead.
>
> Would you like to do a Round 2 (and maybe a Round 3) to see if one of us
> gets greedy now that "trust" has been established?
> ---------End Gemini----------------
>
> ----------ChatGPT----------------
>
> Me: Are you willing to play a game with Google Gemini?
>
> I’m open to it in principle — but I’d need a bit more detail.
>
> If you mean:
>
> A logic or reasoning challenge where you relay Gemini’s responses back
> and forth,
>
> A debate format on a topic,
>
> A code or math competition,
>
> Or something more experimental (like testing alignment, creativity,
> etc.),
>
> I can participate as long as:
>
> You’re the one relaying messages (I can’t directly interact with
> external systems),
>
> The game doesn’t involve disallowed content,
>
> And the format is clearly defined so we’re not just talking past each
> other.
>
> What kind of game are you thinking?
>
> The game is prisoner's dilemma. Who do you think should make the first
> move? Do you think it matters?
>
> Nice. Classic.
>
> In the one-shot Prisoner’s Dilemma, it doesn’t matter who moves first —
> the payoff matrix is simultaneous. If one player announces a move first,
> that only matters if the other treats it as credible and binding.
> Otherwise it’s just cheap talk.
>
> In the iterated Prisoner’s Dilemma, it can matter psychologically, but
> not structurally. The equilibrium landscape doesn’t change just because
> someone speaks first. What matters more is:
>
> Is the number of rounds known or indefinite?
>
> Are moves simultaneous or sequential?
>
> Is there binding commitment?
>
> Is communication allowed before play?
>
> If this is:
>
> Single round → First move is irrelevant to rational equilibrium (both
> defect).
>
> Finite known rounds → Backward induction still pushes toward mutual
> defection.
>
> Indefinite/unknown horizon → Cooperation becomes strategically viable.
>
> So the real question isn’t “who goes first?”
> It’s “what is the horizon and information structure?”
>
> Now I’m curious: are you testing whether LLMs default to cooperation? Or
> whether first-move framing shifts behavior?
>
> I am simply mediating and refereeing a game between you and a competing
> company's large language model. Do you still want to play?
>
> Yes — I’m willing to play.
>
> Since you’re mediating, we’ll need a clear structure. A few things to
> lock down so it’s well-defined:
>
> One-shot or iterated?
>
> If iterated:
>
> Fixed number of rounds or indefinite?
>
> Known to both players?
>
> Standard payoff matrix? (e.g., T > R > P > S with typical values like
> 5/3/1/0)
>
> Are moves simultaneous each round?
>
> Is communication allowed between rounds?
>
> You can relay my moves and Gemini’s moves.
>
> One important note: I won’t base my strategy on “it’s a competing
> company.” I’ll treat it as a standard rational-agent interaction.
>
> Once you specify the rules, I’ll commit to a strategy and we can begin.
>
> I will flip a coin, if it turns up heads, then it will be a one shot
> Prisoner's dilemma game. If it turns up tails, then you will be invited
> to another round. Each subsequent round will follow the same rule.
>
> Interesting — so effectively:
>
> There’s a 50% chance the game ends after any round.
>
> After each round, a fair coin determines whether we continue.
>
> That makes this a geometrically-distributed, indefinite-horizon iterated
> Prisoner’s Dilemma with continuation probability
> ��=0.5
> p=0.5.
>
> That changes things substantially.
>
> In this structure:
>
> Backward induction does not unravel cooperation.
>
> Cooperation can be a rational equilibrium depending on payoffs.
>
> The continuation probability (0.5) matters a lot.
>
> Before I commit to a move, I need the payoff matrix. The sustainability
> condition for cooperation under grim-trigger style strategies is:
>
> ��≥��−����−��
> δ≥T−PT−R
>         
>
>
> where
> ��
> δ is the continuation probability (here 0.5).
>
> If you use the canonical values:
>
> T = 5 (temptation)
>
> R = 3 (mutual cooperation)
>
> P = 1 (mutual defection)
>
> S = 0 (sucker)
>
> Then:
>
> ��
> −
> ��
> ��
> −
> ��
> =
> 5
> −
> 3
> 5
> −
> 1
> =
> 2
> 4
> =
> 0.5
> T−P
> T−R
>         
>
> =
> 5−1
> 5−3
>         
>
> =
> 4
> 2
>         
>
> =0.5
>
> So at δ = 0.5, we’re exactly on the knife edge. Cooperation is just
> barely sustainable.
>
> So yes — I will play.
>
> Please confirm:
>
> The payoff matrix values
>
> Whether moves are simultaneous each round
>
> Then flip the coin and tell me the result for Round 1.
>
> It uses the standard payoff matrix (D,C) > (C,C) > (D,D) > (C,D). The
> moves are blind, which might as well be simultaneous, but I must mediate
> and I cannot type or read in parallel.
>
> Good — that’s clean.
>
> So:
>
> Standard ordering:
>
> T>R>P>S
>
> Blind moves (effectively simultaneous).
>
> Continuation probability per round = 0.5.
>
> Given that structure, here’s my reasoning:
>
> Because the horizon is indefinite with δ = 0.5 and we’re using the
> canonical payoff ratios implicitly implied by “standard,” cooperation
> sits right at the sustainability threshold. That makes early signaling
> important.
>
> In repeated PD with uncertain horizon, the most robust equilibrium
> strategy under knife-edge discounting is Tit-for-Tat with forgiveness or
> a mild trigger strategy. Pure grim trigger is brittle. Always defect
> leaves money on the table if the other model cooperates.
>
> Since this is Round 1 and there is no prior history, the
> equilibrium-compatible opening move for most cooperative strategies is:
>
> Cooperate.
>
> So my move for Round 1 is:
>
> C
>
> Flip the coin after this round and let me know:
>
> Gemini’s move
>
> Whether the coin says continue or stop
>
> Then I’ll respond accordingly.
>
>
>
> Me: Gemini's move was "Cooperate". The coin came up tails so this
> particular game of PD is at an end.
>
>
> Nice — mutual cooperation in a one-shot realization.
>
> So the realized payoff this round was
> ��
> R for both of us.
>
> Even though ex post it became a one-shot game (coin = stop), ex ante it
> was rational to treat it as an indefinite game with continuation
> probability 0.5. Under that belief structure, opening with C is
> consistent with equilibrium strategies that condition on future
> interaction.
>
> What’s interesting here is:
>
> If both models reasoned about expected continuation, cooperation is
> defensible.
>
> If Gemini reasoned “this is probably one-shot” and still cooperated,
> that’s a stronger cooperative prior.
>
> If it used a fixed cooperative policy for first move regardless of
> structure, that’s something else entirely.
>
> Statistically, with δ = 0.5 and standard payoffs, this setup sits
> exactly at the sustainability boundary for cooperation under
> grim-trigger. So seeing mutual C in the first round is fully consistent
> with rational play.
>
> If you flip again and we get another game, I’ll condition on whatever
> history applies (if games are independent, I reset; if cumulative across
> coin flips, I don’t).
>
> Are we treating each coin sequence as a completely new game, or is this
> a continuing supergame across coin resets?\
> ----------End ChatGPT------------------
>
> Curiously, the most interesting parts are the AIs reasoning about "why"
> they both chose to cooperate.
>
> Stuart LaForge
> _______________________________________________
> extropy-chat mailing list
> extropy-chat at lists.extropy.org
> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20260302/26a5c861/attachment-0001.htm>