Nash equilibria · Thought Toys

A Nash equilibrium is a quiet kind of trap: a set of choices where no single player can do better by changing their mind alone. It needn't be the best outcome — only the stable one. Drag the temptation to betray and watch cooperation stop holding together.

What you're seeing

Two players, each with the same choice: cooperate or defect. The grid lists what each walks away with — your payoff first, in amber; theirs in cyan. Mutual cooperation pays 3 each; mutual defection pays a meagre 1; and if one defects on a cooperator, the cooperator gets the sucker's 0 while the defector pockets the temptation — the number on the dial.

The little arrows are the whole story. An amber arrow says "you'd jump to that row to earn more"; a cyan arrow says the same for the other player and the columns. A cell that no arrow leaves — that neither of you can improve on by moving alone — is a Nash equilibrium. Drop the token anywhere and hit let them react: each player keeps switching to their best reply, and the token slides downhill along the arrows until it lands in such a cell and stops. The game locks.

Now the dial. While betrayal pays less than honest cooperation (T < 3), cooperating is your best reply to a cooperator, so both mutual cooperation and mutual defection are equilibria — a stag hunt, two stable worlds, one of them good. Drag T up past 3 and the arrows around the cooperative corner flip outward: now you always gain by defecting, so cooperation is no longer self-enforcing and its equilibrium vanishes. What's left is the lone, miserable cell where you both defect for 1 — a prisoner's dilemma. Nobody chose it together; it's just the only place no one can leave. That gap between what's stable and what's best is the whole uneasy point.

The rule, exactly. A strategy profile is a Nash equilibrium when each player's move is a best response to the others' — no one can raise their own payoff by deviating alone: u_i(s_i, s_−i) ≥ u_i(s′_i, s_−i) for every alternative s′_i Here, with reward R=3, punishment P=1, sucker S=0, mutual defection is always an equilibrium (since P > S), while mutual cooperation is one only while T ≤ R. Verified in node (improve/verify/27-nash.js): the equilibrium count flips from two (stag hunt, T=2) to one (dilemma, T=5) at the threshold T=R, and best-response dynamics from all four starts lands in an equilibrium — collapsing to mutual defection whenever T > R. Counter-examples: the outcome that's best for both, mutual cooperation (3,3), is not the equilibrium in the dilemma — "the best outcome is what happens" is exactly the false intuition this breaks. And some games have no pure equilibrium at all: Matching Pennies, where one player wants to match and the other to mismatch, sends the arrows chasing in a circle forever — its only equilibrium is a mixed (randomised) one.

← the cabinet · Thought Toys — a cabinet of explorable explanations. Exhibit 27.