The Iterated Prisoner’s Dilemma
The Game
In each round, each of us can choose to cooperate or to defect. We play repeated rounds.
Scoring
- If we both cooperate, we both get 3 points.
- If you cooperate and I defect, I get 5 points and you get none.
- If you defect and I cooperate, you get 5 points and I get none.
- If we both defect, we both get 1 point.
A new discovery
William Press and Freeman Dyson have discovered a new class of strategies that are quite surprising. One thing they found is that I can decide what I want your score to be, and play in such a way that your average score is whatever I decided. The way you play will affect my score, but over the long run there is nothing you can do to affect your own score.
Let us play!
To illustrate this, I’ve decided I want your average score to be 2. You’ll see that, however you play, after a few hundred moves your average score will be approximately 2.
How does it work?
The computer is playing a very simple strategy in the game above. Its play is based purely on how both of us played in the previous move:
- If you cooperated last time, then I cooperate with probability 2/3.
- If I cooperated and you defected, then this time I defect.
- If we both defected last time, I cooperate with probability 1/3.
These probabilities were obtained by deciding I wanted your average score to be 2, and solving equations [8] and [9] in Press and Dyson’s paper for a target score of 2.
Notes
Although this strategy does belong to the new and interesting class of Press-Dyson strategies, it turns out that this particular type of Press-Dyson strategy (ones that force your opponent to have a particular fixed score, on average, however they play) were described earlier by Maarten C Boerlijst, Martin A Nowak and Karl Sigmund in their 1997 paper Equal Pay for All Prisoners.
How about some extortion?
Another interesting class of Press-Dyson strategies is the “extortionate” ones. In this example your best strategy (if you want to maximise your own score) is to cooperate all the time – but then I will occasionally defect and so always do better than you.
In fact if you follow any strategy other than “always defect” then I will do three times better than you, in the sense that – on average over the long run – (my score − 1) will be 3× larger than (your score − 1).
Try it and see!