Project 2 focuses on the use of higher order procedures, together with data structures. You will also further develop and demonstrate your ability to write clear, intelligible, well-documented procedures, as well as test cases for your procedures.
In the mid-1920's, the Nebraska
State Police achieved what may still be their finest moment. After a 400-mile
car chase over dirt roads and through corn fields, they finally caught up with
the notorious bank robbers Bunny and Clod. The two criminals were brought back
to the police station in Omaha for further interrogation. Bunny and Clod were
questioned in separate rooms, and each was offered the same deal by the police.
The deal went as follows (since both are the same, we need only describe the
version presented to Bunny):
“Bunny, here's the offer
that we are making to both you and Clod. If you both hold out on us and don't
confess to bank robbery, then we admit that we don't have enough proof to
convict you. However, we will be able to jail you both for one year, for
reckless driving and endangerment of corn. If you turn state's witness and help
us convict Clod (assuming he doesn't confess), then you will go free, and Clod
will get twenty years in prison. On the other hand, if you don't confess and
Clod does, then he will go free and you will get twenty years.”
“What happens if both Clod
and I confess?” asked Bunny.
“Then you both get ten
years,” responded the police.
Bunny, who had been a math major
at Cal Tech before turning to crime, reasoned this way: “Suppose Clod
intends to confess. Then if I don't confess, I'll get twenty years, but if I do
confess, I'll only get ten years. On the other hand, suppose Clod intends to
hold out on the cops. Then if I don't confess, I'll go to jail for a year, but
if I do confess, I'll go free. So no matter what Clod intends to do, I am
better off confessing than holding out. So I'd better confess.”
Naturally, Clod employed the very
same reasoning. Both criminals confessed, and both went to jail for ten years.
The police, of course, were triumphant,
since the criminals would have been free in a year had both remained silent.
The Bunny and Clod story is an
example of a situation known in mathematical game theory as the
“prisoner's dilemma.” A prisoner's dilemma always involves two
“game players,” and each has a choice between
“cooperating” and “defecting.” If the two players cooperate,
they each do moderately well; if they both defect, they each do moderately
poorly. If one player cooperates and the other defects, then the defector does
extremely well and the cooperator does extremely poorly. (In the case of the
Bunny and Clod story, “cooperating” means cooperating with one's
partner - i.e. holding out on the police - and “defecting” means
confessing to bank robbery.) Before formalizing the prisoner's dilemma
situation, we need to introduce some basic game theory notation.
In game theory, we differentiate
between a game, and a play. A game refers to the set of
possible choices and outcomes for the entire range of situations. A play
refers to a specific set of choices by the players, with the associated outcome
for that particular scenario. Thus, in game theory, a two-person
binary-choice game is represented by a two-by-two matrix. Here is a
hypothetical game matrix.
|
|
B cooperates |
B defects |
|
A cooperates |
A gets 5 |
A gets 2 |
|
B defects |
A gets 3 |
A gets 1 |
The two players in this case are
called A and B, and the choices are called
“cooperate” and “defect.” Players A and B
can play a single game by separately (and secretly) choosing either to
cooperate or to defect. Once each player has made a choice, he announces it to
the other player; and the two then look up their respective scores in the game
matrix. Each entry in the matrix is a pair of numbers indicating a score for
each player, depending on their choices. Thus, in the example above, if Player A
chooses to cooperate while Player B defects, then A gets 2 points
and B gets 3 points. If both players defect, they each get 1 point.
Note, by the way, that the game matrix is a matter of public knowledge; for
instance, Player A knows before the game even starts that if he and B
both choose to defect, they will each get 1 point.
In an iterated game, the
two players play repeatedly; thus after finishing one game, A and B
may play another. (Admittedly, there is a little confusion in the terminology
here; thus we refer to each iteration as a “play,” which
constitutes a single “round” of the larger, iterated game.) There
are a number of ways in which iterated games may be played; in the simplest
situation, A and B play for some fixed number of rounds (say
200), and before each round, they are able to look at the record of all
previous rounds. For instance, before playing the tenth round of their iterated
game, both A and B are able to study the results of the previous
nine rounds.
The game depicted by the matrix
above is a particularly easy one to analyze. Let's examine the situation from
Player A's point of view (Player B's point of view is identical):
“Suppose B
cooperates. Then I do better by cooperating myself (I receive five points
instead of three). On the other hand, suppose B defects. I still do
better by cooperating (since I get two points instead of one). So no matter
what B does, I am better off cooperating.”
Player B will, of course,
reason the same way, and both will choose to cooperate. In the terminology of
game theory, both A and B have a dominant choice - i.e., a
choice that gives a preferred outcome no matter what the other player chooses
to do. The matrix shown above, by the way, does not represent a
prisoner's dilemma situation, since when both players make their dominant
choice, they also both achieve their highest personal scores. We'll see an
example of a prisoner's dilemma game very shortly.
To re-cap: in any particular game using the above
matrix, we would expect both players to cooperate; and in an iterated game, we
would expect both players to cooperate repeatedly, on every round.
Now consider the following game
matrix:
|
|
B cooperates |
B defects |
|
A cooperates |
A gets 3 |
A gets 0 |
|
A defects |
A gets 5 |
A gets 1 |
In this case, Players A
and B both have a dominant choice - namely, defection. No matter what
Player B does, Player A improves his own score by defecting, and
vice versa.
However, there is something odd
about this game. It seems as though the two players would benefit by choosing
to cooperate. Instead of winning only one point each, they could win three
points each. So the “rational” choice of mutual defection has a
puzzling self-destructive flavor.
The second matrix is an example
of a prisoner's dilemma game situation. Just to formalize the situation, let CC
be the number of points won by each player when they both cooperate; let DD
be the number of points won when both defect; let CD be the number of
points won by the cooperating party when the other defects; and let DC
be the number of points won by the defecting party when the other cooperates.
Then the prisoner's dilemma situation is characterized by the following
conditions:
![]()
![]()
In the second game matrix, we
have
![]()
so both conditions are met. In
the Bunny and Clod story, by the way, you can verify that:
![]()
Again, these values satisfy the
prisoner's dilemma conditions.
In the late 1970's, political
scientist Robert Axelrod held a computer tournament designed to investigate the
prisoner's dilemma situation
.
Contestants in the tournament submitted computer programs that would compete in
an iterated prisoner's dilemma game of approximately two hundred rounds, using
the second matrix above. Each contestant's program played five iterated games
against each of the other programs submitted, and after all games had been
played the scores were tallied.
The contestants in Axelrod's
tournament included professors of political science, mathematics, computer
science, and economics. The winning program - the program with the highest
average score - was submitted by Anatol Rapoport, a professor of psychology at
the University of Toronto. In this problem set, we will pursue Axelrod's
investigations and make up our own Scheme programs to play the iterated
prisoner's dilemma game.
As part of this problem set, we
will be running a similar tournament, but now involving a three-person
prisoner's dilemma.
Before we look at the two-player
program, it is worth speculating on what possible strategies might be employed
in the iterated prisoner's dilemma game. Here are some examples:
Meanie - a program using the Meanie
strategy simply defects on every round of every game.
Sucker - a program using the Sucker
strategy cooperates on every round of every game.
Spastic - this program cooperates or defects on a
random basis.
Democratic - this program cooperates on the first
round. On all subsequent rounds, Democratic examines the history of the
other player's actions, counting the total number of defections and
cooperations by the other player. If the other player's defections outnumber
her cooperations, Democratic will defect; otherwise this strategy will
cooperate.
Eye-for-Eye - this program cooperates on the first
round, and then on every subsequent round it mimics the other player's previous
move. Thus, if the other player cooperates (defects) on the nth round,
then Eye-for-Eye will cooperate (defect) on the (n+1)st round.
All of these strategies are
extremely simple. (Indeed, the first three do not even pay any attention to the
other player; their responses are uninfluenced by the previous rounds of the
game.) Nevertheless, simplicity is not necessarily a disadvantage. Rapoport's
first-prize program employed the Eye-for-Eye strategy, and achieved the
highest average score in a field of far more complicated programs.
A Scheme program for an iterated
prisoner's dilemma game is shown at the end of this problem set. The procedure play-loop pits two players (or, to be more precise,
two “strategies”) against one another for approximately 100 games,
then prints out the average score of each player.
Player strategies are represented
as procedures. Each strategy takes two inputs - its own “history”
(that is, a list of all its previous “plays,” where for convenience
we will use 1 to represent cooperate, and -1 to represent defect) and its
opponent's “history.” The strategy returns either the number 1 for
“cooperate” or the number -1 for “defect.”
At the beginning of an iterated
game, each history is an empty list. As the game progresses, the histories grow
(via extend-history) into lists of 1's and -1's, thus each
history is stored from most recent to least recent. Note how each strategy must
have its own history as its first input. So in play-loop-iter, strat0 has history0 as its first input, and strat1 has history1 as its first input.
The values from the game matrix are
stored in a list named *game-association-list*. This list is used to calculate the scores at the
end of the iterated game.
(define *game-association-list* (list (list (list 1 1) (list 3 3)) (list (list 1 -1) (list 0 5)) (list (list -1 1) (list 5 0)) (list (list -1 -1) (list 1 1))))
Thus, if both players select 1 for
cooperate, the payoff to each player is a 3.
Some sample strategies are given at the
end of the program. Meanie and Sucker are
particularly simple; each returns a constant value regardless of the histories.
Spastic also ignores the histories and chooses
randomly between cooperation and defection. You should study Democratic and Eye-for-Eye to see that their behavior is consistent
with the descriptions in the previous section.
To be able to test out the system, we need
to complete a definition for extract-entry. This procedure's behavior is as follows: it takes as input a play,
represented as a list of choices for each strategy (i.e., a 1 or a -1), and the
game association list. Each entry in the game association list is a list
itself, with a first element representing a list of game choices, and the
second element representing a list of scores for each player. Thus extract-entry wants to search down the game association
list trying to match its first argument against the first element of each entry
in the game association list, one by one. When it succeeds, it returns that
whole entry.
For example, we expect the following
behavior:
(define a-play (make-play 1 -1))
(extract-entry a-play *game-association-list*)
;Value: ((1 -1) (0 5))
Write the procedure extract-entry, and test it out using the above case *game-association-list*. Turn in a copy of your documented
procedure and some test examples.
Use play-loop to play games among the five defined
strategies. Notice how a strategy's performance varies sharply depending on its
opponent. For example, Sucker does quite well against Eye-for-Eye or against another Sucker, but it loses badly to Meanie. Pay special attention to Eye-for-Eye. Notice how it never beats its opponent - but it never loses badly.
Games involving Democratic tend to be slower than other games. Why
is that so? Use order-of-growth notation to explain your answer.
Alyssa P. Hacker, upon seeing the code for
Democratic, suggested the following iterative
version of the procedure:
(define (Democratic my-history other-history) (define (majority-loop cs ds hist) (cond ((empty-history? hist) (if (> ds cs) -1 1)) ((= (most-recent-play hist) 1) (majority-loop (+ 1 cs) ds (rest-of-plays hist))) (else (majority-loop cs (+ 1 ds) (rest-of-plays hist))))) (majority-loop 0 0 other-history))
Compare this procedure with the original
version. Do the orders of growth (in time) for the two procedures differ? Is
the newer version faster?
Write a new strategy eye-for-two-eyes. The strategy should always cooperate
unless the opponent defected on both of the previous two rounds. (Looked at
another way: eye-for-two-eyes
should cooperate if the opponent cooperated on either of the previous two
rounds.) Play eye-for-two-eyes
against other strategies.
Write a procedure make-eye-for-n-eyes. This procedure should take a number as
input and return the appropriate Eye-for-Eye-like strategy. For example, (make-eye-for-n-eyes 2) should return a strategy equivalent to eye-for-two-eyes.
Write a procedure make-rotating-strategy which takes as input two strategies (say,
strat0 and strat1) and an integer (say switch-point). Make-rotating-strategy should return a strategy which plays strat0 for the first switch-point rounds in the iterated game, then
switches to strat1 for the next switch-point rounds, and so on.
Write a procedure niceify, which takes as input a strategy (say strat) and a number between 0 and 1 (call it niceness-factor). The niceify procedure should return a strategy that plays the
same as strat except: when strat defects, the new strategy should have a niceness-factor chance of cooperating. (If niceness-factor is 0, the return strategy performs
exactly the same as strat; if niceness-factor is
1, the returned strategy performs the same as Sucker.)
Use niceify with a low value for niceness-factor - say, 0.1 - to create two new
strategies: slightly-nice-Meanie and slightly-nice-Eye-for-Eye.
So far, all of our prisoner's dilemma
examples have involved two players (and, indeed, most game-theory research on
the prisoner's dilemma has focused on two-player games). But it is possible to
create a prisoner's dilemma game involve three - or even more - players.
Strategies from the two-player game do not
necessarily extend to a three-person game in a natural way. For example, what
does Eye-for-Eye mean? Should the player defect if either
of the opponents defected on the previous round? Or only if both
opponents defected? And are either of these strategies nearly as effective in
the three-player game as Eye-for-Eye is in the two-player game?
Before we analyze the three-player game
more closely, we must introduce some notation for representing the payoffs. We
use a notation similar to that used for the two-player game. For example, we
let DCC represent the payoff to a defecting player if both opponents cooperate.
Note that the first position represents the player under consideration. The
second and third positions represent the opponents.
Another example: CCD represents the payoff
to a cooperating player if one opponent cooperates and the other opponent
defects. Since we assume a symmetric game matrix, CCD could be written as CDC.
The choice is arbitrary.
Now we are ready to discuss the payoffs
for the three-player game. We impose three rules:
1) Defection should be the dominant choice
for each player. In other words, it should always be better for a player to
defect, regardless of what the opponents do. This rule gives three constraints:
![]()
![]()
![]()
2) A player should always be better off if
more of his opponents choose to cooperate. This rule gives:
![]()
![]()
3) If one player's choice is fixed, the
other two players should be left in a two-player prisoner's dilemma. This rule
gives the following constraints:
![]()
![]()
![]()
![]()
We can satisfy all of these constraints
with the following payoffs:
CDD = 0, DDD = 1, CCD = 3, DCD = 5, CCC = 7, DCC = 9.
Revise the Scheme code for the two-player
game to make a three-player iterated game. The program should take three
strategies as input, keep track of three histories, and print out results for
three players. You need to change only three procedures: play-loop, print-out-results and get-scores (although you may also have to change
your definition of extract-entry if you
did not write it in a general enough manner). You also need to change *game-association-list* as follows:
(define *game-association-list* (list (list (list 1 1 1) (list 7 7 7)) (list (list 1 1 -1) (list 3 3 9)) (list (list 1 -1 1) (list 3 9 3)) (list (list -1 1 1) (list 9 3 3)) (list (list 1 -1 -1) (list 0 5 5)) (list (list -1 1 -1) (list 5 0 5)) (list (list -1 -1 1) (list 5 5 0)) (list (list -1 -1 -1) (list 1 1 1))))
Write strategies Sucker-3, Meanie-3, and spastic-3 that will work in a three-player game.
Try them out to make sure your code is working.
Write two new strategies: tough-Eye-for-Eye and soft-Eye-for-Eye. Tough-Eye-for-Eye should defect if either of the
opponents defected on the previous round. Soft-Eye-for-Eye should defect only if both opponents defected on the previous round.
Play some games using these two new strategies.
A natural idea in creating a prisoner's
dilemma strategy is to try and deduce what kind of strategies the other
players might be using. In this problem, we will implement a simple version of
this idea.
First, we need a procedure that takes
three histories as arguments: call them hist-0, hist-1 and hist-2. The idea is
that we wish to characterize the strategy of the player responsible for hist-0. To do this, we are going to build an
intermediary data structure which keeps track of what player-0 did, correlated
with what the other two players did, over the course of the three histories.

Figure 1: Example of the summary data structure, as a tree.
The top level has three pieces, corresponding to the actions of the other
players. The second level has three pieces, listing the number of times the
player cooperated, defected and the total number of times the situation
specified by the actions of the opponents occured.
You should design and implement a data
structure called a history-summary,
with the overall structure shown in Figure 1. The history-summary has three subpieces, one for the case
where both player-1 and player-2 cooperated, one for when one of them
cooperated and the other defected, and a third for when both of these players
defected. For each piece, there is another data structure that keeps track of
the number of times player-0 cooperated, the number of times she defected, and
the total number of examples. You may find it convenient to think of this as a
kind of tree structure. Thus, your first task is to design constructors and
selectors to implement this multilevel abstraction.
Once you have designed your data
abstraction, build a procedure that takes the three histories as arguments, and
returns a history-summary. If
we extract from this data structure the piece corresponding to cooperate-cooperate, this should give us all the information
about what happened when player-1 and player-2 both cooperated. Thus, we should
be able to extract from this piece the number of times player-0 cooperated and
the number of times she defected.
NOTE: the goal of our data structure is to
correlate player-0’s behavior on round n, with player-1 and player-2’s
behavior on round n-1. For example, the result of an implementation, call it make-history-summary, on
an example set of histories is shown below:
(define summary (make-history-summary (list 1 1 1 1) ;hist-0 (list -1 -1 -1 1) ;hist-1 (list -1 -1 1 1))) ;hist-2
summary
;Value: ((1 0 1) (1 0 1) (1 0 1))
Finally, using this data structure, we can
build a new procedure that will return a list of three numbers: the probability
that the hist-0 player
cooperates given that the other two players cooperated on the previous round,
the probability that the hist-0 player cooperates given that only one other player cooperated on the
previous round, and the probability that the hist-0 player cooperates given that both others defected
on the previous round. To fill out some details in this picture, let's look at
a couple of examples. We will call our procedure get-probability-of-c: here are a couple of sample calls.
(define summary (make-history-summary (list 1 1 1 1) ;hist-0 (list -1 -1 -1 1) ;hist-1 (list -1 -1 1 1))) ;hist-2
(get-probability-of-c summary)
;Value: (1 1 1)
(define new-summary (make-history-summary (list 1 1 1 -1 1) (list -1 1 -1 -1 1) (list -1 1 1 1 1)))
(get-probability-of-c new-summary)
;Value: (0.5 1 ())
In the top example, the returned list
indicates that the first player cooperates with probability 1 no matter what
the other two players do. In the bottom example, the first player cooperates
with probability 0.5 when the other two players cooperate; the first player
cooperates with probability 1 when one of the other two players defects; and
since we have no data regarding what happens when both of the other players
defect, our procedure returns () for that case.
Write the get-probability-of-c procedure.
Using this procedure, you should be able
to write some predicate procedures that help in deciphering another player's
strategy. For instance, here are two possibilities:
(define (test-entry index trial) (cond ((null? index) (if (null? trial) true false)) ((null? trial) false) ((= (car index) (car trial)) (test-entry (cdr index) (cdr trial))) (else false)))
(define (is-he-a-fool? hist0 hist1 hist2) (test-entry (list 1 1 1) (get-probability-of-c (make-history-summary hist0 hist1 hist2)))
(define (could-he-be-a-fool? hist0 hist1 hist2) (test-entry (list 1 1 1) (map (lambda (elt) (cond ((null? elt) 1) ((= elt 1) 1) (else 0))) (get-probability-of-c (make-history-summary hist0 hist1 hist2)))))
Use the get-probability-of-c procedure to write a predicate that tests
whether another player is using the soft-Eye-for-Eye strategy from Problem 9. Also, write a new
strategy named dont-tolerate-fools.
This strategy should cooperate for the first ten rounds; on subsequent rounds
it checks (on each round) to see whether the other players might both be
playing Sucker. If our
strategy finds that both other players seem to be cooperating uniformly, it
defects; otherwise, it cooperates.
Write a procedure make-combined-strategies which takes as input two two-player
strategies and a “combining” procedure. Make-combined-strategies should return a three-player
strategy that plays one of the two-player strategies against one of the
opponents, and the other two-player strategy against the other opponent, then
calls the “combining” procedure on the two two-player results.
Here's an example: this call to make-combined-strategies returns a strategy equivalent to tough-Eye-for-Eye in Problem 9.
(make-combined-strategies Eye-for-Eye Eye-for-Eye (lambda (r1 r2) (if (or (= r1 -1) (= r2 -1)) -1 1)))
The resulting strategy plays Eye-for-Eye against each opponent, and then calls the
combining procedure on the two results. If either of the two two-player
strategies has returned -1, then the three-player strategy will also return -1.
Here's another example. This call to make-combined-strategies returns a three-player strategy that plays
Eye-for-Eye against one opponent, Democratic against another, and chooses randomly
between the two results:
(make-combined-strategies Eye-for-Eye Democratic (lambda (r1 r2) (if (= (random 2) 0) r1 r2)))
For each problem below above, include your code (with identification of the problem number being solved), as well as comments and explanations of your code, and demonstrate your code’s functionality against a set of test cases. Once you have completed this project, your file should be submitted electronically on the 6.001 on-line tutor, using the Submit Project Files button.
Remember that this is Project 2; when you are have completed all the work and saved it in a file, upload that file and submit it for Project 2.
As described earlier, Axelrod held two
computer tournaments to investigate the two-player prisoner's dilemma. We are
going to hold a three-player tournament. You can participate by designing a
strategy for the tournament. You might submit one of the strategies developed
in the problem set, or develop a new one. The only restriction is that the
strategy must work against any other legitimate entry. Any strategies that
cause the tournament software to crash will be disqualified. If you wish to
submit an entry strategy, you should:
The tournament will depend somewhat on the
number of submitted entries. We will try to make the tournament as complete as
possible (i.e., every strategy plays against every other pair). Each game will
consist of approximately 100 rounds.
Well, actually they didn't go to jail.
When they were in court, and heard that they had both turned state's witness,
they strangled each other. But that's another story.
Actually, there were two tournaments.
Their rules and results are described in Axelrod's book: The Evolution of
Cooperation.
Actually, there is no universal definition
for the multi-player prisoner's dilemma. The constraints used here represent
one possible version of the three-player prisoner's dilemma.