In the discussion of noisy observables in section 14.3, we proposed in passing the following sort of model for repeated play of the prisoners' dilemma. Players choose their intended strategies on each round, pass:iilg instructions to a referee for implementation. The referee then implements those intentions "noisily": If told to play a noncooperative strategy, the referee plays this strategy with probability .8, but with probability .2, the referee plays cooperatively. If told to play cooperatively, the referee plays cooperatively with probability .8 and noncooperatively with probability .2. Players know what instructions they sent, and at the end of each round the referee's implementation of both strategies are revealed to both players. But players don't know what instructions their opponents sent. Assume that q = .9 and that the random ending of the game in ru;i.y round and the referee's random implementations of strategies are all mutually and serially independent·as random variables.
(a) Prove that it is a Nash equilibrium for players always to send instruc tions to play noncooperatively. What is the expected payoff to the players in this equilibrium?
(b) Suppose players adopt the strategy: Instruct the referee to play cooper atively until either player fails to cooperate (in terms of the implemented strategy) and thereafter instruct the referee to play noncooperatively. Is this an equilibrium? If so, what are the expected payoffs to the players in this equilibrium?
(c) Suppose players adopt the strategy: Instruct the referee to play cooperatively until either player fails to cooperate. Then instruct the referee to play noncooperatively for N periods. Then instruct the referee to play cooperatively again until the next incident of noncooperation. For which N is this an equilibrium? For the smallest such N, what are the expected payoffs to the players in this equilibrium?
(d) Now suppose we have a less capricious referee - one that follows instructions with probability .95. Redo parts (a), (b), and (c).