The best reply dynamic is usally termed the Cournot adjustment model or Cournot learning after Augustin Cournot who first proposed it in the context of a duopoly model. Each of two players selects policies successively. In each period, a firm chooses the action that is its greatest response to the action selected by the challenging firm in the previous period. Cournot noted that this procedure converges to the Nash equilibrium in some duopoly games in which firms successively select output. The best reply dynamic can be seen as a tremendous form of fictitious play in which every firm think that its opponent is using the same policy in each period which is equivalent to the one most recently used.