Our Pavlov learns by conditioned response, through rewards and punishments, to cooperate or defect. We analyze the behavior of an extended play Prisoner's Dilemma with Pavlov against various opponents and compute the time and cost to train Pavlov to cooperate. Among our results is that Pavlov and his clone would learn to cooperate more rapidly than if Pavlov played against the Tit for Tat strategy. This fact has implications for the evolution of cooperation.
game theoryprisoner's dilemmaMarkov chainevolution of cooperation