TY - JOUR
T1 - Limiting Dynamics for Q-Learning with Memory One in Symmetric Two-Player, Two-Action Games
AU - Meylahn, J.M.
AU - Janssen, L.
N1 - Financial transaction number:
2500037108
PY - 2022/11/8
Y1 - 2022/11/8
N2 - We develop a method based on computer algebra systems to represent the mutual pure strategy best-response dynamics of symmetric two-player, two-action repeated games played by players with a one-period memory. We apply this method to the iterated prisoner’s dilemma, stag hunt, and hawk-dove games and identify all possible equilibrium strategy pairs and the conditions for their existence. The only equilibrium strategy pair that is possible in all three games is the win-stay, lose-shift strategy. Lastly, we show that the mutual best-response dynamics are realized by a sample batch Q-learning algorithm in the infinite batch size limit.
AB - We develop a method based on computer algebra systems to represent the mutual pure strategy best-response dynamics of symmetric two-player, two-action repeated games played by players with a one-period memory. We apply this method to the iterated prisoner’s dilemma, stag hunt, and hawk-dove games and identify all possible equilibrium strategy pairs and the conditions for their existence. The only equilibrium strategy pair that is possible in all three games is the win-stay, lose-shift strategy. Lastly, we show that the mutual best-response dynamics are realized by a sample batch Q-learning algorithm in the infinite batch size limit.
U2 - 10.1155/2022/4830491
DO - 10.1155/2022/4830491
M3 - Article
SN - 1076-2787
VL - 2022
JO - Complexity
JF - Complexity
M1 - 4830491
ER -