Model-Free Reinforcement Learning for Stochastic Parity Games

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, Dominik Wojtczak

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

5 Citations (Scopus)
13 Downloads (Pure)

Abstract

This paper investigates the use of model-free reinforcement learning to compute the optimal value in two-player stochastic games with parity objectives. In this setting, two decision makers, player Min and player Max, compete on a finite game arena - a stochastic game graph with unknown but fixed probability distributions - to minimize and maximize, respectively, the probability of satisfying a parity objective. We give a reduction from stochastic parity games to a family of stochastic reachability games with a parameter ε, such that the value of a stochastic parity game equals the limit of the values of the corresponding simple stochastic games as the parameter ε tends to 0. Since this reduction does not require the knowledge of the probabilistic transition structure of the underlying game arena, model-free reinforcement learning algorithms, such as minimax Q-learning, can be used to approximate the value and mutual best-response strategies for both players in the underlying stochastic parity game. We also present a streamlined reduction from 1 1/2-player parity games to reachability games that avoids recourse to nondeterminism. Finally, we report on the experimental evaluations of both reductions.
Original languageEnglish
Title of host publication31st International Conference on Concurrency Theory (CONCUR 2020)
EditorsIgor Konnov, Laura Kovacs
Place of PublicationLeibniz
PublisherDagstuhl
Number of pages16
ISBN (Electronic)9783959771603
ISBN (Print)978-3-95977-160-3
DOIs
Publication statusPublished - 2020
Event31st International Conference on Concurrency Theory, CONCUR 2020 - Online
Duration: 1 Sep 20204 Sep 2020
Conference number: 31

Publication series

NameLeibniz International Proceedings in Informatics (LIPIcs)
PublisherSchloss Dagstuhl - Leibniz-Zentrum für Informatik
Volume171
ISSN (Print)1868-8969

Conference

Conference31st International Conference on Concurrency Theory, CONCUR 2020
Abbreviated titleCONCUR
Period1/09/204/09/20

Keywords

  • Reinforcement learning
  • Stochastic games
  • Omega-regular objectives

Fingerprint

Dive into the research topics of 'Model-Free Reinforcement Learning for Stochastic Parity Games'. Together they form a unique fingerprint.

Cite this