Good-for-mdps automata for probabilistic analysis and reinforcement learning

Ernst Moritz Hahn*, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, Dominik Wojtczak

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

21 Citations (Scopus)
96 Downloads (Pure)

Abstract

We characterize the class of nondeterministic ω-automata that can be used for the analysis of finite Markov decision processes (MDPs). We call these automata ‘good-for-MDPs’ (GFM). We show that GFM automata are closed under classic simulation as well as under more powerful simulation relations that leverage properties of optimal control strategies for MDPs. This closure enables us to exploit state-space reduction techniques, such as those based on direct and delayed simulation, that guarantee simulation equivalence. We demonstrate the promise of GFM automata by defining a new class of automata with favorable properties—they are Büchi automata with low branching degree obtained through a simple construction—and show that going beyond limit-deterministic automata may significantly benefit reinforcement learning.

Original languageEnglish
Title of host publicationTools and Algorithms for the Construction and Analysis of Systems
Subtitle of host publication26th International Conference, TACAS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25–30, 2020, Proceedings
EditorsArmin Biere, David Parker
Place of PublicationCham
PublisherSpringer
Pages306-323
Number of pages18
VolumePart I
ISBN (Electronic)978-3-030-45190-5
ISBN (Print)978-3-030-45189-9
DOIs
Publication statusPublished - 2020
Externally publishedYes
Event26th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2020 - Dublin, Ireland
Duration: 25 Apr 202030 Apr 2020
Conference number: 26

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume12078
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2020
Abbreviated titleTACAS
Country/TerritoryIreland
CityDublin
Period25/04/2030/04/20

Fingerprint

Dive into the research topics of 'Good-for-mdps automata for probabilistic analysis and reinforcement learning'. Together they form a unique fingerprint.

Cite this