Alternating Good-for-MDPs Automata

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, Dominik Wojtczak

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

1 Citation (Scopus)
24 Downloads (Pure)

Abstract

When omega-regular objectives were first proposed in model-free reinforcement learning (RL) for controlling MDPs, deterministic Rabin automata were used in an attempt to provide a direct translation from their transitions to scalar values. While these translations failed, it has turned out that it is possible to repair them by using good-for-MDPs (GFM) Büchi automata instead. These are nondeterministic Büchi automata with a restricted type of nondeterminism, albeit not as restricted as in good-for-games automata. Indeed, deterministic Rabin automata have a pretty straightforward translation to such GFM automata, which is bi-linear in the number of states and pairs. Interestingly, the same cannot be said for deterministic Streett automata: a translation to nondeterministic Rabin or Büchi automata comes at an exponential cost, even without requiring the target automaton to be good-for-MDPs. Do we have to pay more than that to obtain a good-for-MDPs automaton? The surprising answer is that we have to pay significantly less when we instead expand the good-for-MDPs property to alternating automata: like the nondeterministic GFM automata obtained from deterministic Rabin automata, the alternating good-for-MDPs automata we produce from deterministic Streett automata are bi-linear in the size of the deterministic automaton and its index. They can therefore be exponentially more succinct than the minimal nondeterministic Büchi automaton.
Original languageEnglish
Title of host publicationAutomated Technology for Verification and Analysis - 20th International Symposium, ATVA 2022, Virtual Event, October 25-28, 2022, Proceedings
EditorsAhmed Bouajjani, Lukás Holík, Zhilin Wu
PublisherSpringer Nature
Pages303-319
Number of pages17
ISBN (Electronic)978-3-031-19992-9
ISBN (Print)978-3-031-19991-2
DOIs
Publication statusPublished - 21 Oct 2022
Event20th International Symposium on Automated Technology for Verification and Analysis, ATVA 2022 - Virtual Event
Duration: 25 Oct 202228 Oct 2022
Conference number: 20

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume13505

Conference

Conference20th International Symposium on Automated Technology for Verification and Analysis, ATVA 2022
Abbreviated titleATVA 2022
CityVirtual Event
Period25/10/2228/10/22

Keywords

  • 2023 OA procedure

Fingerprint

Dive into the research topics of 'Alternating Good-for-MDPs Automata'. Together they form a unique fingerprint.

Cite this