Abstract
When omega-regular objectives were first proposed in model-free reinforcement learning (RL) for controlling MDPs, deterministic Rabin automata were used in an attempt to provide a direct translation from their transitions to scalar values. While these translations failed, it has turned out that it is possible to repair them by using good-for-MDPs (GFM) Büchi automata instead. These are nondeterministic Büchi automata with a restricted type of nondeterminism, albeit not as restricted as in good-for-games automata. Indeed, deterministic Rabin automata have a pretty straightforward translation to such GFM automata, which is bi-linear in the number of states and pairs. Interestingly, the same cannot be said for deterministic Streett automata: a translation to nondeterministic Rabin or Büchi automata comes at an exponential cost, even without requiring the target automaton to be good-for-MDPs. Do we have to pay more than that to obtain a good-for-MDPs automaton? The surprising answer is that we have to pay significantly less when we instead expand the good-for-MDPs property to alternating automata: like the nondeterministic GFM automata obtained from deterministic Rabin automata, the alternating good-for-MDPs automata we produce from deterministic Streett automata are bi-linear in the size of the deterministic automaton and its index. They can therefore be exponentially more succinct than the minimal nondeterministic Büchi automaton.
Original language | English |
---|---|
Title of host publication | Automated Technology for Verification and Analysis - 20th International Symposium, ATVA 2022, Virtual Event, October 25-28, 2022, Proceedings |
Editors | Ahmed Bouajjani, Lukás Holík, Zhilin Wu |
Publisher | Springer |
Pages | 303-319 |
Number of pages | 17 |
ISBN (Electronic) | 978-3-031-19992-9 |
ISBN (Print) | 978-3-031-19991-2 |
DOIs | |
Publication status | Published - 21 Oct 2022 |
Event | 20th International Symposium on Automated Technology for Verification and Analysis, ATVA 2022 - Virtual Event Duration: 25 Oct 2022 → 28 Oct 2022 Conference number: 20 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 13505 |
Conference
Conference | 20th International Symposium on Automated Technology for Verification and Analysis, ATVA 2022 |
---|---|
Abbreviated title | ATVA 2022 |
City | Virtual Event |
Period | 25/10/22 → 28/10/22 |
Keywords
- 2023 OA procedure