Abstract
We provide the first solution for model-free reinforcement learning of ω-regular objectives for Markov decision processes (MDPs). We present a constructive reduction from the almost-sure satisfaction of ω-regular objectives to an almost-sure reachability problem, and extend this technique to learning how to control an unknown model so that the chance of satisfying the objective is maximized. We compile ω-regular properties into limit-deterministic Büchi automata instead of the traditional Rabin automata; this choice sidesteps difficulties that have marred previous proposals. Our approach allows us to apply model-free, off-the-shelf reinforcement learning algorithms to compute optimal strategies from the observations of the MDP. We present an experimental evaluation of our technique on benchmark learning problems.
Original language | English |
---|---|
Title of host publication | Tools and Algorithms for the Construction and Analysis of Systems |
Subtitle of host publication | 25th International Conference, TACAS 2019, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019, Prague, Czech Republic, April 6–11, 2019, Proceedings |
Editors | Tomáš Vojnar, Lijun Zhang |
Place of Publication | Cham |
Publisher | Springer |
Pages | 395-412 |
Volume | Part I |
ISBN (Electronic) | 978-3-030-17462-0 |
ISBN (Print) | 978-3-030-17461-3 |
DOIs | |
Publication status | Published - 2019 |
Event | 25th International Conference on Tools and Algorithms for the Construction and Analysis of Systems conference series, TACAS 2019 - Charles University, Prague, Czech Republic Duration: 6 Apr 2019 → 11 Apr 2019 Conference number: 25 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 11427 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 25th International Conference on Tools and Algorithms for the Construction and Analysis of Systems conference series, TACAS 2019 |
---|---|
Abbreviated title | TACAS 2019 |
Country/Territory | Czech Republic |
City | Prague |
Period | 6/04/19 → 11/04/19 |
Other | held as part of the 22nd European Joint Conferences on Theory and Practice of Software, ETAPS 2019 |