Abstract
Markov decision processes are widely used for planning and verification in settings that combine controllable or adversarial choices with probabilistic behaviour. The standard analysis algorithm, value iteration, only provides lower bounds on infinite-horizon probabilities and rewards. Two “sound” variations, which also deliver an upper bound, have recently appeared. In this paper, we present a new sound approach that leverages value iteration’s ability to usually deliver good lower bounds: we obtain a lower bound via standard value iteration, use the result to “guess” an upper bound, and prove the latter’s correctness. We present this optimistic value iteration approach for computing reachability probabilities as well as expected rewards. It is easy to implement and performs well, as we show via an extensive experimental evaluation using our implementation within the mcsta model checker of the Modest Toolset.
Original language | English |
---|---|
Title of host publication | Computer Aided Verification |
Subtitle of host publication | 32nd International Conference, CAV 2020, Los Angeles, CA, USA, July 21–24, 2020, Proceedings, Part II |
Editors | Shuvendu K. Lahiri, Chao Wang |
Publisher | Springer |
Pages | 488-511 |
Number of pages | 24 |
ISBN (Electronic) | 978-3-030-53291-8 |
ISBN (Print) | 978-3-030-53290-1 |
DOIs | |
Publication status | Published - 14 Jul 2020 |
Event | 32nd International Conference on Computer-Aided Verification, CAV 2020 - Virtual Conference, Los Angeles, United States Duration: 19 Jul 2020 → 24 Jul 2020 Conference number: 32 http://i-cav.org/2020/ |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 12225 |
Conference
Conference | 32nd International Conference on Computer-Aided Verification, CAV 2020 |
---|---|
Abbreviated title | CAV 2020 |
Country/Territory | United States |
City | Los Angeles |
Period | 19/07/20 → 24/07/20 |
Internet address |