Abstract
We apply reinforcement learning to approximate the optimal probability that a stochastic hybrid system satisfies a temporal logic formula. We consider systems with (non)linear continuous dynamics, random events following general continuous probability distributions, and discrete nondeterministic choices. We present a discretized view of states to the learner, but simulate the continuous system. Once we have learned a near-optimal scheduler resolving the choices, we use statistical model checking to estimate its probability of satisfying the formula. We implemented the approach using Q-learning in the tools HYPEG and modes, which support Petri net- and hybrid automata-based models, respectively. Via two case studies, we show the feasibility of the approach, and compare its performance and effectiveness to existing analytical techniques for a linear model. We find that our new approach quickly finds near-optimal prophetic as well as non-prophetic schedulers, which maximize or minimize the probability that a specific signal temporal logic property is satisfied.
Original language | English |
---|---|
Title of host publication | MEMOCODE '21 |
Subtitle of host publication | Prtoceedings of the 19th ACM-IEEE International Conference on Formal Methods and Models for System Design, Virtual Event, China, November 20-22, 2021 |
Editors | S. Arun-Kumar, Dominique Méry, Indranil Saha, Lijun Zhang |
Publisher | ACM Press |
Pages | 44-55 |
Number of pages | 12 |
ISBN (Print) | 978-1-4503-9127-6 |
DOIs | |
Publication status | Published - 20 Nov 2021 |
Event | 19th ACM-IEEE International Conference on Formal Methods and Models for System Design, MEMOCODE 2021 - Virtual Event Duration: 20 Nov 2021 → 22 Nov 2021 Conference number: 19 |
Conference
Conference | 19th ACM-IEEE International Conference on Formal Methods and Models for System Design, MEMOCODE 2021 |
---|---|
Abbreviated title | MEMOCODE 2021 |
City | Virtual Event |
Period | 20/11/21 → 22/11/21 |