Abstract

Continuous-time Markov decision processes (CTMDPs) are widely used for the control of queueing systems, epidemic and manufacturing processes. Various results on optimal schedulers for discounted and average reward optimality criteria in CTMDPs are known, but the typical game-theoretic winning objectives have received scant attention so far. This paper studies various sorts of reachability objectives for CTMDPs. Memoryless schedulers are optimal for simple reachability objectives as it suffices to consider the embedded MDP. Schedulers that may count the number of visits to states are optimal---when restricting to time-abstract schedulers---for timed reachability in uniform CTMDPs. The central result is that for any CTMDP, reward reachability objectives are dual to timed ones. As a corollary, epsilon-optimal schedulers for reward reachability objectives in uniform CTMDPs can be obtained in polynomial time using a simple backward greedy algorithm.
Original languageUndefined
Title of host publicationLogic and Automata: History and Perspectives
EditorsJ. Flum, E. Graedel, Th. Wilke
Place of PublicationAmsterdam
PublisherAmsterdam University Press
Pages53-71
Number of pages19
ISBN (Print)978-90-5356-576-6
StatePublished - Feb 2008

Publication series

NameTexts in Logic and Games
PublisherAmsterdam University Press
Volume2

Fingerprint

Polynomials

Keywords

  • EWI-11634
  • METIS-250854
  • IR-64553

Cite this

Baier, C., Haverkort, B. R. H. M., Hermanns, H., & Katoen, J. P. (2008). Reachability in continuous-time Markov reward decision processes. In J. Flum, E. Graedel, & T. Wilke (Eds.), Logic and Automata: History and Perspectives (pp. 53-71). (Texts in Logic and Games; Vol. 2). Amsterdam: Amsterdam University Press.

Baier, Christel; Haverkort, Boudewijn R.H.M.; Hermanns, H.; Katoen, Joost P. / Reachability in continuous-time Markov reward decision processes.

Logic and Automata: History and Perspectives. ed. / J. Flum; E. Graedel; Th. Wilke. Amsterdam : Amsterdam University Press, 2008. p. 53-71 (Texts in Logic and Games; Vol. 2).

Research output: Scientific - peer-reviewConference contribution

@inbook{b7dec19e91e54ebe948e4f5d7b0f917e,
title = "Reachability in continuous-time Markov reward decision processes",
abstract = "Continuous-time Markov decision processes (CTMDPs) are widely used for the control of queueing systems, epidemic and manufacturing processes. Various results on optimal schedulers for discounted and average reward optimality criteria in CTMDPs are known, but the typical game-theoretic winning objectives have received scant attention so far. This paper studies various sorts of reachability objectives for CTMDPs. Memoryless schedulers are optimal for simple reachability objectives as it suffices to consider the embedded MDP. Schedulers that may count the number of visits to states are optimal---when restricting to time-abstract schedulers---for timed reachability in uniform CTMDPs. The central result is that for any CTMDP, reward reachability objectives are dual to timed ones. As a corollary, epsilon-optimal schedulers for reward reachability objectives in uniform CTMDPs can be obtained in polynomial time using a simple backward greedy algorithm.",
keywords = "EWI-11634, METIS-250854, IR-64553",
author = "Christel Baier and Haverkort, {Boudewijn R.H.M.} and H. Hermanns and Katoen, {Joost P.}",
year = "2008",
month = "2",
isbn = "978-90-5356-576-6",
series = "Texts in Logic and Games",
publisher = "Amsterdam University Press",
pages = "53--71",
editor = "J. Flum and E. Graedel and Th. Wilke",
booktitle = "Logic and Automata: History and Perspectives",
address = "Netherlands",

}

Baier, C, Haverkort, BRHM, Hermanns, H & Katoen, JP 2008, Reachability in continuous-time Markov reward decision processes. in J Flum, E Graedel & T Wilke (eds), Logic and Automata: History and Perspectives. Texts in Logic and Games, vol. 2, Amsterdam University Press, Amsterdam, pp. 53-71.

Reachability in continuous-time Markov reward decision processes. / Baier, Christel; Haverkort, Boudewijn R.H.M.; Hermanns, H.; Katoen, Joost P.

Logic and Automata: History and Perspectives. ed. / J. Flum; E. Graedel; Th. Wilke. Amsterdam : Amsterdam University Press, 2008. p. 53-71 (Texts in Logic and Games; Vol. 2).

Research output: Scientific - peer-reviewConference contribution

TY - CHAP

T1 - Reachability in continuous-time Markov reward decision processes

AU - Baier,Christel

AU - Haverkort,Boudewijn R.H.M.

AU - Hermanns,H.

AU - Katoen,Joost P.

PY - 2008/2

Y1 - 2008/2

N2 - Continuous-time Markov decision processes (CTMDPs) are widely used for the control of queueing systems, epidemic and manufacturing processes. Various results on optimal schedulers for discounted and average reward optimality criteria in CTMDPs are known, but the typical game-theoretic winning objectives have received scant attention so far. This paper studies various sorts of reachability objectives for CTMDPs. Memoryless schedulers are optimal for simple reachability objectives as it suffices to consider the embedded MDP. Schedulers that may count the number of visits to states are optimal---when restricting to time-abstract schedulers---for timed reachability in uniform CTMDPs. The central result is that for any CTMDP, reward reachability objectives are dual to timed ones. As a corollary, epsilon-optimal schedulers for reward reachability objectives in uniform CTMDPs can be obtained in polynomial time using a simple backward greedy algorithm.

AB - Continuous-time Markov decision processes (CTMDPs) are widely used for the control of queueing systems, epidemic and manufacturing processes. Various results on optimal schedulers for discounted and average reward optimality criteria in CTMDPs are known, but the typical game-theoretic winning objectives have received scant attention so far. This paper studies various sorts of reachability objectives for CTMDPs. Memoryless schedulers are optimal for simple reachability objectives as it suffices to consider the embedded MDP. Schedulers that may count the number of visits to states are optimal---when restricting to time-abstract schedulers---for timed reachability in uniform CTMDPs. The central result is that for any CTMDP, reward reachability objectives are dual to timed ones. As a corollary, epsilon-optimal schedulers for reward reachability objectives in uniform CTMDPs can be obtained in polynomial time using a simple backward greedy algorithm.

KW - EWI-11634

KW - METIS-250854

KW - IR-64553

M3 - Conference contribution

SN - 978-90-5356-576-6

T3 - Texts in Logic and Games

SP - 53

EP - 71

BT - Logic and Automata: History and Perspectives

PB - Amsterdam University Press

ER -

Baier C, Haverkort BRHM, Hermanns H, Katoen JP. Reachability in continuous-time Markov reward decision processes. In Flum J, Graedel E, Wilke T, editors, Logic and Automata: History and Perspectives. Amsterdam: Amsterdam University Press. 2008. p. 53-71. (Texts in Logic and Games).