Reachability in continuous-time Markov reward decision processes

Christel Baier, Boudewijn R. Haverkort, H. Hermanns, Joost P. Katoen

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    29 Downloads (Pure)

    Abstract

    Continuous-time Markov decision processes (CTMDPs) are widely used for the control of queueing systems, epidemic and manufacturing processes. Various results on optimal schedulers for discounted and average reward optimality criteria in CTMDPs are known, but the typical game-theoretic winning objectives have received scant attention so far. This paper studies various sorts of reachability objectives for CTMDPs. Memoryless schedulers are optimal for simple reachability objectives as it suffices to consider the embedded MDP. Schedulers that may count the number of visits to states are optimal---when restricting to time-abstract schedulers---for timed reachability in uniform CTMDPs. The central result is that for any CTMDP, reward reachability objectives are dual to timed ones. As a corollary, epsilon-optimal schedulers for reward reachability objectives in uniform CTMDPs can be obtained in polynomial time using a simple backward greedy algorithm.
    Original languageEnglish
    Title of host publicationLogic and Automata
    Subtitle of host publicationHistory and Perspectives
    EditorsJörg Flum, Erich Grädel, Thomas Wilke
    Place of PublicationAmsterdam
    PublisherAmsterdam University Press
    Pages53-71
    Number of pages19
    ISBN (Print)978-90-5356-576-6
    Publication statusPublished - Feb 2008

    Publication series

    NameTexts in Logic and Games
    PublisherAmsterdam University Press
    Volume2

    Keywords

    • Continuous-time Markov decision process (CTMDP)

    Fingerprint Dive into the research topics of 'Reachability in continuous-time Markov reward decision processes'. Together they form a unique fingerprint.

    Cite this