Approximate Dynamic Programming by Practical Examples

Martijn R.K. Mes, Arturo Eduardo Perez Rivera

Research output: Chapter in Book/Report/Conference proceedingChapterAcademic

1 Citation (Scopus)

Abstract

Computing the exact solution of an MDP model is generally difficult and possibly intractable for realistically sized problem instances. A powerful technique to solve the large scale discrete time multistage stochastic control processes is Approximate Dynamic Programming (ADP). Although ADP is used as an umbrella term for a broad spectrum of methods to approximate the optimal solution of MDPs, the common denominator is typically to combine optimization with simulation, use approximations of the optimal values of the Bellman’s equations, and use approximate policies. This chapter aims to present and illustrate the basics of these steps by a number of practical and instructive examples. We use three examples (1) to explain the basics of ADP, relying on value iteration with an approximation of the value functions, (2) to provide insight into implementation issues, and (3) to provide test cases for the reader to validate its own ADP implementations.
Original languageEnglish
Title of host publicationMarkov Decision Processes in Practice
EditorsRichard Boucherie, Nico M. van Dijk
PublisherSpringer
Pages63-101
ISBN (Electronic)978-3-319-47766-4
ISBN (Print)978-3-319-47766-4
DOIs
Publication statusPublished - 11 Mar 2017

Publication series

NameInternational Series in Operations Research & Management Science
PublisherSpringer
Number248

Fingerprint

Approximate Dynamic Programming
Common denominator
Value Iteration
Bellman Equation
Stochastic Control
Approximation
Value Function
Discrete-time
Optimal Solution
Exact Solution
Optimization
Computing
Term
Simulation

Keywords

  • METIS-318330
  • IR-101811

Cite this

Mes, M. R. K., & Perez Rivera, A. E. (2017). Approximate Dynamic Programming by Practical Examples. In R. Boucherie, & N. M. van Dijk (Eds.), Markov Decision Processes in Practice (pp. 63-101). (International Series in Operations Research & Management Science; No. 248). Springer. https://doi.org/10.1007/978-3-319-47766-4_3
Mes, Martijn R.K. ; Perez Rivera, Arturo Eduardo. / Approximate Dynamic Programming by Practical Examples. Markov Decision Processes in Practice. editor / Richard Boucherie ; Nico M. van Dijk. Springer, 2017. pp. 63-101 (International Series in Operations Research & Management Science; 248).
@inbook{e6286f45667246749ce9858696c00622,
title = "Approximate Dynamic Programming by Practical Examples",
abstract = "Computing the exact solution of an MDP model is generally difficult and possibly intractable for realistically sized problem instances. A powerful technique to solve the large scale discrete time multistage stochastic control processes is Approximate Dynamic Programming (ADP). Although ADP is used as an umbrella term for a broad spectrum of methods to approximate the optimal solution of MDPs, the common denominator is typically to combine optimization with simulation, use approximations of the optimal values of the Bellman’s equations, and use approximate policies. This chapter aims to present and illustrate the basics of these steps by a number of practical and instructive examples. We use three examples (1) to explain the basics of ADP, relying on value iteration with an approximation of the value functions, (2) to provide insight into implementation issues, and (3) to provide test cases for the reader to validate its own ADP implementations.",
keywords = "METIS-318330, IR-101811",
author = "Mes, {Martijn R.K.} and {Perez Rivera}, {Arturo Eduardo}",
year = "2017",
month = "3",
day = "11",
doi = "10.1007/978-3-319-47766-4_3",
language = "English",
isbn = "978-3-319-47766-4",
series = "International Series in Operations Research & Management Science",
publisher = "Springer",
number = "248",
pages = "63--101",
editor = "Richard Boucherie and {van Dijk}, {Nico M.}",
booktitle = "Markov Decision Processes in Practice",

}

Mes, MRK & Perez Rivera, AE 2017, Approximate Dynamic Programming by Practical Examples. in R Boucherie & NM van Dijk (eds), Markov Decision Processes in Practice. International Series in Operations Research & Management Science, no. 248, Springer, pp. 63-101. https://doi.org/10.1007/978-3-319-47766-4_3

Approximate Dynamic Programming by Practical Examples. / Mes, Martijn R.K.; Perez Rivera, Arturo Eduardo.

Markov Decision Processes in Practice. ed. / Richard Boucherie; Nico M. van Dijk. Springer, 2017. p. 63-101 (International Series in Operations Research & Management Science; No. 248).

Research output: Chapter in Book/Report/Conference proceedingChapterAcademic

TY - CHAP

T1 - Approximate Dynamic Programming by Practical Examples

AU - Mes, Martijn R.K.

AU - Perez Rivera, Arturo Eduardo

PY - 2017/3/11

Y1 - 2017/3/11

N2 - Computing the exact solution of an MDP model is generally difficult and possibly intractable for realistically sized problem instances. A powerful technique to solve the large scale discrete time multistage stochastic control processes is Approximate Dynamic Programming (ADP). Although ADP is used as an umbrella term for a broad spectrum of methods to approximate the optimal solution of MDPs, the common denominator is typically to combine optimization with simulation, use approximations of the optimal values of the Bellman’s equations, and use approximate policies. This chapter aims to present and illustrate the basics of these steps by a number of practical and instructive examples. We use three examples (1) to explain the basics of ADP, relying on value iteration with an approximation of the value functions, (2) to provide insight into implementation issues, and (3) to provide test cases for the reader to validate its own ADP implementations.

AB - Computing the exact solution of an MDP model is generally difficult and possibly intractable for realistically sized problem instances. A powerful technique to solve the large scale discrete time multistage stochastic control processes is Approximate Dynamic Programming (ADP). Although ADP is used as an umbrella term for a broad spectrum of methods to approximate the optimal solution of MDPs, the common denominator is typically to combine optimization with simulation, use approximations of the optimal values of the Bellman’s equations, and use approximate policies. This chapter aims to present and illustrate the basics of these steps by a number of practical and instructive examples. We use three examples (1) to explain the basics of ADP, relying on value iteration with an approximation of the value functions, (2) to provide insight into implementation issues, and (3) to provide test cases for the reader to validate its own ADP implementations.

KW - METIS-318330

KW - IR-101811

U2 - 10.1007/978-3-319-47766-4_3

DO - 10.1007/978-3-319-47766-4_3

M3 - Chapter

SN - 978-3-319-47766-4

T3 - International Series in Operations Research & Management Science

SP - 63

EP - 101

BT - Markov Decision Processes in Practice

A2 - Boucherie, Richard

A2 - van Dijk, Nico M.

PB - Springer

ER -

Mes MRK, Perez Rivera AE. Approximate Dynamic Programming by Practical Examples. In Boucherie R, van Dijk NM, editors, Markov Decision Processes in Practice. Springer. 2017. p. 63-101. (International Series in Operations Research & Management Science; 248). https://doi.org/10.1007/978-3-319-47766-4_3