### Abstract

Original language | English |
---|---|

Title of host publication | Markov Decision Processes in Practice |

Editors | Richard Boucherie, Nico M. van Dijk |

Publisher | Springer |

Pages | 63-101 |

ISBN (Electronic) | 978-3-319-47766-4 |

ISBN (Print) | 978-3-319-47766-4 |

DOIs | |

Publication status | Published - 11 Mar 2017 |

### Publication series

Name | International Series in Operations Research & Management Science |
---|---|

Publisher | Springer |

Number | 248 |

### Fingerprint

### Keywords

- METIS-318330
- IR-101811

### Cite this

*Markov Decision Processes in Practice*(pp. 63-101). (International Series in Operations Research & Management Science; No. 248). Springer. https://doi.org/10.1007/978-3-319-47766-4_3

}

*Markov Decision Processes in Practice.*International Series in Operations Research & Management Science, no. 248, Springer, pp. 63-101. https://doi.org/10.1007/978-3-319-47766-4_3

**Approximate Dynamic Programming by Practical Examples.** / Mes, Martijn R.K.; Perez Rivera, Arturo Eduardo.

Research output: Chapter in Book/Report/Conference proceeding › Chapter › Academic

TY - CHAP

T1 - Approximate Dynamic Programming by Practical Examples

AU - Mes, Martijn R.K.

AU - Perez Rivera, Arturo Eduardo

PY - 2017/3/11

Y1 - 2017/3/11

N2 - Computing the exact solution of an MDP model is generally difficult and possibly intractable for realistically sized problem instances. A powerful technique to solve the large scale discrete time multistage stochastic control processes is Approximate Dynamic Programming (ADP). Although ADP is used as an umbrella term for a broad spectrum of methods to approximate the optimal solution of MDPs, the common denominator is typically to combine optimization with simulation, use approximations of the optimal values of the Bellman’s equations, and use approximate policies. This chapter aims to present and illustrate the basics of these steps by a number of practical and instructive examples. We use three examples (1) to explain the basics of ADP, relying on value iteration with an approximation of the value functions, (2) to provide insight into implementation issues, and (3) to provide test cases for the reader to validate its own ADP implementations.

AB - Computing the exact solution of an MDP model is generally difficult and possibly intractable for realistically sized problem instances. A powerful technique to solve the large scale discrete time multistage stochastic control processes is Approximate Dynamic Programming (ADP). Although ADP is used as an umbrella term for a broad spectrum of methods to approximate the optimal solution of MDPs, the common denominator is typically to combine optimization with simulation, use approximations of the optimal values of the Bellman’s equations, and use approximate policies. This chapter aims to present and illustrate the basics of these steps by a number of practical and instructive examples. We use three examples (1) to explain the basics of ADP, relying on value iteration with an approximation of the value functions, (2) to provide insight into implementation issues, and (3) to provide test cases for the reader to validate its own ADP implementations.

KW - METIS-318330

KW - IR-101811

U2 - 10.1007/978-3-319-47766-4_3

DO - 10.1007/978-3-319-47766-4_3

M3 - Chapter

SN - 978-3-319-47766-4

T3 - International Series in Operations Research & Management Science

SP - 63

EP - 101

BT - Markov Decision Processes in Practice

A2 - Boucherie, Richard

A2 - van Dijk, Nico M.

PB - Springer

ER -