The Logic of Adaptive Behavior - Knowledge Representation and Algorithms for the Markov Decision Process Framework in First-Order Domains

M. van Otterlo

Research output: ThesisPhD Thesis - Research UT, graduation UT

973 Downloads (Pure)

Abstract

Learning and reasoning in large, structured, probabilistic worlds is at the heart of artificial intelligence. Markov decision processes have become the de facto standard in modeling and solving sequential decision making problems under uncertainty. Many efficient reinforcement learning and dynamic programming techniques exist that can solve such problems. Until recently, the representational state-of-the-art in this field was based on propositional representations. However, it is hard to imagine a truly general, intelligent system that does not conceive of the world in terms of objects and their properties and relations to other objects. To this end, this book studies lifting Markov decision processes, reinforcement learning and dynamic programming to the first-order (or, relational) setting. Based on an extensive analysis of propositional representations and techniques, a methodological translation is constructed from the propositional to the relational setting. Furthermore, this book provides a thorough and complete description of the state-of-the-art, it surveys vital, related historical developments and it contains extensive descriptions of several new model-free and model-based solution techniques.
Original languageEnglish
Awarding Institution
  • University of Twente
Supervisors/Advisors
  • Nijholt, Anton, Supervisor
  • Meyer, J.J., Supervisor, External person
  • Poel, Mannes, Co-Supervisor
Award date30 May 2008
Place of PublicationEnschede
Publisher
Print ISBNs978-90-365-2677-7
DOIs
Publication statusPublished - 30 May 2008

Keywords

  • HMI-IA: Intelligent Agents

Fingerprint

Dive into the research topics of 'The Logic of Adaptive Behavior - Knowledge Representation and Algorithms for the Markov Decision Process Framework in First-Order Domains'. Together they form a unique fingerprint.

Cite this