An automated measure of MDP similarity for transfer in reinforcement learning

Haitham Bou Ammar, Eric Eaton, Matthew E. Taylor, Decebal Constantin Mocanu, Kurt Driessens, Gerhard Weiss, Karl Tüyls

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

56 Citations (Scopus)
13 Downloads (Pure)

Abstract

Transfer learning can improve the reinforcement learning of a new task by allowing the agent to reuse knowledge acquired from other source tasks. Despite their success, transfer learning methods rely on having relevant source tasks; transfer from inappropriate tasks can inhibit performance on the new task. For fully autonomous transfer, it is critical to have a method for automatically choosing relevant source tasks, which requires a similarity measure between Markov Decision Processes (MDPs). This issue has received liule attention, and is therefore still a largely open problem. This paper presents a data-driven automated similarity measure for MDPs. This novel measure is a significant .step toward autonomous reinforcement learning transfer, allowing agents to: (1) characterize when transfer will be useful and, (2) automatically select tasks to use for transfer The proposed measure is based on the reconstruction error of a restricted Boltzmann machine that attempts to model the behavioral dynamics of the two MDPs being compared. Hmpirical results illustrate that this measure is correlated with the performance of transfer and therefore can be used to identify similar source tasks for transfer learning.

Original languageEnglish
Title of host publicationMachine Learning for Interactive Systems
Subtitle of host publicationBridging the Gap Between Perception, Action and Communication - Papers Presented at the 28th AAAI Conference on Artificial Intelligence, Technical Report
PublisherAI Access Foundation
Pages31-37
Number of pages7
ISBN (Electronic)9781577356684
Publication statusPublished - 2014
Externally publishedYes
Event28th AAAI Conference on Artificial Intelligence, AAAI 2014 - Québec Convention Centre, Quebec City, Canada
Duration: 27 Jul 201428 Jul 2014
Conference number: 28

Publication series

NameAAAI Workshop - Technical Report
PublisherAAAI
VolumeWS-14-07

Conference

Conference28th AAAI Conference on Artificial Intelligence, AAAI 2014
Abbreviated titleAAAI-14
Country/TerritoryCanada
CityQuebec City
Period27/07/1428/07/14

Fingerprint

Dive into the research topics of 'An automated measure of MDP similarity for transfer in reinforcement learning'. Together they form a unique fingerprint.

Cite this