On-line building energy optimization using deep reinforcement learning

Elena Mocanu, Decebal Constantin Mocanu, Phuong H. Nguyen, Antonio Liotta, Michael E. Webber, Madeleine Gibescu, J.G. Slootweg

Research output: Contribution to journalArticleAcademicpeer-review

436 Citations (Scopus)
33 Downloads (Pure)

Abstract

Unprecedented high volumes of data are becoming available with the growth of the advanced metering infrastructure. These are expected to benefit planning and operation of the future power system, and to help the customers transition from a passive to an active role. In this paper, we explore for the first time in the smart grid context the benefits of using Deep Reinforcement Learning, a hybrid type of methods that combines Reinforcement Learning with Deep Learning, to perform on-line optimization of schedules for building energy management systems. The learning procedure was explored using two methods, Deep Q-learning and Deep Policy Gradient, both of them being extended to perform multiple actions simultaneously. The proposed approach was validated on the large-scale Pecan Street Inc. database. This highly-dimensional database includes information about photovoltaic power generation, electric vehicles as well as buildings appliances. Moreover, these on-line energy scheduling strategies could be used to provide real-time feedback to consumers to encourage more efficient use of electricity.
Original languageEnglish
Article number8356086
Pages (from-to)3698-3708
Number of pages11
JournalIEEE transactions on smart grid
Volume10
Issue number4
Early online date8 May 2018
DOIs
Publication statusPublished - 1 Jul 2019
Externally publishedYes

Keywords

  • n/a OA procedure
  • Deep neural networks
  • Deep reinforcement learning
  • Demand response
  • Energy consumption
  • Learning (artificial intelligence)
  • Machine learning
  • Minimization
  • Optimization
  • Smart grid
  • Strategic optimization
  • Deep policy gradient
  • Buildings

Fingerprint

Dive into the research topics of 'On-line building energy optimization using deep reinforcement learning'. Together they form a unique fingerprint.

Cite this