Scalable multi-agent reinforcement learning for distributed control of residential energy flexibility

Flora Charbonnier*, Thomas Morstyn, Malcolm D. McCulloch

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

This paper proposes a novel scalable type of multi-agent reinforcement learning-based coordination for distributed residential energy. Cooperating agents learn to control the flexibility offered by electric vehicles, space heating and flexible loads in a partially observable stochastic environment. In the standard independent Q-learning approach, the coordination performance of agents under partial observability drops at scale in stochastic environments. Here, the novel combination of learning from off-line convex optimisations on historical data and isolating marginal contributions to total rewards in reward signals increases stability and performance at scale. Using fixed-size Q-tables, prosumers are able to assess their marginal impact on total system objectives without sharing personal data either with each other or with a central coordinator. Case studies are used to assess the fitness of different combinations of exploration sources, reward definitions, and multi-agent learning frameworks. It is demonstrated that the proposed strategies create value at individual and system levels thanks to reductions in the costs of energy imports, losses, distribution network congestion, battery depreciation and greenhouse gas emissions.

Original languageEnglish
Article number118825
JournalApplied Energy
Volume314
Early online date22 Mar 2022
DOIs
Publication statusPublished - 15 May 2022

Keywords / Materials (for Non-textual outputs)

  • Demand-side response
  • Energy management system
  • Multi-agent reinforcement learning
  • Peer-to-peer
  • Prosumer
  • Smart grid

Fingerprint

Dive into the research topics of 'Scalable multi-agent reinforcement learning for distributed control of residential energy flexibility'. Together they form a unique fingerprint.

Cite this