Publication Details
Overview
 
 
Glenn Ceusters
 

Thesis

Abstract 

Energy systems continue to become increasingly interconnected as energy technologies that enable sector coupling (i.e., the integration of different energy sectors such as electricity, heating, and transport) mature and are more widely implemented. Multi-energy, multi-carrier, multi-commodity, or multi-utility systems allow for the use of flexibility (e.g., storage, controllable loads) within and across all carriers, based on criteria such as energy efficiency, cost, emissions, dependability, or a combination thereof. Model Predictive Control (MPC) is a widely accepted and used optimisation-based control technology in various industries, including theenergy industry. However, MPC requires detailed models in advance, such as plant models, disturbance models (external forecasters), and measurement noise models (state estimations). These error-prone a priori models may not be available or economically viable to build, deploy, and maintain. That is, building such models requires additional engineering effort, deploying them increases computational requirements, and maintaining them demands periodic manual tuning. In contrast, reinforcement learning (RL), which does not rely on a known system model and is inherently scalable and adaptive, has seen increased adoption. However, its theoretical foundations – particularly with respect to stability, feasibility, robustness, and constraint handling – remain underdeveloped.The primary research objective of this thesis was to research, develop, and validate a safe, model-free RL methodology for multi-energy management that guarantees at least nominal hard-constraint satisfaction, adapts dynamically (including constraints), preserves optimality, can handle non-convexities, functions effectively in stochastic non-linear environments, and remains independent of specific RL algorithms. Specifically, this thesis made several significant contributions to the fields of reinforcement learning, control theory, and multi-energy management: (1) First-time demonstration that RL can outperform practically achievable MPC for multi-energy management, given a sufficient training budget and optimised hyperparameters. (2) Methodological contributions showing that a (near-to) optimal multi-energy management policy can be learnt safely online, assuming at least a nominal set of a priori available constraint functions. (3) Development of a self-improving (adaptive) safety layer featuring general constraint-handling capabilities, along with improved initial utility, convergence, and sample efficiency over existing approaches. (4) Experimental validation of the proposed safe RL framework, holding up in the real-world and showing safety layers, can be beneficial beyond RL specifically.Ultimately, this work paves the way for more robust and flexible data-driven energy management solutions, with broader implications for other domains facing similar challenges, and thereby contribute to increments in energy efficiency that are necessary to achieve global climate protection goals. Nonetheless, multiple key directions for future work are identified, for example, regarding the social acceptance of aggressive online learning – even when strict safety guarantees are in place.keywords: reinforcement learning, constraint satisfaction, control theory, optimal control, adaptive control, non-linear systems, stochastic environments, artificial intelligence, mathematical programming, energy management system, multi-energy

Reference