Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowe, Guillermo A. Pérez
We propose a novel framework to controller design in environments with a two-level structure: a high-level graph in which each vertex is populated by a Markov decision process, called a “room”, with several low-level objectives. We proceed as follows. First, we apply deep reinforcement learning (DRL) to obtain low- level policies for each room and objective. Second, we apply reactive synthesis to obtain a planner that selects which low-level policy to apply in each room. Reactive synthesis refers to constructing a planner for a given model of the environment that satisfies a given objective (typically specified as a temporal logic formula) by design. The main advantage of the framework is formal guarantees. In addition, the framework enables a “separation of concerns”: low-level tasks are addressed using DRL, which enables scaling to large rooms of unknown dynamics, reward engineering is only done locally, and policies can be reused, whereas users can specify high-level tasks in- intuitively and naturally. The central challenge in synthesis is the need for a model of the rooms. We address this challenge by developing a DRL procedure to train concise “latent” policies together with latent abstract rooms, both paired with probably approximately correct (PAC) guarantees on performance and abstraction quality. Unlike previous approaches, this circumvents a model distillation step. We demonstrate feasibility in a case study involving agent navigation in an environment with moving obstacles.
Delgrange, F, Avni, G, Lukina, A, Schilling, C, Nowe, A & Pérez, GA 2024, 'Controller Synthesis from Deep Reinforcement Learning Policies', Paper presented at 17th European Workshop on Reinforcement Learning
, Toulouse, France, 28/10/24 - 30/10/24. <https://openreview.net/forum?id=KDiCsArAKs>
Delgrange, F., Avni, G., Lukina, A., Schilling, C., Nowe, A., & Pérez, G. A. (2024). Controller Synthesis from Deep Reinforcement Learning Policies. Paper presented at 17th European Workshop on Reinforcement Learning
, Toulouse, France. https://openreview.net/forum?id=KDiCsArAKs
@conference{902851fb0c7b45b6aaaf73998f4790e8,
title = "Controller Synthesis from Deep Reinforcement Learning Policies",
abstract = "We propose a novel framework to controller design in environments with a two-level structure: a high-level graph in which each vertex is populated by a Markov decision process, called a “room”, with several low-level objectives. We proceed as follows. First, we apply deep reinforcement learning (DRL) to obtain low- level policies for each room and objective. Second, we apply reactive synthesis to obtain a planner that selects which low-level policy to apply in each room. Reactive synthesis refers to constructing a planner for a given model of the environment that satisfies a given objective (typically specified as a temporal logic formula) by design. The main advantage of the framework is formal guarantees. In addition, the framework enables a “separation of concerns”: low-level tasks are addressed using DRL, which enables scaling to large rooms of unknown dynamics, reward engineering is only done locally, and policies can be reused, whereas users can specify high-level tasks in- intuitively and naturally. The central challenge in synthesis is the need for a model of the rooms. We address this challenge by developing a DRL procedure to train concise “latent” policies together with latent abstract rooms, both paired with probably approximately correct (PAC) guarantees on performance and abstraction quality. Unlike previous approaches, this circumvents a model distillation step. We demonstrate feasibility in a case study involving agent navigation in an environment with moving obstacles.",
keywords = "Planning and Reasoning under Uncertainty, Controller Synthesis, Model Checking, Reinforcement Learning, Representation Learning",
author = "Florent Delgrange and Guy Avni and Anna Lukina and Christian Schilling and Ann Nowe and P{\'e}rez, {Guillermo A.}",
year = "2024",
month = oct,
day = "28",
language = "English",
note = "17th European Workshop on Reinforcement Learning <br/>, EWRL 2024 ; Conference date: 28-10-2024 Through 30-10-2024",
url = "https://ewrl.wordpress.com/ewrl17-2024/",
}