ETROVUB

Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowe, Guillermo A. Pérez

Unpublished contribution to conference

Abstract ■

We propose a novel framework to controller design in environments with a two-level structure: a known high-level graph (“map”) in which each vertex is populated by a Markov decision process, called a “room”. The framework “separates concerns” by using different design techniques for low- and high-level tasks. We apply reactive synthesis for high-level tasks: given a specification as a logical formula over the high-level graph and a collection of low-level policies given on “concise” latent structures, we construct a “planner” that selects which low-level policy to apply in each room. We develop a reinforcement learning procedure to train low-level policies on latent structures, which unlike previous approach, circumvents a model distillation step. It pairs the policy with probably approximately correct guarantees on its performance and abstraction quality, which are lifted to guarantees on the high-level task. These formal guarantees are the main advantage of the framework. Other advantages include scalability (rooms are large and their dynamics is unknown) and reusability of low-level policies. We demonstrate feasibility in challenging case studies involving agent navigation in environments with moving obstacles and visual inputs.

Reference ■

Delgrange, F, Avni, G, Lukina, A, Schilling, C, Nowe, A & Pérez, GA 2025, 'Controller Synthesis from Deep Reinforcement Learning Policies', Paper presented at PRL Workshop Series
, Philadelphia, United States, 4/03/25 - 4/03/25. <https://prl-theworkshop.github.io/prl2025-aaai/papers/23.pdf>

Delgrange, F., Avni, G., Lukina, A., Schilling, C., Nowe, A., & Pérez, G. A. (2025). Controller Synthesis from Deep Reinforcement Learning Policies. Paper presented at PRL Workshop Series
, Philadelphia, Pennsylvania, United States. https://prl-theworkshop.github.io/prl2025-aaai/papers/23.pdf

@conference{a3c3075f766d4d3a8a2aaed05054d8b4,
title = "Controller Synthesis from Deep Reinforcement Learning Policies",
abstract = "We propose a novel framework to controller design in environments with a two-level structure: a known high-level graph (“map”) in which each vertex is populated by a Markov decision process, called a “room”. The framework “separates concerns” by using different design techniques for low- and high-level tasks. We apply reactive synthesis for high-level tasks: given a specification as a logical formula over the high-level graph and a collection of low-level policies given on “concise” latent structures, we construct a “planner” that selects which low-level policy to apply in each room. We develop a reinforcement learning procedure to train low-level policies on latent structures, which unlike previous approach, circumvents a model distillation step. It pairs the policy with probably approximately correct guarantees on its performance and abstraction quality, which are lifted to guarantees on the high-level task. These formal guarantees are the main advantage of the framework. Other advantages include scalability (rooms are large and their dynamics is unknown) and reusability of low-level policies. We demonstrate feasibility in challenging case studies involving agent navigation in environments with moving obstacles and visual inputs.",
keywords = "Reactive Synthesis, Reinforcement Learning, Model Checking, Planning and Reasoning under Uncertainty",
author = "Florent Delgrange and Guy Avni and Anna Lukina and Christian Schilling and Ann Nowe and P{\'e}rez, {Guillermo A.}",
year = "2025",
month = mar,
day = "4",
language = "English",
note = "PRL Workshop Series<br/> : Bridging the Gap Between AI Planning and Reinforcement Learning, PRL@AAAI 2025 ; Conference date: 04-03-2025 Through 04-03-2025",
url = "https://prl-theworkshop.github.io/prl2025-aaai/",
}

Link