Florent Delgrange, Mathieu Reymond, Ann Nowe, Guillermo A. Pérez
In real-world problems, decision makers often have to balance mul- tiple objectives, which can result in trade-offs. One approach to finding a compromise is to use a multi-objective approach, which builds a set of all optimal trade-offs called a Pareto front. Learning the Pareto front requires exploring many different parts of the state- space, which can be time-consuming and increase the chances of encountering undesired or dangerous parts of the state-space. In this preliminary work, we propose a method that combines two frameworks, Pareto Conditioned Networks (PCN) and Wasserstein auto-encoded MDPs (WAE-MDPs), to efficiently learn all possible trade-offs while providing formal guarantees on the learned poli- cies. The proposed method learns the Pareto-optimal policies while providing safety and performance guarantees, especially towards unexpected events, in the multi-objective setting.
Delgrange, F, Reymond, M, Nowe, A & Pérez, GA 2023, WAE-PCN: Wasserstein-autoencoded Pareto Conditioned Networks. in F Cruz, CF Hayes , C Wang & C Yates (eds), Proc. of the Adaptive and Learning Agents Workshop (ALA 2023). 15 edn, vol. https://alaworkshop2023.github.io/, 42, London, UK, pp. 1-7, 2023 Adaptive and Learning Agents Workshop at AAMAS, London, United Kingdom, 29/05/23. <https://alaworkshop2023.github.io/papers/ALA2023_paper_42.pdf>
Delgrange, F., Reymond, M., Nowe, A., & Pérez, G. A. (2023). WAE-PCN: Wasserstein-autoencoded Pareto Conditioned Networks. In F. Cruz, C. F. Hayes , C. Wang, & C. Yates (Eds.), Proc. of the Adaptive and Learning Agents Workshop (ALA 2023) (15 ed., Vol. https://alaworkshop2023.github.io/, pp. 1-7). Article 42. https://alaworkshop2023.github.io/papers/ALA2023_paper_42.pdf
@inproceedings{cecc9e6fed7f41b28544a70752d34418,
title = "WAE-PCN: Wasserstein-autoencoded Pareto Conditioned Networks",
abstract = "In real-world problems, decision makers often have to balance mul- tiple objectives, which can result in trade-offs. One approach to finding a compromise is to use a multi-objective approach, which builds a set of all optimal trade-offs called a Pareto front. Learning the Pareto front requires exploring many different parts of the state- space, which can be time-consuming and increase the chances of encountering undesired or dangerous parts of the state-space. In this preliminary work, we propose a method that combines two frameworks, Pareto Conditioned Networks (PCN) and Wasserstein auto-encoded MDPs (WAE-MDPs), to efficiently learn all possible trade-offs while providing formal guarantees on the learned poli- cies. The proposed method learns the Pareto-optimal policies while providing safety and performance guarantees, especially towards unexpected events, in the multi-objective setting.",
keywords = "Multi-objective, Reinforcement Learning, Formal Methods, Representation Learning",
author = "Florent Delgrange and Mathieu Reymond and Ann Nowe and P{\'e}rez, {Guillermo A.}",
year = "2023",
month = may,
day = "29",
language = "English",
isbn = "None",
volume = "https://alaworkshop2023.github.io/",
pages = "1--7",
editor = "Francisco Cruz and {Hayes }, {Conor F.} and Wang, {Caroline } and Connor Yates",
booktitle = "Proc. of the Adaptive and Learning Agents Workshop (ALA 2023)",
edition = "15",
note = "2023 Adaptive and Learning Agents Workshop at AAMAS, ALA 2023 ; Conference date: 29-05-2023 Through 30-05-2023",
url = "https://alaworkshop2023.github.io",
}