Florian Felten, Lucas Nunes Alegre, Ann Nowe, Ana Bazzan, El-Ghazali Talbi, Grégoire Danoy, Bruno Castro da Silva
Multi-objective reinforcement learning algorithms (MORL) extend standard reinforcement learning (RL) to scenarios where agents must optimize multiple---potentially conflicting---objectives, each represented by a distinct reward function. To facilitate and accelerate research and benchmarking in multi-objective RL problems, we introduce a comprehensive collection of software libraries that includes:(i) MO-Gymnasium, an easy-to-use and flexible API enabling the rapid construction of novel MORL environments. It also includes more than 20 environments under this API. This allows researchers to effortlessly evaluate any algorithms on any existing domains;(ii) MORL-Baselines, a collection of reliable and efficient implementations of state-of-the-art MORL algorithms, designed to provide a solid foundation for advancing research. Notably, all algorithms are inherently compatible with MO-Gymnasium; and (iii) a thorough and robust set of benchmark results and comparisons of MORL-Baselines algorithms, tested across various challenging MO-Gymnasium environments. These benchmarks were constructed to serve as guidelines for the research community, underscoring the properties, advantages, and limitations of each particular state-of-the-art method.
Felten, F, Nunes Alegre, L, Nowe, A, Bazzan, A, Talbi, E-G, Danoy, G & Castro da Silva, B 2023, A Toolkit for Reliable Benchmarking and Research in Multi-Objective Reinforcement Learning. in Advances in Neural Information Processing Systems. vol. 36, Advances in Neural Information Processing Systems, Curran Associates, Inc., pp. 23671-23700. <http://{https://proceedings.neurips.cc/paper_files/paper/2023/file/4aa8891583f07ae200ba07843954caeb-Paper-Datasets_and_Benchmarks.pdf>
Felten, F., Nunes Alegre, L., Nowe, A., Bazzan, A., Talbi, E.-G., Danoy, G., & Castro da Silva, B. (2023). A Toolkit for Reliable Benchmarking and Research in Multi-Objective Reinforcement Learning. In Advances in Neural Information Processing Systems (Vol. 36, pp. 23671-23700). (Advances in Neural Information Processing Systems). Curran Associates, Inc.. http://{https://proceedings.neurips.cc/paper_files/paper/2023/file/4aa8891583f07ae200ba07843954caeb-Paper-Datasets_and_Benchmarks.pdf
@inproceedings{601f5f20f55d4aec8ed5108491dfaef2,
title = "A Toolkit for Reliable Benchmarking and Research in Multi-Objective Reinforcement Learning",
abstract = "Multi-objective reinforcement learning algorithms (MORL) extend standard reinforcement learning (RL) to scenarios where agents must optimize multiple---potentially conflicting---objectives, each represented by a distinct reward function. To facilitate and accelerate research and benchmarking in multi-objective RL problems, we introduce a comprehensive collection of software libraries that includes:(i) MO-Gymnasium, an easy-to-use and flexible API enabling the rapid construction of novel MORL environments. It also includes more than 20 environments under this API. This allows researchers to effortlessly evaluate any algorithms on any existing domains;(ii) MORL-Baselines, a collection of reliable and efficient implementations of state-of-the-art MORL algorithms, designed to provide a solid foundation for advancing research. Notably, all algorithms are inherently compatible with MO-Gymnasium; and (iii) a thorough and robust set of benchmark results and comparisons of MORL-Baselines algorithms, tested across various challenging MO-Gymnasium environments. These benchmarks were constructed to serve as guidelines for the research community, underscoring the properties, advantages, and limitations of each particular state-of-the-art method.",
author = "Florian Felten and {Nunes Alegre}, Lucas and Ann Nowe and Ana Bazzan and El-Ghazali Talbi and Gr{\'e}goire Danoy and {Castro da Silva}, Bruno",
note = "Funding Information: We would like to thank Willem R\u00F6pke for his implementation of PQL, Denis Steckelmacher and Conor F. Hayes for the original implementation of EUPG, and Mathieu Reymond for the original implementation of PCN. Jordan K. Terry, Mark Towers, Manuel Goul\u00E3o, and the broader Farama Foundation team for supporting MO-Gymnasium. Shengyi Huang, Antonin Raffin, and Quentin Gallou\u00E9dec for their advice on experimental setup and openrlbenchmark integration. This work was funded by: Coordena\u00E7\u00E3o de Aperfei\u00E7oamento de Pessoal de N\u00EDvel Superior - Brazil (CAPES) - Finance Code 001; CNPq (Grants 140500/2021-9, 304932/2021-3); FAPESP/MCTI/CGI (Grant 2020/05165-1); the Fonds National de la Recherche Luxembourg (FNR), CORE program under the ADARS Project, ref. C20/IS/14762457; the Research Foundation Flanders (FWO) [G062819N]; the AI Research Program from the Flemish Government (Belgium); and the Francqui Foundation. Publisher Copyright: {\textcopyright} 2023 Neural information processing systems foundation. All rights reserved.",
year = "2023",
language = "English",
isbn = "1049-5258",
volume = "36",
series = "Advances in Neural Information Processing Systems",
publisher = "Curran Associates, Inc.",
pages = "23671--23700",
booktitle = "Advances in Neural Information Processing Systems",
}