Interactively Learning the User’s Utility for Best-Arm Identification in Multi-Objective Multi-Armed Bandits

Interactively Learning the User’s Utility for Best-Arm Identification in Multi-Objective Multi-Armed Bandits ■

Mathieu Reymond, Eugenio Bargiacchi, Diederik M. Roijers, Ann Nowe