Publication Details
Overview
 
 
Roxana Radulescu
 

Thesis

Abstract 

The prevalence of artificial agents in our world raises the need to ensurethat they are able to handle the salient properties of the environment, inorder to plan or learn how to solve specific tasks.A first important aspect is the fact that real-world problems are notrestricted to one agent, and often involve multiple agents acting in the sameenvironment. Such settings have already proven to be challenging to solve,with a few examples including traffic systems, electricity grids, or warehousemanagement. Furthermore, the majority of these multi-agent systemimplementations aim to optimise the agents{\textquoteright} behaviour with respect to asingle objective, despite the fact that many problem domains inherentlyinvolve multiple objectives. By taking a multi-objective perspective ondecision-making problems, complex trade-offs can be managed; e.g., supplychain management involves a complex coordination process for optimisingthe information and material flow between all the components of the supplychain, while minimising overall costs and complying with the conflictingdemands of the involved partners.In this work, we focus on these highlighted aspects and discuss how theprocess of decision-making and learning of artificial agents can be formalisedand approached when there are multiple agents involved, and multipleobjectives that need to be considered in the process. To analyse suchproblems, we adopt a utility-based perspective, and advocate thatcompromises between competing objectives should be made on the basis ofthe utility that these compromises have for the users, in other words, itshould depend on the desirability of the outcomes.As a first contribution, we develop a novel taxonomy to classify thesesettings. This allows us to offer a structured view of the field, to clearlydelineate the current state-of-the-art in multi-objective multi-agent decisionmaking approaches and to identify promising directions for future research.As a second contribution, we proceed to analyse and investigate gametheoretic equilibria under different multi-objective optimisation criteria andprovide theoretical results concerning the existence and conditions forarriving to such solutions in these scenarios. We additionally show that it ispossible for Nash equilibria to not exist in certain multi-objective multi-agentsettings.As a final contribution, we present the first study of the effects ofopponent modelling on multi-objective multi-agent interactions. Wecontribute novel reinforcement learning algorithms for this setting, alongwith extensions that incorporate opponent behaviour reconstruction andlearning with opponent learning awareness (i.e., learning while anticipatingone's impact on the opponent's learning step)

Reference