Adaptive Load Balancing of Parallel Applications with Multi-Agent Reinforcement Learning on Heterogeneous Systems

Adaptive Load Balancing of Parallel Applications with Multi-Agent Reinforcement Learning on Heterogeneous Systems ■

Johan Parent, Katja Verbeeck, Jan Lemeire, E. Dirkx, Ann Nowe, Kris Steenhaut

Abstract ■

We report on the improvements that can be achieved by applying machine learning techniques, in particular reinforcementlearning, for the dynamic load balancing of parallel applications. The applications being considered in this paper are coarsegrain data intensive applications. Such applications put high pressure on the interconnect of the hardware. Synchronization andload balancing in complex, heterogeneous networks need fast, flexible, adaptive load balancing algorithms. Viewing a parallelapplication as a one-state coordination game in the framework of multi-agent reinforcement learning, and by using a recentlyintroduced multi-agent exploration technique, we are able to improve upon the classic job farming approach. The improvementsare achieved with limited computation and communication overhead.