Programmatic Reinforcement Learning using Critic-Moderated Evolution

Programmatic Reinforcement Learning using Critic-Moderated Evolution ■

Senne Deproost, Denis Steckelmacher, Ann Nowe

Abstract ■

We propose a new method to generate a program from a Reinforcement Learning policy. Compared to previous methods, we exploit more RL-specific elements such as the critic value-network. Improved actions from the critic are used to steer a Genetic Programming process via a fitness function.