Programmatic Reinforcement Learning using Critic-Moderated Evolution
 
Programmatic Reinforcement Learning using Critic-Moderated Evolution 
 
Senne Deproost, Denis Steckelmacher, Ann Nowe
 
Abstract 

We propose a new method to generate a program from a Reinforcement Learning policy. Compared to previous methods, we exploit more RL-specific elements such as the critic value-network. Improved actions from the critic are used to steer a Genetic Programming process via a fitness function.