Publication Details
Overview
 
 
Senne Deproost, Denis Steckelmacher, Ann Nowe
 

Unpublished contribution to conference

Abstract 

We propose a new method to generate a program from a Reinforcement Learning policy. Compared to previous methods, we exploit more RL-specific elements such as the critic value-network. Improved actions from the critic are used to steer a Genetic Programming process via a fitness function.

Reference