In this paper we describe an event-based integration approach for building a human-robot spoken interaction system using the Nao robot platform with the Urbi middleware ,used as a testbed for investigating child-robot interaction in the ALIZ-E project. Our integrated system includes components for the recognition, interpretation and generation of speech and gestures and for dialogue management. We describe the components and how we adapted and extended them for use in a system that interacts with children in Italian. We present an event-based approach we have chosen for the implementation of the interfaces and the integration mechanism.