Publication Details
Jens Nevens



Language is a unique hallmark of human intelligence. Our linguistic systems do not only meticulously serve our communicative needs, they are also incredibly robust to noise, adaptive to change and they can be learnt efficiently. One of the grand challenges in AI consists in the development of truly intelligent autonomous agents with bidirectional communication systems offering the same robustness, adaptivity and efficiency as found in human languages. Ideally, these communication systems are also transparent and explainable in human-interpretable terms, which facilitates the verification of their internal consistency, thereby eliciting trust. Today, the design of adequate representations and learning mechanisms that lead to such communication systems is still an unsolved problem. In this dissertation, I present novel representations and learning mechanisms that are suitable for the communication systems of autonomous agents and that exhibit the key desirable properties of human languages. These representations and learning mechanisms are validated through case studies that tackle visual question answering benchmarks. Visual question answering involves answering natural language questions about images and requires perceptual, linguistic, and reasoning abilities. I present four main contributions that allow an agent to represent and acquire linguistic structures on the conceptual, morpho-syntactic, and semantic level. The first contribution consists in a computational construction grammar that has sufficient representational capacity for the benchmark dataset. Specifically, it provides bidirectional mappings between all questions of the dataset and their underlying procedural semantic representations. The next contributions consist of experiments where the agent acquires this grammar through task-based communicative interactions. Concretely, in the first experiment, the agent learns to extract meaningful concepts from continuous sensorimotor observations. Here, I build further on experiments within the language game paradigm, and I introduce a novel concept representation that is based on prototype theory. The second experiment provides an alternative, hybrid approach, which combines symbolic and sub-symbolic techniques to capture the same concepts in the form of hybrid procedural semantic representations. In the final experiment, I introduce a mechanistic model of two cognitive processes that are central in usage-based theories of language acquisition: intention reading and pattern finding. Through these processes, the agent is able to acquire a grammar through task-based communicative interactions. In particular, the intention reading process enables the agent to hypothesize about the intended meaning of an observed utterance, and the pattern finding process enables the agent to construct abstract schemata that capture generalisations over form-meaning mappings. Together, the contributions that I present in this dissertation push forward the state of the art in the development of autonomous agents with communication systems offering human-like properties. The methodological advances that I introduce are relevant to a wide range of application domains, including for example human-robot interaction systems, conversational agents and intelligent tutoring systems.