Natural language processing refers to a collection of data processing and machine learning methods for performing text mining. NLP can be used to process natural human language to address various applications, ranging from sentiment analysis and text classification or summarization to machine translation and question-answering. Furthermore, NLP can be used to perform information extraction by transforming large corpora of high-dimensional data into structured information that can be used in decision support systems.
Similar to many machine learning domains, deep learning has achieved state-of-the-art performance in the majority of NLP problems. Nevertheless, this presents a drawback: deep models are often considered as black boxes, with various performance improvements originating from empirical model refinement. In order to improve the trustworthiness and acceptance of these models, recent developments in the fields of interpretable and explainable deep learning aim to develop novel techniques to design and analyze deep networks in a transparent way.
This thesis will investigate deep models for natural language processing with particular emphasis on the problems associated with information extraction and text classification. On the one hand, we will study methods that extract information from text corpora with the scope to detect and characterize events; on the other hand, we will investigate methods for the extreme multi-label text classification (XMC) problem: given an input text, return the most relevant labels from a large label collection. The focus will be on models that are interpretable and explainable so that they support trustworthy decisions on the data.
The state-of-the-art performance of the envisioned methods will be demonstrated using diverse corpora of data including social media streams and electronic health records.