Our project addresses a timely and relevant question on how to create a pipeline that allows the data collection of health and social media data to enable the application of machine learning and artificial intelligence techniques to address Infectious Disease Prediction and Forecast. The envisioned solution aspires to break the current shortcomings in predicting infectious diseases from a vast corpus of structured and unstructured medical records and unstructured big data from online sources. To fulfil its vision, the project will address the following intertwined objectives:brOBJ 1. To create a big data platform that inputs data from patient’s health records, clinical narratives, coded medical records, and online sources (particularly, social media, meteo APIs), processes and learns (time-series) representations of such data.brOBJ 2. To create an innovative AI system that can be trained to predict of infectious diseases from the above described data sources.brOBJ 3. To design a interpretable AI models that learn to fuse data from the diverse sources and predict infectious diseases rates across time from the fused time-series data.brOBJ 4. To enable the shared platform to be independent of a specific EHR system, scalable, and able to provide timely alarms and to amplify the capacity of medical registration staff.brKeywords: Big data mining, data analytics, infectious disease outbreaks,

Runtime: 2020 - 2023