Publication Details
Michiel Dhont, Elena Tsiporkova, Nicolás González-Deleito

Smart Transportation Systems 2022

Contribution To Book Anthology


The road network is becoming increasingly equipped with a multitude of sensors, monitoring a wide range of operating and contextual parameters. The availability of real-time sensor data enables the realisation of diverse data-driven applications, e.g., anomaly detection, identification of insightful patterns, monitoring the evolution of relevant trends in time and delivery of actionable decision support. However, such streaming data might contain vast amounts of missing values depending on the application. This makes it very challenging, if not impossible, to fully exploit the potential of data analysis and machine learning for these data sources, and in particular real-time analysis is not feasible. We propose in this paper an imputation methodology dedicated to multi-source streaming data, with a focus on the mobility domain. The proposed approach is based on spatio-temporal profiling of the streaming behaviour derived from historical data via non-negative matrix factorisation. The profiling method takes advantage of an adaptive segmentation strategy splitting the data into rolling time windows (chunks) allowing to use the limited non-missing data as optimally as possible. The identified profiles allow to devise a dynamic and scalable imputation strategy, which is able to reliably estimate incoming missing values in streaming data as soon as they arrive.

DOI DOI DOI scopus