Generalizable deep learning for a frequency modulated continuous wave radar-based human activity recognition

Generalizable deep learning for a frequency modulated continuous wave radar-based human activity recognition ■

Habib-ur-Rehman Khalid

Abstract ■

Human Activity Recognition (HAR) is vital for understanding human behavior and improving people's lives. The overall goal of a reliable HAR system is to automatically analyze and comprehend human actions based on the data collected from various sensors and sources, and different environments. In this research, we utilize a Frequency Modulated Continuous Wave radar sensor. Radar-based HAR has several advantages over camera-based HAR. It is less affected by external environmental factors, such as lighting conditions, smoke, or dust, making it easily deployable and cost-effective. Furthermore, compared to the camera-based HAR which may raise privacy concerns, radar-HAR is less invasive as it does not capture visual details of the participants. In recent years, machine learning algorithms have made significant progress in analyzing big data, and HAR is one of the key areas where machine learning algorithms have been applied successfully. Much of this success is due to supervised learning which relies heavily on the availability of large labeled datasets. However, creating labeled datasets is a time-consuming and expensive process that requires significant human effort and domain expertise. As a result, this makes supervised learning infeasible for unique or niche domains with limited data availability, such as radar-HAR. Additionally, supervised learning algorithms may not perform well on unseen data if there is a significant difference between the training and testing datasets, known as the “domain-shift” or “dataset bias” problem. In this thesis, our primary goal is to focus on the generalizability aspect of the deep learning-based models in the presence of domain-shift for an indoor radar-HAR application. In this context, radar target tracking-based auxiliary features and the preprocessing steps in the radar data cube are proposed. The tracking-based features provide the dynamic context of the participants in an indoor environment. At the same time, the proposed preprocessing steps in the radar data cube are driven by the Doppler \& range energy dispersion-based profiles of the participant's micro-motion. A novel Multi-View CNN- LSTM-based multi-model approach is proposed, which efficiently combines the complementary holistic view of the participants, given by the dynamic auxiliary contextual features, with the energy dispersion- based spatiotemporal features from the preprocessed radar data cubes. Moreover, to facilitate robust and class-agnostic feature extraction an unsupervised Convolutional Auto-encoder-based training and model initialization step is proposed, followed by the supervised training and fine-tuning steps. Lastly, to address the domain-shift problem, the proposed robust model training methodology is extended with the CORrelation ALignment (CORAL)-based Multi-View Unsupervised Domain Adaptation-based model adaptation step.