There is a substantial demand for deep learning methods that can work with limited, high-dimensional, and noisy datasets. Nonetheless, current research mostly neglects this area, especially in the absence of prior expert knowledge or knowledge transfer. In this work, we bridge this gap by studying the performance of deep learning methods on the true data distribution in a limited, high-dimensional, and noisy data setting. To this end, we conduct a systematic evaluation that reduces the available training data while retaining the challenging properties mentioned above. Furthermore, we extensively search the space of hyperparameters and compare state-of-the-art architectures and models built and trained from scratch to advocate for the use of multi-objective tuning strategies. Our experiments highlight the lack of performative deep learning models in current literature and investigate the impact of training hyperparameters. We analyze the complexity of the models and demonstrate the advantage of choosing models tuned under multi-objective criteria in lower data regimes to reduce the likelihood to overfit. Lastly, we demonstrate the importance of selecting a proper inductive bias given a limited-sized dataset. Given our results, we conclude that tuning models using a multi-objective criterion results in simpler yet competitive models when reducing the number of data points.
Jaxy, S, Nowé, A & Libin, P 2024, A Systematic Analysis of Deep Learning Algorithms in High-Dimensional Data Regimes of Limited Size. in 2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI). 2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI), IEEE, pp. 515-523. https://doi.org/10.1109/ICTAI62512.2024.00079
Jaxy, S., Nowé, A., & Libin, P. (2024). A Systematic Analysis of Deep Learning Algorithms in High-Dimensional Data Regimes of Limited Size. In 2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI) (pp. 515-523). (2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI)). IEEE. https://doi.org/10.1109/ICTAI62512.2024.00079
@inproceedings{480cee8875864238a189212f931083e8,
title = "A Systematic Analysis of Deep Learning Algorithms in High-Dimensional Data Regimes of Limited Size",
abstract = "There is a substantial demand for deep learning methods that can work with limited, high-dimensional, and noisy datasets. Nonetheless, current research mostly neglects this area, especially in the absence of prior expert knowledge or knowledge transfer. In this work, we bridge this gap by studying the performance of deep learning methods on the true data distribution in a limited, high-dimensional, and noisy data setting. To this end, we conduct a systematic evaluation that reduces the available training data while retaining the challenging properties mentioned above. Furthermore, we extensively search the space of hyperparameters and compare state-of-the-art architectures and models built and trained from scratch to advocate for the use of multi-objective tuning strategies. Our experiments highlight the lack of performative deep learning models in current literature and investigate the impact of training hyperparameters. We analyze the complexity of the models and demonstrate the advantage of choosing models tuned under multi-objective criteria in lower data regimes to reduce the likelihood to overfit. Lastly, we demonstrate the importance of selecting a proper inductive bias given a limited-sized dataset. Given our results, we conclude that tuning models using a multi-objective criterion results in simpler yet competitive models when reducing the number of data points.",
keywords = "Deep learning, Training, Analytical models, Systematics, High dimensional data, Training data, Data models, Noise measurement, Tuning, Knowledge transfer, Limited Data, Deep Learning, Multi Objective Optimization, Overfit",
author = "Simon Jaxy and Ann Now{\'e} and Pieter Libin",
note = "Funding Information: S.J. gratefully acknowledges support from Fonds Wetenschappelijk Onderzoek (FWO) via FWO PhD Fellowship strategic basic research, Belgium 1SHHV24N. P.J.K.L. wishes to express gratitude for the support received from FWO via postdoctoral fellowship, Belgium 1242021N and the research council of the Vrije Universiteit Brussel (OZR-VUB via grant number OZR3863BOF). This research was supported by funding from the Flemish Government under the \{"}Onderzoeksprogramma Artifici\u00EBle Intelligentie (AI) Vlaanderen\{"}program and through the IMAGIca project by the Interdisciplinary Research Program of the Vrije Universiteit Brussel (reference IRP8 b). Lastly, we want to thank the HPC administration and support service of Vrije Universiteit Brussel that helped tremendously during the experimental phase, Bart Bogaerts for providing us with essential feedback and guidance throughout the development of this research and finally, Bram Silue, Denis Steckelmacher, and Samuele Pollaci for proofreading. Publisher Copyright: {\textcopyright} 2024 IEEE.",
year = "2024",
doi = "10.1109/ICTAI62512.2024.00079",
language = "English",
isbn = "9798331527242",
series = "2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI)",
publisher = "IEEE",
pages = "515--523",
booktitle = "2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI)",
}