This study investigates the potential of various computational models to forecast water potability two days in advance. The models evaluated include Long Short-Term Memory (LSTM), K-Nearest Neighbors (KNN), Gaussian Process, Random Forest, Gradient Boosting, XGBoost, LightGBM, Support Vector Machines (SVM), and Artificial Neural Networks (ANN). These models were trained, validated, and tested on an enhanced dataset incorporating key water quality parameters such as pH, temperature, dissolved oxygen, and conductivity. The results indicate that LightGBM and XGBoost outperformed other models, achieving an accuracy of 95.4\%, while Gradient Boosting also demonstrated strong performance. To ensure a fair comparison, all models were evaluated under identical experimental conditions and datasets. This research highlights the effectiveness of advanced machine learning techniques in predicting water potability, offering valuable tools for water quality management and decision-making.
Ehmimed, N, Chkouri, MY & Touhafi, A 2026, Temporal SMOTE and Model Diversity: Boosting Accuracy in Water Quality Forecasting. in S Motahhir, B Bossoufi & JM Guerrero (eds), Digital Technologies and Applications - Proceedings of ICDTA 2025, Volume 4. Lecture Notes in Networks and Systems, vol. 1642 LNNS, springer link, pp. 103-114. https://doi.org/10.1007/978-3-032-07915-2_9
Ehmimed, N., Chkouri, M. Y., & Touhafi, A. (2026). Temporal SMOTE and Model Diversity: Boosting Accuracy in Water Quality Forecasting. In S. Motahhir, B. Bossoufi, & J. M. Guerrero (Eds.), Digital Technologies and Applications - Proceedings of ICDTA 2025, Volume 4 (pp. 103-114). (Lecture Notes in Networks and Systems; Vol. 1642 LNNS). springer link. https://doi.org/10.1007/978-3-032-07915-2_9
@inproceedings{d25159a9a6b3423aa85f81c8a8cabb5a,
title = "Temporal SMOTE and Model Diversity: Boosting Accuracy in Water Quality Forecasting",
abstract = "This study investigates the potential of various computational models to forecast water potability two days in advance. The models evaluated include Long Short-Term Memory (LSTM), K-Nearest Neighbors (KNN), Gaussian Process, Random Forest, Gradient Boosting, XGBoost, LightGBM, Support Vector Machines (SVM), and Artificial Neural Networks (ANN). These models were trained, validated, and tested on an enhanced dataset incorporating key water quality parameters such as pH, temperature, dissolved oxygen, and conductivity. The results indicate that LightGBM and XGBoost outperformed other models, achieving an accuracy of 95.4\%, while Gradient Boosting also demonstrated strong performance. To ensure a fair comparison, all models were evaluated under identical experimental conditions and datasets. This research highlights the effectiveness of advanced machine learning techniques in predicting water potability, offering valuable tools for water quality management and decision-making.",
author = "Nadir Ehmimed and Chkouri, \{Mohamed Yassin\} and Abdellah Touhafi",
note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.",
year = "2026",
doi = "10.1007/978-3-032-07915-2\_9",
language = "English",
isbn = "978-3-032-07914-5",
series = "Lecture Notes in Networks and Systems",
publisher = "springer link",
pages = "103--114",
editor = "Saad Motahhir and Badre Bossoufi and Guerrero, \{Josep M.\}",
booktitle = "Digital Technologies and Applications - Proceedings of ICDTA 2025, Volume 4",
}