Temporal SMOTE and Model Diversity: Boosting Accuracy in Water Quality Forecasting

Temporal SMOTE and Model Diversity: Boosting Accuracy in Water Quality Forecasting ■

Nadir Ehmimed, Mohamed Yassin Chkouri, Abdellah Touhafi

Abstract ■

This study investigates the potential of various computational models to forecast water potability two days in advance. The models evaluated include Long Short-Term Memory (LSTM), K-Nearest Neighbors (KNN), Gaussian Process, Random Forest, Gradient Boosting, XGBoost, LightGBM, Support Vector Machines (SVM), and Artificial Neural Networks (ANN). These models were trained, validated, and tested on an enhanced dataset incorporating key water quality parameters such as pH, temperature, dissolved oxygen, and conductivity. The results indicate that LightGBM and XGBoost outperformed other models, achieving an accuracy of 95.4\%, while Gradient Boosting also demonstrated strong performance. To ensure a fair comparison, all models were evaluated under identical experimental conditions and datasets. This research highlights the effectiveness of advanced machine learning techniques in predicting water potability, offering valuable tools for water quality management and decision-making.