Improving GALDIT-based groundwater vulnerability predictive mapping using coupled resampling algorithms and machine learning models

Rahim Barzegar*, Siamak Razzagh, John Quilty, Jan Adamowski, Homa Kheyrollah Pour, Martijn J. Booij

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

45 Citations (Scopus)
241 Downloads (Pure)


Developing accurate groundwater vulnerability maps is important for the sustainable management of groundwater resources. In this research, resampling methods [e.g., Bootstrap Aggregating (BA) and Disjoint Aggregating (DA)] are combined with machine learning (ML) models, namely eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), Adaptive Boosting (AdaBoost), Categorical Boosting (CatBoost), and Random Forest (RF), to improve the GALDIT groundwater vulnerability mapping framework that considers Groundwater occurrence (G) (i.e., aquifer type), Aquifer hydraulic conductivity (A), depth to groundwater Level (L), Distance from the seashore (D), Impact of existing seawater intrusion status (I), and aquifer Thickness (T). The proposed approach overcomes the subjectivity of the weights and ratings given to the six variables in the GALDIT framework (via the ML methods) and helps address the small dataset issue (via resampling methods) common to groundwater vulnerability predictive mapping. Considering the Shabestar Plain aquifer, situated in the northeast of Lake Urmia (Iran), the predicted vulnerability indices from GALDIT were adjusted using total dissolved solid (TDS, an indicator of drinking water quality) concentrations, and were modeled by the ML models. Pearson’s correlation coefficient (r) and distance correlation (DC) between the predicted vulnerability indices and TDS were used to validate the models. Using a validation set, the GALDIT framework (r = 0.447 and DC = 0.511) was compared against the best performing standalone (XGBoost-GALDIT, r = 0.613, DC = 0.647) and coupled resampling (BA-XGBoost-GALDIT, r = 0.659, DC = 0.699 and DA-RF-GALDIT, r = 0.616, DC = 0.662) ML models, revealing that the proposed framework significantly increases r and DC metrics. In general, the BA resampling method led to better performing ML models than DA. However, in all cases, it was found that integrating resampling methods and ML models are promising tools to improve the accuracy of GALDIT vulnerability models.
Original languageEnglish
Article number126370
Number of pages15
JournalJournal of hydrology
Early online date26 Apr 2021
Publication statusPublished - 1 Jul 2021


Dive into the research topics of 'Improving GALDIT-based groundwater vulnerability predictive mapping using coupled resampling algorithms and machine learning models'. Together they form a unique fingerprint.

Cite this