Author(s): Gomez-Escalonilla V; Baron H; Read T; Watson J; Rodriguez Del Rosario M; Martinez-Santos P; Keller V; Rickards N; Darch G
Linked Author(s):
Keywords: Groundwater quality; Machine learning; Nitrate
Abstract: This study presents a spatial prediction of nitrate concentrations (above or below a specific threshold) in groundwater in the Chalk Aquifer of East Anglia using a supervised classification approach combined with stakeholder co-creation. A multi-year database comprising approximately 150 boreholes was compiled, containing measurements of nitrate concentrations, along with 27 explanatory variables, including geological properties, topographical factors, soil-related variables, climatic information, and potential nitrogen sources. Machine learning models demonstrated high predictive performance, with evaluation metrics (F1 score) exceeding 0.85 in many cases. To enhance model interpretability, SHAP (SHapley Additive exPlanations) values were employed, providing insights into the relative influence of each predictor on nitrate occurrence. The resulting spatial probability map indicates that the highest probabilities of nitrate presence are primarily concentrated along the western boundary of the Chalk Aquifer, corresponding to areas where the aquifer is unconfined or occurs at shallow depths beneath its confining deposits. This integrated approach highlights the potential of combining machine learning and stakeholder knowledge to support informed groundwater management and mitigation strategies for nitrate contamination.
Year: 2026