Abstract

Highlights

  • The study highlights unsafe groundwater fluoride zones via machine-learning methods.
  • Random Forest and Gradient Boosting techniques predict high fluoride zones with high accuracy.
  • The study highlights the acceptability of machine learning in groundwater quality surveillance.

This study applied machine learning techniques to monitor and predict fluoride concentrations in the Vea Catchment area, a region affected by endemic fluoride contamination. The aim was to identify high-risk areas with unsafe fluoride levels, supporting public health surveillance and water management efforts. Water quality data, including fluoride concentrations and various physico-chemical parameters, were collected from multiple sampling sites. Machine learning models, including Logistic Regression, Random Forest, and Gradient Boosting, were used for regression, classification, and spatial analysis to classify regions as safe or unsafe based on World Health Organization (WHO) fluoride guidelines. The results demonstrated the effectiveness of the models, with Random Forest and Gradient Boosting achieving high accuracy in predicting unsafe fluoride levels with precision and F1-score values of 0.88 for both models, successfully identifying high-risk areas with concentrations exceeding the WHO limit of 1.5 mg/L. Anomaly detection techniques revealed localized areas of concern. This study highlights the value of machine learning in water quality management, providing a data-driven approach to predicting fluoride contamination and informing public health interventions and water management strategies. The integration of predictive modeling with spatial analysis represents a significant advancement, offering the potential for real-time water quality monitoring in fluoride-endemic regions.

Introduction

Fluoride contamination in groundwater poses a significant environmental and public health issue, particularly in regions where groundwater serves as the primary source of drinking water. Fluoride occurs naturally in groundwater due to the weathering of fluoride-bearing minerals such as fluorite, apatite, and rock phosphate (Al Sabti et al., 2023; Mukherjee and Singh, 2018; Onipe et al., 2020; Zango et al., 2022). While fluoride at low concentrations is beneficial for dental health, excessive levels can lead to adverse health effects (Das et al., 2020). The World Health Organization (WHO) recommends a fluoride limit of 1.5 mg/L in drinking water (Ahmad Dar and Kurella, 2023; Kerdoun et al., 2022). Exceeding this limit can result in serious health conditions, especially in regions with prolonged exposure (Dhar and Bhatnagar, 2009).Chronic ingestion of fluoride in excess primarily manifests as dental and skeletal fluorosis (Kashyap et al., 2021; Srivastava and Flora, 2020). Dental fluorosis is evident in the discoloration, mottling, and pitting of teeth and typically occurs when children are exposed to high fluoride levels during tooth development (Liao, 2021; Mao and Wang, 2021). Skeletal fluorosis, a more severe condition, arises from long-term consumption of water with elevated fluoride levels and affects bones and joints, causing pain, stiffness, and, in advanced stages, crippling deformities (Ahmad Dar and Kurella, 2023; Srivastava and Flora, 2020; Zhou et al., 2023). These health impacts are particularly problematic in rural areas where communities often rely on fluoride contaminated groundwater for drinking and domestic use (Onipe et al., 2020; Rudra, 2021).In the Upper East Region of Ghana, high fluoride concentrations in groundwater have been reported in numerous studies (Asare-Donkor and Adimado, 2020; Obiri-Nyarko et al., 2022; Zango et al., 2021, 2022, 2024)). This region is semi-arid and characterized by a prolonged dry season, making groundwater an essential water resource for local communities. The Vea Catchment area, situated in this region, is an example of a fluoride-endemic zone where fluoride contaminated groundwater has become a pressing concern (Zango et al., 2024). Given the health risks associated with excessive fluoride intake, monitoring and managing fluoride levels in groundwater sources is crucial for safeguarding public health in this region.Monitoring fluoride concentrations in endemic regions remains a significant challenge due to the variability of fluoride levels across geographical locations and the limited capacity for continuous water quality testing. Traditional surveillance systems rely on periodic sampling and laboratory testing, which are resource-intensive and often provide incomplete spatial coverage. The heterogeneous distribution of fluoride in groundwater makes it difficult to identify high-risk areas through sporadic sampling alone. Furthermore, existing fluoride monitoring methods may not provide timely data necessary for effective management and intervention (Solanki et al., 2022; Ali et al., 2023; Sunkari et al., 2024).The spatial variability of fluoride concentrations, often influenced by local geological conditions, highlights the need for more sophisticated surveillance approaches. In many cases, groundwater samples collected from wells or boreholes located a few kilometers apart show significant variations in fluoride content. Thus, understanding the spatial distribution of fluoride and identifying regions with unsafe levels requires advanced methods capable of leveraging both spatial and water quality data (Adimalla et al., 2020; Egbueri et al., 2023; Yassin et al., 2024).Recent advances in machine learning (ML) techniques (Ravindra et al., 2023; Abu et al., 2024; Kazapoe et al., 2024) offer promising solutions for improving fluoride surveillance and risk prediction in such regions. By utilizing water quality data such as electrical conductivity (EC), total dissolved solids (TDS), and the concentrations of key ions (e.g., sodium, calcium, chloride), ML models can be developed to predict fluoride concentrations across a catchment. These models can provide insights into the relationships between fluoride and other water quality parameters, allowing for more accurate predictions and the identification of areas at risk of excessive fluoride exposure. Furthermore, ML methods enable the classification of regions based on fluoride risk, supporting better-targeted interventions.The Vea Catchment, located in the Upper East Region of Ghana, covers an area characterized by a tropical savannah climate with distinct wet and dry seasons. During the long dry season, groundwater is the principal source of drinking water for the local population. The communities within the catchment rely on wells and boreholes for their daily water needs, making groundwater quality a critical concern for public health.Geologically, the region is dominated by basement complex rocks, including granite and metamorphic formations (Zango et al., 2021, 2023). These rocks contain fluoride-bearing minerals, which leach into the groundwater and contribute to elevated fluoride levels (Zango et al., 2022, 2023). The dissolution of these minerals varies across the catchment, resulting in uneven fluoride concentrations in different parts of the region. While some communities may access water that is safe for consumption, others may unknowingly be exposed to hazardous fluoride levels.The Vea Catchment’s water resources are essential in sustaining local livelihoods, particularly in agriculture and domestic use (Arfasa et al., 2023; Zango et al., 2021). Ensuring that groundwater in the region remains safe for consumption is critical for protecting human health and maintaining the region’s economic and social well-being. Understanding the spatial distribution of fluoride concentrations across the catchment is necessary for the effective management of these water resources.This study aims to address the challenge of fluoride contamination in the Vea Catchment by employing machine learning techniques to enhance surveillance and prediction. The primary objective is to develop models that can predict fluoride concentrations using other water quality parameters, providing a more efficient method for monitoring fluoride levels in the region. Additionally, the study seeks to classify areas within the catchment into risk categories based on fluoride concentrations, helping to identify locations where fluoride levels exceed the WHO recommended limit of 1.5 mg/L.This research aims to employ spatial analysis techniques to identify and map high-risk areas due fluoride contamination within the Vea Catchment. By providing a visual representation of fluoride distribution across the region, the study seeks to pinpoint spatial patterns in fluoride concentrations. These insights are intended to be actionable, aiding local water authorities and policymakers in making informed decisions. The study is guided by two main research questions: First, how can machine learning techniques be utilized to identify high-risk areas of fluoride contamination in the Vea Catchment? Second, what spatial patterns of fluoride concentrations can be detected within the catchment, and how can these patterns be leveraged to inform targeted interventions for managing water quality?These questions aim to address the core challenges of fluoride surveillance in endemic regions and explore the potential of machine learning in improving the monitoring and prediction of fluoride levels in groundwater.

Section snippets

Water quality monitoring

Groundwater contamination by naturally occurring fluoride is a global issue, particularly in regions where fluoride-bearing minerals are prevalent in the geological substrate. Numerous studies have highlighted the significance of monitoring water quality in fluoride-endemic regions, given the adverse health effects of prolonged exposure to elevated fluoride levels (Mridha et al., 2021; Nizam et al., 2022; Senthilkumar et al., 2021; Sunkari et al., 2024; Yuan et al., 2020). Monitoring fluoride…

Study areaThe Vea catchment, a sub-catchment of Ghana’s White Volta Basin, spans across parts of Burkina Faso, and the Bongo District and Bolgatanga Municipality in Ghana’s Upper East Region. Situated between latitudes 10.43°N and 11°N, and longitudes 0.45°W and 1°W (see Fig. 1), this catchment encompasses a drainage area of approximately 305 km2. Its average slope and elevation are 0.2% and 196.5 m above sea level, respectively (Zango et al., 2021).The catchment is part of the larger Volta River Basin, …Descriptive analysisThe mean fluoride concentration (F) across all sampled locations is approximately 1.7 mg/L, with a range of 0.4 mg/L to 4.0 mg/L, highlighting both safe and unsafe levels according to the World Health Organization’s standard of 1.5 mg/L. The electrical conductivity (EC), a measure of the water’s capacity to conduct electric current and an indicator of total ion concentration, has an average value of 410.4 uS/cm, with a minimum of 176.0 uS/cm and a maximum of 1352.0 uS/cm, showing moderate …

Discussion

The study revealed several important findings related to the prediction of fluoride contamination and the identification of high-risk areas within the Vea Catchment. The ML models demonstrated robust predictive capabilities, with Random Forest and Gradient Boosting models achieving perfect accuracy in the training set and strong generalization in the testing set. Both ensemble models exhibited superior performance over Logistic Regression, particularly in identifying unsafe fluoride levels with …

Conclusion

This study has demonstrated the significant potential of machine learning in the surveillance of fluoride contamination, offering a powerful tool for safeguarding public health in fluoride-endemic regions such as the Vea Catchment. By applying advanced predictive models like Random Forest, Gradient Boosting, and Logistic Regression, the study successfully identified key areas at risk of unsafe fluoride levels, providing accurate and actionable insights that could enhance the effectiveness of …

Abstract online at https://www.sciencedirect.com/science/article/abs/pii/S1474706525000270