Please use this identifier to cite or link to this item: http://ir.mu.ac.ke:8080/jspui/handle/123456789/8341
Title: Predictive geochemical mapping using machine learning in western Kenya
Authors: Humphrey, Olivier S.
Cave, Mark
Hamilton, Elliott M.
Osano, Odipo
Menya, Diana
Watts, Michael J.
Keywords: Random Forest
Machine learning
Soil
Geochemistry
Uncertainty
Issue Date: 25-Oct-2023
Publisher: Elsevier
Abstract: Digital soil mapping techniques represent a cost-effective method for obtaining detailed information regarding the spatial distribution of chemical elements in soils. Machine learning (ML) algorithms using random forest (RF) models have been developed for classification, pattern recognition and regression tasks, they are capable of modelling non-linear relationships using a range of datasets, identifying hierarchical relationships, and deter- mining the importance of predictor variables. In this study, we describe a framework for spatial prediction based on RF modelling where inverse distance weighted (IDW) predictors are used in conjunction with ancillary environmental covariates. The model was applied to predict the total concentration (mg kg 1) and assess the prediction uncertainty of 56 elements, soil pH and organic matter content using 466 soil samples in western Kenya; the results of iodine (I), selenium (Se), zinc (Zn) and soil pH are highlighted in this work. These elements were selected due to contrasting biogeochemical cycles and widespread dietary deficiencies in sub-Saharan Africa, whilst soil pH is an important parameter controlling soil chemical reactions. Algorithm performance was evaluated determining the relative importance of each predictor variable and the model's response using partial dependence profiles. The accuracy and precision of each RF model were assessed by evaluating out-of-bag predicted values. The models R2 values range from 0.31 to 0.64 whilst CCC values range from 0.51 to 0.77. The IDW predictor variables had the greatest impact on assessing the distribution of soil properties in the study area, however, the inclusion of ancillary environmental data improved model performance for all soil properties. The results presented in this paper highlight the benefits of ML algorithms which can incorporate multiple layers of data for spatial prediction, uncertainty assessment and attributing variable importance. Additional research is now required to ensure health practitioners and the agri-community utilise the geochemical maps presented here for assessing the relationship between environmental geochemistry, endemic diseases and preventable micro- nutrient deficiency.
URI: http://ir.mu.ac.ke:8080/jspui/handle/123456789/8341
Appears in Collections:School of Public Health

Files in This Item:
File Description SizeFormat 
MENYA.pdf5.11 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.