Moi University Open Access Repository

Predictive geochemical mapping using machine learning in western Kenya

Show simple item record

dc.contributor.author Humphrey, Olivier S.
dc.contributor.author Cave, Mark
dc.contributor.author Hamilton, Elliott M.
dc.contributor.author Osano, Odipo
dc.contributor.author Menya, Diana
dc.contributor.author Watts, Michael J.
dc.date.accessioned 2023-11-10T08:15:12Z
dc.date.available 2023-11-10T08:15:12Z
dc.date.issued 2023-10-25
dc.identifier.uri http://ir.mu.ac.ke:8080/jspui/handle/123456789/8341
dc.description.abstract Digital soil mapping techniques represent a cost-effective method for obtaining detailed information regarding the spatial distribution of chemical elements in soils. Machine learning (ML) algorithms using random forest (RF) models have been developed for classification, pattern recognition and regression tasks, they are capable of modelling non-linear relationships using a range of datasets, identifying hierarchical relationships, and deter- mining the importance of predictor variables. In this study, we describe a framework for spatial prediction based on RF modelling where inverse distance weighted (IDW) predictors are used in conjunction with ancillary environmental covariates. The model was applied to predict the total concentration (mg kg 1) and assess the prediction uncertainty of 56 elements, soil pH and organic matter content using 466 soil samples in western Kenya; the results of iodine (I), selenium (Se), zinc (Zn) and soil pH are highlighted in this work. These elements were selected due to contrasting biogeochemical cycles and widespread dietary deficiencies in sub-Saharan Africa, whilst soil pH is an important parameter controlling soil chemical reactions. Algorithm performance was evaluated determining the relative importance of each predictor variable and the model's response using partial dependence profiles. The accuracy and precision of each RF model were assessed by evaluating out-of-bag predicted values. The models R2 values range from 0.31 to 0.64 whilst CCC values range from 0.51 to 0.77. The IDW predictor variables had the greatest impact on assessing the distribution of soil properties in the study area, however, the inclusion of ancillary environmental data improved model performance for all soil properties. The results presented in this paper highlight the benefits of ML algorithms which can incorporate multiple layers of data for spatial prediction, uncertainty assessment and attributing variable importance. Additional research is now required to ensure health practitioners and the agri-community utilise the geochemical maps presented here for assessing the relationship between environmental geochemistry, endemic diseases and preventable micro- nutrient deficiency. en_US
dc.language.iso en en_US
dc.publisher Elsevier en_US
dc.subject Random Forest en_US
dc.subject Machine learning en_US
dc.subject Soil en_US
dc.subject Geochemistry en_US
dc.subject Uncertainty en_US
dc.title Predictive geochemical mapping using machine learning in western Kenya en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account