We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Determination of bioavailable arsenic threshold and validation of modeled permissible total arsenic in paddy soil using machine learning.
- Authors
Mandal, Jajati; Jain, Vinay; Sengupta, Sudip; Rahman, Md. Aminur; Bhattacharyya, Kallol; Rahman, Mohammad Mahmudur; Golui, Debasis; Wood, Michael D.; Mondal, Debapriya
- Abstract
Minimizing arsenic intake from food consumption is a key aspect of the public health response in arsenic (As)‐contaminated regions. In many of these regions, rice is the predominant staple food. Here, we present a validated maximum allowable concentration of total As in paddy soil and provide the first derivation of a maximum allowable soil concentration for bioavailable As. We have previously used meta‐analysis to predict the maximum allowable total As in soil based on decision tree (DT) and logistic regression (LR) models. The models were defined using the maximum tolerable concentration (MTC) of As in rice grains as per the codex recommendation. In the present study, we validated these models using three test data sets derived from purposely collected field data. The DT model performed better than the LR in terms of accuracy and Matthews correlation coefficient (MCC). Therefore, the DT estimated maximum allowable total As in paddy soil of 14 mg kg−1 could confidently be used as an appropriate guideline value. We further used the purposely collected field data to predict the concentration of bioavailable As in the paddy soil with the help of random forest (RF), gradient boosting machine (GBM), and LR models. The category of grain As (<MTC and >MTC) was considered as the dependent variable; bioavailable As (BAs), total As (TAs), pH, organic carbon (OC), available phosphorus (AvP), and available iron (AvFe) were the predictor variables. LR performed better than RF and GBM in terms of accuracy, sensitivity, specificity, kappa, precision, log loss, F1score, and MCC. From the better‐performing LR model, bioavailable As (BAs), TAs, AvFe, and OC were significant variables for grain As. From the partial dependence plots (PDP) and individual conditional expectation (ICE) of the LR model, 5.70 mg kg−1 was estimated to be the limit for BAs in soil. Core ideas: Decision tree (DT) and logistic regression (LR) models are tested with field data.For rice cultivation, the better performing DT model predicts 14 mg kg−1 total As as the soil limit.Both LR and random forest models identified available Fe, P, and organic carbon as important variables governing bioavailable As.From LR model, 5.70 mg kg−1 is the threshold limit for soil bioavailable As for rice.
- Subjects
ARSENIC; MACHINE learning; INDEPENDENT variables; SOILS; RANDOM forest algorithms; CONDITIONAL expectations
- Publication
Journal of Environmental Quality, 2023, Vol 52, Issue 2, p315
- ISSN
0047-2425
- Publication type
Article
- DOI
10.1002/jeq2.20452