Abstract
Soil extracellular electron transfer (EET) is a pivotal biological process within the realm of soil. Unfortunately, EET suffers from a lack of predictive models. Herein, an intricately crafted machine learning model has been developed for the purpose of predicting soil EET by using the physicochemical properties of soil as independent input variables and the EET capabilities in terms of current density (jmax) and Coulombic charge (Cout) as dependent output variables. An autoencoder ensemble stacking (AES) model was developed to address the aforementioned issue by integrating support vector machine, multilayer perceptron, extreme gradient boosting, and light gradient boosting machine algorithms as the stacking algorithms. With 10-fold cross-validation, the AES model exhibited notable improvements in predicting jmax and Cout, with average test R2 values of 0.83 and 0.84, respectively, surpassing those of single machine learning (ML) models and the basic ensemble model. By utilizing partial correlation plots (PDPs), Shapley Additive explanations (SHAP) values, and SHAP decision plots, we quantitatively explained the impact and contribution of the input molecules on the AES model’s predictions of jmax and Cout. In the context of the SHAP method for the AES model, total carbon (TC) was identified as the most correlated descriptor for jmax, while total organic carbon (TOC) stood out as the most relevant descriptor for Cout. In the prediction tasks of jmax and Cout within the AES model, employing a multitask ML approach allowed the model to benefit from the shared information of input variables, thereby enhancing its overall generalizability. This study provides a feasible tool for the prediction of soil EET from soil physiochemical properties and an advanced understanding of the relationship between soil physiochemical properties and EET capability.
Similar content being viewed by others
References
Pankratova G, Hederstedt L, Gorton L. Extracellular electron transfer features of Gram-positive bacteria. Anal Chim Acta, 2019, 1076: 32–47
Shi M M, Jiang Y G, Shi L. Electromicrobiology and biotechnological applications of the exoelectrogens Geobacter and Shewanella spp. Sci China Tech Sci, 2019, 62: 1670–1678
Li J, Chen D, Liu G, et al. Construction of a new type of three-dimensional honeycomb-structure anode in microbial electrochemical systems for energy harvesting and pollutant removal. Water Res, 2022, 218: 118429
Bao P, Li G X, Sun G X, et al. The role of sulfate-reducing prokaryotes in the coupling of element biogeochemical cycling. Sci Total Environ, 2018, 613–614: 398–408
Daghio M, Aulenta F, Vaiopoulou E, et al. Electrobioremediation of oil spills. Water Res, 2017, 114: 351–370
Zhao J, Gao J, Jin X, et al. Superior dimethyl disulfide degradation in a microbial fuel cell: Extracellular electron transfer and hybrid metabolism pathways. Environ Pollution, 2022, 315: 120469
Wang W, Sheng Y. Pseudomonas sp. strain WJ04 enhances current generation of Synechocystis sp. PCC6803 in photomicrobial fuel cells. Algal Res, 2019, 40: 101490
Sudirjo E, Buisman C J N, Strik D P B T B. Marine sediment mixed with activated carbon allows electricity production and storage from internal and external energy sources: A new rechargeable bio-battery with bi-directional electron transfer properties. Front Microbiol, 2019, 10: 934
Jiang D, Li B, Jia W, et al. Effect of inoculum types on bacterial adhesion and power production in microbial fuel cells. Appl Biochem Biotechnol, 2010, 160: 182–196
Mathuriya A S. Inoculum selection to enhance performance of a microbial fuel cell for electricity generation during wastewater treatment. Environ Tech, 2013, 34: 1957–1964
Gustave W, Yuan Z F, Sekar R, et al. Soil organic matter amount determines the behavior of iron and arsenic in paddy soil with microbial fuel cells. Chemosphere, 2019, 237: 124459
Hu S, Hu H, Li W, et al. Investigating the biodegradation of sulfadiazine in soil using Enterobacter cloacae T2 immobilized on bagasse. RSC Adv, 2020, 10: 1142–1151
Wang Y J, Chen Z, Liu P P, et al. Arsenic modulates the composition of anode-respiring bacterial community during dry-wet cycles in paddy soils. J Soils Sediments, 2016, 16: 1745–1753
Ren Z, Ma P, Lv L, et al. Application of exogenous redox mediators in anaerobic biological wastewater treatment: A critical review. J Clean Prod, 2022, 372: 133527
Xie Q, Lu Y, Tang L, et al. The mechanism and application of bidirectional extracellular electron transport in the field of energy and environment. Crit Rev Environ Sci Tech, 2021, 51: 1924–1969
Ragot S A, Huguenin-Elie O, Kertesz M A, et al. Total and active microbial communities and phoD as affected by phosphate depletion and pH in soil. Plant Soil, 2016, 408: 15–30
Dincă L C, Grenni P, Onet C, et al. Fertilization and soil microbial community: A review. Appl Sci, 2022, 12: 1198
Siebielec S, Siebielec G, Klimkowicz-Pawlas A, et al. Impact of water stress on microbial community and activity in sandy and loamy soils. Agronomy, 2020, 10: 1429
Li Y S, Wu L H, Zhao L M, et al. Influence of continuous plastic film mulching on yield, water use efficiency and soil properties of rice fields under non-flooding condition. Soil Tillage Res, 2007, 93: 370–378
Oliver D P, Bramley R G V, Riches D, et al. Review: Soil physical and chemical properties as indicators of soil quality in Australian viticulture. Aust J Grape Wine Res, 2013, 19: 129–139
Kookana R S. The role of biochar in modifying the environmental fate, bioavailability, and efficacy of pesticides in soils: A review. Soil Res, 2010, 48: 627–637
Podgorski J, Berg M. Global threat of arsenic in groundwater. Science, 2020, 368: 845–850
Mori N, Debeljak B, Škerjanec M, et al. Modelling the effects of multiple stressors on respiration and microbial biomass in the hyporheic zone using decision trees. Water Res, 2019, 149: 9–20
Ballesté E, Belanche-Muñoz L A, Farnleitner A H, et al. Improving the identification of the source of faecal pollution in water using a modelling approach: From multi-source to aged and diluted samples. Water Res, 2020, 171: 115392
Yao Z, Sánchez-Lengeling B, Bobbitt N S, et al. Inverse design of nanoporous crystalline reticular materials with deep generative models. Nat Mach Intell, 2021, 3: 76–86
Reichstein M, Camps-Valls G, Stevens B, et al. Deep learning and process understanding for data-driven Earth system science. Nature, 2019, 566: 195–204
Lesnik K L, Cai W, Liu H. Microbial community predicts functional stability of microbial fuel cells. Environ Sci Technol, 2019, 54: 427–436
Lesnik K L, Liu H. Predicting microbial fuel cell biofilm communities and bioreactor performance using artificial neural networks. Environ Sci Technol, 2017, 51: 10881–10892
Dunaj S J, Vallino J J, Hines M E, et al. Relationships between soil organic matter, nutrients, bacterial community structure, and the performance of microbial fuel cells. Environ Sci Technol, 2012, 46: 1914–1922
Wen J L, He D G, Luo S Q, et al. Cloud-based smartphone-assisted chemiluminescent assay for rapid screening of electroactive bacteria. Sci China Tech Sci, 2023, 66: 743–750
Luo X, Huang L, Cai X, et al. Structure and core taxa of bacterial communities involved in extracellular electron transfer in paddy soils across China. Sci Total Environ, 2022, 844: 157196
Cai X, Yuan Y, Yu L, et al. Biochar enhances bioelectrochemical remediation of pentachlorophenol-contaminated soils via long-distance electron transfer. J Hazard Mater, 2020, 391: 122213
Zabalza J, Ren J, Zheng J, et al. Novel segmented stacked autoencoder for effective dimensionality reduction and feature extraction in hyperspectral imaging. Neurocomputing, 2016, 185: 1–10
Wang D, Gu J. Vasc: Dimension reduction and visualization of single-cell RNA-seq data by deep variational autoencoder. Genom Proteom Bioinf, 2018, 16: 320–331
Han Z Z, Huang Y Z, Li J, et al. A hybrid deep neural network based prediction of 300 MW coal-fired boiler combustion operation condition. Sci China Tech Sci, 2021, 64: 2300–2311
Luo X, Li X, Wang Z, et al. Discriminant autoencoder for feature extraction in fault diagnosis. Chemometr Intell Lab Syst, 2019, 192: 103814
Liu T, Li Z, Yu C, et al. NIRS feature extraction based on deep autoencoder neural network. Infrared Phys Tech, 2017, 87: 124–128
Yu M, Quan T, Peng Q, et al. A model-based collaborate filtering algorithm based on stacked AutoEncoder. Neural Comput Applic, 2022, 34: 2503–2511
Chen C, Wang Y, Gao Z T, et al. Intelligent learning model-based skill learning and strategy optimization in robot grinding and polishing. Sci China Tech Sci, 2022, 65: 1957–1974
Cao M T, Hoang N D, Nhu V H, et al. An advanced meta-learner based on artificial electric field algorithm optimized stacking ensemble techniques for enhancing prediction accuracy of soil shear strength. Eng Comput, 2022, 38: 2185–2207
Su L, Zhang S Y, Ji Y, et al. A novel approach for flip chip inspection based on improved SDELM and vibration signals. Sci China Tech Sci, 2022, 65: 1087–1097
Hu X, Belle J H, Meng X, et al. Estimating PM2.5 concentrations in the conterminous united states using the random forest approach. Environ Sci Technol, 2017, 51: 6936–6944
Saito H, Goovaerts P. Accounting for source location and transport direction into geostatistical prediction of contaminants. Environ Sci Technol, 2001, 35: 4823–4829
Zorn K M, Foil D H, Lane T R, et al. Comparing machine learning models for aromatase (p450 19a1). Environ Sci Technol, 2020, 54: 15546–15555
Joy T T, Rana S, Gupta S, et al. Fast hyperparameter tuning using Bayesian optimization with directional derivatives. Knowledge-Based Syst, 2020, 205: 106247
Deng H, Luo Z, Imbrogno J, et al. Machine learning guided polyamide membrane with exceptional solute-solute selectivity and permeance. Environ Sci Technol, 2023, 57: 17841–17850
Shi H, Yang N, Yang X, et al. Clarifying relationship between PM2.5 concentrations and spatiotemporal predictors using multi-way partial dependence plots. Remote Sens, 2023, 15: 358
Kookalani S, Cheng B, Torres J L C. Structural performance assessment of GFRP elastic gridshells by machine learning interpretability methods. Front Struct Civ Eng, 2022, 16: 1249–1266
Chen J, Wang M, Zhao D, et al. Msingb: A novel computational method based on NGBoost for identifying microsatellite instability status from tumor mutation annotation data. Interdiscip Sci Comput Life Sci, 2022, 15: 100–110
Zhou Y, Wu W, Wang H, et al. Identification of soil texture classes under vegetation cover based on sentinel-2 data with SVM and SHAP techniques. IEEE J Sel Top Appl Earth Obs Remote Sens, 2022, 15: 3758–3770
Tan W, Zhao X, Dang Q, et al. Microbially reducible extent of solidphase humic substances is governed by their physico-chemical protection in soils: Evidence from electrochemical measurements. Sci Total Environ, 2020, 708: 134683
Gupta D, Guzman M S, Bose A. Extracellular electron uptake by autotrophic microbes: Physiological, ecological, and evolutionary implications. J Ind Microbiol Biotechnol, 2020, 47: 863–876
Han T, Wang K, Rushimisha I E, et al. Influence of biocurrent self-generated by indigenous microorganisms on soil quality. Chemosphere, 2022, 307: 135864
Kato S, Hashimoto K, Watanabe K. Microbial interspecies electron transfer via electric currents through conductive minerals. Proc Natl Acad Sci USA, 2012, 109: 10042–10046
Friedman J, Hastie T, Tibshirani R. Additive logistic regression: A statistical view of boosting (With discussion and a rejoinder by the authors). Ann Statist, 2000, 28: 337–407
Friedman J H. Greedy function approximation: A gradient boosting machine. Ann Statist, 2001, 29: 1189–1232
Massaoudi M, Refaat S S, Chihi I, et al. A novel stacked generalization ensemble-based hybrid LGBM-XGB-MLP model for short-term load forecasting. Energy, 2021, 214: 118874
Zhao B, Shuai C, Hou P, et al. Estimation of unit process data for life cycle assessment using a decision tree-based approach. Environ Sci Technol, 2021, 55: 8439–8446
Pinkus A. Approximation theory of the MLP model in neural networks. Acta Numerica, 1999, 8: 143–195
Ahmed S, Shaikh S, Ikram F, et al. Prediction of cardiovascular disease on self-augmented datasets of heart patients using multiple machine learning models. J Sensors, 2022, 2022: 3730303
Kardani N, Zhou A, Nazem M, et al. Improved prediction of slope stability using a hybrid stacking ensemble method based on finite element analysis and field data. J Rock Mech Geotech Eng, 2021, 13: 188–201
Arellano G. Calculation of narrower confidence intervals for tree mortality rates when we know nothing but the location of the death/survival events. Ecol Evol, 2019, 9: 9644–9653
Najm S M, Trzepieciński T, Kowalik M. Modelling and parameter identification of coefficient of friction for deep-drawing quality steel sheets using the CatBoost machine learning algorithm and neural networks. Int J Adv Manuf Technol, 2023, 124: 2229–2259
Ou J, Wen J, Tan W, et al. A data-driven approach for understanding the structure dependence of redox activity in humic substances. Environ Res, 2023, 219: 115142
Kondaiah V Y, Saravanan B. A modified deep residual network for short-term load forecasting. Front Energy Res, 2022, 10: doi: 10.3389/fenrg.2022.1038819
Poskanzer C, Fang M, Aglinskas A, et al. Controlling for spurious nonlinear dependence in connectivity analyses. Neuroinformatics, 2022, 20: 599–611
Tao X, Liu Z, Zhao F, et al. An SSA-LC-DAE method for extracting network security elements. IEEE Trans Netw Sci Eng, 2023, 10: 1175–1185
Gai J, Shen J, Wang H, et al. A parameter-optimized DBN using goa and its application in fault diagnosis of gearbox. Shock Vib, 2020, 2020: 4294095
Syed N F, Ge M, Baig Z. Fog-cloud based intrusion detection system using Recurrent Neural Networks and feature selection for IoT networks. Comput Networks, 2023, 225: 109662
Zhang Y, Yang Q. An overview of multi-task learning. Natl Sci Rev, 2018, 5: 30–43
Fetanat M, Keshtiara M, Keyikoglu R, et al. Machine learning for design of thin-film nanocomposite membranes. Sep Purif Technol, 2021, 270: 118383
Hu J, Kim C, Halasz P, et al. Artificial intelligence for performance prediction of organic solvent nanofiltration membranes. J Membrane Sci, 2021, 619: 118513
Tan M, He G, Li X, et al. Prediction of the effects of preparation conditions on pervaporation performances of polydimethylsiloxane (PDMS)/ceramic composite membranes by backpropagation neural network and genetic algorithm. Sep Purif Technol, 2012, 89: 142–146
Li X, Xu Y, Lai L, et al. Prediction of human cytochrome P450 inhibition using a multitask deep autoencoder neural network. Mol Pharm, 2018, 15: 4336–4345
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by Guangdong Basic and Applied Basic Research Foundation (Grant No. 2023B1515040022) and the National Natural Science Foundation of China (Grant Nos. 42177270 and 42207340).
Supporting Information
The supporting information is available online at tech.scichina.com and link.springer.com. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.
Electronic supplementary material
11431_2023_2537_MOESM1_ESM.pdf
Supplementary materials: Predicting microbial extracellular electron transfer activity in paddy soils with soil physicochemical properties using machine learning
Rights and permissions
About this article
Cite this article
Ou, J., Luo, X., Liu, J. et al. Predicting microbial extracellular electron transfer activity in paddy soils with soil physicochemical properties using machine learning. Sci. China Technol. Sci. 67, 259–270 (2024). https://doi.org/10.1007/s11431-023-2537-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11431-023-2537-y