Abstract
The study on the relationship between trace elements and diseases often need to build a classification/regression model. Furthermore, the accuracy of such a model is of particular importance and directly decides its applicability. The goal of this study is to explore the feasibility of applying boosting, i.e., a new strategy from machine learning, to model the relationship between trace elements and diseases. Two examples are employed to illustrate the technique in the applications of classification and regression, respectively. The first example involves the diagnosis of anorexia according to the concentrations of six elements (i.e. classification task). Decision stump and support vector machine are used as the weak/base algorithm and reference algorithm, respectively. The second example involves the prediction of breast cancer mortality based on the intake of trace elements (i.e. a regression task). In this regard, partial least squares is not only used as the weak/base algorithm, but also the reference algorithm. The results from both examples confirm the potential of boosting in modeling the relationship between trace elements and diseases.
Similar content being viewed by others
References
Zhai HL, Chen XG, Hu ZD (2003) Study on the relationship between intake of trace elements and breast cancer mortality with chemometric methods. Comput Biol Chem 27:581–586
Gaetke LM, Frederich RC, Oz HS, McClain CJ (2002) Decreased food intake rather than zinc deficiency is associated with changes in plasma leptin, metabolic rate, and activity levels in zinc deficient rats. J Nutr Biochem 13:237–244
Ren YL, Zhang ZY, Ren YQ, Li W, Wang MC, Xu G (1997) Diagnosis of lung cancer based on metal contents in serum and hair using multivariate statistical methods. Talanta 44:1823–1831
Chan S, Gerson B, Subramaniam S (1998) The role of copper, molybdenum, selenium, and zinc in nutrition and health. Clin Lab Med 18:673–685
Zhang ZY, Zhou HL, Liu SD, Harrington PB (2001) Classification of cancer patients based on elemental contents of serums using bidirectional associative memory networks. Anal Chim Acta 436:281–291
Miura Y, Nakai K, Suwabe A, Sera K (2002) Trace elements in renal disease and hemodialysis. J Nucl Instrum Methods Phys Res B 189:443–449
Douglas MT (2003) The importance of trace element speciation in biomedical science. Anal Bioanal Chem 375:1062–1066
HegdeP SML, Vengamma B, Rao TSS, Menon RB, Rao RV, Rao KSJ (2004) Serum trace element levels and the complexity of inter-element relations in patients with Parkinson's disease. J Trace Elem Med Bio 18:163–171
Forte G, Alimonti A, Violante N, Gregorio M, Senofonte O, Petrucci F, Sancesario G, Bocca B (2005) Calcium, copper, iron, magnesium, silicon and zinc content of hair in Parkinson's disease. J Trace Elem Med Bio 19:195–201
Zhang ZY, Zhou HL, Liu SD, Harrington P (2006) An application of Takagi-Sugeno fuzzy system to the classification of cancer patients based on elemental contents in serum samples. Chemom Intell Lab Syst 82:294–299
Gurusamy K, Davidson BR (2007) Trace element concentration in metastatic liver disease—a systematic review. J Trace Elem Med Bio 21:169–177
Frisk P, Darnerud P, Friman G, Blomberg J, Ilbäck NG (2007) Sequential trace element changes in serum and blood during a common viral infection in mice. J Trace Elem Med Bio 21:29–36
Bianchi F, Maffini M, Mangia A, Marengo E, Mucchino C (2007) Experimental design optimization for the ICP-AES determination of Li, Na, K, Al, Fe, Mn and Zn in human serum. J Pharm Biomed Anal 43:659–665
Tan C, Chen H, Xia CY (2009) Early prediction of lung cancer based on the combination of trace element analysis in urine and an Adaboost algorithm. J Pharm Biomed Anal 49:746–752
Greenlee RT, Hill-Harmon MB, Murray T, Thun M (2001) Cancer statistics. CA-Cancer J Clin 51:15–36
Whelehan OP, Earll ME, Johansson E, Toft M, Eriksson L (2006) Detection of ovarian cancer using chemometric analysis of proteomic profiles. Chemom Intell Lab Syst 84:82–87
Huang ZW, Mcwilliams A, Lui H, Mclean D, Lan S, Zeng HS (2003) Near-infrared Raman spectroscopy for optical diagnosis of lung cancer. Int J Cancer 107:1047–1052
Sorich MJ, Miners JO, McKinnon RA, Winkler DA, Burden FR, Smith PA (2003) Comparison of linear and nonlinear classification algorithms for the prediction of drug and chemical metabolism by human UDP-glucuronosyltransferase isoforms. J Chem Inf Comput Sci 43:2019–2024
Sboner A, Eccher C, Blanzieri E, Bauer P, Cristofolini M, Zumiani G, Forti S (2003) A multiple classifier system for early melanoma diagnosis. AI Med 27:29–44
Liu HX, Zhang RS, Luan F, Yao XJ, Liu MC, Hu ZD, Fan BT (2003) Diagnosing breast cancer based on support vector machines. J Chem Inf Comput Sci 43:900–907
Tan C, Li ML, Qin X (2008) Random subspace regression ensemble for near-infrared spectroscopic calibration of tobacco samples. Anal Sci 24:647–653
Brown G, Wyatt JL, Tino P (2005) Managing diversity in regression ensembles. J Mach Learn Res 6:1621–1650
Mevik B-H, Segtnan VH, Næs T (2004) Ensemble methods and partial least squares regression. J Chemometr 18:498–507
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. Proceedings of the Thirteenth International Conference, pp 148–156
He P, Xu CJ, Liang YZ, Fang KT (2004) Improving the classification accuracy in chemistry via boosting technique. Chemom Intell Lab Syst 70:39–46
Zhang MH, Xu QS, Massart DL (2005) Boosting partial least squares. Anal Chem 77:1423–1431
Shinzawa H, Jiang JH, Ritthiruangdej P, Ozaki Y (2006) Investigations of bagged kernel partial least squares (KPLS) and boosting KPLS with applications to near-infrared (NIR) spectra. J Chemometr 20:436–444
Zhou YP, Jiang JH, Wu HL, Shen GL, Yu RQ, Ozaki Y (2006) Dry film method with ytterbium as the internal standard for near infrared spectroscopic plasma glucose assay coupled with boosting support vector regression. J Chemometr 20:13–21
Tan C, Li ML, Qin X (2007) Study of the feasibility of distinguishing cigarettes of different brands using an Adaboost algorithm and near-infrared spectroscopy. Anal Bioanal Chem 389:667–676
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–378
Zhao CY, Zhang RS, Liu HX, Xue CX, Zhao SG, Zhou XF, Liu MC, Fan BT (2004) Diagnosing anorexia based on partial least squares, back-propagation neural network, and support vector machines. J Chem Inf Comput Sci 44:2040–2046
Zhou S (1996) Synthetometrics and optimization in chemistry and chemical engineering. Hunan University Press, Hunan p 69
Galváo RKH, Araújo MCU, José GE, Pontes MJC, Silva EC, Saldanha TCB (2005) A method for calibration and validation subset partitioning. Talanta 67:736–740
Keller KA, Grider A, Coffield JA (2001) Age-dependent influence of dietary zinc restriction on short-term memory in male rats. Physiol Behav 72:339–348
Dalway JS (2000) Why trace elements are important. Fuel Process Technol 65:21–23
Shay NF, Manigan HF (2000) Neurobiology of zinc-influenced eating behavior. J Nutr 130:1493–1499
Iyengara GV, Rappb A (2000) Human placenta as a ‘dual’ biomarker for monitoring fetal and maternal environment with special reference to potentially toxic trace elements. Part 2: essential minor, trace and other nonessential elements in human placenta. Sci Total Environ 280:207–219
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Thissena U, Pepersb M, Stuna BU, Melssena WJ, Buydensa LMC (2004) Comparing support vector machines to PLS for spectral regression applications. Chemom Intell Lab Syst 73:169–179
Kuo HS, Chen SF, Wu CC, Chen DR, Lee JH (2002) Serum and tissue trace elements in patients with breast cancer in Taiwan. Biol Trace Elem Res 89:1–11
Magalova T, Bella V, Brtkova A, Beno I, Kudlackova M, Volkovova K (1999) Copper, zinc and superoxide dismutase in precancerous, benign diseases and gastric, colorectal and breast cancer. Neoplasma 46:100–104
Spallholz JE, Mallory LB, Rhaman MM (2004) Environmental hypothesis: is poor dietary selenium intake an underlying factor for arsenicosis and cancer in Bangladesh and West Bengal, India. Sci Total Environ 323:21–32
Conor R (1998) Selenium: a new entrant into the functional food arena. Trends Food Sci Technol 9:114–118
Acharya UR, Mishra M, Mishra I (2004) Status of antioxidant defense system in chromium-induced Swiss mice tissues. Environ Toxicol Pharmacol 17:117–123
Garg AN, Weginwar RG, Sagdeo V (1990) Minor and trace elemental contents of cancerous breast tissue measured by instrumental and radiochemical neutron activation analysis. Biol Trace Elem Res 26–27:485–496
Acknowledgements
This work was supported by Sichuan Province Science Foundation for Youths (09ZQ026-066) and Scientific Research Startup Fund for Doctor, Yibin University (2008B06).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tan, C., Chen, H. & Zhu, W. Application of Boosting Classification and Regression to Modeling the Relationships Between Trace Elements and Diseases. Biol Trace Elem Res 134, 146–159 (2010). https://doi.org/10.1007/s12011-009-8468-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12011-009-8468-9