A variable importance criterion for variable selection in near-infrared spectral analysis

  • Jin Zhang
  • Xiaoyu Cui
  • Wensheng Cai
  • Xueguang ShaoEmail author


Variable selection is a universal problem in building multivariate calibration models, such as quantitative structure-activity relationship (QSAR) and quantitative relationships between quantity or property and spectral data. Significant improvement in the prediction ability of the models can be achieved by reducing the bias induced by the uninformative variables. A new criterion, named as C, is proposed in this study to evaluate the importance of the variables in a model. The value of C is defined as the average contribution of a variable to the model, which is calculated by the statistics of the models built with different combinations of the variables. In the calculation, a large number of partial least squares (PLS) models are built using a subset of variables selected by randomly re-sampling. Then, a vector of the prediction errors, in terms of root mean squared error of cross validation (RMSECV), and a matrix composed of 1 and 0 indicating the selected and unselected variables can be obtained. If multiple linear regression (MLR) is employed to model the relationship between the RMSECVs and the matrix, the coefficients of the MLR model can be used as a criterion to evaluate the contribution of a variable to the RMSECV. To enhance the efficiency of the method, a multi-step shrinkage strategy was used. Comparison with Monte Carlo-uninformative variables elimination (MC-UVE), randomization test (RT) and competitive adaptive reweighted sampling (CARS) was conducted using three NIR benchmark datasets. The results show that the proposed criterion is effective for selecting the informative variables from the spectra to improve the prediction ability of models.


near-infrared spectroscopy variable selection multivariate calibration multi-step strategy 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



This work was supported by the National Natural Science Foundation of China (21475068, 21775076).


  1. 1.
    Saeys Y, Inza I, Larrañaga P. Bioinformatics, 2007, 23: 2507–2517CrossRefGoogle Scholar
  2. 2.
    Goodarzi M, Heyden YV, Funar-Timofei S. TrAC Trends Anal Chem, 2013, 42: 49–63CrossRefGoogle Scholar
  3. 3.
    Zhu XW, Xin YJ, Ge HL. J Chem Inf Model, 2015, 55: 736–746CrossRefGoogle Scholar
  4. 4.
    Yousefinejad S, Hemmateenejad B. Chemom Intell Lab Syst, 2015, 149: 177–204CrossRefGoogle Scholar
  5. 5.
    Andersen CM, Bro R. J Chemom, 2010, 24: 728–737CrossRefGoogle Scholar
  6. 6.
    Xiaobo Z, Jiewen Z, Povey MJW, Holmes M, Hanpin M. Anal Chim Acta, 2010, 667: 14–32CrossRefGoogle Scholar
  7. 7.
    Mehmood T, Liland KH, Snipen L, Sæbø S. Chemom Intell Lab Syst, 2012, 118: 62–69CrossRefGoogle Scholar
  8. 8.
    Chong IG, Jun CH. Chemom Intell Lab Syst, 2005, 78: 103–112CrossRefGoogle Scholar
  9. 9.
    Zhang J, Cui X, Cai W, Shao X. J Chemom, 2017, 28: e2971Google Scholar
  10. 10.
    Allegrini F, Braga JWB, Moreira ACO, Olivieri AC. Anal Chim Acta, 2018, 1011: 20–27CrossRefGoogle Scholar
  11. 11.
    Ma C, Shao X. J Chem Inf Comput Sci, 2004, 44: 907–911CrossRefGoogle Scholar
  12. 12.
    Zhu X, Li S, Shan Y, Zhang Z, Li G, Su D, Liu F. J Food Eng, 2010, 101: 92–97CrossRefGoogle Scholar
  13. 13.
    Fan M, Liu X, Yu X, Cui X, Cai W, Shao X. Sci China Chem, 2017, 60: 299–304CrossRefGoogle Scholar
  14. 14.
    Baumann K. TrAC Trends Anal Chem, 2003, 22: 395–406CrossRefGoogle Scholar
  15. 15.
    Kalivas JH, Roberts N, Sutter JM. Anal Chem, 2002, 61: 2024–2030CrossRefGoogle Scholar
  16. 16.
    Lucasius CB, Kateman G. TrAC Trends Anal Chem, 1991, 10: 254–261CrossRefGoogle Scholar
  17. 17.
    Li Z, Zhou X, Dai Z, Zou X. BMC BioInf, 2010, 11: 325CrossRefGoogle Scholar
  18. 18.
    Shen Q, Jiang JH, Tao JC, Shen GL, Yu RQ. J Chem Inf Model, 2005, 45: 1024–1029CrossRefGoogle Scholar
  19. 19.
    Cao H, Wang Y, Yang S, Zhou Y. J Chemom, 2015, 29: 289–299CrossRefGoogle Scholar
  20. 20.
    Li H, Liang Y, Xu Q, Cao D. Anal Chim Acta, 2009, 648: 77–84CrossRefGoogle Scholar
  21. 21.
    Centner V, Massart DL, de Noord OE, de Jong S, Vandeginste BM, Sterna C. Anal Chem, 1996, 68: 3851–3858CrossRefGoogle Scholar
  22. 22.
    Andries JPM, Vander Heyden Y, Buydens LMC. Anal Chim Acta, 2017, 982: 37–47CrossRefGoogle Scholar
  23. 23.
    Cai W, Li Y, Shao X. Chemom Intell Lab Syst, 2008, 90: 188–194CrossRefGoogle Scholar
  24. 24.
    Han QJ, Wu HL, Cai CB, Xu L, Yu RQ. Anal Chim Acta, 2008, 612: 121–125CrossRefGoogle Scholar
  25. 25.
    Zheng K, Li Q, Wang J, Geng J, Cao P, Sui T, Wang X, Du Y. Chemom Intell Lab Syst, 2012, 112: 48–54CrossRefGoogle Scholar
  26. 26.
    Xu H, Liu Z, Cai W, Shao X. Chemom Intell Lab Syst, 2009, 97: 189–193CrossRefGoogle Scholar
  27. 27.
    Milanez KDTM, Araújo Nóbrega TC, Silva Nascimento D, Galvão RKH, Pontes MJC. Anal Chim Acta, 2017, 984: 76–85CrossRefGoogle Scholar
  28. 28.
    Rossi F, Lendasse A, François D, Wertz V, Verleysen M. Chemom Intell Lab Syst, 2006, 80: 215–226CrossRefGoogle Scholar
  29. 29.
    Tan C, Li M. Spectrochim Acta Part A-Mol Biomol Spectr, 2008, 71: 1266–1273CrossRefGoogle Scholar
  30. 30.
    Tran TN, Afanador NL, Buydens LMC, Blanchet L. Chemom Intell Lab Syst, 2014, 138: 153–160CrossRefGoogle Scholar
  31. 31.
    Afanador NL, Tran TN, Buydens LMC. Anal Chim Acta, 2013, 768: 49–56CrossRefGoogle Scholar
  32. 32.
    Yun YH, Deng BC, Cao DS, Wang WT, Liang YZ. Anal Chim Acta, 2016, 911: 27–34CrossRefGoogle Scholar
  33. 33.
    Shao X, Du G, Jing M, Cai W. Chemom Intell Lab Syst, 2012, 114: 44–49CrossRefGoogle Scholar
  34. 34.
    Shao X, Zhang M, Cai W. Anal Methods, 2012, 4: 467–473CrossRefGoogle Scholar
  35. 35.
    Shan R, Cai W, Shao X. Chemom Intell Lab Syst, 2014, 131: 31–36CrossRefGoogle Scholar
  36. 36.
    Brown CD, Green RL. TrAC Trends Anal Chem, 2009, 28: 506–514CrossRefGoogle Scholar
  37. 37.
    Kjeldahl K, Bro R. J Chemom, 2010, 24: 558–564CrossRefGoogle Scholar
  38. 38.
    Tran TN, Blanchet L, Afanador NL, Buydens LMC. Chemom Intell Lab Syst, 2015, 149: 127–139CrossRefGoogle Scholar
  39. 39.
    Tran T, Szymanska E, Gerretzen J, Buydens L, Afanador NL, Blanchet L. J Chemom, 2017, 31: e2887CrossRefGoogle Scholar
  40. 40.
    Deng BC, Yun YH, Cao DS, Yin YL, Wang WT, Lu HM, Luo QY, Liang YZ. Anal Chim Acta, 2016, 908: 63–74CrossRefGoogle Scholar
  41. 41.
    Olivieri AC. Anal Chim Acta, 2015, 868: 10–22CrossRefGoogle Scholar
  42. 42.
    Pan T, Han Y, Chen J, Yao L, Xie J. Chemom Intell Lab Syst, 2016, 156: 217–223CrossRefGoogle Scholar
  43. 43.
    Kennard RW, Stone LA. Technometrics, 1969, 11: 137–148CrossRefGoogle Scholar

Copyright information

© Science China Press and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Jin Zhang
    • 1
  • Xiaoyu Cui
    • 1
  • Wensheng Cai
    • 1
  • Xueguang Shao
    • 1
    • 2
    • 3
    • 4
    • 5
    Email author
  1. 1.Research Center for Analytical Sciences, College of ChemistryNankai UniversityTianjinChina
  2. 2.Tianjin Key Laboratory of Biosensing and Molecular RecognitionTianjinChina
  3. 3.State Key Laboratory of Medicinal Chemical BiologyTianjinChina
  4. 4.Collaborative Innovation Center of Chemical Science and Engineering (Tianjin)TianjinChina
  5. 5.Xinjiang Laboratory of Native Medicinal and Edible Plant Resources Chemistry, College of Chemistry and Environmental ScienceKashgar UniversityKashgarChina

Personalised recommendations