Skip to main content
Log in

Combining support vector regression with feature selection for multivariate calibration

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Multivariate calibration is a classic problem in the analytical chemistry field and frequently solved by partial least squares (PLS) and artificial neural networks (ANNs) in the previous works. The spaciality of multivariate calibration is high dimensionality with small sample. Here, we apply support vector regression (SVR) as well as ANNs, and PLS to the multivariate calibration problem in the determination of the three aromatic amino acids (phenylalanine, tyrosine and tryptophan) in their mixtures by fluorescence spectroscopy. The results of the leave-one-out method show that SVR performs better than other methods, and appear to be one good method for this task. Furthermore, feature selection is performed for SVR to remove redundant features and a novel algorithm named Prediction RIsk based FEature selection for support vector Regression (PRIFER) is proposed. Results on the above multivariate calibration data set show that PRIFER is a powerful tool for solving the multivariate calibration problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Peussa M, Härkönen S, Puputti J, Niinistö L (2000) Application of PLS multivariate calibration for the determination of the hydroxyl group content in calcined silica by DRIFTS. J Chemom 14:501–512

    Article  Google Scholar 

  2. Marx BD, Eilers PHC (2002) Multivariate calibration stability: a comparison of methods. J Chemom 16:129–140

    Article  Google Scholar 

  3. Tormod N, Knut K, Tomas I, Charles M (1993) Artificial neural networks in multivariate calibration. J Near Infrared Spectr 1:1–11

    Article  Google Scholar 

  4. Poppi RJ, Massart DL (1998) The optimal brain surgeon for pruning neural network architecture applied to multivariate calibration. Anal Chim Acta 375:187–195

    Article  Google Scholar 

  5. Tetko VI, Livingstone JD, Luik IA (1995) Neural network studies: 1. comparison of overfitting and overtraining. J Chem Inform Comp Sci 35:826–833

    Google Scholar 

  6. Moody J, Utans J (1992) Principled architecture selection for neural networks: application to corporate bond rating prediction. In: Moody JE, Hanson SJ, Lippmann RP (eds) Advances in neural Information processing systems. Morgan Kaufmann Publishers, Inc., Menlo Park, pp. 683–690

    Google Scholar 

  7. Foresee FD, Hagan MT (1997) Gauss-newton approximation to bayesian regularization. In: Proceedings of the 1997 International Joint Conference on Neural Networks. 1930–1935

  8. Vapnik V (1995) The nature of statistical learning theory. Springer, New York

    MATH  Google Scholar 

  9. Chen NY, Lu WC, Yang J, Li GZ (2004) Support vector machines in chemistry. World Scientific Publishing Company, Singapore

    Google Scholar 

  10. Belousov AI, Verzakov SA, von Frese J (2002) Applicational aspects of support vector machines. J Chemom 16:482–489

    Article  Google Scholar 

  11. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(3):1–12

    Article  MATH  Google Scholar 

  12. Lacowicz JR (1983) Principle of fluorescence spectroscopy. Plenum Press, New York

    Google Scholar 

  13. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge

    Google Scholar 

  14. Karush W (1939) Minima of functions of several variables with inequalities as side constraints. Master’s Thesis, Department of Mathematics, University of Chicago

  15. Kuhn HW, Tucker AW (1951) Nonlinear programming. In: Proceeding of the 2nd Berkeley Symposium on Mathematical Statistics and Probabilistic. University of California Press, Berkeley, pp 481–492

  16. Mercer J (1909) Functions of positive and negative type and their connection with the theory of integral equations. Philos Trans Roy Soc Lond A 209:415–446

    Article  Google Scholar 

  17. Hsu CW, Chang CC, Lin CJ (2003) A practical guide to support vector classification. Technical report, Department of Computer Science and Information Engineering of National Taiwan University, Available: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf (14 August 2003)

  18. Rüping S (2000) Mysvm-Manual. University of Dortmund, Lehrstuhl Informatik 8, Available: http://www-ai.cs.uni-dortmund.de/SOFTWARE/mySvm/(14 August 2003)

  19. Demuth H, Beale M (2001) Neural network Toolbox User’s Guide for Use with MATLAB, 4th edn. The Mathworks Inc

  20. Sarle WS (1995) Stopped training and other remedies for overfitting. In: Proceedings of the 27th Symposium on the Interface of Computing Science and Statistics. 352–360

  21. Andersson CA, Bro R (2000) The n-way toolbox for MATLAB. Chemometrics & Intelligent Laboratory Systems 52:1–4

    Article  Google Scholar 

  22. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    Article  MATH  Google Scholar 

  23. Liu H, Dougherty ER, Dy JG, Torkkola K, Tuv E, Peng H, Ding C, Long F, Berens M, Parsons L, Yu L, Zhao Z, Forman G (2005) Evolving feature selection. IEEE Trans Intell Syst 20(6):64–76

    Article  Google Scholar 

  24. Zhang YQ, Rajapakse JC (2007) Machine learning in bioinformatics. Wiley, New York

    Google Scholar 

  25. Li GZ, Yang J, Liu GP, Xue L (2004) Feature selection for multi-class problems using support vector machines. In: Lecture Notes on Artificial Intelligence 3173 (PRICAI2004), Springer 292–300

  26. Lal TN, Chapelle O, Weston J, Elisseeff A (2006) Embedded methods. In: Guyon I, Gunn S, Nikravesh M (eds) Feature extraction, foundations and applications. Physica-Verlag, Springer, Berlin

    Google Scholar 

  27. Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  28. Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422

    Article  MATH  Google Scholar 

Download references

Acknowledgments

Thanks to the late professor Nian-Yi Chen for his advices to this paper. This work was supported in part by the Nature Science Foundation of China under grant no. 20503015 and 60873129, the Shanghai Rising-Star Program under grant no. 08QA14032 and open funding by Institute of Systems Biology of Shanghai University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guo-Zheng Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, GZ., Meng, HH., Yang, M.Q. et al. Combining support vector regression with feature selection for multivariate calibration. Neural Comput & Applic 18, 813–820 (2009). https://doi.org/10.1007/s00521-008-0202-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-008-0202-6

Keywords

Navigation