Skip to main content

Ensemble Support Vector Regression:A New Non-parametric Approach for Multiple Imputation

  • Conference paper
  • First Online:
Advanced Statistical Methods for the Analysis of Large Data-Sets

Part of the book series: Studies in Theoretical and Applied Statistics ((STASSPSS))

  • 4428 Accesses

Abstract

The complex case in which several variables contain missing values needs to be analyzed by means of an iterative procedure. The imputation methods most commonly employed, however, rely on parametric assumptions. In this paper we propose a new non-parametric method for multiple imputation based on Ensemble Support Vector Regression. This procedure works under quite general assumptions and has been tested with different simulation schemes. We show that the results obtained in this way are better than the ones obtained with other methods usually employed to get a complete data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Breiman, L.: Bagging predictors. Machine Learning 26, 123–140 (1996)

    Google Scholar 

  • Boser, B., Guyon, I., Vapnik V.: A training algorithm for optimal margin classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 144–152 (1992)

    Google Scholar 

  • Brand, J., Buuren, S., Groothuis-Oudshoorn, K., Gelsema, E. S.: A toolkit in SAS for the evaluation of multiple imputation methods. Statistical Neerlandica 57, 36–45 (2003)

    Google Scholar 

  • Cherkassky, V., Yunqian, M.: Practical selection of SVM parameters and noise estimation for SVM regression. Neural Networks 17(1), 113–126 (2004)

    Google Scholar 

  • Dempster, A. P., Laird, N., Rubin, D. B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society 39, 1–38 (1977)

    Google Scholar 

  • Di Ciaccio, A.: Bootstrap and Nonparametric Predictors to Impute Missing Data, Cladag,(2008)

    Google Scholar 

  • Durrant, G. B.: Imputation methods for handling item-nonresponse in the social sciences: A methodological review. NCRM Working Paper Series, (2005)

    Google Scholar 

  • Di Zio, M., Guarnera, U.: A multiple imputation method for non-Gaussian data. Metron - International Journal of Statistics LXVI(1), 75–90 (2008)

    Google Scholar 

  • Efron, B.: Missing data, imputation, and the bootstrap. Journal of the American Statistical Association 89, 463–475 (1994)

    Google Scholar 

  • Freud, Y., Shapire, R.: A decision theoretic generalization of online learning and an application to boosting. J. Comput.System Sci. 55(1), 119–139 (1997)

    Google Scholar 

  • Graham, J.W., Schafer, J.L.: On the performance of multiple imputation for multivariate data with small sample size. Hoyle, R. (Ed.), Statistical Strategies for Small Sample Research. Sage, Thousand Oaks, CA, 1–29 (1999)

    Google Scholar 

  • Kim, H.-C., Pang, S., Je, H.-M., Kim, D., Yang Bang, S.: Constructing support vector machine ensemble. Pattern Recongnition 36, 2757–2767 (2003)

    Google Scholar 

  • Little, R., Rubin, D.: Statistical Analysis with Missing Data. New York, Wiley (1987)

    Google Scholar 

  • Mallinson, H., Gammerman, A.: Imputation Using Support Vector Machines. http://www.cs.york.ac.uk/euredit/ (2003)

  • Raghunathan, T. E., Lepkowski, J. M., Van Hoewyk, J., and Solenberger, P.: A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models. Survey Methodology 27(1), 85–96 (2001)

    Google Scholar 

  • Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Jhon Wiley &Sons (1987)

    Google Scholar 

  • Rubin, D.B., Schenker, N.: Multiple Imputation for interval estimatation from simple random samples with ignorable nonresponse. Journal of the American Statistical Association 81, 366–374 (1986)

    Google Scholar 

  • Safaa R. Amer: Neural Network Imputation in Complex Survey Design. International Journal of Electrical, Computer, and Systems Engineering 3(1), 52–57 (2009)

    Google Scholar 

  • Shafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman and Hall (1997)

    Google Scholar 

  • Shafer, J.L., Olsen, M.K.: Multiple imputation for multivariate missing-data problems: a data analyst’s perspective. Multivariate Behavioral Research 33, 545571 (1998)

    Google Scholar 

  • Smola, A.J., Schölkopf B.: A Tutorial on Support Vector Regression. NeuroCOLT, Technical Report NC-TR-98–030, Royal Holloway College, University of London, UK (1998)

    Google Scholar 

  • Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer Verlag (1999)

    Google Scholar 

  • Wang, F., Yangh, H-Z.: epsilon-insensitive support vector regression ensemble algorithm based on improved adaboost. Computer Engineering and Applications 44, 42–44 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daria Scacciatelli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Scacciatelli, D. (2012). Ensemble Support Vector Regression:A New Non-parametric Approach for Multiple Imputation. In: Di Ciaccio, A., Coli, M., Angulo Ibanez, J. (eds) Advanced Statistical Methods for the Analysis of Large Data-Sets. Studies in Theoretical and Applied Statistics(). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21037-2_15

Download citation

Publish with us

Policies and ethics