Advertisement

Bagging of Instance Selection Algorithms

  • Marcin Blachnik
  • Mirosław Kordos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8468)

Abstract

The paper presents bagging ensembles of instance selection algorithms. We use bagging to improve instance selection. The improvement comprises data compression and prediction accuracy. The examined instance selection algorithms for classification are ENN, CNN, RNG and GE and for regression are the developed by us Generalized CNN and Generalized ENN algorithms. Results of the comparative experimental study performed using different configurations on several datasets shows that the approachbased on bagging allowed for significant improvement, especially in terms of data compression.

Keywords

Regression Problem Instance Selection Regression Task Acceptance Threshold Prototype Selection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bhattacharya, B.K., Poulsen, R.S., Toussaint, G.T.: Application of proximity graphs to editing nearest neighbor decision rule. In: International Symposium on Information Theory, Santa Monica (1981)Google Scholar
  2. 2.
    Bhatia, N.: Survey of Nearest Neighbor Techniques. International Journal of Computer Science and Information Security (IJCSIS) 8(2) (2010)Google Scholar
  3. 3.
    Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  4. 4.
    Dasarathy, B.V.: Nearest Neighbor (NN) Norm, Nn Pattern Classification Techniques. IEEE Computer Society (1990)Google Scholar
  5. 5.
    Guillen, A.: Applying mutual information for prototype or instance selection in regression problems. In: ESANN 2009 (2009)Google Scholar
  6. 6.
    Hart, P.: The condensed nearest neighbor rule (corresp.). IEEE Transactions on Information Theory 14(3), 515–516 (1968)CrossRefGoogle Scholar
  7. 7.
    Huber, P.J.: Robust Statistics. Wiley Series in Probability and Statistics, Wiley-Interscience (1981)Google Scholar
  8. 8.
    Jankowski, N., Gr?bczewski, K.: Handwritten digit recognition — road to contest victory. In: IEEE Symposium Series on Computational Intelligence, pp. 491–498. IEEE Press (2007)Google Scholar
  9. 9.
    Kordos, M., Blachnik, M., Strzempa, D.: Do We Need Whatever More Than k-NN? In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2010, Part I. LNCS, vol. 6113, pp. 414–421. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Kordos, M., Blachnik, M.: Instance Selection with Neural Networks for Regression Problems. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part II. LNCS, vol. 7553, pp. 263–270. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley (2004)Google Scholar
  12. 12.
    Garcia, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 417–435 (2012)CrossRefGoogle Scholar
  13. 13.
    Tolvi, J.: Genetic algorithms for outlier detection and variable selection in linear regression models. Soft Computing 8, 527–533 (2004)CrossRefzbMATHGoogle Scholar
  14. 14.
    Merz, C., Murphy, P.: Uci repository of machine learning databases (2013), http://www.ics.uci.edu/mlearn/MLRepository.html
  15. 15.
    Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics SMC-2(3), 408–421 (1972)CrossRefGoogle Scholar
  16. 16.
    Wilson, D., Martinez, T.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 251–268 (2000)CrossRefGoogle Scholar
  17. 17.
    Zhang, J.: Intelligent selection of instances for prediction functions in lazy learning algorithms. Artifcial Intelligence Review 11, 175–191 (1997)CrossRefGoogle Scholar
  18. 18.
    Zhou, Z.-H.: Ensemble Methods: Foundations and Algorithm. Chapman and Hall/CRC (2012)Google Scholar
  19. 19.
  20. 20.
    source code and datasets used in the paper, http://code.google.com/p/instance-selection-2014/

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Marcin Blachnik
    • 1
  • Mirosław Kordos
    • 2
  1. 1.Department of Management and InformaticsSilesian University of TechnologyKatowicePoland
  2. 2.Department of Mathematics and Computer ScienceUniversity of Bielsko-BialaBielsko-BiałaPoland

Personalised recommendations