Advertisement

A New Training Sample Selection Method Avoiding Over-Fitting Based on Nearest Neighbor Rule

  • Guang LiEmail author
Conference paper
  • 423 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 885)

Abstract

Sample selection is an important task. Now, there are many sample selecting methods using nearest neighbor rule. But most of them never consider the over-fitting problem. For overcoming this disadvantage, this paper gives a new sample selecting method. This method uses pruning tactics and cross-validation to avoid over-fitting. It divided the original sample set to some disjoint subsets. Every time, a subset is used as validation sample set to prune samples selected from other subsets. All the subsets take turns as validation set. And the final result was gotten by combining all the selected sample sets. The experiments show that, compared with the existing methods, the new method can get smaller selected sample set and better classifiers can be trained on its selected samples.

Keywords

Sample selection Nearest neighbor rule Over-fitting Cross-validation 

Notes

Acknowledgments

The Project Supported by Natural Science Basic Research Plan in Shaanxi Province of China (Program No. 2016JQ6078), and the Fundamental Research Funds for the Central Universities of Chang’an University (300102328107, 0009—2014G6114024).

References

  1. 1.
    Zhou, X., Jiang, W., Tian, Y., Shi, Y.: Kernel subclass convex hull sample selection method for SVM on face recognition. Neurocomputing 73(10–12), 2234–2246 (2010)CrossRefGoogle Scholar
  2. 2.
    He, Q., Xie, Z., Hu, Q., Wu, C.: Neighborhood based sample and feature selection for SVM classification learning. Neurocomputing 74(10), 1585–1594 (2011)CrossRefGoogle Scholar
  3. 3.
    Hart, P.: The condensed nearest neighbor rule. IEEE Trans. Inf. Theory 14(3), 515–516 (1968)CrossRefGoogle Scholar
  4. 4.
    Hao, H., Jiang, R.: Training sample selection method for neural networks based on nearest neighbor rule. Acta Autom. Sin. 33(12), 1247–1251 (2007)zbMATHGoogle Scholar
  5. 5.
    Cerveron, V., Ferri, F.J.: Another move toward the minimum consistent subset: a tabu search approach to the condensed nearest neighbor rule. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 31(3), 408–413 (2001)CrossRefGoogle Scholar
  6. 6.
    Angiulli, F.: Fast condensed nearest neighbor rule. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 25–32 (2005)Google Scholar
  7. 7.
    Chou, C.-H., Kuo, B.-H., Chang, F.: The generalized condensed nearest neighbor rule as a data reduction method. In: Proceedings of the 18th International Conference on Pattern Recognition, vol. 02, pp. 556–559 (2006)Google Scholar
  8. 8.
    Sogaard, A.: Semisupervised condensed nearest neighbor for part-of-speech tagging. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers, vol. 2, pp. 48–52 (2011)Google Scholar
  9. 9.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Karkkainen, T.: On cross-validation for MLP model evaluation. Lecture Notes in Computer Science, vol. 8621, pp. 291–300 (2014)Google Scholar
  11. 11.
    Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2010)zbMATHGoogle Scholar
  12. 12.
    Escobar, J.W., Linfati, R., Toth, P., Baldoquin, M.G.: A hybrid granular tabu search algorithm for the multi-depot vehicle routing problem. J. Heuristics. 20(5), 483–509 (2014)CrossRefGoogle Scholar
  13. 13.
    Yan, D.M., Bao, G., Zhang, X., Wonka, P.: Low-resolution remeshing using the localized restricted Voronoi diagram. IEEE Trans. Vis. Comput. Graph. 20(10), 1418–1427 (2014)CrossRefGoogle Scholar
  14. 14.
    Witten, I.H., Frank, E., Hall, A.M.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of Electronic and Control EngineeringChang’an UniversityXi’anChina

Personalised recommendations