Advertisement

Soft Computing

, Volume 23, Issue 24, pp 13235–13245 | Cite as

Constraint nearest neighbor for instance reduction

  • Lijun Yang
  • Qingsheng ZhuEmail author
  • Jinlong Huang
  • Quanwang Wu
  • Dongdong Cheng
  • Xiaolu Hong
Methodologies and Application
  • 84 Downloads

Abstract

In instance-based machine learning, algorithms often suffer from prohibitive computational costs and storage space. To overcome such problems, various instance reduction techniques have been developed to remove noises and/or redundant instances. Condensation approach is the most frequently used method, and it aims to remove the instances far away from the decision surface. Edition method is another popular one, and it removes noises to improve the classification accuracy. Drawbacks of these existing techniques include parameter dependency and relatively low accuracy and reduction rate. To solve these drawbacks, the constraint nearest neighbor-based instance reduction (CNNIR) algorithm is proposed in this paper. We firstly introduce the concept of natural neighbor and apply it into instance reduction to eliminate noises and search core instances. Then, we define a constraint nearest neighbor chain which only consists of three instances. It is used to select border instances which can construct a rough decision boundary. After that, a specific strategy is given to reduce the border set. Finally, reduced set is obtained by merging border and core instances. Experimental results show that compared with existing algorithms, the proposed algorithm effectively reduces the number of instances and achieves higher classification accuracy. Moreover, it does not require any user-defined parameters.

Keywords

Instance reduction Natural neighbor Constraint nearest neighbor Instance-based learning 

Notes

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61802360 and 61502060), the Project of Chongqing Education Commission (No. KJZH17104), the Fundamental Research Funds for the Central Universities (No. 2018NQN05) and the China Postdoctoral Science Foundation (No. 2016M602651).

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants and animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

References

  1. Angiulli F (2007) Fast nearest neighbor condensation for large data sets classification. IEEE Trans Knowl Data Eng 19(11):1450–1464CrossRefGoogle Scholar
  2. Bhattacharya B, Mukherjee K, Toussaint G (2005) Geometric decision rules for instance-based learning problems. In: International conference on pattern recognition and machine intelligence. Springer, pp 60–69Google Scholar
  3. Cavalcanti GDC, Ren TI, Pereira CL (2013) Atisa: adaptive threshold-based instance selection algorithm. Expert Syst Appl 40(17):6894–6900CrossRefGoogle Scholar
  4. Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefGoogle Scholar
  5. Fayed HA, Atiya AF (2009) A novel template reduction approach for the \(k\)-nearest neighbor method. IEEE Trans Neural Netw 20(5):890–896Google Scholar
  6. Hamidzadeh J (2015) Irdds: Instance reduction based on distance-based decision surface. J AI Data Min 3(2):121–130Google Scholar
  7. Hamidzadeh J, Monsefi R, Yazdi HS (2015) Instance reduction algorithm using hyperrectangle. Pattern Recognit 48(5):1878–1889CrossRefGoogle Scholar
  8. Hart P (1968) The condensed nearest neighbor rule (corresp.). IEEE Trans Inf Theory 14(3):515–516CrossRefGoogle Scholar
  9. Huang J, Zhu Q, Yang L, Feng J (2016) A non-parameter outlier detection algorithm based on natural neighbor. Knowl-Based Syst 92:71–77CrossRefGoogle Scholar
  10. Huang J, Zhu Q, Yang L, Quanwang W (2017) Qcc: a novel clustering algorithm based on quasi-cluster centers. Mach Learn 106:337–357MathSciNetCrossRefGoogle Scholar
  11. Li J, Wang Y (2015) A new fast reduction technique based on binary nearest neighbor tree. Neurocomputing 149:1647–1657CrossRefGoogle Scholar
  12. Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 2016
  13. Lumini A, Nanni L (2006) A clustering method for automatic biometric template selection. Pattern Recognit 39(3):495–497CrossRefGoogle Scholar
  14. Marchiori E (2008) Hit miss networks with applications to instance selection. J Mach Learn Res 9(Jun):997–1017MathSciNetzbMATHGoogle Scholar
  15. Marchiori E (2009) Graph-based discrete differential geometry for critical instance filtering. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 63–78Google Scholar
  16. Marchiori E (2010) Class conditional nearest neighbor for large margin instance selection. IEEE Trans Pattern Anal Mach Intell 32(2):364–370CrossRefGoogle Scholar
  17. Mollineda RA, Ferri FJ, Vidal E (2002) An efficient prototype merging strategy for the condensed 1-nn rule through class-conditional hierarchical clustering. Pattern Recognit 35(12):2771–2782CrossRefGoogle Scholar
  18. Nikolaidis K, Goulermas JY, Wu QH (2011) A class boundary preserving algorithm for data condensation. Pattern Recognit 44(3):704–715CrossRefGoogle Scholar
  19. Nikolaidis K, Rodriguez-Martinez E, Goulermas JY, Wu QH (2012) Spectral graph optimization for instance reduction. IEEE Trans Neural Netw Learn Syst 23(7):1169–1175CrossRefGoogle Scholar
  20. Olvera-Lopez JA, Carrasco-Ochoa JA, Martnez-Trinidad JF (2010) A new fast prototype selection method based on clustering. Form Pattern Anal Appl 13(2):131–141MathSciNetCrossRefGoogle Scholar
  21. Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern SMC 2(3):408–421MathSciNetCrossRefGoogle Scholar
  22. Yang L, Zhu Q, Huang J, Cheng D (2017) Adaptive edited natural neighbor algorithm. Neurocomputing 230:427–433CrossRefGoogle Scholar
  23. Zhu Q, Feng J, Huang J (2016) Natural neighbor: a self-adaptive neighborhood method without parameter \(k\). Pattern Recognit Lett 80:30–36Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Lijun Yang
    • 1
    • 2
  • Qingsheng Zhu
    • 2
    Email author
  • Jinlong Huang
    • 3
  • Quanwang Wu
    • 2
  • Dongdong Cheng
    • 2
  • Xiaolu Hong
    • 2
  1. 1.School of Computer Science and TechnologySouthwest Minzu UniversityChengduChina
  2. 2.College of Computer ScienceChongqing UniversityChongqingChina
  3. 3.College of Computer EngineeringYangtze Normal UniversityChongqingChina

Personalised recommendations