Skip to main content
Log in

Pairwise feature evaluation for constructing reduced representations

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Feature selection methods are often used to determine a small set of informative features that guarantee good classification results. Such procedures usually consist of two components: a separability criterion and a selection strategy. The most basic choices for the latter are individual ranking, forward search and backward search. Many intermediate methods such as floating search are also available. The forward as well as backward selection may cause lossy evaluation of the criterion and/or overtraining of the final classifier in case of high-dimensional spaces and small sample size problems. Backward selection may also become computationally prohibitive. Individual ranking, on the other hand, suffers as it neglects dependencies between features. A new strategy based on a pairwise evaluation has recently been proposed by Bo and Jonassen (Genome Biol 3, 2002) and Pękalska et al. (International Conference on Computer Recognition Systems, Poland, pp 271–278, 2005). Since it considers interactions between features, but always restricted to two-dimensional spaces, it may circumvent the small sample size problem. In this paper, we evaluate this idea in a more general framework for the selection of features as well as prototypes. Our finding is that such a pairwise selection may improve over traditional procedures and we present some artificial and real-world examples to support this claim. Additionally, we have also discovered that the set of problems for which the pairwise selection may be effective is small.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Alon U, Barkai N, Notterman D, Gish K, Ybarra S, Mack D, Levine A (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12):6745–6750

    Article  Google Scholar 

  2. Bennett CH, Gacs P, Li M, Vitányi PMB, Zurek W (1998) Information distance. IEEE Trans Inf Theory IT-44(4):1407–1423

    Article  Google Scholar 

  3. Bo T, Jonassen I (2002) New feature subset selection procedures for classification of expression profiles. Genome Biol 3

  4. Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, California

    MATH  Google Scholar 

  5. Brodatz P (1996) Textures: a photographic album for artists and designers. Dover, New York

    Google Scholar 

  6. Bunke H, Sanfeliu A (1990) Syntactic and structural pattern recognition theory and applications. World Scientific

  7. Cover TM, van Campenhout JM (1977) On the possible ordering in the measurement selection problem. IEEE Trans Syst Man Cybern SMC-7(9):657–661

    Google Scholar 

  8. Das S (2001) Filters, wrappers and a boosting-based hybrid for feature selection. In: International Conference on Machine Learning, pp 74–81

  9. Dubuisson MP, Jain AK (1994) Modified Hausdorff distance for object matching. In: International Conference on Pattern Recognition, vol 1, pp 566–568

  10. Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York

  11. Duin RPW, Juszczak P, de Ridder D, Paclík P, Pękalska E, Tax DMJ (2004) PR-Tools, Pattern Recognition Tools. http://www.prtools.org

  12. Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Academic, INC

  13. Hall M (2000) Correlation-based feature selection for machine learning. Ph.D Thesis, University of Waikato

  14. Jain AK, Zongker D (1997) Feature selection—evaluation, application, and small sample performance. IEEE Trans Pattern Anal Mach Intell 19(2):153–158

    Article  Google Scholar 

  15. Jain AK, Zongker D (1997) Representation and recognition of handwritten digits using deformable templates. IEEE Trans Pattern Anal Mach Intell 19(12):1386–1391

    Article  Google Scholar 

  16. Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22:4–37

    Article  Google Scholar 

  17. John GH, Kohavi R, Pfleger P (1994) Irrelevant features and the subset selection problem. In: Mahine learning: Proceedings of the Ninth International Conference. Morgan Kaufmann

  18. Kohavi R (1995) The power of decision tables. In: Proceedings of the Eighth European Conference on Machine Learning ECML95, Lecture Notes in Artificial Intelligence, 914, pp 174–189. Springer, Berlin Heidelberg New York

  19. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324

    Article  Google Scholar 

  20. Li L, Weinberg CR, Darden TA, Pedersen LG (2001) Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parametthe GA/KNN method. Bioinformatics 17:1131–1142

    Article  Google Scholar 

  21. Lozano M, Sotoca JM, Sanchez JS, Pla F, Pękalska E, Duin RPW (2006) Experimental study on prototype optimisation algorithms for dissimilarity based classifiers. Pattern Recognit 39(10):1827–1838

    Article  Google Scholar 

  22. Paclík P, Novovičová J, Somol P, Pudil P (2000) Road sign classification using Laplace Kernel classifier. Pattern Recognit Lett 21(13–14):1165–1173

    Article  Google Scholar 

  23. Pękalska E, Duin RPW (2005) The dissimilarity representation for pattern recognition. Foundations and applications. World Scientific, Singapore

    Google Scholar 

  24. Pękalska E, Harol A, Lai C, Duin RPW (2005) Pairwise selection of features and prototypes. In: International Conference on Computer Recognition Systems, Poland, pp 271–278

  25. Pękalska E, Duin RPW, Paclík P (2002) A generalized Kernel approach to dissimilarity based classification. J Mach Learn Res 2(2):175–211

    Article  MathSciNet  Google Scholar 

  26. Pękalska E, Duin RPW, Paclík P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recognit 39(2):189–208

    Article  Google Scholar 

  27. Pudil P, Novovicova J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15:1119–1125

    Article  Google Scholar 

  28. Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  29. Veltkamp RC, Hagedoorn M (2000) Shape similarity measures, properties, and constructions. Advances in visual information systems, pp 467–476

  30. Wilson CL, Garris MD (1992) Handprinted character database 3. Technical Report, National Institute of Standards and Technology

  31. Xing E, Jordan M, Karp R (2001) Feature selection for high-dimencional genomic microarray data. In: International Conference on Machine Learning, pp 601–608

  32. Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: International Conference on Machine Learning, Washington

Download references

Acknowledgments

This work is supported by the Dutch Organization for Scientific Research (NWO) and the Dutch Cancer Institute (NKI). The authors thank Prof. Anil Jain and Dr. Douglas Zongker for providing the Digit dissimilarity data and Dr. Pavel Paclík for providing the RoadSign dissimilarity data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Artsiom Harol.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Harol, A., Lai, C., Pękalska, E. et al. Pairwise feature evaluation for constructing reduced representations. Pattern Anal Applic 10, 55–68 (2007). https://doi.org/10.1007/s10044-006-0050-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-006-0050-x

Keywords

Navigation