Advertisement

Introduction

  • Sanghamitra Bandyopadhyay
  • Sriparna Saha

Abstract

This chapter provides an introduction to the clustering problem, the different data types used, problems of model and model order selection, outliers, and the research issues, challenges and application domains. The chapter starts with a brief overview of the different data types e.g., binary, categorical, ordinal and quantitative, with several examples. Thereafter the steps in automatic machine recognition of patterns are described in detail, including the procedures of data collection, feature selection, classification and clustering. Different distance measures used for clustering are then mentioned in brief. Some ways to deal with outliers and missing values present in a data set are described. Finally, applications of pattern recognition techniques in different domains are highlighted.

Keywords

Feature Selection Cluster Algorithm Multiobjective Optimization Cluster Technique Linear Discriminant Function 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 4.
    Anderberg, M.R.: Cluster Analysis for Application. Academic Press, New York (1973) Google Scholar
  2. 6.
    Andrews, H.C.: Mathematical Techniques in Pattern Recognition. Wiley-Interscience, New York (1972) zbMATHGoogle Scholar
  3. 12.
    Attneave, F.: Symmetry information and memory for pattern. Am. J. Psychol. 68, 209–222 (1995) CrossRefGoogle Scholar
  4. 20.
    Bandyopadhyay, S., Maulik, U.: Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recognit. 35(6), 1197–1208 (2002) zbMATHCrossRefGoogle Scholar
  5. 22.
    Bandyopadhyay, S., Maulik, U., Pakhira, M.K.: Clustering using simulated annealing with probabilistic redistribution. Int. J. Pattern Recognit. Artif. Intell. 15(2), 269–285 (2001) CrossRefGoogle Scholar
  6. 24.
    Bandyopadhyay, S., Pal, S.K.: Classification and Learning Using Genetic Algorithms Applications in Bioinformatics and Web Intelligence. Springer, Heidelberg (2007) zbMATHGoogle Scholar
  7. 27.
    Bandyopadhyay, S., Saha, S.: GAPS: A clustering method using a new point symmetry based distance measure. Pattern Recognit. 40(12), 3430–3451 (2007) zbMATHCrossRefGoogle Scholar
  8. 28.
    Bandyopadhyay, S., Saha, S.: A point symmetry based clustering technique for automatic evolution of clusters. IEEE Trans. Knowl. Data Eng. 20(11), 1–17 (2008) CrossRefGoogle Scholar
  9. 31.
    Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5, 537–550 (1994) CrossRefGoogle Scholar
  10. 35.
    Berg, M.D., Kreveld, M.V., Overmars, M., Schwarzkopf, O.: Computational Geometry: Algorithms and Applications. Springer, Heidelberg (2008) zbMATHGoogle Scholar
  11. 37.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981) zbMATHCrossRefGoogle Scholar
  12. 39.
    Bezdek, J.C., Pal, S.K. (eds.): Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data. IEEE Press, New York (1992) Google Scholar
  13. 52.
    Caves, R., Quegan, S., White, R.: Quantitative comparison of the performance of SAR segmentation algorithms. IEEE Trans. Image Process. 7(11), 1534–1546 (1998) CrossRefGoogle Scholar
  14. 58.
    Chou, C.H., Su, M.C., Lai, E.: Symmetry as a new measure for cluster validity. In: 2nd WSEAS Int. Conf. on Scientific Computation and Soft Computing, Crete, Greece, pp. 209–213 (2002) Google Scholar
  15. 59.
    Chou, C.H., Su, M.C., Lai, E.: A new cluster validity measure and its application to image compression. Pattern Anal. Appl. 7(2), 205–220 (2004) MathSciNetCrossRefGoogle Scholar
  16. 60.
    Chung, K.L., Lin, J.S.: Faster and more robust point symmetry-based K-means algorithm. Pattern Recognit. 40(2), 410–422 (2007) MathSciNetzbMATHCrossRefGoogle Scholar
  17. 61.
    Chung, K.L., Lin, K.S.: An efficient line symmetry-based K-means algorithm. Pattern Recognit. Lett. 27(7), 765–772 (2006) CrossRefGoogle Scholar
  18. 73.
    Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(4), 224–227 (1979) CrossRefGoogle Scholar
  19. 81.
    Devijver, P.A., Kittler, J.: Pattern Recognition: A Statistical Approach. Prentice Hall, London (1982) zbMATHGoogle Scholar
  20. 84.
    Dubes, R.C., Jain, A.K.: Clustering techniques: The user’s dilemma. Pattern Recognit. 8(4), 247–260 (1976) CrossRefGoogle Scholar
  21. 85.
    Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973) zbMATHGoogle Scholar
  22. 86.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001) zbMATHGoogle Scholar
  23. 87.
    Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1973) MathSciNetzbMATHCrossRefGoogle Scholar
  24. 89.
    Emmanouilidis, C., Hunter, A., MacIntyre, J.: A multiobjective evolutionary setting for feature selection and a commonality-based crossover operator. In: Proceedings of the 2000 Congress on Evolutionary Computation CEC00, pp. 309–316. IEEE Press, La Jolla (2000). citeseer.nj.nec.com/emmanouilidis00multiobjective.html Google Scholar
  25. 95.
    Everitt, B.S.: Cluster Analysis, 3rd edn. Halsted, New York (1993) Google Scholar
  26. 96.
    Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis. Arnold, London (2001) zbMATHGoogle Scholar
  27. 104.
    Friedman, M., Kandel, A.: Introduction to Pattern Recognition, Statistical, Structural, Neural and Fuzzy Logic Approaches. World Scientific, Singapore (1999) CrossRefGoogle Scholar
  28. 105.
    Fu, K.S.: Syntactic Pattern Recognition and Applications. Academic Press, London (1982) zbMATHGoogle Scholar
  29. 106.
    Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, New York (1990) zbMATHGoogle Scholar
  30. 110.
    Gelsema, E.S., Kanal, L. (eds.): Pattern Recognition in Practice II. North-Holland, Amsterdam (1986) zbMATHGoogle Scholar
  31. 112.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, New York (1989) zbMATHGoogle Scholar
  32. 113.
    Gonzalez, R.C., Thomason, M.G.: Syntactic Pattern Recognition: An Introduction. Addison-Wesley, Reading (1978) zbMATHGoogle Scholar
  33. 118.
    Grubbs, F.E.: Procedures for detecting outlying observations in samples. Technometrics 11, 1–21 (1969) CrossRefGoogle Scholar
  34. 125.
    Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975) zbMATHGoogle Scholar
  35. 129.
    Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975) Google Scholar
  36. 134.
    Hruschka, E.R., Campello, R.J.G.B., Freitas, A.A., de Carvalho, A.C.P.L.F.: A survey of evolutionary algorithms for clustering. IEEE Trans. Syst. Man Cybern., Part C, Appl. Rev. 39(2), 133–155 (2009) CrossRefGoogle Scholar
  37. 143.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988) zbMATHGoogle Scholar
  38. 149.
    Kandel, A.: Fuzzy Techniques in Pattern Recognition. Wiley-Interscience, New York (1982) zbMATHGoogle Scholar
  39. 150.
    Kandel, A.: Fuzzy Mathematical Techniques with Applications. Addison-Wesley, New York (1986) zbMATHGoogle Scholar
  40. 155.
    Kim, D.W., Lee, K.H., Lee, D.: Fuzzy cluster validation index based on inter-cluster proximity. Pattern Recognit. Lett. 24(15), 2561–2574 (2003) CrossRefGoogle Scholar
  41. 156.
    Kim, T.H., Barrera, L.O., Zheng, M., Qu, C., Singer, M.A., Richmond, T.A., Wu, Y., Green, R.D., Ren, B.: A high-resolution map of active promoters in the human genome. Nature 436, 876–880 (2005) CrossRefGoogle Scholar
  42. 157.
    Kim, Y.I., Kim, D.W., Lee, D., Lee, K.H.: A cluster validation index for GK cluster analysis based on relative degree of sharing. Inf. Sci. 168(1–4), 225–242 (2004) zbMATHCrossRefGoogle Scholar
  43. 159.
    Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983) MathSciNetzbMATHCrossRefGoogle Scholar
  44. 178.
    Lin, J.Y., Peng, H., Xie, J.M., Zheng, Q.L.: Novel clustering algorithm based on central symmetry. In: Proceedings of 2004 International Conference on Machine Learning and Cybernetics, 26–29 August 2004, vol. 3, pp. 1329–1334 (2004) Google Scholar
  45. 185.
    Mao, J., Jain, A.K.: A self-organizing network for hyperellipsoidal clustering. IEEE Trans. Neural Netw. 7(1), 16–29 (1996) CrossRefGoogle Scholar
  46. 189.
    Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell. 24(12), 1650–1654 (2002) CrossRefGoogle Scholar
  47. 191.
    Maulik, U., Bandyopadhyay, S., Trinder, J.: SAFE: An efficient feature extraction technique. J. Knowl. Inf. Syst. 3(3), 374–387 (2001) zbMATHCrossRefGoogle Scholar
  48. 196.
    Mezzich, J.E.: Evaluating clustering methods for psychiatric-diagnosis. Biol. Psychiatry 13, 265–281 (1978) Google Scholar
  49. 200.
    Milligan, G.W., Cooper, C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2), 159–179 (1985) CrossRefGoogle Scholar
  50. 201.
    Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997) zbMATHGoogle Scholar
  51. 214.
    Pal, S.K.: Fuzzy set theoretic measures for automatic feature evaluation – II. Inf. Sci. 64, 165–179 (1992) zbMATHCrossRefGoogle Scholar
  52. 215.
    Pal, S.K., Majumder, D.D.: Fuzzy Mathematical Approach to Pattern Recognition. Wiley, New York (1986) zbMATHGoogle Scholar
  53. 216.
    Pal, S.K., Mandal, D.P.: Linguistic recognition system based on approximate reasoning. Inf. Sci. 61, 135–161 (1992) CrossRefGoogle Scholar
  54. 219.
    Pavlidis, T.: Structural Pattern Recognition. Springer, Berlin (1977) zbMATHGoogle Scholar
  55. 220.
    Pedrycz, W.: A fuzzy cognitive structure for pattern recognition. Pattern Recognit. Lett. 9(5), 305–313 (1989) zbMATHCrossRefGoogle Scholar
  56. 221.
    Pedrycz, W.: Fuzzy sets in pattern recognition: Methodology and methods. Pattern Recognit. 23, 121–146 (1990) CrossRefGoogle Scholar
  57. 229.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993) Google Scholar
  58. 231.
    Rechenberg, I.: Evolutionsstrategie: Optimierung Technischer Systeme nach Prinzipien der Biologischen Evolution. Frommann-Holzboog, Stuttgart (1973) Google Scholar
  59. 235.
    Ruck, D.W., Rogers, S.K., Kabrisky, M.: Feature selection using a multilayer perceptron. Network 2(2), 1–14 (1990). http://portal.acm.org.offcampus.lib.washington.edu/citation.cfm?id=1497653.1498412 Google Scholar
  60. 239.
    Saha, S., Bandyopadhyay, S.: A new multiobjective simulated annealing based clustering technique using symmetry. Pattern Recognit. Lett. 30(15), 1392–1403 (2009) CrossRefGoogle Scholar
  61. 243.
    Saha, S., Bandyopadhyay, S.: Application of a new symmetry based cluster validity index for satellite image segmentation. IEEE Geosci. Remote Sens. Lett. 5(2), 166–170 (2008) CrossRefGoogle Scholar
  62. 244.
    Saha, S., Bandyopadhyay, S.: Performance evaluation of some symmetry based cluster validity indices. IEEE Trans. Syst. Man Cybern., Part C, Appl. Rev. 39(4), 420–425 (2009) CrossRefGoogle Scholar
  63. 255.
    Siedlecki, W., Sklansky, J.: A note on genetic algorithms for large-scale feature selection. Pattern Recognit. Lett. 10, 335–347 (1989). http://dl.acm.org/citation.cfm?id=78354.78362 zbMATHCrossRefGoogle Scholar
  64. 266.
    Su, M.C., Chou, C.H.: A modified version of the K-means algorithm with a distance based on cluster symmetry. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 674–680 (2001) CrossRefGoogle Scholar
  65. 279.
    Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 3rd edn. Academic Press, Orlando (2006) zbMATHGoogle Scholar
  66. 281.
    Tou, J.T., Gonzalez, R.C.: Pattern Recognition Principles. Addison-Wesley, Reading (1974) zbMATHGoogle Scholar
  67. 284.
    Tseng, L., Yang, S.: Genetic algorithms for clustering, feature selection, and classification. In: Proceedings of the IEEE International Conference on Neural Networks, Houston, pp. 1612–1616 (1997) Google Scholar
  68. 290.
    Wang, W., Zhang, Y.: On fuzzy cluster validity indices. Fuzzy Sets Syst. 158(19), 2095–2117 (2007) zbMATHCrossRefGoogle Scholar
  69. 295.
    Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 841–847 (1991) CrossRefGoogle Scholar
  70. 296.
    Xu, R.: II, D.W.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005) CrossRefGoogle Scholar
  71. 301.
    Zadeh, L.: Fuzzy sets. Inf. Control 8, 338–353 (1965) MathSciNetzbMATHCrossRefGoogle Scholar
  72. 305.
    Zhang, Y., Wang, W., Zhang, X., Li, Y.: A cluster validity index for fuzzy clustering. Inf. Sci. 178(4), 1205–1218 (2008) zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sanghamitra Bandyopadhyay
    • 1
  • Sriparna Saha
    • 2
  1. 1.Machine Intelligence UnitIndian Statistical InstituteKolkataIndia
  2. 2.Dept. of Computer ScienceIndian Institute of TechnologyPatnaIndia

Personalised recommendations