Advertisement

Median Topographic Maps for Biomedical Data Sets

  • Barbara Hammer
  • Alexander Hasenfuss
  • Fabrice Rossi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5400)

Abstract

Median clustering extends popular neural data analysis methods such as the self-organizing map or neural gas to general data structures given by a dissimilarity matrix only. This offers flexible and robust global data inspection methods which are particularly suited for a variety of data as occurs in biomedical domains. In this chapter, we give an overview about median clustering and its properties and extensions, with a particular focus on efficient implementations adapted to large scale data analysis.

Keywords

Dissimilarity Matrix Biomedical Data Biomedical Domain Median Cluster Standard Batch 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Al-Harbi, S., Rayward-Smith, V.: The use of a supervised k-means algorithm on real-valued data with applications in health. In: Chung, P.W.H., Hinde, C.J., Ali, M. (eds.) IEA/AIE 2003. LNCS, vol. 2718, pp. 575–581. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  2. 2.
    Alex, N., Hammer, B.: Parallelizing single pass patch clustering. In: Verleysen, M. (ed.) ESANN 2008, pp. 227–232 (2008)Google Scholar
  3. 3.
    Alex, N., Hammer, B., Klawonn, F.: Single pass clustering for large data sets. In: Proceedings of 6th International Workshop on Self-Organizing Maps (WSOM 2007), Bielefeld, Germany, September 3-6 (2007)Google Scholar
  4. 4.
    Ambroise, C., Govaert, G.: Analyzing dissimilarity matrices via Kohonen maps. In: Proceedings of 5th Conference of the International Federation of Classification Societies (IFCS 1996), Kobe (Japan), March 1996, vol. 2, pp. 96–99 (1996)Google Scholar
  5. 5.
    Anderson, E.: The irises of the gaspe peninsula. Bulletin of the American Iris Society 59, 25 (1935)Google Scholar
  6. 6.
    Arora, S., Raghavan, P., Rao, S.: Approximation schemes for euclidean k-medians and related problems. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 106–113 (1998)Google Scholar
  7. 7.
    Barreto, G.A.: Time series prediction with the self-organizing map: A review. In: Hammer, B., Hitzler, P. (eds.) Perspectives on Neural-Symbolic Integration. Springer, Heidelberg (2007)Google Scholar
  8. 8.
    Boulet, R., Jouve, B., Rossi, F., Villa, N.: Batch kernel som and related laplacian methods for social network analysis. In: Neurocomputing (2008) (to be published)Google Scholar
  9. 9.
    Celeux, G., Diday, E., Govaert, G., Lechevallier, Y., Ralambondrainy, H.: Classification Automatique des Données. Bordas, Paris (1989)Google Scholar
  10. 10.
    Charikar, M., Guha, S., Tardos, A., Shmoys, D.B.: A constant-factor approcimation algorithm for the k-median problem. Journal of Computer and System Sciences 65, 129 (2002)CrossRefGoogle Scholar
  11. 11.
    Conan-Guez, B., Rossi, F.: Speeding up the dissimilarity self-organizing maps by branch and bound. In: Sandoval, F., Prieto, A.G., Cabestany, J., Graña, M. (eds.) IWANN 2007. LNCS, vol. 4507, pp. 203–210. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Conan-Guez, B., Rossi, F., El Golli, A.: Fast algorithm and implementation of dissimilarity self-organizing maps. Neural Networks 19(6-7), 855–863 (2006)CrossRefPubMedGoogle Scholar
  13. 13.
    Cottrell, M., Hammer, B., Hasenfuss, A., Villmann, T.: Batch and median neural gas. Neural Networks 19, 762–771 (2006)CrossRefPubMedGoogle Scholar
  14. 14.
    Farnstrom, F., Lewis, J., Elkan, C.: Scalability for clustering algorithms revisited. SIGKDD Explorations 2(1), 51–57 (2000)CrossRefGoogle Scholar
  15. 15.
    Fisher, R.A.: The use of multiple measurements in axonomic problems. Annals of Eugenics 7, 179–188 (1936)CrossRefGoogle Scholar
  16. 16.
    Fort, J.-C., Letrémy, P., Cottrell, M.: Advantages and drawbacks of the batch kohonen algorithm. In: Verleysen, M. (ed.) ESANN 2002, pp. 223–230. D Facto (2002)Google Scholar
  17. 17.
    Frey, B., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–977 (2007)CrossRefPubMedGoogle Scholar
  18. 18.
    Frey, B., Dueck, D.: Response to clustering by passing messages between data points. Science 319, 726d (2008)CrossRefGoogle Scholar
  19. 19.
    Graepel, T., Herbrich, R., Bollmann-Sdorra, P., Obermayer, K.: Classification on pairwise proximity data. In: NIPS, vol. 11, pp. 438–444. MIT Press, Cambridge (1999)Google Scholar
  20. 20.
    Graepel, T., Obermayer, K.: A stochastic self-organizing map for proximity data. Neural Computation 11, 139–155 (1999)CrossRefPubMedGoogle Scholar
  21. 21.
    Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams. In: IEEE Symposium on Foundations of Computer Science, pp. 359–366 (2000)Google Scholar
  22. 22.
    Guha, S., Rastogi, R., Shim, K.: Cure: an efficient clustering algorithm for large datasets. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 73–84 (1998)Google Scholar
  23. 23.
    Haasdonk, B., Bahlmann, C.: Learning with distance substitution kernels. In: Pattern Recognition - Proc. of the 26th DAGM Symposium (2004)Google Scholar
  24. 24.
    Hammer, B., Hasenfuss, A.: Relational neural gas. In: Hertzberg, J., Beetz, M., Englert, R. (eds.) KI 2007. LNCS, vol. 4667, pp. 190–204. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  25. 25.
    Hammer, B., Jain, B.J.: Neural methods for non-standard data. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks 2004, pp. 281–292. D-side publications (2004)Google Scholar
  26. 26.
    Hammer, B., Micheli, A., Sperduti, A., Strickert, M.: Recursive self-organizing network models. Neural Networks 17(8-9), 1061–1086 (2004)CrossRefPubMedGoogle Scholar
  27. 27.
    Hammer, B., Villmann, T.: Classification using non standard metrics. In: Verleysen, M. (ed.) ESANN 2005, pp. 303–316. d-side publishing (2005)Google Scholar
  28. 28.
    Hansen, P., Mladenovic, M.: Todo. Location Science 5, 207 (1997)CrossRefGoogle Scholar
  29. 29.
    Hasenfuss, A., Hammer, B.: Single pass clustering and classification of large dissimilarity datasets. In: AIPR (2008)Google Scholar
  30. 30.
    Hathaway, R.J., Bezdek, J.C.: Nerf c-means: Non-euclidean relational fuzzy clustering. Pattern Recognition 27(3), 429–437 (1994)CrossRefGoogle Scholar
  31. 31.
    Hathaway, R.J., Davenport, J.W., Bezdek, J.C.: Relational duals of the c-means algorithms. Pattern Recognition 22, 205–212 (1989)CrossRefGoogle Scholar
  32. 32.
    Heskes, T.: Self-organizing maps, vector quantization, and mixture modeling. IEEE Transactions on Neural Networks 12, 1299–1305 (2001)CrossRefPubMedGoogle Scholar
  33. 33.
    Hofmann, T., Buhmann, J.M.: Pairwise data clustering by deterministic annealing. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(1), 1–14 (1997)CrossRefGoogle Scholar
  34. 34.
    Jin, R., Goswami, A., Agrawal, G.: Fast and exact out-of-core and distributed k-means clustering. Knowledge and Information System 1, 17–40 (2006)CrossRefGoogle Scholar
  35. 35.
    Juan, A., Vidal, E.: On the use of normalized edit distances and an efficient k-nn search technique (k-aesa) for fast and accurate string classification. In: ICPR 2000, vol. 2, pp. 680–683 (2000)Google Scholar
  36. 36.
    Kaski, S., Nikkilä, J., Oja, M., Venna, J., Törönen, P., Castren, E.: Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics 4 (2003)Google Scholar
  37. 37.
    Kaski, S., Nikkilä, J., Savia, E., Roos, C.: Discriminative clustering of yeast stress response. In: Seiffert, U., Jain, L., Schweizer, P. (eds.) Bioinformatics using Computational Intelligence Paradigms, pp. 75–92. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  38. 38.
    Kaufman, L., Rousseeuw, P.J.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1-Norm and Related Methods, pp. 405–416. North-Holland, Amsterdam (1987)Google Scholar
  39. 39.
    Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  40. 40.
    Kohonen, T.: Self-organizing maps of symbol strings. Technical report A42, Laboratory of computer and information science, Helsinki University of technology, Finland (1996)Google Scholar
  41. 41.
    Kohonen, T., Somervuo, P.: How to make large self-organizing maps for nonvectorial data. Neural Networks 15, 945–952 (2002)CrossRefPubMedGoogle Scholar
  42. 42.
    Land, A.H., Doig, A.G.: An automatic method for solving discrete programming problems. Econometrica 28, 497–520 (1960)CrossRefGoogle Scholar
  43. 43.
    Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 6, 707–710 (1966)Google Scholar
  44. 44.
    Lu, Y., Lu, S., Fotouhi, F., Deng, Y., Brown, S.: Incremental genetic k-means algorithm and its application in gene expression data analysis. BMC Bioinformatics 5, 172 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Lundsteen, C., Phillip, J., Granum, E.: Quantitative analysis of 6985 digitized trypsin G-banded human metaphase chromosomes. Clinical Genetics 18, 355–370 (1980)CrossRefPubMedGoogle Scholar
  46. 46.
    Martinetz, T., Berkovich, S., Schulten, K.: ‘neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Transactions on Neural Networks 4, 558–569 (1993)CrossRefPubMedGoogle Scholar
  47. 47.
    Martinetz, T., Schulten, K.: Topology representing networks. Neural Networks 7(507-522) (1994)Google Scholar
  48. 48.
    Mevissen, H., Vingron, M.: Quantifying the local reliability of a sequence alignment. Protein Engineering 9, 127–132 (1996)CrossRefPubMedGoogle Scholar
  49. 49.
    Neuhaus, M., Bunke, H.: Edit distance-based kernel functions for structural pattern classification. Pattern Recognition 39(10), 1852–1863 (2006)CrossRefGoogle Scholar
  50. 50.
    Bradley, P.S., Fayyad, U., Reina, C.: Scaling clustering algorithms to large data sets. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 9–15. AAAI Press, Menlo Park (1998)Google Scholar
  51. 51.
    Qin, A.K., Suganthan, P.N.: Kernel neural gas algorithms with application to cluster analysis. In: ICPR 2004, vol. 4, pp. 617–620 (2004)Google Scholar
  52. 52.
    Rossi, F.: Model collisions in the dissimilarity SOM. In: Proceedings of XVth European Symposium on Artificial Neural Networks (ESANN 2007), Bruges (Belgium), pp. 25–30 (April 2007)Google Scholar
  53. 53.
    Shamir, R., Sharan, R.: Approaches to clustering gene expression data. In: Jiang, T., Smith, T., Xu, Y., Zhang, M.Q. (eds.) Current Topics in Computational Biology. MIT Press, Cambridge (2001)Google Scholar
  54. 54.
    Villmann, T., Seiffert, U., Schleif, F.-M., Brüß, C., Geweniger, T., Hammer, B.: Fuzzy labeled self-organizing map with label-adjusted prototypes. In: Schwenker, F., Marinai, S. (eds.) ANNPR 2006. LNCS, vol. 4087, pp. 46–56. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  55. 55.
    Wang, W., Yang, J., Muntz, R.: Sting: a statistical information grid approach to spatial data mining. In: Proceedings of the 23rd VLDB Conference, pp. 186–195 (1997)Google Scholar
  56. 56.
    Wolberg, W., Street, W., Heisey, D., Mangasarian, O.: Computer-derived nuclear features distinguish malignant from benign breast cytology. Human Pathology 26, 792–796 (1995)CrossRefPubMedGoogle Scholar
  57. 57.
    Yang, Q., Wu, X.: 10 challenging problems in data mining research. International Journal of Information Technology & Decision Making 5(4), 597–604 (2006)CrossRefGoogle Scholar
  58. 58.
    Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. In: Proceedings of the 15th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Databas Systems, pp. 103–114 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Barbara Hammer
    • 1
  • Alexander Hasenfuss
    • 1
  • Fabrice Rossi
    • 2
  1. 1.Clausthal University of TechnologyClausthal-ZellerfeldGermany
  2. 2.INRIA Rocquencourt, Domaine de Voluceau, RocquencourtLe Chesnay CedexFrance

Personalised recommendations