Skip to main content

Symbolic Data Analysis Approach to Clustering Large Datasets

  • Conference paper
Classification, Clustering, and Data Analysis

Abstract

The paper builds on the representation of units/clusters with a special type of symbolic objects that consist of distributions of variables. Two compatible clustering methods are developed: the leaders method, that reduces a large dataset to a smaller set of symbolic objects (clusters) on which a hierarchical clustering method is applied to reveal its internal structure. The proposed approach is illustrated on USDA Nutrient Database.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • BATAGELJ, V.: Generalized Ward and related clustering problems. (H.H. Bock, ed.: Classification and related methods of data analysis ), North-Holland, Amsterdam, 1988, 67–74.

    Google Scholar 

  • BOCK, H.-H. (2000): Symbolic Data. In: H.-H. Bock and E. Diday (Eds.): Analysis of Symbolic Data. Exploratory methods for extracting statistical information from complex data. Springer, Heidelberg.

    Chapter  Google Scholar 

  • BOCK, H.-H. and DIDAY, E. (2000): Symbolic Objects. In: H.-H. Bock and E. Diday (Eds.): Analysis of Symbolic Data. Exploratory methods for extracting statistical information from complex data. Springer, Heidelberg.

    Chapter  Google Scholar 

  • DIDAY, E. (1979): Optimisation en classification automatique, Tome 1.,2. INRIA, Rocquencourt (in French).

    MATH  Google Scholar 

  • DOUGHERTY, J., KOHAVI, R., and SAHAMI, M. (1995): Supervised and unsupervised discretization of continuous features. Proceedings of the Twelfth International Conference on Machine Learning (pp. 194–202). Tahoe City, CA: Morgan Kaufmann. http://citeseer.nj.nec.com/dougherty95supervised.html

  • HARTIGAN, J.A. (1975): Clustering Algorithms. Wiley, New York.

    MATH  Google Scholar 

  • KORENJAK-ČERNE, S. and BATAGELJ, V. (1998): Clustering large datasets of mixed units. In: Rizzi, A., Vichi, M., Bock, H.-H. (Eds.): Advances in Data Science and Classification. Springer.

    Google Scholar 

  • VERDE, R., DE CARVALHO, F.A.T. and LECHEVALLIER, Y. (2000): A Dynamic Clustering Algorithm for Multi-nominal Data. In: Kiers, H.A.L., Ras-son, J.-P., Groenen, P.J.F., Schader, M. (Eds.): Data Analysis, Classification, and Related Methods. Springer.

    Google Scholar 

  • USDA Nutrient Database for Standard Reference, Release 14. U.S. Department of Agriculture, Agricultural Research Service. 2001: Nutrient Data Laboratory Home Page, http://www.nal.usda.gov/fnic/foodcomp.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Korenjak-Černe, S., Batagelj, V. (2002). Symbolic Data Analysis Approach to Clustering Large Datasets. In: Jajuga, K., Sokołowski, A., Bock, HH. (eds) Classification, Clustering, and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-56181-8_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-56181-8_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43691-1

  • Online ISBN: 978-3-642-56181-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics