Skip to main content

Analyzing Symbolic Data

Problems, Methods, and Perspectives

  • Conference paper
  • First Online:
Cooperation in Classification and Data Analysis
  • 662 Accesses

Abstract

Classical data analysis considers data vectors with real-valued or categorical components. In contrast, Symbolic Data Analysis (SDA) deals with data vectors whose components are intervals, sets of categories, or even frequency distributions. SDA generalizes common methods of multivariate statistics to the case of symbolic data tables. This paper presents a brief survey on basic problems and methods of this fast-developing branch of data analysis. As an alternative to the current more or less heuristic approaches, we propose a new probabilistic approach in this context. Our presentation concentrates on visualization, dissimilarities, and partition-type clustering for symbolic data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Baddeley, A.J., and Molchanov, I.S. (1997): On the expected measure of a random set. In: D. Jeulin (ed.): Advances in theory and applications of random sets. World Scientific, Singapore, 3–20.

    Google Scholar 

  • Baddeley, A.J., and Molchanov, I.S. (1998): Averaging of random sets based on their distance functions. Journal of Mathematical Imaging and Vision 8, 79–92.

    Article  MathSciNet  Google Scholar 

  • Bock, H.-H. (1996a): Probability models and hypotheses testing in partitioning cluster analysis. In: Ph. Arable, L. Hubert, and G. De Soete (Eds.): Clustering and classification. World Science, River Edge, NJ, 1996, 377–453.

    Google Scholar 

  • Bock, H.-H. (1996b): Probabilistic models in cluster analysis. Computational Statistics and Data Analysis 23, 5–28.

    Article  MATH  Google Scholar 

  • Bock, H.-H. (1996c): Probabilistic models in partitional cluster analysis. In: A. Ferligoj and A. Kramberger (Eds.): Developments in data analysis. FDV, Metodoloski zvezki, 12, Ljubljana, Slovenia, 1996, 3–25.

    Google Scholar 

  • Bock, H.-H. (1999): Clustering and neural network approaches. In: W. Gaul, and H. Locarek-Junge (Eds): Classification in the information age. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg, 1999, 42–57.

    Google Scholar 

  • Bock, H.-H. (2003): Clustering methods and Kohonen maps for symbolic data. Journal of the Japanese Society of Computational Statistics 15.2, 217–229.

    MathSciNet  Google Scholar 

  • Bock, H.-H (2005): Optimization in symbolic data analysis: dissimilarities, class centers, and clustering. In: D. Baier, R. Decker, and L. Schmidt-Thieme (eds.): Data analsis and decision support. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg, 3–10.

    Chapter  Google Scholar 

  • Bock, H.-H. (2008): Visualizing symbolic data by Kohonen maps. In: E. Diday, and M. Noirhomme (Eds.): Symbolic data analysis and the SODAS software. Wiley, Chichester, 2008, 205–234.

    Google Scholar 

  • Bock, H.-H., and Diday, E. (2000): Analysis of symbolic data. Exploratory methods for extracting statistical information from complex data. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg.

    Google Scholar 

  • De Carvalho, F., Brito, B., and Bock, H.-H. (2005): Dynamic clustering for interval data based on L 2 distance. Computational Statistics 21, 231–250.

    Article  MathSciNet  Google Scholar 

  • Diday, E., and Noirhomme, M. (Eds.) (2008): Symbolic data analysis and the SODAS software. Wiley, Chichester.

    MATH  Google Scholar 

  • El Golli, A., Conan-Guez, B., and Rossi, F. (2004): A self-organizing map for dissimilarity data. In: D. Banks, L. House, F.R. McMorris, P. Arabie, and W. Gaul (Eds.): Classification, clustering, and data mining applications. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg, 61–68.

    Google Scholar 

  • Hansen, P., and Jaumard, B. (1997): Cluster analysis and mathematical programming. Mathematical Programming 79, 191–215.

    MathSciNet  Google Scholar 

  • Johnson, N.L., Kotz, S., and Balakrishnan, N (1994): Continuous univariate distributions, Vol. 1. Wiley, New York.

    MATH  Google Scholar 

  • Kotz, S., Balakrishnan, N., Read, C.B., and Vidakovic, B. (2006): Encyclopedia of statistical sciences, Vol. 4. Wiley, New York.

    Google Scholar 

  • Kruse, R. (1987): On the variance of random sets. Journal of Mathematical Analysis and Applications 122(2), 469–473.

    Article  MATH  MathSciNet  Google Scholar 

  • Mathéron, G. (1975): Random sets and integral geometry. Wiley, New York.

    MATH  Google Scholar 

  • Molchanov, I. (1997): Statistical problems for random sets. In: J. Goutsias (Ed.): Random sets: theory and applications. Springer, Berlin, 27–45.

    Google Scholar 

  • Nordhoff, O. (2003): Expectation of random intervals (in German: Erwartungswerte zufälliger Quader). Diploma thesis. Institute of Statistics, RWTH Aachen University, 2003.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to H.-H. Bock .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bock, HH. (2009). Analyzing Symbolic Data. In: Gaul, W., Bock, HH., Imaizumi, T., Okada, A. (eds) Cooperation in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00668-5_1

Download citation

Publish with us

Policies and ethics