Analyzing Symbolic Data

Bock, H.-H.

doi:10.1007/978-3-642-00668-5_1

H.-H. Bock⁴

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

662 Accesses

Abstract

Classical data analysis considers data vectors with real-valued or categorical components. In contrast, Symbolic Data Analysis (SDA) deals with data vectors whose components are intervals, sets of categories, or even frequency distributions. SDA generalizes common methods of multivariate statistics to the case of symbolic data tables. This paper presents a brief survey on basic problems and methods of this fast-developing branch of data analysis. As an alternative to the current more or less heuristic approaches, we propose a new probabilistic approach in this context. Our presentation concentrates on visualization, dissimilarities, and partition-type clustering for symbolic data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baddeley, A.J., and Molchanov, I.S. (1997): On the expected measure of a random set. In: D. Jeulin (ed.): Advances in theory and applications of random sets. World Scientific, Singapore, 3–20.
Google Scholar
Baddeley, A.J., and Molchanov, I.S. (1998): Averaging of random sets based on their distance functions. Journal of Mathematical Imaging and Vision 8, 79–92.
Article MathSciNet Google Scholar
Bock, H.-H. (1996a): Probability models and hypotheses testing in partitioning cluster analysis. In: Ph. Arable, L. Hubert, and G. De Soete (Eds.): Clustering and classification. World Science, River Edge, NJ, 1996, 377–453.
Google Scholar
Bock, H.-H. (1996b): Probabilistic models in cluster analysis. Computational Statistics and Data Analysis 23, 5–28.
Article MATH Google Scholar
Bock, H.-H. (1996c): Probabilistic models in partitional cluster analysis. In: A. Ferligoj and A. Kramberger (Eds.): Developments in data analysis. FDV, Metodoloski zvezki, 12, Ljubljana, Slovenia, 1996, 3–25.
Google Scholar
Bock, H.-H. (1999): Clustering and neural network approaches. In: W. Gaul, and H. Locarek-Junge (Eds): Classification in the information age. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg, 1999, 42–57.
Google Scholar
Bock, H.-H. (2003): Clustering methods and Kohonen maps for symbolic data. Journal of the Japanese Society of Computational Statistics 15.2, 217–229.
MathSciNet Google Scholar
Bock, H.-H (2005): Optimization in symbolic data analysis: dissimilarities, class centers, and clustering. In: D. Baier, R. Decker, and L. Schmidt-Thieme (eds.): Data analsis and decision support. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg, 3–10.
Chapter Google Scholar
Bock, H.-H. (2008): Visualizing symbolic data by Kohonen maps. In: E. Diday, and M. Noirhomme (Eds.): Symbolic data analysis and the SODAS software. Wiley, Chichester, 2008, 205–234.
Google Scholar
Bock, H.-H., and Diday, E. (2000): Analysis of symbolic data. Exploratory methods for extracting statistical information from complex data. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg.
Google Scholar
De Carvalho, F., Brito, B., and Bock, H.-H. (2005): Dynamic clustering for interval data based on L ₂ distance. Computational Statistics 21, 231–250.
Article MathSciNet Google Scholar
Diday, E., and Noirhomme, M. (Eds.) (2008): Symbolic data analysis and the SODAS software. Wiley, Chichester.
MATH Google Scholar
El Golli, A., Conan-Guez, B., and Rossi, F. (2004): A self-organizing map for dissimilarity data. In: D. Banks, L. House, F.R. McMorris, P. Arabie, and W. Gaul (Eds.): Classification, clustering, and data mining applications. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg, 61–68.
Google Scholar
Hansen, P., and Jaumard, B. (1997): Cluster analysis and mathematical programming. Mathematical Programming 79, 191–215.
MathSciNet Google Scholar
Johnson, N.L., Kotz, S., and Balakrishnan, N (1994): Continuous univariate distributions, Vol. 1. Wiley, New York.
MATH Google Scholar
Kotz, S., Balakrishnan, N., Read, C.B., and Vidakovic, B. (2006): Encyclopedia of statistical sciences, Vol. 4. Wiley, New York.
Google Scholar
Kruse, R. (1987): On the variance of random sets. Journal of Mathematical Analysis and Applications 122(2), 469–473.
Article MATH MathSciNet Google Scholar
Mathéron, G. (1975): Random sets and integral geometry. Wiley, New York.
MATH Google Scholar
Molchanov, I. (1997): Statistical problems for random sets. In: J. Goutsias (Ed.): Random sets: theory and applications. Springer, Berlin, 27–45.
Google Scholar
Nordhoff, O. (2003): Expectation of random intervals (in German: Erwartungswerte zufälliger Quader). Diploma thesis. Institute of Statistics, RWTH Aachen University, 2003.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Statistics, RWTH Aachen University, 52056, Aachen, Germany
H.-H. Bock

Authors

H.-H. Bock
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to H.-H. Bock .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bock, HH. (2009). Analyzing Symbolic Data. In: Gaul, W., Bock, HH., Imaizumi, T., Okada, A. (eds) Cooperation in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00668-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-00668-5_1
Published: 05 May 2009
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00667-8
Online ISBN: 978-3-642-00668-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics