Skip to main content

An Agglomerative Hierarchical Clustering Algorithm for Improving Symbolic Object Retrieval

  • Chapter
Selected Contributions in Data Analysis and Classification

Abstract

One of the main novelties of the Symbolic data analysis is the introduction of symbolic objects (SOs): “aggregated data” that synthesize information concerning a group of individuals of a population. SOs are particularly suitable for representing (and managing) census data that require the availability of aggregated information. This paper proposes a new (conceptual) hierarchical agglomerative clustering algorithm whose output is a “tree” of progressively general SO descriptions. Such a tree can be effectively used to outperform the resource retrieval task, specifically for finding the SO to which an individual belongs to and/or to determine a more general representation of a given SO. (i.e. finding a more general segment of information which a SO belongs to).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • APPICE, A., d’AMATO, C., ESPOSITO, F. and MALERBA, D. (2004): Classification of symbolic objects: a lazy learning approach. In: P. Brito and M. Noirhomme-Fraiture (Eds.): Proceeding of Workshop on Symbolic and Spatial Data Analysis: Mining Complex Data Structures, at ECML/PKDD 2004, 19–30.

    Google Scholar 

  • APPICE, A., d’AMATO, C., ESPOSITO, F., MALERBA, D. (2006): Classification of symbolic objects: a lazy learning approach. In: P. Brito and M. Noirhomme-Fraiture (Eds.): Journal of Intelligent Data Analysis 10, 301–324.

    Google Scholar 

  • BOCK, H.-H. and DIDAY, E. (2000): Analysis of Symbolic Data. Exploratory methods for extracting statistical information from complex data. Springer-Verlag, Berlin, Heidelberg.

    Google Scholar 

  • BRITO, P. (1994a): Use of pyramids in symbolic data analysis. In: E. Diday, Y. Lechevallier, M. Schader et al. (Eds.), New Approaches in Classification and Data Analysis, Proceeding of IFCS-93. Springer-Verlag, Berlin-Heidelberg, 378–386.

    Google Scholar 

  • BRITO, P. (1994b): Order structure of symbolic assertion objects. IEEE Transaction on Knowledge and Data Engineering, 6(5), 830–835.

    Article  Google Scholar 

  • CHRIS, H., DING, Q. and HE, X. (2005): Cluster aggregate inequality and multilevel hierarchical clustering. In: Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD. Springer, LNCS, 3721, 71–83.

    Google Scholar 

  • DIDAY, E. (1988): The symbolic approach in clustering and related methods of data analysis: the basic choices. In: H.-H. Bock (Ed.), Classification and Related Methods of Data Analysis, Proc. of IFCS’87, Aachen, July 1987. North Holland, Amsterdam, 673–684.

    Google Scholar 

  • ESPOSITO, F., MALERBA, D. and TAMMA, V. (2000): Dissimilarity measures for symbolic objects. In: H.-H. Bock and E. Diday (Eds.), Analysis of Symbolic Data. Exploratory methods for extracting statistical information from complex data. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg, 165–185.

    Google Scholar 

  • ESPOSITO, F., MALERBA, D., GIOVIALE, V. and TAMMA, V. (2001): Comparing dissimilarity measures in Symbolic Data Analysis. In: Proceedings of the Joint Conferences on New Techniques and Technologies for Statistics and Exchange of Technology and Know-how (ETK-NTTS’01), 473–481.

    Google Scholar 

  • GOWDA, K.C. and DIDAY, E. (1991): Symbolic clustering using a new dissimilarity measure. Pattern Recognition, 24(6), 567–578.

    Article  Google Scholar 

  • GOWDA, K.C. and DIDAY, E. (1992): Symbolic clustering using a new similarity measure. IEEE transactions on Systems, Man and Cybernetics, 22(2), 68–378.

    Article  Google Scholar 

  • ICHINO, M. (1988): General metrics for mixed features-The cartesian space theory for pattern recognition. In: Proc. IEEE Conf. Systems, Man and Cybernetics, Atlanta, GA, 14–17.

    Google Scholar 

  • ICHINO, M. and YAGUCHI, H. (1994): General Minkowsky metric for mixed feature type. IEEE transactions on Systems, Man and Cybernetics, 24, 698–708.

    Article  MathSciNet  Google Scholar 

  • JAIN, A.K., MURTY, M.N. and FLYNN, P.J. (1999): Data clustering: a review. ACM Computing Surveys, 31(3), 264–323.

    Article  Google Scholar 

  • KING, B. (1967): Step-wise clustering procedures. J. Am. Stat. Assoc., 69, 86–101.

    Article  Google Scholar 

  • MALERBA, D., ESPOSITO, F. and MONOPOLI, M. (2002): Comparing dissimilarity measures for probabilistic symbolic objects. In: A. Zanasi, C.A. Brebbia, N.F.F. Ebecken and P. Melli (Eds.) Data Mining III. WIT Press, Southampton, UK-Management Information Systems 6, 31–40.

    Google Scholar 

  • MENESES, E. and RODRIGUEZ-ROJAS, O. (2006): Using symbolic objects to cluster web documents. In: Proceedings of the 15th International Conference on World Wide Web (WWW 2006). ACM Press, New York, 967–968.

    Chapter  Google Scholar 

  • MICHALSKI, R.S. and STEPP, R.E. (1983): Automated construction of classifications: conceptual clustering versus numerical taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 5, 219–243.

    Article  Google Scholar 

  • RAVI, T.V. and GOWDA, K.C. (2004): A new non-hierarchical clustering procedure for symbolic objects. In: Intelligent Data Engineering and Automated Learning — IDEAL 2000: Data Mining, Financial Engineering, and Intelligent Agents Springer, LNCS.

    Google Scholar 

  • SNEATH, P.H.A. and SOKAL, R.R. (1973): Numerical Taxonomy. Freeman, London, UK.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Esposito, F., d’Amato, C. (2007). An Agglomerative Hierarchical Clustering Algorithm for Improving Symbolic Object Retrieval. In: Brito, P., Cucumel, G., Bertrand, P., de Carvalho, F. (eds) Selected Contributions in Data Analysis and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73560-1_5

Download citation

Publish with us

Policies and ethics