Abstract
One of the main novelties of the Symbolic data analysis is the introduction of symbolic objects (SOs): “aggregated data” that synthesize information concerning a group of individuals of a population. SOs are particularly suitable for representing (and managing) census data that require the availability of aggregated information. This paper proposes a new (conceptual) hierarchical agglomerative clustering algorithm whose output is a “tree” of progressively general SO descriptions. Such a tree can be effectively used to outperform the resource retrieval task, specifically for finding the SO to which an individual belongs to and/or to determine a more general representation of a given SO. (i.e. finding a more general segment of information which a SO belongs to).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
APPICE, A., d’AMATO, C., ESPOSITO, F. and MALERBA, D. (2004): Classification of symbolic objects: a lazy learning approach. In: P. Brito and M. Noirhomme-Fraiture (Eds.): Proceeding of Workshop on Symbolic and Spatial Data Analysis: Mining Complex Data Structures, at ECML/PKDD 2004, 19–30.
APPICE, A., d’AMATO, C., ESPOSITO, F., MALERBA, D. (2006): Classification of symbolic objects: a lazy learning approach. In: P. Brito and M. Noirhomme-Fraiture (Eds.): Journal of Intelligent Data Analysis 10, 301–324.
BOCK, H.-H. and DIDAY, E. (2000): Analysis of Symbolic Data. Exploratory methods for extracting statistical information from complex data. Springer-Verlag, Berlin, Heidelberg.
BRITO, P. (1994a): Use of pyramids in symbolic data analysis. In: E. Diday, Y. Lechevallier, M. Schader et al. (Eds.), New Approaches in Classification and Data Analysis, Proceeding of IFCS-93. Springer-Verlag, Berlin-Heidelberg, 378–386.
BRITO, P. (1994b): Order structure of symbolic assertion objects. IEEE Transaction on Knowledge and Data Engineering, 6(5), 830–835.
CHRIS, H., DING, Q. and HE, X. (2005): Cluster aggregate inequality and multilevel hierarchical clustering. In: Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD. Springer, LNCS, 3721, 71–83.
DIDAY, E. (1988): The symbolic approach in clustering and related methods of data analysis: the basic choices. In: H.-H. Bock (Ed.), Classification and Related Methods of Data Analysis, Proc. of IFCS’87, Aachen, July 1987. North Holland, Amsterdam, 673–684.
ESPOSITO, F., MALERBA, D. and TAMMA, V. (2000): Dissimilarity measures for symbolic objects. In: H.-H. Bock and E. Diday (Eds.), Analysis of Symbolic Data. Exploratory methods for extracting statistical information from complex data. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg, 165–185.
ESPOSITO, F., MALERBA, D., GIOVIALE, V. and TAMMA, V. (2001): Comparing dissimilarity measures in Symbolic Data Analysis. In: Proceedings of the Joint Conferences on New Techniques and Technologies for Statistics and Exchange of Technology and Know-how (ETK-NTTS’01), 473–481.
GOWDA, K.C. and DIDAY, E. (1991): Symbolic clustering using a new dissimilarity measure. Pattern Recognition, 24(6), 567–578.
GOWDA, K.C. and DIDAY, E. (1992): Symbolic clustering using a new similarity measure. IEEE transactions on Systems, Man and Cybernetics, 22(2), 68–378.
ICHINO, M. (1988): General metrics for mixed features-The cartesian space theory for pattern recognition. In: Proc. IEEE Conf. Systems, Man and Cybernetics, Atlanta, GA, 14–17.
ICHINO, M. and YAGUCHI, H. (1994): General Minkowsky metric for mixed feature type. IEEE transactions on Systems, Man and Cybernetics, 24, 698–708.
JAIN, A.K., MURTY, M.N. and FLYNN, P.J. (1999): Data clustering: a review. ACM Computing Surveys, 31(3), 264–323.
KING, B. (1967): Step-wise clustering procedures. J. Am. Stat. Assoc., 69, 86–101.
MALERBA, D., ESPOSITO, F. and MONOPOLI, M. (2002): Comparing dissimilarity measures for probabilistic symbolic objects. In: A. Zanasi, C.A. Brebbia, N.F.F. Ebecken and P. Melli (Eds.) Data Mining III. WIT Press, Southampton, UK-Management Information Systems 6, 31–40.
MENESES, E. and RODRIGUEZ-ROJAS, O. (2006): Using symbolic objects to cluster web documents. In: Proceedings of the 15th International Conference on World Wide Web (WWW 2006). ACM Press, New York, 967–968.
MICHALSKI, R.S. and STEPP, R.E. (1983): Automated construction of classifications: conceptual clustering versus numerical taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 5, 219–243.
RAVI, T.V. and GOWDA, K.C. (2004): A new non-hierarchical clustering procedure for symbolic objects. In: Intelligent Data Engineering and Automated Learning — IDEAL 2000: Data Mining, Financial Engineering, and Intelligent Agents Springer, LNCS.
SNEATH, P.H.A. and SOKAL, R.R. (1973): Numerical Taxonomy. Freeman, London, UK.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Esposito, F., d’Amato, C. (2007). An Agglomerative Hierarchical Clustering Algorithm for Improving Symbolic Object Retrieval. In: Brito, P., Cucumel, G., Bertrand, P., de Carvalho, F. (eds) Selected Contributions in Data Analysis and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73560-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-73560-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73558-8
Online ISBN: 978-3-540-73560-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)