Abstract
Several learning tasks comprise hierarchies. Comparison with a “gold-standard” is often performed to evaluate the quality of a learned hierarchy. We assembled various similarity metrics that have been proposed in different disciplines and compared them in a unified interdisciplinary framework for hierarchical evaluation which is based on the distinction of three fundamental dimensions. Identifying deficiencies for measuring structural similarity, we suggest three new measures for this purpose, either extending existing ones or based on new ideas. Experiments with an artificial dataset were performed to compare the different measures. As shown by our results, the measures vary greatly in their properties.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bade, K., & Hermkes, M. (2008). Collection browsing through automatic hierarchical tagging. In Proceedings of the 5th International Conference on Adaptive Hypermedia and Adaptive Web-based Systems.
Bade, K., Hüllermeier, E., & Nürnberger, A. (2006). Hierarchical classification by expected utility maximization. In Proceedings of the 2006 IEEE International Conference on Data Mining.
Bade, K., & Nürnberger, A. (2008). Creating a cluster hierarchy under constraints of a partially known hierarchy. In Proceedings of the 2008 SIAM International Conference on Data Mining (pp. 13–24).
Bille, P. (2005). A survey on tree edit distance and related problems. Theoretical Computer Science, 337(1–3), 217–239.
Brank, J., Madenic, D., & Groblenik, M. (2006, May). Gold standard based ontology evaluation using instance assignment. In Proceedings of 4th EON Workshop.
Cai, L., & Hofmann, T. (2004). Hierarchical document categorization with support vector machines. In Proceedings of the 13th ACM CIKM (pp. 78–87).
Ceci, M., & Malerba, D. (2003). Hierarchical classification of html documents with webclassii. In Proceedings of the 25th European Conference on IR (pp. 57–72).
Cimiano, P. (2006). Ontology learning and population from text: Algorithms, evaluation and applications. Berlin: Springer.
Dellschaft, K., & Staab, S. (2006). On how to perform a gold standard based evaluation of ontology learning. In Proceedings of ISWC-2006 International Semantic Web Conference.
Hubert, L., & Arabie, P. (1985, December). Comparing partitions. Journal of Classification, 2(1), 193–218.
Maedche, A. (2002). Ontology learning for the semantic web. Boston: Kluwer.
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66, 622–626.
Schenker, A., Bunke, H., Last, M., & Kandel, A. (2005). Graph-theoretic techniques for web content mining. Singapore: World Scientific.
Sun, A., & Lim, E. (2001). Hierarchical text classification and evaluation. In Proceedings of the 2001 IEEE International Conference on Data Mining (pp. 521–528).
Treeratpituk, P., & Callan, J. (2006). Automatically labeling hierarchical clusters. In Proceedings of the International Conference on Digital Government Research (pp. 167–176).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bade, K., Benz, D. (2009). Evaluation Strategies for Learning Algorithms of Hierarchies. In: Fink, A., Lausen, B., Seidel, W., Ultsch, A. (eds) Advances in Data Analysis, Data Handling and Business Intelligence. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01044-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-01044-6_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01043-9
Online ISBN: 978-3-642-01044-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)