Inducing Taxonomy from Tags: An Agglomerative Hierarchical Clustering Framework

  • Xiang Li
  • Huaimin Wang
  • Gang Yin
  • Tao Wang
  • Cheng Yang
  • Yue Yu
  • Dengqing Tang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7713)

Abstract

By amassing ‘wisdom of the crowd’, social tagging systems draw more and more academic attention in interpreting Internet folk knowledge. In order to uncover their hidden semantics, several researches have attempted to induce an ontology-like taxonomy from tags. As far as we know, these methods all need to compute an overall or relative generality for each tag, which is difficult and error-prone. In this paper, we propose an agglomerative hierarchical clustering framework which relies only on how similar every two tags are. We enhance our framework by integrating it with a topic model to capture thematic correlations among tags. By experimenting on a designated online tagging system, we show that our method can disclose new semantic structures that supplement the output of previous approaches. Finally, we demonstrate the effectiveness of our method with quantitative evaluations.

Keywords

social tagging semantics tag taxonomy tag generality agglomerative hierarchical clustering topic model 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Staab, S., Studer, R. (eds.): Handbook on Ontologies, 2nd edn. Springer, Berlin (2009)MATHGoogle Scholar
  2. 2.
    Liu, K., Fang, B., Zhang, W.: Ontology emergence from folksonomies. In: Huang, J., Koudas, N., Jones, G.J.F., Wu, X., Collins-Thompson, K., An, A. (eds.) CIKM, pp. 1109–1118. ACM (2010)Google Scholar
  3. 3.
    Tang, J., Leung, H.-F., Luo, Q., Chen, D., Gong, J.: Towards ontology learning from folksonomies. In: Boutilier, C. (ed.) IJCAI, pp. 2089–2094 (2009)Google Scholar
  4. 4.
    Wang, W., Barnaghi, P.M., Bargiela, A.: Probabilistic topic models for learning terminological ontologies. IEEE Trans. Knowl. Data Eng. 22(7), 1028–1040 (2010)CrossRefGoogle Scholar
  5. 5.
    Navigli, R., Velardi, P., Faralli, S.: A graph-based algorithm for inducing lexical taxonomies from scratch. In: Walsh, T. (ed.) IJCAI, pp. 1872–1877. IJCAI/AAAI (2011)Google Scholar
  6. 6.
    Russell, S.J., Norvig, P.: Artificial Intelligence - A Modern Approach, 3rd internat edn. Pearson Education (2010)Google Scholar
  7. 7.
    Heymann, P., Garcia-Molina, H.: Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical report, Computer Science Department, Standford University (April 2006)Google Scholar
  8. 8.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)MATHGoogle Scholar
  9. 9.
    Itti, L., Baldi, P.: Bayesian surprise attracts human attention. In: NIPS (2005)Google Scholar
  10. 10.
    Brooks, C.H., Montanez, N.: Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: Carr, L., Roure, D.D., Iyengar, A., Goble, C.A., Dahlin, M. (eds.) WWW, pp. 625–632. ACM (2006)Google Scholar
  11. 11.
    Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2005)Google Scholar
  12. 12.
    Blei, D.M., McAuliffe, J.D.: Supervised topic models. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S.T. (eds.) NIPS. Curran Associates, Inc. (2007)Google Scholar
  13. 13.
    Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. J. ACM 57(2) (2010)Google Scholar
  14. 14.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Science 101, 5228–5235 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Xiang Li
    • 1
  • Huaimin Wang
    • 1
  • Gang Yin
    • 1
  • Tao Wang
    • 1
  • Cheng Yang
    • 1
  • Yue Yu
    • 1
  • Dengqing Tang
    • 2
  1. 1.National Laboratory for Parallel and Distributed Processing, School of Computer ScienceNational University of Defense TechnologyChangshaChina
  2. 2.College of Mechatronics Engineering and AutomationNational University of Defense TechnologyChangshaChina

Personalised recommendations