Tripartite Hidden Topic Models for Personalised Tag Suggestion

  • Morgan Harvey
  • Mark Baillie
  • Ian Ruthven
  • Mark J. Carman
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5993)


Social tagging systems provide methods for users to categorise resources using their own choice of keywords (or “tags”) without being bound to a restrictive set of predefined terms. Such systems typically provide simple tag recommendations to increase the number of tags assigned to resources. In this paper we extend the latent Dirichlet allocation topic model to include user data and use the estimated probability distributions in order to provide personalised tag suggestions to users. We describe the resulting tripartite topic model in detail and show how it can be utilised to make personalised tag suggestions. Then, using data from a large-scale, real life tagging system, test our system against several baseline methods. Our experiments show a statistically significant increase in performance of our model over all key metrics, indicating that the model could be successfully used to provide further social tagging tools such as resource suggestion and collaborative filtering.


Topic Model Latent Dirichlet Allocation Dirichlet Process Latent Topic Tripartite Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research (3), 993–1022 (2003)Google Scholar
  2. 2.
    Garg, N., Weber, I.: Personalized tag suggestion for flickr. In: WWW (2008)Google Scholar
  3. 3.
    Griffiths, T., Steyvers, M.: Finding scientific topics. PNAS (2004)Google Scholar
  4. 4.
    Heinrich, G.: Parameter estimation for text analysis. Technical report, Fraunhofer IGD (2008)Google Scholar
  5. 5.
    Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42(1/2), 177–196 (2001)zbMATHCrossRefGoogle Scholar
  6. 6.
    Hooper, R.S.: Indexer consistency tests—origin, measurements, results and utilization. Technical report, IBM, Bethesda (1965)Google Scholar
  7. 7.
    Hotho, A., Jaschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: Search and ranking. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 411–426. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Plangprasopchok, A., Lerman, K.: Exploiting social annotation for automatic resource discovery. In: AAAI 2007 (2007)Google Scholar
  9. 9.
    Schmitz, P.: Inducing ontology from flickr tags. In: WWW (2006)Google Scholar
  10. 10.
    Sigurbjörnsson, B., van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: WWW (2008)Google Scholar
  11. 11.
    Smith, A.F.M., Roberts, G.O.: Bayesian computation via the gibbs sampler and related markov chain monte-carlo methods (with discussion). Journal of the Royal Statistical Society 55, 3–23 (1993)zbMATHMathSciNetGoogle Scholar
  12. 12.
    Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. JASA 101(476), 1566–1581 (2006)zbMATHMathSciNetGoogle Scholar
  13. 13.
    Wu, X., Zhang, L., Yu, Y.: Exploring social annotations of the semantic web. In: WWW (2006)Google Scholar
  14. 14.
    Zunde, P., Dexter, M.E.: Indexing consistency and quality. American Documentation 20(3), 259–267 (1969)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Morgan Harvey
    • 1
  • Mark Baillie
    • 1
  • Ian Ruthven
    • 1
  • Mark J. Carman
    • 2
  1. 1.CIS DepartmentUniversity of StrathclydeGlasgowUK
  2. 2.Faculty of InformaticsUniversity of LuganoLuganoSwitzerland

Personalised recommendations