Folksonomy-Based Collabulary Learning

  • Leandro Balby Marinho
  • Krisztian Buza
  • Lars Schmidt-Thieme
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5318)


The growing popularity of social tagging systems promises to alleviate the knowledge bottleneck that slows down the full materialization of the Semantic Web since these systems allow ordinary users to create and share knowledge in a simple, cheap, and scalable representation, usually known as folksonomy. However, for the sake of knowledge workflow, one needs to find a compromise between the uncontrolled nature of folksonomies and the controlled and more systematic vocabulary of domain experts. In this paper we propose to address this concern by devising a method that automatically enriches a folksonomy with domain expert knowledge and by introducing a novel algorithm based on frequent itemset mining techniques to efficiently learn an ontology over the enriched folksonomy. In order to quantitatively assess our method, we propose a new benchmark for task-based ontology evaluation where the quality of the ontologies is measured based on how helpful they are for the task of personalized information finding. We conduct experiments on real data and empirically show the effectiveness of our approach.


Association Rule Recommender System Frequent Itemset Semantic Mapping Frequent Itemset Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Wikipedia article (accessed on May 2008),
  2. 2.
    Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. of SIGMOD 1993, pp. 207–216. ACM Press, New York (1993)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. of the 20th international conference on Very Large Data Bases (VLDB 1994), pp. 478–499. Morgan Kaufmann, San Francisco (1994)Google Scholar
  4. 4.
    Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Scientific American (May 2001)Google Scholar
  5. 5.
    Bodon, F.: A fast apriori implementation. In: Proc. 1st IEEE ICDM Workshop on Frequent Item Set Mining Implementations. CEUR Workshop Proc., vol. 90 (2003)Google Scholar
  6. 6.
    Borgelt, C.: Efficient implementations of apriori and eclat. In: FIMI, CEUR Workshop Proc., vol. 90 (2003)Google Scholar
  7. 7.
    Borgelt, C.: Recursion pruning for the apriori algorithm. In: FIMI, CEUR Workshop Proc., vol. 126 (2004)Google Scholar
  8. 8.
    Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI 1998), pp. 43–52. Morgan Kaufmann, San Francisco (1998)Google Scholar
  9. 9.
    Brooks, C.H., Montanez, N.: Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: WWW 2006. Proc. of the 15th international conference on World Wide Web, pp. 625–632. ACM, New York (2006)Google Scholar
  10. 10.
    Cattuto, C., Loreto, V., Pietronero, L.: Collaborative tagging and semiotic dynamics (May 2006),
  11. 11.
    Chalupksy, H.: Ontomorph: A translation system for symbolic knowledge. In: Proc. of the 17th International Conference on Knowledge Representation and Reasoning (2000)Google Scholar
  12. 12.
    Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. Journal of Artificial Intelligence Research (JAIR) 24, 305–339 (2005)zbMATHGoogle Scholar
  13. 13.
    Doan, A., Madhavan, J., Domingos, P., Halevy, A.Y.: Ontology matching: A machine learning approach. In: Handbook on Ontologies, International Handbooks on Information Systems, pp. 385–404. Springer, Heidelberg (2004)Google Scholar
  14. 14.
    Goldenberg, A., Moore, A.: Tractable learning of large bayes net structures from sparse data. In: Proc. of the 21st International Conference on Machine Learning (2004)Google Scholar
  15. 15.
    Heymann, P., Garcia-Molina, H.: Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical Report 2006-10, Stanford University (April 2006)Google Scholar
  16. 16.
    Hotho, A., Jaeschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: Search and ranking. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 411–426. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  17. 17.
    Mika, P.: Ontologies are us: A unified model of social networks and semantics. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 522–536. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  18. 18.
    Noy, N.F., Musen, M.A.: Prompt: Algorithm and tool for automated ontology merging and alignment. In: AAAI/IAAI, pp. 450–455 (2000)Google Scholar
  19. 19.
    Pei, J., Liu, J., Wang, K.: Discovering frequent closed partial orders from strings. IEEE Transactions on Knowledge and Data Engineering 18(11), 1467–1481 (2006)CrossRefGoogle Scholar
  20. 20.
    Porzel, R., Malaka, R.: A task-based approach for ontology evaluation. In: Proc. of ECAI 2004 Workshop on Ontology Learning and Population, Valencia, Spain (August 2004)Google Scholar
  21. 21.
    Resnick, P., Iacovou, N., Suchak, M., Bergstorm, P., Riedl, J.: Grouplens: An open architecture for collaborative filtering of netnews. In: Proc. of ACM 1994 Conference on Computer Supported Cooperative Work, Chapel Hill, North Carolina, pp. 175–186. ACM, New York (1994)CrossRefGoogle Scholar
  22. 22.
    Schmitz, C., Hotho, A., Jaeschke, R., Stumme, G.: Mining association rules in folksonomies. In: Data Science and Classification: Proc. of the 10th IFCS Conf., Studies in Classification, Data Analysis, and Knowledge Organization, pp. 261–270. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  23. 23.
    Schmitz, P.: Inducing ontology from flickr tags. In: Proc. of the Workshop on Collaborative Tagging at WWW 2006, Edinburgh, Scotland (May 2006)Google Scholar
  24. 24.
    Specia, L., Motta, E.: Integrating folksonomies with the semantic web. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 624–639. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  25. 25.
    Sriphaew, K., Theeramunkong, T.: A new method for finding generalized frequent itemsets in generalized association rule mining. In: ISCC 2002. Proc. of the Seventh International Symposium on Computers and Communications (ISCC 2002), p. 1040 (2002)Google Scholar
  26. 26.
    Zhou, M., Bao, S., Wu, X., Yu, Y.: An unsupervised model for exploring hierarchical semantics from social annotations. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ISWC 2007. LNCS, vol. 4825, pp. 673–686. Springer, Heidelberg (2007)Google Scholar
  27. 27.
    Ziegler, C., Schmidt-Thieme, L., Lausen, G.: Exploiting semantic product descriptions for recommender systems. In: Proc. of the 2nd ACM SIGIR Semantic Web and Information Retrieval Workshop (SWIR 2004), Sheffield, UK (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Leandro Balby Marinho
    • 1
  • Krisztian Buza
    • 1
  • Lars Schmidt-Thieme
    • 1
  1. 1.Information Systems and Machine Learning Lab (ISMLL)University of HildesheimHildesheimGermany

Personalised recommendations