Emergent Semantics from Folksonomies: A Quantitative Study

  • Lei Zhang
  • Xian Wu
  • Yong Yu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4090)


Defining and using ontology to annotate web resources with semantic markups is generally perceived as the primary way to implement the vision of the Semantic Web. The ontology provides a shared and machine understandable semantics for web resources that agents and applications can utilize. This top-down approach (in the sense that an ontology is defined first on top of existing web resources and then used later to markup them), however, has a high barrier to entry and is difficult to scale up. In this paper, we investigate using a bottom-up approach for semantically annotating web resources as supported by the now widely popular social bookmarks services on the web where users can annotate and categorize web resources using “tags” freely choosen by the user without any pre-existing global semantic model. This kind of informal social categories is coined as “folksonomies”. We show how global semantics can be statistically inferred from the folksonomies to semantically annotate the web resources. The global semantic model also disambiguate the tags and group synonymous tags together. Finally, we show that there indeed are hierarchical relations among the emerged concepts in the folksonomy and it is plausible to further identify them if we use more advanced probabilistic models.


Semantic Annotation Ontology Engineering Social Bookmark Ontology Learning Probabilistic Generative Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284, 34–43 (2001)CrossRefGoogle Scholar
  2. 2.
    Manola, F., Miller, E.: RDF Primer. W3C Recommendation (2004)Google Scholar
  3. 3.
    McGuinness, D.L., van Harmelen, F.: OWL Web ontology language overview. W3C Recommendation (2004)Google Scholar
  4. 4.
    Gennari, J.H., Musen, M.A., Fergerson, R.W., Grosso, W.E., Crubézy, M., Eriksson, H., Noy, N.F., Tu, S.W.: The evolution of Protégé: An environment for knowledge-based systems development. Technical Report SMI-2002-0943, Stanford Medical Informatics (2002)Google Scholar
  5. 5.
    Bechhofer, S., Horrocks, I., Goble, C., Stevens, R.: OilEd: a reason-able ontology editor for the semantic web. In: Baader, F., Brewka, G., Eiter, T. (eds.) KI 2001. LNCS (LNAI), vol. 2174, pp. 396–408. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  6. 6.
    Corcho, O., López, M.F., Pérez, A.G., Vicente, O.: WebODE: An integrated workbench for ontology representation, reasoning, and exchange. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 138–153. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Zhang, L., Yu, Y., Lu, J., Lin, C., Tu, K., Guo, M., Zhang, Z., Xie, G., Su, Z., Pan, Y.: ORIENT: Integrate ontology engineering into industry tooling environment. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, Springer, Heidelberg (2004)Google Scholar
  8. 8.
    Kalyanpur, A., Sirin, E., Parsia, B., Hendler, J.: Hypermedia inspired ontology engineering environment: SWOOP. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298. Springer, Heidelberg (2004)Google Scholar
  9. 9.
    Heflin, J., Hendler, J.: Dynamic ontologies on the web. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI 2000), pp. 443–449. AAAI/MIT Press, Menlo Park (2000)Google Scholar
  10. 10.
    Noy, N.F., Klein, M.: Ontology evolution: Not the same as schema evolution. Knowledge and Information Systems 5 (2003)Google Scholar
  11. 11.
    Kiryakov, A., Ognyanov, D.: Tracking changes in RDF(S) repositories. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 373–378. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Noy, N.F., Kunnatur, S., Klein, M., Musen, M.A.: Tracking changes during ontology evolution. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 259–273. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  13. 13.
    Klein, M., Fensel, D.: Ontology versioning for the semantic web. In: Proceedings of the 1st International Semantic Web Working Symposium (SWWS 2001), pp. 75–91. Stanford University, Stanford (2001)Google Scholar
  14. 14.
    Klein, M., Fensel, D., Kiryakov, A., Ognyanov, D.: Ontology versioning and change detection on the web. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 197–212. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Stojanovic, L., Maedche, A., Motik, B., Stojanovic, N.: User-driven ontology evolution management. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 285–300. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  16. 16.
    Noy, N.F., Sintek, M., Decker, S., Crubezy, M., Fergerson, R.W., Musen, M.A.: Creating semantic web contents with Protege-2000. IEEE Intelligent Systems 2, 60–71 (2001)CrossRefGoogle Scholar
  17. 17.
    Handschuh, S., Staab, S.: Authoring and annotation of web pages in CREAM. In: Proc. of the 11th Intl. World Wide Web Conference (WWW 2002) (2002)Google Scholar
  18. 18.
    Kiryakov, A., Popov, B., Ognyanoff, D., Manov, D., Kirilov, A., Goranov, M.: Semantic annotation, indexing, and retrieval. In: Fensel, D., Sycara, K.P., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 484–499. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  19. 19.
    Handschuh, S., Staab, S., Volz, R.: On deep annotation. In: Proc. of the 12th Intl. World Wide Web Conference (WWW 2003), pp. 431–438 (2003)Google Scholar
  20. 20.
    Blythe, J., Gil, Y.: Incremental formalization of document annotations through ontology-based paraphrasing. In: Proc. of the 13th conference on World Wide Web (WWW 2004), ACM Press, pp. 455–461. ACM Press, New York (2004)CrossRefGoogle Scholar
  21. 21.
    Cimiano, P., Handschuh, S., Staab, S.: Towards the self-annotating web. In: Proc. of the 13th Intl. World Wide Web Conference (WWW 2004) (2004)Google Scholar
  22. 22.
    Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, T., Rajagopalan, S., Tomkins, A., Tomlin, A., Zien, J.Y.J.: Bootstrapping the semantic web via automated semantic annotation. In: SemTag, Seeker. (eds.) Proc. of the 12th Intl. World Wide Web Conference (WWW 2003) pp. 178–186 (2003)Google Scholar
  23. 23.
    Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A.M., Shaked, T., Soderland, S., Weld, S., Yates, D.A.: Web-scale information extraction in KnowItAll (preliminary results). In: Proc. of the 13th Intl. World Wide Web Conf. (WWW 2004) (2004)Google Scholar
  24. 24.
    Cimiano, P., Ladwig, G., Staab, S.: Gimme the context: Context-driven automatic semantic annotation with C-PANKOW. In: Proc. of the 14th Intl. World Wide Web Conference (WWW 2005) (2005)Google Scholar
  25. 25.
    Maedche, A.: Emergent semantics for ontologies. IEEE Intelligent Systems 17 (2002)Google Scholar
  26. 26.
    Aberer, K., et al.: Emergent semantics principles and issues. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS, vol. 2973, pp. 25–38. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  27. 27.
    Kahan, J., Koivunen, M.R., Prud’Hommeaux, E., Swick, R.R.: Annotea: An open RDF infrastructure for shared web annotations. In: Proc. of the 10th Intl. World Wide Web Conference (2001)Google Scholar
  28. 28.
    Hammond, T., Hannay, T., Lund, B., Scott, J.: Social bookmarking tools (i) - a general review. D-Lib Magazine 11 (2005)Google Scholar
  29. 29.
    Mathes, A.: Folksonomies - cooperative classification and communication through shared metadata. Computer Mediated Communication, LIS590CMC (Doctoral Seminar), Graduate School of Library and Information Science, University of Illinois Urbana-Champaign (2004)Google Scholar
  30. 30.
    Udell, J.: Collaborative knowledge gardening. InfoWorld, (August 20, 2004)Google Scholar
  31. 31.
    Merholz, P.: Metadata for the masses (2004) (accessed, May 2005),
  32. 32.
    Adamic, L.A., Huberman, B.A.: The web’s hidden order. Communications of the ACM 44 (2001)Google Scholar
  33. 33.
    Hofmann, T., Puzicha, J.: Statistical models for co-occurrence data. Technical report, A.I. Memo 1635. MIT, Cambridge (1998)Google Scholar
  34. 34.
    Miller, G.A.: WordNet: A lexical database for english. Communications of the ACM 2 (1995)Google Scholar
  35. 35.
    Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intelligent Systems 16 (2001)Google Scholar
  36. 36.
    Shamsfard, M.M.A.: The state of the art in ontology learning: a framework for comparison. Knowledge Engineering Review 18 (2003)Google Scholar
  37. 37.
    Jung, J.J., Yu, Y.H., Jo, S.S.: Collaborative web browsing based on ontology learning from bookmarks. In: Proc. of the Intl. Conference of Computational Science (ICCS 2004) (2004)Google Scholar
  38. 38.
    Grosky, W.I., Sreenath, D.V., Fotouhi, F.: Emergent semantics and the multimedia semantic web. SIGMOD Record 31 (2002)Google Scholar
  39. 39.
    Aberer, K., Cudre-Mauroux, P., Hauswirth, M.: The chatty web: Emergent semantics through gossiping. In: Proc. of 12th Intl. Conf. on World Wide Web (WWW 2003) (2003)Google Scholar
  40. 40.
    Howe, B., Tanna, K., Turner, P., Maier, D.: Emergent semantics: Towards self-organizing scientific metadata. In: Bouzeghoub, M., Goble, C.A., Kashyap, V., Spaccapietra, S. (eds.) ICSNW 2004. LNCS, vol. 3226, pp. 177–198. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  41. 41.
    Furnas, G.W., Deerwester, S., Dumais, S.T., Landauer, T.K., Harshman, R.A., Streeter, L.A., Lochbaum, K.E.: Information retrieval using a singular value decomposition model of latent semantic structure. In: Proc. of the ACM SIGIR 1988, pp. 465–480. Grenoble, France (1988)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Lei Zhang
    • 1
  • Xian Wu
    • 1
  • Yong Yu
    • 1
  1. 1.APEX Data and Knowledge Management Lab, Department of Computer Science and EngineeringShanghai JiaoTong UniversityShanghaiChina

Personalised recommendations