Characterizing the Semantic Web on the Web

  • Li Ding
  • Tim Finin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4273)


Semantic Web languages are being used to represent, encode and exchange semantic data in many contexts beyond the Web – in databases, multiagent systems, mobile computing, and ad hoc networking environments. The core paradigm, however, remains what we call the Web aspect of the Semantic Web – its use by independent and distributed agents who publish and consume data on the World Wide Web. To better understand this central use case, we have harvested and analyzed a collection of Semantic Web documents from an estimated ten million available on the Web. Using a corpus of more than 1.7 million documents comprising over 300 million RDF triples, we describe a number of global metrics, properties and usage patterns. Most of the metrics, such as the size of Semantic Web documents and the use frequency of Semantic Web terms, were found to follow a power law distribution.


Inductive Learner Instance Space Global Catalog Semantic Desktop Dublin Core Element 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Berners-Lee, T., Hall, W., Hendler, J., Shadbolt, N., Weitzner, D.J.: Creating a science of the web. Science 313, 769–771 (2006)CrossRefGoogle Scholar
  2. 2.
    Zou, Y., Finin, T., Ding, L., Chen, H., Pan, R.: Using Semantic web technology in Multi-Agent systems: a case study in the TAGA Trading agent environment. In: Proceeding of the 5th International Conference on Electronic Commerce (September 2003)Google Scholar
  3. 3.
    Franz, T., Staab, S.: Sam: Semantics aware instant messaging for the networked semantic desktop. In: Proceedings of the ISWC 2005 Workshop on The Semantic Desktop - Next Generation Information Management and Collaboration Infrastructure (2005)Google Scholar
  4. 4.
    Visser, U., Stuckenschmidt, H., Schuster, G., Vogele, T.: Ontologies for geographic information processing. Computers and Geoscience 28, 103–117 (2002)CrossRefGoogle Scholar
  5. 5.
    Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Scientific American 284, 35–43 (2001)CrossRefGoogle Scholar
  6. 6.
    Eberhart, A.: Survey of rdf data on the web. Technical report, International University in Germany (2002)Google Scholar
  7. 7.
    Patel, C., Supekar, K., Lee, Y., Park, E.K.: OntoKhoj: a semantic web portal for ontology searching, ranking and classification. In: WIDM 2003 (2003)Google Scholar
  8. 8.
    Dean, M., Barber, K.: Daml crawler (August 2006) (2002),
  9. 9.
    DAML: The DAML ontology library (August 2006) (2004),
  10. 10.
    Pitkow, J.E.: Summary of www characterizations. Computer Networks 30 (1998)Google Scholar
  11. 11.
    Lawrence, S., Giles, C.L.: Accessibility of information on the web. Nature 400 (1999)Google Scholar
  12. 12.
    Gil, R., Garca, R., Delgado, J.: Measuring the semantic web. SIGSEMIS Bulletin 1 (2004)Google Scholar
  13. 13.
    Hartmann, J., Sure, Y., Giboin, A., Maynard, D., del Carmen Surez-Figueroa, M., Cuel, R.: Methods for ontology evaluation. Technical report, University of Karlsruhe (2004)Google Scholar
  14. 14.
    Gangemi, A., Catenacci, C., Ciaramita, M., Lehmann, J.: A theoretical framework for ontology evaluation and validation. In: Proc. of the 2nd Italian Semantic Web Workshop (2005)Google Scholar
  15. 15.
    Lozano-Tello, A., Gomez-Perez, A.: ONTOMETRIC:a method to choose the appropriate ontology. Journal of Database Management 15 (2003)Google Scholar
  16. 16.
    Welty, C.A., Guarino, N.: Supporting ontological analysis of taxonomic relationships. Data Knowledge Engineering 39 (2001)Google Scholar
  17. 17.
    Parsia, B., Sirin, E., Kalyanpur, A.: Debugging owl ontologies. In: WWW 2005 (2005)Google Scholar
  18. 18.
    Magkanaraki, A., Alexaki, S., Christophides, V., Plexousakis, D.: Benchmarking RDF schemas for the semantic web. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342. Springer, Heidelberg (2002)Google Scholar
  19. 19.
    Supekar, K., Patel, C., Lee, Y.: Characterizing quality of knowledge on semantic web. In: FLAIRS 2002 (2002)Google Scholar
  20. 20.
    Alani, H., Brewster, C.: Ontology ranking based on the analysis of concept structures. In: K-CAP 2005 (2005)Google Scholar
  21. 21.
    Yao, H., Orme, A.M., Etzkorn, L.: Cohesion metrics for ontology design and application. Journal of Computer Science 1 (2005)Google Scholar
  22. 22.
    Tartir, S., Arpinar, I.B., Moore, M., Sheth, A.P., Aleman-Meza, B.: Ontoqa: Metric-based ontology quality analysis. In: Proc. of Workshop on Knowledge Acquisition from Distributed, Autonomous, Semantically Heterogeneous Data and Knowledge Sources (2006)Google Scholar
  23. 23.
    Paolillo, J.C., Wright, E.: The Challenges of FOAF Characterization. In: Proc. of the 1st Workshop on Friend of a Friend, Social Networking and the (Semantic) Web (2004)Google Scholar
  24. 24.
    Grimnes, G.A., Edwards, P., Preece, A.D.: Learning meta-descriptions of the FOAF network. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 152–165. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  25. 25.
    Mika, P.: Social Networks and the Semantic Web: An Experiment in Online Social Network Analysis. In: Proc. of International Conference on Web Intelligence (2004)Google Scholar
  26. 26.
    Ding, L., Zhou, L., Finin, T., Joshi, A.: How the semantic web is being used:an analysis of foaf. In: Proceedings of the 38th International Conference on System Sciences (2005)Google Scholar
  27. 27.
    Ding, L.: Enhancing Semantic Web Data Access. PhD thesis, UMBC (2006)Google Scholar
  28. 28.
    Zhang, Y., Vasconcelos, W., Sleeman, D.: Ontosearch: An ontology search engine. In: Proc. of 24th Conf. on Innovative Techniques and Applications of Artificial Intelligence (2004)Google Scholar
  29. 29.
    Lindesay, V.: The schemaweb repository (August 2006) (2005),
  30. 30.
    Biddulph, M.: Crawling the semantic web. In: XML Europe (2004)Google Scholar
  31. 31.
    Apsitis, K., Staab, S., Handschuh, S., Oppermann, H.: Specification of an RDF Crawler (March 2006) (2005),
  32. 32.
    Sherman, C.: Metacrawlers and metasearch engines (March 2006) (2004),
  33. 33.
    Gulli, A., Signorini, A.: The indexable web is more than 11.5 billion pages. In: WWW 2005 (poster) (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Li Ding
    • 1
  • Tim Finin
    • 2
  1. 1.Knowledge Systems LaboratoryStanford UniversityStanfordUSA
  2. 2.Computer Science and Electrical EngineeringUniversity of Maryland, Baltimore CountyBaltimoreUSA

Personalised recommendations