Abstract
By leveraging Semantic Web technologies, Linked Open Data provides an extensive amount of structured information in a wide variety of domains. Principles of Liked Data facilitate access and re-use of semantic information, both for human and machine consumption. However, information overload due to the availability of a large amount of semantic data, as well as the need for automatic interpretation and analysis of the Web of Data require systematic approaches to manage the quality of the published information. Identifying the value of information provided by Linked Data enables a general understanding about the significance of semantic resources. This can lead to better information filtering functionalities in Semantic Web-based applications. The aim of this paper is to propose a novel approach, derived from information theory, to measure the informativeness in the context of the Web of Data. We experiment with the metric and present its applications in a variety of areas, including Linked Data quality analysis, faceted browsing, and ranking.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lassila, O.: Resource Description Framework (RDF) model and syntax specification. World Wide Web Consortium, W3C (1999)
McGuinness, D.L., Harmelen, F.V.O.: Web Ontology Language: Overview. World Wide Web Consortium, W3C (2004)
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. World Wide Web Consortium, W3C (2008)
Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. International Journal on Semantic Web and Information Systems 5(3), 1–22 (2009)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: A Nucleus for a Web of Open Data. Springer, Heidelberg (2007)
Gray, R.M.: Entropy and Information Theory. Springer, New York (2009)
Edwards, S.: Elements of information theory, 2nd edition. Information Processing & Management 44(1), 400–401 (2008)
Shannon, C.E.: A Mathematical Theory of Communication. Bell System Technical Journal 27, 379–423, 623–656 (1948)
Ross, S.M.: A First Course in Probability. Prentice Hall (2002)
Hassanzadeh, O., Consens, M.P.: Linked Movie Data Base (2009)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 448–453 (1995)
Jiang, J.J., Conrath, D.W.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: Proceedings of the International Conference Research on Computational Linguistics (ROCLING X), Taiwan (1997)
Lin, D.: An information-theoretic definition of similarity. Morgan Kaufmann, San Francisco (1998)
Resnik, P.: Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research 11, 95–130 (1999)
Zili, Z., Yanna, W., Junzhong, G.: A New Model of Information Content for Semantic Similarity in WordNet (2008)
Cheng, G., Tran, T., Qu, Y.: RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization. Springer, Heidelberg (2011)
Ell, B., Vrandečić, D., Simperl, E.: Labels in the Web of Data. Springer, Heidelberg (2011)
Guéret, C., Groth, P., van Harmelen, F., Schlobach, S.: Finding the Achilles Heel of the Web of Data: Using Network Analysis for Link-Recommendation. Springer, Heidelberg (2010)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. In: Proceedings of the Seventh International Conference on World Wide Web 7, Brisbane, Australia. Elsevier Science Publishers B. V. (1998)
Jeh, G., Widom, J.: SimRank: a measure of structural-context similarity. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada. ACM (2002)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Balmin, A., Hristidis, V., Papakonstantinou, Y.: Objectrank: authority-based keyword search in databases. In: Proceedings of the Proceedings of the Thirtieth International Conference on Very Large Data Bases - Volume 30, Toronto, Canada. VLDB Endowment (2004)
Bamba, B., Mukherjea, S.: Utilizing Resource Importance for Ranking Semantic Web Query Results. Springer, Heidelberg (2005)
Ding, L., Pan, R., Finin, T., Joshi, A., Peng, Y., Kolari, P.: Finding and Ranking Knowledge on the Semantic Web. Springer, Heidelberg (2005)
Hogan, A., Harth, A., Decker, S.: ReConRank: A Scalable Ranking Method for Semantic Web Data with Context. In: Proceedings of Second International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2006), in conjunction with International Semantic Web Conference (ISWC 2006) (2006)
Franz, T., Schultz, A., Sizov, S., Staab, S.: TripleRank: Ranking Semantic Web Data by Tensor Decomposition. Springer, Heidelberg (2009)
Delbru, R., Toupikov, N., Catasta, M., Tummarello, G., Decker, S.: Hierarchical Link Analysis for Ranking Web Data. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meymandpour, R., Davis, J.G. (2013). Linked Data Informativeness. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds) Web Technologies and Applications. APWeb 2013. Lecture Notes in Computer Science, vol 7808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37401-2_61
Download citation
DOI: https://doi.org/10.1007/978-3-642-37401-2_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37400-5
Online ISBN: 978-3-642-37401-2
eBook Packages: Computer ScienceComputer Science (R0)