Abstract
The goal of the Semantic Desktop is to enable better organization of the personal information on our computers, by applying semantic technologies on the desktop. However, information on our desktop is often incomplete, as it is based on our subjective view, or limited knowledge about an entity. On the other hand, the Web of Data contains information about virtually everything, generally from multiple sources. Connecting the desktop to the Web of Data would thus enrich and complement desktop information. Bringing in information from the Web of Data automatically would take the burden of searching for information off the user. In addition, connecting the two networks of data opens up the possibility of advanced personal services on the desktop.
Our solution tackles the problems raised above by using a semantic search engine for the Web of Data, such as Sindice, to find and retrieve a relevant subset of entities from the web. We present a matching framework, using a combination of configurable heuristics and rules to compare data graphs, that achieves a high degree of precision in the linking decision. We evaluate our methodology with real-world data; create a gold standard from relevance judgements by experts, and we measure the performance of our system against it. We show that it is possible to automatically link desktop data with web data in an effective way.
Chapter PDF
Similar content being viewed by others
References
Benjelloun, O., Garcia-Molina, H., Jonas, J., Su, Q., Widom, J.: Swoosh: A generic approach to entity resolution. Tech. rep., Stanford University (2006)
Bernardi, A., Decker, S., van Elst, L., Grimnes, G., Groza, T., Jazayeri, S.H.M., Mesnage, C., Moeller, K., Reif, G., Sintek, M.: The Social Semantic Desktop: A New Paradigm Towards Deploying the Semantic Web on the Desktop. IGI Global (2008)
Bizer, C., Volz, J., Kobilarov, G., Gaedke, M.: Silk - a link discovery framework for the web of data. In: Proceedings of the 18th International World Wide Web Conference (April 2009)
Bouquet, P., Stoermer, H., Giacomuzzi, D.: OKKAM: Enabling a web of entities. In: Proceedings of the WWW 2007 Workshop I3: Identity, Identifiers, Identification, Entity-Centric Approaches to Information and Knowledge Management on the Web (May 2007)
Delbru, R., Toupikov, N., Catasta, M., Tummarello, G., Decker, S.: Hierarchical Link Analysis for Ranking Web Data. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010. LNCS, vol. 6089, pp. 225–239. Springer, Heidelberg (2010)
Dong, X., Halevy, A.Y., Madhavan, J.: Reference reconciliation in complex information spaces. In: Özcan, F. (ed.) SIGMOD Conference, pp. 85–96. ACM (2005)
Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1–16 (2007)
Fellegi, I.P., Sunter, A.B.: A theory for record linkage. Journal of the American Statistical Association 64(328), 1183–1210 (1969)
Hogan, A., Harth, A., Decker, S.: Performing object consolidation on the semantic web data graph. In: Proceedings of the WWW 2007 Workshop I3: Identity, Identifiers, Identification, Entity-Centric Approaches to Information and Knowledge Management on the Web (May 2007)
Jaffri, A., Glaser, H., Millard, I.: URI Identity Management for Semantic Web Data Integration and Linkage. In: Meersman, R., Tari, Z., Herrero, P., et al. (eds.) OTM-WS 2007, Part II. LNCS, vol. 4806, pp. 1125–1134. Springer, Heidelberg (2007), http://eprints.ecs.soton.ac.uk/14361/
Raimond, Y., Sutton, C., Sandler, M.: Automatic interlinking of music datasets on the semantic web. In: Proceedings of the Linked Data on the Web workshop, LDOW 2008 (2008)
Saïs, F., Pernelle, N., Rousset, M.C.: L2r: a logical method for reference reconciliation. In: AAAI 2007: Proceedings of the 22nd National Conference on Artificial Intelligence, pp. 329–334. AAAI Press (2007)
Sauermann, L., Elst, L.V., Möller, K.: Personal Information Model (PIMO). Deliverable 1.1 (February 2009), http://www.semanticdesktop.org/ontologies/2007/11/01/pimo/v1.1/pimo_v1.1.pdf
Tummarello, G., Delbru, R., Oren, E.: Sindice.com: Weaving the open linked data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 552–565. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Drăgan, L., Delbru, R., Groza, T., Handschuh, S., Decker, S. (2011). Linking Semantic Desktop Data to the Web of Data. In: Aroyo, L., et al. The Semantic Web – ISWC 2011. ISWC 2011. Lecture Notes in Computer Science, vol 7032. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25093-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-25093-4_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25092-7
Online ISBN: 978-3-642-25093-4
eBook Packages: Computer ScienceComputer Science (R0)