KD2R: A Key Discovery Method for Semantic Reference Reconciliation

  • Danai Symeonidou
  • Nathalie Pernelle
  • Fatiha Saïs
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7046)

Abstract

The reference reconciliation problem consists of deciding whether different identifiers refer to the same world entity. Some existing reference reconciliation approaches use key constraints to infer reconciliation decisions. In the context of the Linked Open Data, this knowledge is not available. We propose KD2R, a method which allows automatic discovery of key constraints associated to OWL2 classes. These keys are discovered from RDF data which can be incomplete. The proposed algorithm allows this discovery without having to scan all the data. KD2R has been tested on data sets of the international contest OAEI and obtains promising results.

Keywords

Child Node Resource Description Framework Link Open Data Property Expression Real World Entity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bhattacharya, I., Getoor, L.: Collective entity resolution in relational data. ACM Trans. Knowl. Discov. Data 1 (March 2007)Google Scholar
  2. 2.
    Dong, X., Halevy, A., Madhavan, J.: Reference reconciliation in complex information spaces. In: Proceedings of the 2005 ACM, SIGMOD 2005, NY, USA, pp. 85–96 (2005)Google Scholar
  3. 3.
    Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19, 1–16 (2007)CrossRefGoogle Scholar
  4. 4.
    Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: Tane: An efficient algorithm for discovering functional and approximate dependencies. Comput. J. 42(2), 100–111 (1999)CrossRefMATHGoogle Scholar
  5. 5.
    Low, W.L., Lee, M.L., Ling, T.W.: A knowledge-based approach for duplicate elimination in data cleaning. Information Systemes 26, 585–606 (2001)CrossRefMATHGoogle Scholar
  6. 6.
    Nikolov, A., Motta, E.: Data linking: Capturing and utilising implicit schema-level relations. In: Proceedings of Linked Data on the Web Workshop Collocated with WWW 2010 (2010)Google Scholar
  7. 7.
    Saïs, F., Pernelle, N., Rousset, M.-C.: Combining a logical and a numerical method for data reconciliation. Journal on Data Semantics 12, 66–94 (2009)CrossRefGoogle Scholar
  8. 8.
    Sismanis, Y., Brown, P., Haas, P.J., Reinwald, B.: Gordian: efficient and scalable discovery of composite keys. In: Proceedings of the 32nd International Conference VLDB 2006, pp. 691–702. VLDB Endowment (2006)Google Scholar
  9. 9.
    Wang, D.Z., Dong, X.L., Sarma, A.D., Franklin, M.J., Halevy, A.Y.: Functional dependency generation and applications in pay-as-you-go data integration systems. In: 12th International Workshop on the Web and Databases (2009)Google Scholar
  10. 10.
    Winkler, W.E.: Overview of record linkage and current research directions. Tech. rep., Bureau of the Census (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Danai Symeonidou
    • 1
  • Nathalie Pernelle
    • 1
  • Fatiha Saïs
    • 1
  1. 1.LRI (CNRS & Paris-Sud XI University)/INRIA Saclay, LRIOrsayFrance

Personalised recommendations