Advertisement

Explaining Reference Reconciliation Decisions: A Coloured Petri Nets Based Approach

  • Souhir Gahbiche
  • Nathalie Pernelle
  • Fatiha Saïs
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 398)

Abstract

Data integration systems aims at facilitating the management of heterogeneous data sources. When huge amount of data have to be integrated, resorting to human validations is not possible. However, completely automatic integration methods may give rise to decision errors and to approximated results. Hence, such systems need explanation modules to enhance the user confidence in the integrated data. In this paper, we focus our study on reference reconciliation methods which compare data descriptions to decide whether they refer to the same real world entity. Numerical reference reconciliation methods that are global and ontology driven, exploit semantic knowledge to model the dependencies between similarities and to propagate them to other references. In order to explain the similarity scores and the reconciliation decisions obtained by such methods, we have developed an explanation model based on Coloured Petri Nets which provides graphical and comprehensive explanations to the user. This model allows to show the relevance of one decision, and to diagnose possible anomalies in the domain knowledge or in the similarity measures that are used.

Keywords

Similarity Score Explanation Model Data Integration System Variable Place Ontology Knowledge 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Batini and Scannapieco, 2006]
    Batini, C., Scannapieco, M.: Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications). Springer, New York (2006)zbMATHGoogle Scholar
  2. [Bilgic et al., 2006]
    Bilgic, M., Licamele, L., Getoor, L., Shneiderman, B.: D-dupe: An interactive tool for entity resolution in social networks. In: Visual Analytics Science and Technology (VAST), Baltimore (2006)Google Scholar
  3. [Borgida et al., 2008]
    Borgida, A., Calvanese, D., Rodriguez-Muro, M.: Explanation in the DLLite Family of Description Logics. In: Meersman, R., Tari, Z. (eds.) OTM 2008. LNCS, vol. 5332, pp. 1440–1457. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. [Cohen et al., 2003]
    Cohen, W.W., Ravikumar, P.D., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: Proceedings of IJCAI-2003 Workshop on Information Integration on the Web (IIWeb-2003), Acapulco, Mexico, August 9-10, pp. 73–78 (2003)Google Scholar
  5. [Dong et al., 2005]
    Dong, X., Halevy, A.Y., Madhavan, J.: Reference reconciliation in complex information spaces. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14-16, pp. 85–96 (2005)Google Scholar
  6. [Elmagarmid et al., 2007]
    Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Trans. on Knowl. and Data Eng. 19(1), 1–16 (2007)CrossRefGoogle Scholar
  7. [Golub and Loan, 1996]
    Golub, G.H., Loan, C.F.V.: Matrix computations, 3rd edn., Johns Hopkins University Press, Baltimore (1996)Google Scholar
  8. [Jensen, 1997]
    Jensen, K.: Coloured Petri Nets, Basic Concepts. Springer, London (1997)zbMATHCrossRefGoogle Scholar
  9. [McGuinness et al., 2006]
    McGuinness, D.L., Ding, L., Glass, A., Chang, C., Zeng, H., Furtado, V.: Explanation Interfaces for the Semantic Web: Issues and Models. In: 3rd International Semantic Web User Interaction Workshop (SWUI 2006), Athens, Georgia, USA, November 6 (2006)Google Scholar
  10. [Rahm and Bernstein, 2001]
    Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. The VLDB Journal 10(4), 334–350 (2001)zbMATHCrossRefGoogle Scholar
  11. [Robin et al., 2004]
    Robin, D., Yoonkyong, L., AnHai, D., Alon, H., Pedro, D.: iMAP: discovering complex semantic matches between database schemas. In: SIGMOD 2004: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 383–394. ACM, New York (2004)Google Scholar
  12. [Saïs, 2007]
    Saïs, F.: Integration sémantique de données guidée par une ontologie. PhD thesis, Université de paris sud (2007)Google Scholar
  13. [Saïs et al., 2007]
    Saïs, F., Pernelle, N., Rousset, M.-C.: L2R: A Logical Method for Reference Reconciliation. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, Vancouver, British Columbia, Canada, July 22-26, pp. 329–334 (2007)Google Scholar
  14. [Saïs et al., 2009]
    Saïs, F., Pernelle, N., Rousset, M.-C.: Combining a logical and a numerical method for data reconciliation. J. Data Semantics 12, 66–94 (2009)CrossRefGoogle Scholar
  15. [Shvaiko and Euzenat, 2005]
    Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches, pp. 146–171 (2005)Google Scholar
  16. [Shvaiko et al., 2005]
    Shvaiko, P., Giunchiglia, F., da Silva, P.P., McGuinness, D.L.: Web Explanations for Semantic Heterogeneity Discovery. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 303–317. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. [Silva et al., 2006]
    Silva, D., Pinheiro, P., McGuinness, D.L., Richard, F.: A proof markup language for semantic web services. Inf. Syst. 31(4), 381–395 (2006)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2012

Authors and Affiliations

  • Souhir Gahbiche
    • 1
  • Nathalie Pernelle
    • 2
  • Fatiha Saïs
    • 2
  1. 1.LIMSI, Bât 508, Université Paris 11Orsay CedexFrance
  2. 2.Université Paris-Sud 11, INRIA SaclayOrsay CedexFrance

Personalised recommendations