Advertisement

Tolerant Ad Hoc Data Propagation with Error Quantification

  • Philipp Rösch
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4254)

Abstract

Nowadays everybody uses a variety of different systems managing similar information, for example in the home entertainment sector. Unfortunately, these systems are largely heterogeneous, mostly with respect to the data model but at least with respect to the schema, making synchronization and propagation of data a daunting task. Our goal is to cope with this situation in a best-effort manner. To meet this claim, we introduce a symmetric instance-level matching approach that allows to establish mappings without any user interaction, schema information or dictionaries and ontologies. In awareness of dealing with inexact and incomplete mappings, the quality of the propagation has to be quantified. For this purpose, different quality dimensions like accuracy or completeness are introduced. Additionally, visualizing the quality allows users to evaluate the performance of the data propagation process.

Keywords

Child Node Cluster Group Data Propagation Process Schema Match Source Document 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Do, H.H., Rahm, E.: COMA - A System for Flexible Combination of Schema Matching Approaches. In: Proceedings of the 28th VLDB Conference, Hong Kong, China, pp. 610–621 (2002)Google Scholar
  2. 2.
    Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: Proceedings of the 27th VLDB Conference, Rome, Italy, pp. 49–58 (2001)Google Scholar
  3. 3.
    Milo, T., Zohar, S.: Using Schema Matching to Simplify Heterogeneous Data Translation. In: Proceedings of the 24th VLDB Conference, New York City, USA, pp. 122–133 (1998)Google Scholar
  4. 4.
    Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A Versatile Graph Matching Algorithm and its Application to Schema Matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), San Jose, USA, pp. 117–128 (2002)Google Scholar
  5. 5.
    Wang, Q.Y., Yu, J.X., Wong, K.-F.: Approximate graph schema extraction for semi-structured data. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 302–316. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proceedings of the 2rd VLDB Conference, Athens, Greece, pp. 436–445 (1997)Google Scholar
  7. 7.
    Mandreoli, F., Martoglia, R., Tiberio, P.: Approximate Query Answering for a Heterogeneous XML Document Base. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K.G. (eds.) WISE 2004. LNCS, vol. 3306, pp. 337–351. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Rahm, E., Bernstein, P.A.: On Matching Schemas Automatically. Technical Report MSR-TR-2001-17, Microsoft Research, Microsoft Corporation, One Microsoft Way, Redmond, WA 98052-6399 (2001)Google Scholar
  9. 9.
    Bovee, M., Srivastava, R.P., Mak, B.: A Conceptual Framework and Belief-Function Approach to Assessing Overall Information Quality. International Journal of Intelligent Systems 18(1), 51–74 (2003)MATHCrossRefGoogle Scholar
  10. 10.
    Lee, Y.W., Strong, D.M., Kahn, B.K., Wang, R.Y.: AIMQ: A Methodology for Information Quality Assessment. Information & Management 40, 133–146 (2002)CrossRefGoogle Scholar
  11. 11.
    Martinez, A., Hammer, J.: Making Quality Count in Biological Data Sources. In: Proceedings of the IQIS Workshop, Baltimore, USA, pp. 16–27 (2005)Google Scholar
  12. 12.
    Motro, A., Rakov, I.: Estimating the Quality of Databases. In: Proceedings of the 3rd FQAS Conference, Roskilde, Denmark, pp. 298–307 (1998)Google Scholar
  13. 13.
    Naumann, F., Rolker, C.: Do Metadata Models meet IQ Requirements?. In: Proceedings of the 4th IQ Conference, Cambridge, USA, pp. 99–114 (1999)Google Scholar
  14. 14.
    Naumann, F., Rolker, C.: Assessment Methods for Information Quality Criteria. In: Proceedings of the 5th IQ Conference, Cambridge, USA, pp. 148–162 (2000)Google Scholar
  15. 15.
    Scannapieco, M., Missier, P., Batini, C.: Data Quality at a Glance. Datenbank-Spektrum 14, 6–14 (2005)Google Scholar
  16. 16.
    Tayi, G.K., Ballou, D.P.: Examining Data Quality - Introduction. Communications of the ACM 41(2), 54–57 (1998)CrossRefGoogle Scholar
  17. 17.
    Pipino, L., Lee, Y.W., Wang, R.Y.: Data Quality Assessment. Communications of the ACM 45(4), 211–218 (2002)CrossRefGoogle Scholar
  18. 18.
    Naumann, F.: From Databases to Information Systems - Information Quality Makes the Difference. In: Proceedings of the 6th IQ Conference, Cambridge, USA, pp. 244–260 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Philipp Rösch
    • 1
  1. 1.Database Technology GroupTechnische Universität DresdenGermany

Personalised recommendations