Skip to main content

Estimating the quality of databases

  • Conference paper
  • First Online:
Flexible Query Answering Systems (FQAS 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1495))

Included in the following conference series:

Abstract

With more and more electronic information sources becoming widely available, the issue of the quality of these often-competing sources has become germane. We propose a standard for specifying the quality of databases, which is based on the dual concepts of data soundness and data completeness. The relational model of data is extended by associating a quality specification with each relation instance, and by extending its algebra to calculate the quality specifications of derived relation instances. This provides a method for calculating the quality of answers to arbitrary queries from the overall quality specification of the database. We show practical methods for estimating the initial quality specifications of given databases, and we report on experiments that test the validity of our methods. Finally, we describe how quality estimations are being applied in the Multiplex multidatabase system to resolve cross-database inconsistencies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bort, J.: Scrubbing dirty data. Info World, 17(51), December 1995.

    Google Scholar 

  2. Breiman, L., Friedman, J., Olshen, R., and Stone, Ch.: Classification and Regression Trees. Wadsworth International Group, 1984.

    Google Scholar 

  3. Fox, C., Levitin, A., and Redman, T.: The notion of data and its quality dimensions. Information processing and management, 30(1), 1994.

    Google Scholar 

  4. Chen, M. C., McNamee, L., and Matloff, N.: Selectivity estimation using homogeneity measurement. Proceeding of the International Conference on Data Engineering, 1990.

    Google Scholar 

  5. Hurson, A.R., Bright, M.W., Pakzad, S.: Multidatabases: An Advanced Solution to Global Information Sharing, IEEE Computer Society Press, 1993.

    Google Scholar 

  6. Motro, A.: Integrity = validity + completeness. ACM Transactions on Database Systems, 14(4):480–502, December 1989.

    Article  Google Scholar 

  7. Motro, A: Multiplex: A Formal Model for Multidatabases and Its Implementation. Technical Report ISSE-TR-95-103, Department of Information and Software Engineering, George Mason University, March 1995.

    Google Scholar 

  8. Motro, A., Rakov, I: Not all answers are equally good: Estimating the quality of database answers. In Flexible Answering Systems (T. Andreasen, H. Christiansen, and H.L. Larsen, Editors), Kluwer Academic Publishers, 1997, 1–21.

    Google Scholar 

  9. Rakov, I: Data quality and Its Use for Reconciling Inconsistencies in Multidatabase Environments, Ph.D. Dissertation, George Mason University, May 1998.

    Google Scholar 

  10. G. Salton and M. J. McGill: Introduction to Modern Information Retrieval. McGraw-Hill, New York, New York, 1983.

    Google Scholar 

  11. Wiederhold, G. (Ed.): Special Issue of the Journal of Intelligent Information Systems, 6(2–3), June 1996.

    Google Scholar 

  12. Wang, R., Storey, V., and Firth, Ch.: A framework for analysis of data quality research. IEEE Transactions on Knowledge and Data Engineering, 7(4), August 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Troels Andreasen Henning Christiansen Henrik Legind Larsen

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Motro, A., Rakov, I. (1998). Estimating the quality of databases. In: Andreasen, T., Christiansen, H., Larsen, H.L. (eds) Flexible Query Answering Systems. FQAS 1998. Lecture Notes in Computer Science, vol 1495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0056011

Download citation

  • DOI: https://doi.org/10.1007/BFb0056011

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65082-9

  • Online ISBN: 978-3-540-49655-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics