Skip to main content

Data Quality in Web Information Systems

  • Conference paper
Web Information Systems Engineering - WISE 2008 (WISE 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5175))

Included in the following conference series:

  • 886 Accesses

Abstract

The World Wide Web has brought a wave of revolutionary changes for people and organizations to generate, disseminate and use data. With unprecedented access to massive amount of data and powerful information gathering capabilities enabled by Web-based technologies, the traditional closed world assumption for database systems has been challenged. More and more data from the Web are used today as essential information sources, directly or indirectly, for all types of decision making purposes in not only just personal, but also many business and scientific applications. A user of such Web data, however, has to constantly rely on their own judgement on data quality, such as correctness, currency, consistency and completeness. This is an unreliable and often very difficult process, as the quality of this judgement itself often relies on the quality of other information obtained from the Web, and the relationship among the data used can be very complex and sometime hidden from the user.

While the issue of data quality is as old as data itself, it is now exposed at a much higher, broader and more critical level due to the scale, diversity and ubiquitousness of Web Information Systems. The intrinsic mismatch between the intended use and actual use of the data on the Web is a fundamental cause of poor data quality for Web-based applications. In this talk, we will introduce the notion of data quality, from its root in management information systems research to new issues and challenges in the context of large-scale Web Information Systems. After a brief introduction to organizational and architectural solutions to the data quality problem, this talk will focus on the current research activities and results on computational solutions form the database community in data profiling, record linking, conditional functional constraints, data provenance and data uncertainty. These technical solutions will be examined for their promises and limitations to the problem of data quality in Web Information Systems. Finally, we will discuss a list of open research problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Editor information

James Bailey David Maier Klaus-Dieter Schewe Bernhard Thalheim Xiaoyang Sean Wang

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhou, X., Sadiq, S., Deng, K. (2008). Data Quality in Web Information Systems. In: Bailey, J., Maier, D., Schewe, KD., Thalheim, B., Wang, X.S. (eds) Web Information Systems Engineering - WISE 2008. WISE 2008. Lecture Notes in Computer Science, vol 5175. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85481-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85481-4_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85480-7

  • Online ISBN: 978-3-540-85481-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics