Data-Quality-Aware Skyline Queries

  • Hélène Jaudoin
  • Olivier Pivert
  • Grégory Smits
  • Virginie Thion
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8502)

Abstract

This paper deals with skyline queries in the context of “dirty databases”, i.e., databases that may contain bad quality or suspect data. We assume that each tuple or attribute value of a given dataset is associated with a quality level and we define several extensions of skyline queries that make it possible to take data quality into account when checking whether a tuple is dominated by another. This leads to the computation of different types of gradual (fuzzy) skylines.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Batini, C., Scannapieco, M.: Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications). Springer-Verlag New York, Inc. (2006)Google Scholar
  2. 2.
    Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: Georgakopoulos, D., Buchmann, A. (eds.) ICDE, pp. 421–430. IEEE Computer Society (2001)Google Scholar
  3. 3.
    Bosc, P., Hadjali, A., Pivert, O.: On possibilistic skyline queries. In: Christiansen, H., De Tré, G., Yazici, A., Zadrozny, S., Andreasen, T., Larsen, H.L. (eds.) FQAS 2011. LNCS, vol. 7022, pp. 412–423. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    Dai, B.T., Koudas, N., Ooi, B.C., Srivastava, D., Venkatasubramanian, S.: Column heterogeneity as a measure of data quality. In: CleanDB (2006)Google Scholar
  5. 5.
    Dubois, D., Prade, H.: Possibility Theory: An Approach to Computerized Processing of Uncertainty. Plenum Press, New York (1988); with the collaboration of Farreny, H., Martin-Clouaire, R., Testemale, C.Google Scholar
  6. 6.
    Lofi, C., Maarry, K.E., Balke, W.-T.: Skyline queries over incomplete data – error models for focused crowd-sourcing. In: Ng, W., Storey, V.C., Trujillo, J. (eds.) ER 2013. LNCS, vol. 8217, pp. 298–312. Springer, Heidelberg (2013)Google Scholar
  7. 7.
    Mihaila, G.A., Raschid, L., Vidal, M.E.: Using quality of data metadata for source selection and ranking. In: WebDB (Informal Proceedings), pp. 93–98 (2000)Google Scholar
  8. 8.
    Pei, J., Jiang, B., Lin, X., Yuan, Y.: Probabilistic skylines on uncertain data. In: Koch, C., Gehrke, J., Garofalakis, M.N., Srivastava, D., Aberer, K., Deshpande, A., Florescu, D., Chan, C.Y., Ganti, V., Kanne, C.C., Klas, W., Neuhold, E.J. (eds.) VLDB, pp. 15–26. ACM (2007)Google Scholar
  9. 9.
    Rahm, E., Do, H.H.: Data cleaning: Problems and current approaches. IEEE Data Eng. Bull. 23(4), 3–13 (2000)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Hélène Jaudoin
    • 1
  • Olivier Pivert
    • 1
  • Grégory Smits
    • 1
  • Virginie Thion
    • 1
  1. 1.Université de Rennes 1IrisaFrance

Personalised recommendations