Skip to main content

Rule-Based Multidimensional Data Quality Assessment Using Contexts

  • Conference paper
  • First Online:
Big Data Analytics and Knowledge Discovery (DaWaK 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9829))

Included in the following conference series:

Abstract

It is an accepted fact that a value for a data quality metric can be acceptable or not, depending on the context in which data are produced and consumed. In particular, in a data warehouse (DW), the context for the value of a measure is given by the dimensions, and external data. In this paper we propose the use of logic rules to assess the quality of measures in a DW, accounting for the context in which these measures are considered. For this, we propose the use of three sets of rules: one, for representing the DW; a second one, for defining the particular context for the measures in the warehouse; and a third one for representing data quality metrics. This provides an uniform, elegant, and flexible framework for context-aware DW quality assessment. Our representation is implementation independent, and not only allows us to assess the quality of measures at the lowest granularity level in a data cube, but also the quality of aggregate and dimension data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Boston (1995)

    MATH  Google Scholar 

  2. Gongora de Almeida, W., de Sousa, R., de Deus, F., Amvame Nze, G., Lopes de Mendonca, F.: Taxonomy of data quality problems in multidimensional Data Warehouse models. In: Proceedings of CISTI, pp. 1–7 (2013)

    Google Scholar 

  3. Batini, C., Scannapieco, M.: Data Quality: Concepts, Methodologies and Techniques. Data-Centric Systems and Applications. Springer, Heidelberg (2006)

    MATH  Google Scholar 

  4. Bolchini, C., Curino, C.A., Orsi, G., Quintarelli, E., Rossato, R., Schreiber, F.A., Tanca, L.: And what can context do for data? Commun. ACM 52(11), 136–140 (2009)

    Article  Google Scholar 

  5. Bolchini, C., Curino, C.A., Quintarelli, E., Schreiber, F.A., Tanca, L.: A data-oriented survey of context models. SIGMOD Rec. 36(4), 19–26 (2007). http://doi.acm.org/10.1145/1361348.1361353

    Article  Google Scholar 

  6. Ciaccia, P., Torlone, R.: Modeling the propagation of user preferences. In: Jeusfeld, M., Delcambre, L., Ling, T.-W. (eds.) ER 2011. LNCS, vol. 6998, pp. 304–317. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  7. Daniel, F., Casati, F., Palpanas, T., Chayka, O.: Managing data quality in business intelligence applications. In: Proceedings of the International Workshop on Quality in Databases and Management of Uncertain Data, pp. 133–143, Auckland (2008)

    Google Scholar 

  8. Firmani, D., Mecella, M., Scannapieco, M., Batini, C.: On the meaningfulness of “big data quality” (invited paper). Data Science and Engineering pp. 1–15 (2015). http://dx.doi.org/10.1007/s41019-015-0004-7

  9. Jarke, M., Jeusfeld, M.A., Quix, C., Vassiliadis, P.: Architecture and quality in data warehouses. In: Pernici, B., Thanos, C. (eds.) CAiSE 1998. LNCS, vol. 1413, p. 93. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  10. Malaki, A., Bertossi, L.E., Rizzolo, F.: Mutidimensional contexts for data quality assessment. In: Proceedings of AMW, pp. 196–209, Ouro Pretol (2012)

    Google Scholar 

  11. Milani, M., Bertossi, L., Ariyan, S.: Extending contexts with ontologies for multidimensional data quality assessment. In: Proceedings of ICDE Workshops, pp. 242–247 (2014)

    Google Scholar 

  12. Minuto, M., Vaisman, A., Terribile, L.: Revising data cubes with exceptions: a ruled-based perspective. In: Proceedings of DMDW 2002, CEUR-WS, vol. 58, pp. 72–81 (2002)

    Google Scholar 

  13. Perez, J., Berlanga, R., Aramburu, M., Pedersen, T.: Towards a data warehouse contextualized with web opinions. In: Proceedings of IEEE-ICEBE 2008, pp. 697–702 (2008)

    Google Scholar 

  14. Pitarch, Y., Favre, C., Laurent, A., Poncelet, P.: Context-aware generalization for cube measures. In: Proceedings of DOLAP, pp. 99–104 (2010)

    Google Scholar 

  15. Poeppelmann, D., Schultewolter, C.: Towards a data quality framework for decision support in a multidimensional context. IJBIR 3(1), 17–29 (2012)

    Google Scholar 

  16. Stefanidis, K., Pitoura, E., Vassiliadis, P.: Managing contextual preferences. Inf. Syst. 36(8), 1158–1180 (2011)

    Article  Google Scholar 

  17. Strong, D.M., Lee, Y.W., Wang, R.Y.: Data quality in context. Commun. ACM 40(5), 103–110 (1997). http://doi.acm.org/10.1145/253769.253804

    Article  Google Scholar 

  18. Vaisman, A., Zimányi, E.: Data Warehouse Systems: Design and Implementation. Data-Centric Systems and Applications. Springer, Heidelberg (2014)

    Book  Google Scholar 

  19. Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Vaisman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Marotta, A., Vaisman, A. (2016). Rule-Based Multidimensional Data Quality Assessment Using Contexts. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43946-4_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43945-7

  • Online ISBN: 978-3-319-43946-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics