Abstract
It is an accepted fact that a value for a data quality metric can be acceptable or not, depending on the context in which data are produced and consumed. In particular, in a data warehouse (DW), the context for the value of a measure is given by the dimensions, and external data. In this paper we propose the use of logic rules to assess the quality of measures in a DW, accounting for the context in which these measures are considered. For this, we propose the use of three sets of rules: one, for representing the DW; a second one, for defining the particular context for the measures in the warehouse; and a third one for representing data quality metrics. This provides an uniform, elegant, and flexible framework for context-aware DW quality assessment. Our representation is implementation independent, and not only allows us to assess the quality of measures at the lowest granularity level in a data cube, but also the quality of aggregate and dimension data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Boston (1995)
Gongora de Almeida, W., de Sousa, R., de Deus, F., Amvame Nze, G., Lopes de Mendonca, F.: Taxonomy of data quality problems in multidimensional Data Warehouse models. In: Proceedings of CISTI, pp. 1–7 (2013)
Batini, C., Scannapieco, M.: Data Quality: Concepts, Methodologies and Techniques. Data-Centric Systems and Applications. Springer, Heidelberg (2006)
Bolchini, C., Curino, C.A., Orsi, G., Quintarelli, E., Rossato, R., Schreiber, F.A., Tanca, L.: And what can context do for data? Commun. ACM 52(11), 136–140 (2009)
Bolchini, C., Curino, C.A., Quintarelli, E., Schreiber, F.A., Tanca, L.: A data-oriented survey of context models. SIGMOD Rec. 36(4), 19–26 (2007). http://doi.acm.org/10.1145/1361348.1361353
Ciaccia, P., Torlone, R.: Modeling the propagation of user preferences. In: Jeusfeld, M., Delcambre, L., Ling, T.-W. (eds.) ER 2011. LNCS, vol. 6998, pp. 304–317. Springer, Heidelberg (2011)
Daniel, F., Casati, F., Palpanas, T., Chayka, O.: Managing data quality in business intelligence applications. In: Proceedings of the International Workshop on Quality in Databases and Management of Uncertain Data, pp. 133–143, Auckland (2008)
Firmani, D., Mecella, M., Scannapieco, M., Batini, C.: On the meaningfulness of “big data quality” (invited paper). Data Science and Engineering pp. 1–15 (2015). http://dx.doi.org/10.1007/s41019-015-0004-7
Jarke, M., Jeusfeld, M.A., Quix, C., Vassiliadis, P.: Architecture and quality in data warehouses. In: Pernici, B., Thanos, C. (eds.) CAiSE 1998. LNCS, vol. 1413, p. 93. Springer, Heidelberg (1998)
Malaki, A., Bertossi, L.E., Rizzolo, F.: Mutidimensional contexts for data quality assessment. In: Proceedings of AMW, pp. 196–209, Ouro Pretol (2012)
Milani, M., Bertossi, L., Ariyan, S.: Extending contexts with ontologies for multidimensional data quality assessment. In: Proceedings of ICDE Workshops, pp. 242–247 (2014)
Minuto, M., Vaisman, A., Terribile, L.: Revising data cubes with exceptions: a ruled-based perspective. In: Proceedings of DMDW 2002, CEUR-WS, vol. 58, pp. 72–81 (2002)
Perez, J., Berlanga, R., Aramburu, M., Pedersen, T.: Towards a data warehouse contextualized with web opinions. In: Proceedings of IEEE-ICEBE 2008, pp. 697–702 (2008)
Pitarch, Y., Favre, C., Laurent, A., Poncelet, P.: Context-aware generalization for cube measures. In: Proceedings of DOLAP, pp. 99–104 (2010)
Poeppelmann, D., Schultewolter, C.: Towards a data quality framework for decision support in a multidimensional context. IJBIR 3(1), 17–29 (2012)
Stefanidis, K., Pitoura, E., Vassiliadis, P.: Managing contextual preferences. Inf. Syst. 36(8), 1158–1180 (2011)
Strong, D.M., Lee, Y.W., Wang, R.Y.: Data quality in context. Commun. ACM 40(5), 103–110 (1997). http://doi.acm.org/10.1145/253769.253804
Vaisman, A., Zimányi, E.: Data Warehouse Systems: Design and Implementation. Data-Centric Systems and Applications. Springer, Heidelberg (2014)
Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Marotta, A., Vaisman, A. (2016). Rule-Based Multidimensional Data Quality Assessment Using Contexts. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-43946-4_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43945-7
Online ISBN: 978-3-319-43946-4
eBook Packages: Computer ScienceComputer Science (R0)