Skip to main content

Maturing Pay-as-you-go Data Quality Management: Towards Decision Support for Paying the Larger Bills

  • Conference paper
  • First Online:
Data Management Technologies and Applications (DATA 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 737))

Abstract

Data quality management is a great challenge in today’s world due to increasing proliferation of abundant and heterogeneous datasets. All organizations that realize and maintain data intensive advanced applications should deal with data quality related problems on a daily basis. In these organization data quality related problems are registered in natural languages and subsequently the organizations rely on ad-hoc, non-systematic, and expensive solutions to categorize and resolve registered problems. In this contribution we present a formal description of an innovative data quality resolving architecture to semantically and dynamically map the descriptions of data quality related problems to data quality attributes. Through this mapping, we reduce complexity – as the dimensionality of data quality attributes is far smaller than that of the natural language space – and enable data analysts to directly use the methods and tools proposed in literature. Another challenge in data quality management is to choose appropriate solutions for addressing data quality problems due to lack of insight in the long-term or broader effects of candidate solutions. This difficulty becomes particularly prominent in flexible architectures where loosely linked data are integrated (e.g., data spaces or in open data settings). We present also a decision support framework for the solution choosing process to evaluate cost-benefit values of candidate solutions. The paper reports on a proof of concept tool of the proposed architecture and its evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Choenni, S., Leertouwer, E.: Public safety mashups to support policy makers. In: Andersen, K.N., Francesconi, E., Grönlund, Å., Engers, T.M. (eds.) EGOVIS 2010. LNCS, vol. 6267, pp. 234–248. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15172-9_22

    Chapter  Google Scholar 

  2. Netten, N., van den Braak, S., Choenni, S., Leertouwer, E.: Elapsed times in criminal justice systems. In: Proceedings of the 8th International Conference on Theory and Practice of Electronic Governance (ICEGOV), pp. 99–108. ACM (2014)

    Google Scholar 

  3. van Dijk, J., Kalidien, S., Choenni, S.: Smart monitoring of the criminal justice system. Government Information Quarterly. Elsevier (2016). doi:10.1016/j.giq.2015.11.005

  4. Christoulakis, M., Spruit, M., van Dijk, J.: Data quality management in the public domain: a case study within the Dutch justice system. Int. J. Inf. Qual. 4(1), 1–17 (2015)

    Article  Google Scholar 

  5. Birman, K.P.: Consistency in distributed systems. In: Guide to Reliable Distributed Systems, pp. 457–470. Springer, Heidelberg (2012)

    Google Scholar 

  6. Davenport, T.H., Glaser, J.: Just-in-time delivery comes to knowledge management. Harvard Bus. Rev. 80(7), 107–111 (2002)

    Google Scholar 

  7. Bargh, M.S., van Dijk, J., Choenni, S.: Dynamic data quality management using issue tracking systems. IADIS Int. J. Comput. Sci. Inf. Syst. 10(2), 32–51 (2015). Isaias, P., Paprzycki, M. (eds.)

    Google Scholar 

  8. Bargh, M.S., Mbgong, F., Dijk, J. van, Choenni, S.: A framework for dynamic data quality management. In: Proceedings of the IADIS International Conference Information Systems Post-Implementation and Change Management, pp. 134–142 (2015)

    Google Scholar 

  9. Bargh, M., van Dijk, J., Choenni, S.: Management of data quality related problems - exploiting operational knowledge. In: Proceedings of the 5th International Conference on Data Management Technologies and Applications (DATA), pp. 31–42. SciTePress (2016)

    Google Scholar 

  10. Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41(3), 16–52 (2009). Article no. 16

    Article  Google Scholar 

  11. Wand, Y., Wang, R.Y.: Anchoring data quality dimensions in ontological foundations. Commun. ACM 39(11), 86–95 (1996). ACM

    Article  Google Scholar 

  12. Davoudi, S., Dooling, J.A., Glondys, B., Jones, T.D., Kadlec, L., Overgaard, S.M., Ruben, K., Wendicke, A.: Data quality management model (2015 Update). J. AHIMA 86(10), 62–65 (2015). expanded web version

    Google Scholar 

  13. Knowledgent: White Paper Series: Building a Successful Data Quality Management Program. http://knowledgent.com/whitepaper/building-successful-data-quality-management-program/. Accessed 31 Oct 2015

  14. Halevy, A., Rajaraman, A., Ordille, J.: Data integration: the teenage years. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 9–16. VLDB Endowment (2006)

    Google Scholar 

  15. Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)

    Article  Google Scholar 

  16. Price, R., Shanks, G.: A semiotic information quality framework. In: Proceedings of International Conference on Decision Support Systems (DSS), pp. 658–672 (2004)

    Google Scholar 

  17. Woodall, P., Borek, A., Parlikad, A.K.: Data quality assessment: the hybrid approach. Inf. Manage. 50, 369–382 (2013)

    Article  Google Scholar 

  18. Bargh, M.S., Choenni, S., Meijer, R.: Privacy and information sharing in a judicial setting: a wicked problem. In: Proceedings of the 16th Annual International Conference on Digital Government Research, pp. 97–106. ACM, New York (2015)

    Google Scholar 

  19. Jiang, L., Barone, D., Borgida, A., Mylopoulos, J.: Measuring and comparing effectiveness of data quality techniques. In: Eck, P., Gordijn, J., Wieringa, R. (eds.) CAiSE 2009. LNCS, vol. 5565, pp. 171–185. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02144-2_17

    Chapter  Google Scholar 

  20. Bugzilla Website. https://www.bugzilla.org. Accessed 31 Oct 2015

  21. JIRA Software Website. https://www.atlassian.com/software/jira. Accessed 31 Oct 2015

  22. H2desk Website, https://www.h2desk.com. Accessed 31 Oct 2015

  23. TOPdesk Website. http://www.topdesk.nl. Accessed 31 Oct 2015

  24. Canovas Izquierdo, J.L., Cosentino, V., Rolandi, B., Bergel, A., Cabot, J.: GiLA: GitHub label analyzer. In: IEEE 22nd International Conference on Software Analysis, Evolution and Reengineering, pp. 479–483, Montreal, Canada (2015)

    Google Scholar 

  25. Environmental protection agency: data quality assessment: a reviewer’s guide, Technical report EPA/240/B-06/002, EPA QA/G-9R (2006)

    Google Scholar 

  26. Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (2012). ACM

    Article  Google Scholar 

  27. Eppler, M.J., Wittig, D.: Conceptualizing information quality: a review of information quality frameworks from the last ten years. In: Proceedings of the Conference on Info Quality, pp. 83–96 (2000)

    Google Scholar 

  28. Lee, Y.: Crafting rules: context-reflective data quality problem solving. J. Manage. Inf. Syst. 20(3), 93–119 (2003)

    Article  Google Scholar 

  29. Ryu, K.S., Park, J.S., Park, J.H.: A data quality management maturity model. ETRI J. 28(2), 191–204 (2006)

    Article  Google Scholar 

  30. Kornai, A.: The algebra of lexical semantics. In: Mathematics of Language, pp. 174–199. Springer, Heidelberg (2010)

    Google Scholar 

  31. Mooney, R.J.: Learning for semantic parsing. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 311–324. Springer, Heidelberg (2007). doi:10.1007/978-3-540-70939-8_28

    Chapter  Google Scholar 

Download references

Acknowledgements

Partial results of this work were presented earlier in [9]. Tables, figures and equations have their origin in this paper, unless stated otherwise.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan van Dijk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

van Dijk, J., Bargh, M.S., Choenni, S., Spruit, M. (2017). Maturing Pay-as-you-go Data Quality Management: Towards Decision Support for Paying the Larger Bills. In: Francalanci, C., Helfert, M. (eds) Data Management Technologies and Applications. DATA 2016. Communications in Computer and Information Science, vol 737. Springer, Cham. https://doi.org/10.1007/978-3-319-62911-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62911-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62910-0

  • Online ISBN: 978-3-319-62911-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics