Maturing Pay-as-you-go Data Quality Management: Towards Decision Support for Paying the Larger Bills

  • Jan van DijkEmail author
  • Mortaza S. Bargh
  • Sunil Choenni
  • Marco Spruit
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 737)


Data quality management is a great challenge in today’s world due to increasing proliferation of abundant and heterogeneous datasets. All organizations that realize and maintain data intensive advanced applications should deal with data quality related problems on a daily basis. In these organization data quality related problems are registered in natural languages and subsequently the organizations rely on ad-hoc, non-systematic, and expensive solutions to categorize and resolve registered problems. In this contribution we present a formal description of an innovative data quality resolving architecture to semantically and dynamically map the descriptions of data quality related problems to data quality attributes. Through this mapping, we reduce complexity – as the dimensionality of data quality attributes is far smaller than that of the natural language space – and enable data analysts to directly use the methods and tools proposed in literature. Another challenge in data quality management is to choose appropriate solutions for addressing data quality problems due to lack of insight in the long-term or broader effects of candidate solutions. This difficulty becomes particularly prominent in flexible architectures where loosely linked data are integrated (e.g., data spaces or in open data settings). We present also a decision support framework for the solution choosing process to evaluate cost-benefit values of candidate solutions. The paper reports on a proof of concept tool of the proposed architecture and its evaluation.


Data quality issues Data quality management Knowledge mapping User generated inputs Solution management 



Partial results of this work were presented earlier in [9]. Tables, figures and equations have their origin in this paper, unless stated otherwise.


  1. 1.
    Choenni, S., Leertouwer, E.: Public safety mashups to support policy makers. In: Andersen, K.N., Francesconi, E., Grönlund, Å., Engers, T.M. (eds.) EGOVIS 2010. LNCS, vol. 6267, pp. 234–248. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15172-9_22 CrossRefGoogle Scholar
  2. 2.
    Netten, N., van den Braak, S., Choenni, S., Leertouwer, E.: Elapsed times in criminal justice systems. In: Proceedings of the 8th International Conference on Theory and Practice of Electronic Governance (ICEGOV), pp. 99–108. ACM (2014)Google Scholar
  3. 3.
    van Dijk, J., Kalidien, S., Choenni, S.: Smart monitoring of the criminal justice system. Government Information Quarterly. Elsevier (2016). doi: 10.1016/j.giq.2015.11.005
  4. 4.
    Christoulakis, M., Spruit, M., van Dijk, J.: Data quality management in the public domain: a case study within the Dutch justice system. Int. J. Inf. Qual. 4(1), 1–17 (2015)CrossRefGoogle Scholar
  5. 5.
    Birman, K.P.: Consistency in distributed systems. In: Guide to Reliable Distributed Systems, pp. 457–470. Springer, Heidelberg (2012)Google Scholar
  6. 6.
    Davenport, T.H., Glaser, J.: Just-in-time delivery comes to knowledge management. Harvard Bus. Rev. 80(7), 107–111 (2002)Google Scholar
  7. 7.
    Bargh, M.S., van Dijk, J., Choenni, S.: Dynamic data quality management using issue tracking systems. IADIS Int. J. Comput. Sci. Inf. Syst. 10(2), 32–51 (2015). Isaias, P., Paprzycki, M. (eds.)Google Scholar
  8. 8.
    Bargh, M.S., Mbgong, F., Dijk, J. van, Choenni, S.: A framework for dynamic data quality management. In: Proceedings of the IADIS International Conference Information Systems Post-Implementation and Change Management, pp. 134–142 (2015)Google Scholar
  9. 9.
    Bargh, M., van Dijk, J., Choenni, S.: Management of data quality related problems - exploiting operational knowledge. In: Proceedings of the 5th International Conference on Data Management Technologies and Applications (DATA), pp. 31–42. SciTePress (2016)Google Scholar
  10. 10.
    Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41(3), 16–52 (2009). Article no. 16CrossRefGoogle Scholar
  11. 11.
    Wand, Y., Wang, R.Y.: Anchoring data quality dimensions in ontological foundations. Commun. ACM 39(11), 86–95 (1996). ACMCrossRefGoogle Scholar
  12. 12.
    Davoudi, S., Dooling, J.A., Glondys, B., Jones, T.D., Kadlec, L., Overgaard, S.M., Ruben, K., Wendicke, A.: Data quality management model (2015 Update). J. AHIMA 86(10), 62–65 (2015). expanded web versionGoogle Scholar
  13. 13.
    Knowledgent: White Paper Series: Building a Successful Data Quality Management Program. Accessed 31 Oct 2015
  14. 14.
    Halevy, A., Rajaraman, A., Ordille, J.: Data integration: the teenage years. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 9–16. VLDB Endowment (2006)Google Scholar
  15. 15.
    Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)CrossRefGoogle Scholar
  16. 16.
    Price, R., Shanks, G.: A semiotic information quality framework. In: Proceedings of International Conference on Decision Support Systems (DSS), pp. 658–672 (2004)Google Scholar
  17. 17.
    Woodall, P., Borek, A., Parlikad, A.K.: Data quality assessment: the hybrid approach. Inf. Manage. 50, 369–382 (2013)CrossRefGoogle Scholar
  18. 18.
    Bargh, M.S., Choenni, S., Meijer, R.: Privacy and information sharing in a judicial setting: a wicked problem. In: Proceedings of the 16th Annual International Conference on Digital Government Research, pp. 97–106. ACM, New York (2015)Google Scholar
  19. 19.
    Jiang, L., Barone, D., Borgida, A., Mylopoulos, J.: Measuring and comparing effectiveness of data quality techniques. In: Eck, P., Gordijn, J., Wieringa, R. (eds.) CAiSE 2009. LNCS, vol. 5565, pp. 171–185. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-02144-2_17 CrossRefGoogle Scholar
  20. 20.
    Bugzilla Website. Accessed 31 Oct 2015
  21. 21.
    JIRA Software Website. Accessed 31 Oct 2015
  22. 22.
    H2desk Website, Accessed 31 Oct 2015
  23. 23.
    TOPdesk Website. Accessed 31 Oct 2015
  24. 24.
    Canovas Izquierdo, J.L., Cosentino, V., Rolandi, B., Bergel, A., Cabot, J.: GiLA: GitHub label analyzer. In: IEEE 22nd International Conference on Software Analysis, Evolution and Reengineering, pp. 479–483, Montreal, Canada (2015)Google Scholar
  25. 25.
    Environmental protection agency: data quality assessment: a reviewer’s guide, Technical report EPA/240/B-06/002, EPA QA/G-9R (2006)Google Scholar
  26. 26.
    Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (2012). ACMCrossRefGoogle Scholar
  27. 27.
    Eppler, M.J., Wittig, D.: Conceptualizing information quality: a review of information quality frameworks from the last ten years. In: Proceedings of the Conference on Info Quality, pp. 83–96 (2000)Google Scholar
  28. 28.
    Lee, Y.: Crafting rules: context-reflective data quality problem solving. J. Manage. Inf. Syst. 20(3), 93–119 (2003)CrossRefGoogle Scholar
  29. 29.
    Ryu, K.S., Park, J.S., Park, J.H.: A data quality management maturity model. ETRI J. 28(2), 191–204 (2006)CrossRefGoogle Scholar
  30. 30.
    Kornai, A.: The algebra of lexical semantics. In: Mathematics of Language, pp. 174–199. Springer, Heidelberg (2010)Google Scholar
  31. 31.
    Mooney, R.J.: Learning for semantic parsing. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 311–324. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-70939-8_28 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Jan van Dijk
    • 1
    Email author
  • Mortaza S. Bargh
    • 1
  • Sunil Choenni
    • 1
    • 2
  • Marco Spruit
    • 3
  1. 1.Research and Documentation CentreMinistry of Security and JusticeThe HagueThe Netherlands
  2. 2.Research Centre Creating 010Rotterdam University of TechnologyRotterdamThe Netherlands
  3. 3.Department of Information and Computing SciencesUtrecht UniversityUtrechtThe Netherlands

Personalised recommendations