Skip to main content

A Quality Management Workflow Proposal for a Biodiversity Data Repository

  • Conference paper
  • 1551 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8823))

Abstract

The importance of quality-assured data in scientific analysis necessitates the inclusion of data quality management (DQM) functionality in research data repositories in addition to their primary role of data storage, sharing and integration. Typically, the DQM workflow in data repositories is fixed and semi-automated for datasets whose structure and semantics is known a-priori, however, for other types of datasets, DQM is either manual or minimal. In comparison, classical DQM methodology (especially in data warehousing research) has established standard, typically manually undertaken, DQM procedures for different types of data. Therefore, our proposal aims at customizing and semi-automating the classical DQM procedures for bio-diversity data repositories. As opposed to reviewing scientific contents of the data, we focus on technical data quality. Our proposed workflow includes DQM criteria specification, client and server-side validation, data profiling, error detection analysis, data enhancement and correction, and quality monitoring.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chrisman, N.R.: The Error Component in Spatial Data. In: Maguire, D.J., Goodchild, M.F., Rhind, D.W. (eds.) Geographical Information Systems, vol. 1, pp. 165–174. Longman Scientific and Technical, Principals (1991)

    Google Scholar 

  2. Redman, T.C.: Data Quality for the Information Age. Artech House, Inc., Boston (1996)

    Google Scholar 

  3. Chapman, A.D.: Principles of Data Quality, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen, pp. 1–58 (2005)

    Google Scholar 

  4. Costello, M., Michener, W., Gahegan, M., Zhang, Z., Bourne, P.: Biodiversity data should be published, cited, and peer reviewed. Trends in Ecology & Evolution 28(8), 454–461 (2013), doi:10.1016/j.tree.2013.05.002

    Article  Google Scholar 

  5. Swan, A., Sheridan, B.: To Share or Not to Share: Publication and Quality Assurance of Research Data Outputs. A report for the Research Information Network. School of Electronics & Computer Science, University of Southampton (2008), http://www.rin.ac.uk/system/files/attachments/To-share-data-outputs-report.pdf (Online: Accessed February 2014)

  6. Sadiq, S.: Handbook of Data Quality. Springer (2013)

    Google Scholar 

  7. English, L.P.: Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits. John Wiley & Sons, Inc., New York (1999)

    Google Scholar 

  8. Chisholm, M.: Data Quality is Not Fitness for Use, http://www.information-management.com/news/data-quality-is-not-fitness-for-use-10023022-1.html (Online: Accessed February 2014)

  9. Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41(3), 1–52 (2009)

    Article  Google Scholar 

  10. Barateiro, J., Galhardas, H.: A Survey of Data Quality Tools. Datenbank-Spektrum 14, 15–21 (2005)

    Google Scholar 

  11. Aggarwal, C.C.: Outlier Analysis. Springer Publishing Company, Incorporated (2013)

    Google Scholar 

  12. Seo, S.: A review and comparison of methods for detecting outlier in univariate data sets. PhD thesis. University of Pittsburgh, Department of Biostatistics (2006)

    Google Scholar 

  13. Fürber, C., Hepp, M.: Ontology-Based Data Quality Management - Methodology, Cost, and Benefits. In: 6th Annual European Semantic Web Conference (ESWC 2009), Heraklion, Greece, May 31-June 4 (2009)

    Google Scholar 

  14. Malik, W.A., Unwin, A., Gribov, A.: An Interactive Graphical System for Visualizing Data Quality - Tableplot Graphics. In: Loracek-Junge, H., Weihs, C. (eds.) Proceedings of the 11th IFCS Conference Classification as a Tool for Research, pp. 331–339. Springer, Berlin

    Google Scholar 

  15. Ball, S., French, G.: NBN Record Cleaner user guide, V.1.0.8.3, https://data.nbn.org.uk/recordcleaner/documentation/NBNRecordCleanerUserguide.pdf (Online: Accessed February 2014)

  16. Hyvönen, E., Alonen, M., Koho, M., Tuominen, J.: BirdWatch—supporting citizen scientists for better linked data quality for biodiversity management. In: Workshop on Semantics for Biodiversity (S4BIODIV), ESWC, Montpellier, France. CEUR Workshop Proceedings (2013)

    Google Scholar 

  17. Lotz, T., Nieschulze, J., Bendix, J., Dobbermann, M., König-Ries, B.: Diverse or uniform? - Intercomparison of two major German project databases for interdisciplinary collaborative functional biodiversity research. Ecological Informatics 8, 10–19 (2012)

    Article  Google Scholar 

  18. Chamanara, J., König-Ries, B.: A conceptual model for data management in the field of ecology. Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.12.003

  19. Marine Metadata Interoperability Project: Ontologies and Thesauri References. 3, https://marinemetadata.org/conventions/ontologies-thesauri (Online: Accessed February 2014)

  20. Oracle Warehouse Builder User’s Guide, 11g Release 1 (11.1) (2009), http://docs.oracle.com/cd/B28359_01/owb.111/b31278.pdf (Online: Accessed February 2014)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Owonibi, M., Koenig-Ries, B. (2014). A Quality Management Workflow Proposal for a Biodiversity Data Repository. In: Indulska, M., Purao, S. (eds) Advances in Conceptual Modeling. ER 2014. Lecture Notes in Computer Science, vol 8823. Springer, Cham. https://doi.org/10.1007/978-3-319-12256-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12256-4_17

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12255-7

  • Online ISBN: 978-3-319-12256-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics