Skip to main content

The Identification and Selection of Good Quality Data Using Pedigree Matrix

  • Conference paper
  • First Online:
Sustainable Design and Manufacturing 2020

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 200))

Abstract

Most data-based studies require significant amounts of data to support their decision-making process. Apart from increasing data quantity, scientists tend to be aware of the quality of data that influences the robustness of the results. A Pedigree matrix method is presented to characterize the data quality aspects and quantify the quality rating. Five quality aspects (reliability, completeness, temporal, geographical and technological representativeness) are defined as the characteristics to describe how well the reference data is fit for the underlying study. Reference rules are made subjectively for allocating the quality rating, which enable the computer to select appropriate data effectively from among different data sources. The overall data quality rating is calculated reflecting the quality level and converted to the four-parameter Beta probability distribution for uncertainty quantification. This is complemented by the Monte Carlo simulation that identifies uncertainty hotspots, to further improve the quality of identified data. This study provides an effective way to identify the data of good quality through the definition of reference rules. Making such rules can help the users to effectively capture the descriptive information regarding the data quality, further assess the quality levels consistently. The four-parameter Beta distribution is used for quantitative transformation, since it is appropriate to represent expert judgement. Therefore, the definition of distribution parameters is flexible depending on the expert understanding of uncertainty. This strength extends the application of the method to different data systems. Further research can focus on the development of reference rules for different quality aspects, as well the integration of the Pedigree matrix in various data systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Funtowicz, S.O., Ravetz, J.R.: Science for policy: Uncertainty and quality. In: Uncertainty and quality in science for policy. Springer, Berlin (1990)

    Google Scholar 

  2. Weidema, B., Wesnaes, M.S.: Data quality management for life cycle inventories-an example of using data quality indicators. J. Clean. Prod. 4(3–4), 167–174 (1996)

    Article  Google Scholar 

  3. van der Sluijs, J., Kloprogge, P., Risbey, J., Ravetz, J.: Towards a synthesis of qualitative and quantitative uncertainty assessment: applications of the numeral, unit, spread, assessment, pedigree (NUSAP) system. In: International Workshop on Uncertainty, Sensitivity and Parameter Estimation 2003, Rockville, USA (2003)

    Google Scholar 

  4. May, J.R., Brennan, D.J.: Application of data quality assessment methods to an LCA of electricity generation. Int. J. Life Cycle Assess. 8(4), 215–225 (2003)

    Article  Google Scholar 

  5. Ewertowska, A., Pozo, C., Gavalda, J., Jimenez, L., Guillen-Gosalbez, G.: Combined use of life cycle assessment, data envelopment analysis and Monte Carlo simulation for quantifying environmental efficiencies under uncertainty. J. Clean. Prod. 166, 771–783 (2017)

    Article  Google Scholar 

  6. Miah, J.H., Griffiths, A., McNeill, R., Halvorson, S., Schenker, U., Espinoza-Orias, N., Morse, S., Yang, A.D., Sadhukhan, J.: A framework for increasing the availability of life cycle inventory data based on the role of multinational companies. Int. J. Life Cycle Assess. 23(9), 1744–1760 (2018)

    Article  Google Scholar 

  7. Weidema, B.P., Bauer, C., Hischier, R., Mutel, C., Nemecek, T., Reinhard, J., Vadenbo, C., Wernet, G.: Overview and Methodology: Data Quality Guideline for the Ecoinvent Database Version 3. Swiss Centre for Life Cycle Inventories (2013)

    Google Scholar 

  8. Muller, S., Lesage, P., Ciroth, A., Mutel, C., Weidema, B.P., Samson, R.: The application of the pedigree approach to the distributions foreseen in ecoinvent v3. Int. J. Life Cycle Assess. 21(9), 1327–1337 (2016)

    Article  Google Scholar 

  9. Ciroth, A., Muller, S., Weidema, B., Lesage, P.: Empirically based uncertainty factors for the pedigree matrix in ecoinvent. Int. J. Life Cycle Assess. 21(9), 1338–1348 (2016)

    Article  Google Scholar 

  10. Coulon, R., Camobreco, V., Teulon, H., Besnainou, J.: Data quality and uncertainty in LCI. Int. J. Life Cycle Assess. 2(3), 178 (1997)

    Article  Google Scholar 

  11. Kennedy, D.J., Montgomery, D.C., Rollier, D.A., Keats, J.B.: Data Quality. Int. J. Life Cycle Assess. 2(4), 229–239 (1997)

    Article  Google Scholar 

  12. Edelen, A., Ingwersen, W.W.: The creation, management, and use of data quality information for life cycle assessment. Int. J. Life Cycle Assess. 23(4), 759–772 (2018)

    Article  Google Scholar 

  13. JRC: ILCD handbook. General guide for life cycle assessment—Detailed guidance. European Commission, Joint Research Centre—Institute for Environment and Sustainability, Luxembourg (2010)

    Google Scholar 

  14. Johnson, N.L., Kotz, S., Balakrishnan, N.: Chapter 21: Beta Distributions. In: Continuous Univariate Distributions, 2nd edn. Houghton Mifflin Boston (1970)

    Google Scholar 

  15. Kennedy, D.J., Montgomery, D.C., Quay, B.H.: Data Quality. Stochastic environmental life cycle assessment modeling. Int. J. Life Cycle Assess. 1(4), 199–207 (1996)

    Google Scholar 

  16. ISO: ISO 9000:2015 Quality management systems—Fundamentals and vocabulary. International Organization for Standardization, Geneve, Switzerland (2015)

    Google Scholar 

Download references

Acknowledgements

This work has been conducted as part of the PLEIADES project that has received funding from the European Union’s Seventh Programmes for research, technological development and demonstration under grant agreement No. 603843.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaobo Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, X., Lee, J. (2021). The Identification and Selection of Good Quality Data Using Pedigree Matrix. In: Scholz, S.G., Howlett, R.J., Setchi, R. (eds) Sustainable Design and Manufacturing 2020. Smart Innovation, Systems and Technologies, vol 200. Springer, Singapore. https://doi.org/10.1007/978-981-15-8131-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-8131-1_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-8130-4

  • Online ISBN: 978-981-15-8131-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics