Skip to main content

A Method for Collecting Provenance Data: A Case Study in a Brazilian Hemotherapy Center

  • Conference paper
  • First Online:
Data and Information in Online Environments (DIONE 2020)

Abstract

Data provenance is a process that aims to provide an overview of the origin of data used by information systems. It focuses on the origin of the data, especially on identifying the data sources and the transformations the data has undergone over time. This paper proposes a method for data collection based on the Provenance Model (PROV-DM), to be applied on Brazilian hemotherapy centers. Storing data on anemia indices using data provenance is the overall purpose of it. This work uses concepts of data provenance, knowledge provenance and scientific workflow techniques. It is an exploratory research, of practical and deductive nature, with application of a case study. Actual data was extracted from reports generated by a Brazilian hemotherapy center, provided from 2000 to 2018. People unsuitable for blood donation, who had favorable anemia rates to be rejected, were quantified and analyzed. A total of 197,551 blood donor candidates who attended the hemotherapy center in 19 years were analyzed. In the end, it was possible to quantify the unfit candidates with the highest index of anemia. A total of 1,011 male and 4,039 female candidates were accounted for, totaling 4.02% and 16.09% respectively of donors unfit for blood donations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Almeida, F.N.: Descrição da Proveniência de Dados para Extração de Conhecimento em Sistemas de Informação de Hemoterapia, p. 114 (2012). f. Tese (Doutorado) - Curso de Bioinformática, Bioinformática, Universidade de São Paulo - USP, São Paulo (2012)

    Google Scholar 

  2. Altintas, I., Berkley, C., Jaeger, E., Jones, M.: Kepler: an extensible system for design and execution of scientific workflows. In: Proceedings of 16th International Conference on Scientific and Statistical Database Management, Santorini Island, Greece, 23 June 2004, pp. 423–424. IEEE (2004)

    Google Scholar 

  3. Borko, H.: Information science: what is it? Am. Doc. 19(1), 3–5 (1968)

    Article  Google Scholar 

  4. Bose, R., Frew, J.: Lineage retrieval for scientific data processing: a survey. ACM Comput. Surv. 37(1), 1–28 (2005)

    Article  Google Scholar 

  5. Buneman, P., Khanna, S., Wang-Chiew, T.: Why and where: a characterization of data provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316–330. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44503-X_20

    Chapter  Google Scholar 

  6. Buneman, P., Tan, W.C.: Provenance in databases: tutorial outline. In: Proceedings of ACM SIGMOD International Conference on Management of Data, Beijing, China, 11–14 Jun 2007. ACM (2007)

    Google Scholar 

  7. Capurro, R., Hjorland, B.: O conceito de informação. Perspectivas em Ciência da Informação, Belo Horizonte 12(1), 148–207 (2007)

    Google Scholar 

  8. Cuevas-Vicenttin, V., Dey, S., Wang, M.L.Y., Song, T., Ludäscher, B.: Modeling and querying scientific workflow provenance in the D-OPM. In: Proceedings of 2012 SC Companion High Performance Computing, Networking, Storage and Analysis, Washington, EUA, 10–16 November 2012, pp. 119–128. IEEE (2012)

    Google Scholar 

  9. Curbera, F., Doganata, Y., Martens, A., Mukhi, N.K., Slominski, A.: Business provenance – a technology to increase traceability of end-to-end operations. In: Meersman, R., Tari, Z. (eds.) OTM 2008. LNCS, vol. 5331, pp. 100–119. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88871-0_10

    Chapter  Google Scholar 

  10. Davidson, S.B., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: ACM SIGMOD International Conference on Management of Data, pp. 1345–1350 (2008)

    Google Scholar 

  11. Deelman, E., Gannon, D., Shields, M., Taylor, I.: Workflows and e-science: an overview of workflow system features and capabilities. Future Gen. Comput. Syst. 25(5), 528–540 (2009)

    Article  Google Scholar 

  12. Genetics in Medicine: ACMG. https://www.nature.com/articles/gim2016196. Accessed 22 Sept 2019

  13. Mendrone, A.J.R., et al.: Anemia screening in potential female blood donors: comparison of two different quantitative methods. Transfusion 49, 662–668 (2009)

    Article  Google Scholar 

  14. Meyers, D.G.: The iron hypothesis: does iron play a role in atherosclerosis? Transfusion 40(8), 1023–1029 (2000)

    Article  Google Scholar 

  15. Moreau, L., et al.: The open provenance model core specification (v1.1). Future Gen. Comput. Syst. 27(6), 743–756 (2011)

    Article  Google Scholar 

  16. Moreau, L., Freire, J., Futrelle, J., McGrath, R.E., Myers, J., Paulson, P.: The open provenance model: an overview. In: Freire, J., Koop, D., Moreau, L. (eds.) IPAW 2008. LNCS, vol. 5272, pp. 323–326. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89965-5_31

    Chapter  Google Scholar 

  17. Moreau, L., Groth, P.: Provenance: An Introduction to PROV. Synthesis Lectures on the Semantic Web: Theory and Technology, vol. 3, no. 4, pp. 1–129. Morgan & Claypool Publishers, California (2013)

    Google Scholar 

  18. Oinn, T., Li, P., Kell, D., Goble, C.: Taverna/myGrid: aligning a workflow system with the life sciences community. In: Taylor, I.J., Deelman, E., Gannon, D.B., Shields, M. (eds.) Workflows for e-Science, pp. 300–319. Springer, London (2007). https://doi.org/10.1007/978-1-84628-757-2_19

    Chapter  Google Scholar 

  19. Saracevic, T.: Ciência da Informação: origem, evolução e relações. Perspectivas em Ciência da Informação 1(1), 41–62 (1996)

    Google Scholar 

  20. Silva, P.P., Mcguinness, D.L., Mccool, R.: Knowledge provenance infrastructure. Proc. IEEE Data Eng. Bull. 25, 179–227 (2003)

    Google Scholar 

  21. Simmhan, Y.L., Plale, B., Gannon, D.: A survey of data provenance techniques. Technical report TR-618, Computer Science Department, Indiana University (2005)

    Google Scholar 

  22. Stevens, R., Zhao, J., Goble, C.: Using provenance to manage knowledge of in silico experiments. Brief. Bioinform. 8, 183–194 (2007)

    Article  Google Scholar 

  23. Stolzfus, R.J.: Defining iron deficiency anemian public health terms: a time for reflection. J. Nutr. 131, 565S–567S (2001)

    Article  Google Scholar 

  24. W3C: PROV-DM. http://www.w3.org/TR/prov-dm/. Accessed 21 Sept 2019

  25. WHO. https://www.who.int/topics/anaemia/en/. Accessed 21 Sept 2019

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Márcio José Sembay .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sembay, M.J., de Macedo, D.D.J., Lima Dutra, M. (2020). A Method for Collecting Provenance Data: A Case Study in a Brazilian Hemotherapy Center. In: Mugnaini, R. (eds) Data and Information in Online Environments. DIONE 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 319. Springer, Cham. https://doi.org/10.1007/978-3-030-50072-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-50072-6_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-50071-9

  • Online ISBN: 978-3-030-50072-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics