Abstract
With the increased collection of medical data in digital format the use and reuse of this data is also increasing. This introduces new challenges in the selection, de-identification, storage and handling of the imaging data. When building large data collections for use in training and validation of machine learning, merely collecting a lot of data is not enough. It is essential that the quality of the data is be sufficient for the intended application in order to obtain valid results. This chapter will discuss the issue of data quality by looking at the process of curation of medical images and other related data and the different aspects that are involved in this when moving forward in the era of AI.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rosenstein BS, et al. How will big data improve clinical and basic research in radiation therapy? Int J Radiat Oncol. 2015;95:895–904.
Mayer-Schonberger V, Ingelsson E. Big data and medicine: a big deal? J Intern Med. 2017.
Ridley EL. How to develop deep-learning algorithms for radiology. AuntMinnie.com. 2017. https://www.auntminnie.com/index.aspx?sec=sup&\break sub=aic&pag=dis&ItemID=118078. Accessed 6 June 2018.
Redman TC. If your data is bad, your machine learning tools are useless. Harv Bus Rev. 2018. https://\break hbr.org/2018/04/if-your-data-is-bad-your-machine-le\break arning-tools-are-useless. Accessed 6 June 2018.
U of Illinois. 2018. https://www.clir.org/initiatives-partnerships/data-curation/. Accessed 9 May 2018.
Freitas A, Curry E. Big data curation. In: Cavanillas JM, et al., editors. New horizons for a data-driven economy. Cham: Springer International Publishing; 2016.
Prior F, Smith K, Sharma A, Kirby J, Tarbox L, Clark K, Bennett W, Nolan T, Freymann J. Data descriptor: the public cancer radiology imaging collections of the Cancer Imaging Archive. Sci Data. 2017;4:170124.
van Ooijen PMA, Viddeleer AR, Meijer F, Oudkerk M. Accessibility of data backup on CD-R after 8 to 11 years. J Digit Imaging. 2010;23(1):95–9.
Aerts HJWL. Data science in radiology: a path forward. Clin Cancer Res. 2018;24(3):532–4.
Kansagra AP, Yu J-PJ, Chatterjee AR, Lenchik L, Chow DS, Prater AB, Yeh J, Doshi AM, Hawkins M, Heilbrun ME, Smith SE, Oselkin M, Gupta P, Ali S. Big data and the future of radiology informatics. Acad Radiol. 2016;23:30–42.
Tang A, Tam R, Cadrin-Chenevert A, Guest W, Chong J, Barfett J, Chepelev L, Cairns R, Michell R, Cicero MD, Gaudreau Poudrette M, Jaremko JL, Reinhold C, Gallix B, Gray B, Geis R. Canadian Association of Radiologists white paper on artificial intelligence in radiology. Can Assoc Radiol J. 2018;69:120–35.
Kohli M, Summers R, Geis R. Medical image data and datasets in the era of machine learning—whitepaper from the 2016 C-MIMI meeting dataset session. J Digit Imaging. 2017;30:392–9.
Lupton D. Who owns your personal health and medical data? This Sociological Life BLOG. 2015.
Aryanto KYE, Oudkerk M, van Ooijen PMA. Free DICOM de-identification tools in clinical research: functioning and safety of patient privacy. Eur Radiol. 2015;25(12):3685–95. https://doi.org/10.1007/s00330-015-3794-0.
Moore SM, et al. De-identification of medical images with retention of scientific research value. Radiographics. 2015;35:727–35.
Clark K, et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. J Digit Imaging. 2013;26:1045–57.
Prior FW, Brunsden B, Hildebolt C, et al. Facial recognition from volume rendered magnetic resonance imaging data. IEEE Trans Inf Technol Biomed. 2009;13(1):5–9.
Mazura JC, Juluru K, Chen JJ, Morgan TA, John M, Siegel EL. Facial recognition software success rate for the identification of 3D surface reconstructed facial images: implications for patient privacy and security. J Digit Imaging. 2012;25(3): 347–51.
Sweeney L. Only you, your doctor, and many others may know. Technology Science. 2015. http://\break techscience.org/a/2015092903. Accessed 6 June 2018.
Lawrence ND. Data readiness levels. 2017. arXiv:1705.02245v1 [cs.DB].
Chalkidou A, O’Doherty MJ, Marsden PK. False discovery rates in PET and CT studies with texture features: a systematic review. PLoS One. 2015;10:e0124165.
Harvey H. Is medical imaging data ready for Artificial Intelligence? AuntMinnieEurope. 2017. https://www.auntminnieeurope.com/index.aspx?sec\break =sup&sub=pac&pag=dis&ItemID=615032. Accessed 6 June 2018.
EMC. The digital universe of opportunities: rich data and the increasing value of the internet of things. Executive summary data growth, business opportunities, and the IT imperatives. EMC. 2014. https://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm. Accessed 9 June 2018.
Wilkinson MD, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:160018. https://doi.org/10.1038/sdata.2016.18.
ESR. ESR position paper on imaging biobanks. Insights Imaging. 2015;6(4):403–10.
Bennett W, Metthews J, Bosch W. SU-GG-T-262: open-source tool for assessing variability in DICOM data. Med Phys. 2010;37:3245.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
van Ooijen, P.M.A. (2019). Quality and Curation of Medical Images and Data. In: Ranschaert, E., Morozov, S., Algra, P. (eds) Artificial Intelligence in Medical Imaging. Springer, Cham. https://doi.org/10.1007/978-3-319-94878-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-94878-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94877-5
Online ISBN: 978-3-319-94878-2
eBook Packages: MedicineMedicine (R0)