Abstract
Retrospective data mining has tremendous potential in research but is time and labor intensive. Current data mining software contains many advanced search features but is limited in its ability to identify patients who meet multiple complex independent search criteria. Simple keyword and Boolean search techniques are ineffective when more complex searches are required, or when a search for multiple mutually inclusive variables becomes important. This is particularly true when trying to identify patients with a set of specific radiologic findings or proximity in time across multiple different imaging modalities. Another challenge that arises in retrospective data mining is that much variation still exists in how image findings are described in radiology reports. We present an algorithmic approach to solve this problem and describe a specific use case scenario in which we applied our technique to a real-world data set in order to identify patients who matched several independent variables in our institution’s picture archiving and communication systems (PACS) database.
Similar content being viewed by others
References
Koh H, Tan G: Data mining applications in healthcare. J Healthc Inf Manag 19:64–72, 2011
Asur S, Huberman B: Predicting the future with social media. Web Intelligence and Intelligent Agent Technology (WI-IAT) 2010 IEEE/WIC/ACM International Conference, Vol 1. 2010, IEEE
Singer P, Helic D, Hotho A, et al: What is Twitter, a social network or a news media?. 24th International World Wide Web Conference, Florence, Italy, 2015, (Best Paper Award)
Cerami E, Gao J, Dogrusoz U, et al: The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2(5):401–404, 2012
Zhang CQ, Petry D, Garzon JI, et al: PrePPI: a structure-informed database of protein–protein interactions. Nucleic Acids Res 41:D828–D833, 2012
Li P, Huang C, Fu Y, et al: Large-scale exploration and analysis of drug combinations. Bioinformatics 31:2007–2016, 2015
Demir E, Cary M, Paley S, et al: The BioPAX community standard for pathway data sharing. Nat Biotechnol 28:935–942, 2010
Cheng W, Rolls ET, Huaguang G, et al: Autism: reduced connectivity between cortical areas involved in face expression, theory of mind, and the sense of self. Brain 138:1382–1393, 2015
Radiological Society of North America: RSNA Informatics Radiology Reporting Initiative. http://www.radreport.org/, 2014. Accessed 10 Dec 2014
Acknowledgments
We would like to thank the leadership within the Department of Diagnostic Radiology at Henry Ford Hospital for their support as well as Dr Donald Peck and his team within the Department of Physics. We would also like to thank Dr Joseph Craig and Dr Courtney Scher within the Department of Musculoskeletal Radiology at Henry Ford Hospital for their contribution to the research project titled “Retrospective Review of Bone Marrow Signal Alterations Involving the Subchondral Bone Plate on Magnetic Resonance Imaging after Acute Knee Trauma” which served as the use case scenario for this project. Finally, we would like to thank Taryn Simon and Ward Detwiler at the Henry Ford Innovation Institute for their continued advisement of the Datafish Project during its ongoing research and development.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kelley, B.P., Klochko, C., Halabi, S. et al. Datafish Multiphase Data Mining Technique to Match Multiple Mutually Inclusive Independent Variables in Large PACS Databases. J Digit Imaging 29, 331–336 (2016). https://doi.org/10.1007/s10278-015-9817-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10278-015-9817-1