Skip to main content

Data Mining in Medicine

  • Chapter
  • First Online:
Machine Learning for Data Science Handbook

Abstract

Clinical databases collect large volumes of information. Relationships and patterns within these data could provide new medical knowledge. Data mining has as major objective the discovery of knowledge from large amounts of data, offers many possibilities for identifying different data features less visible or hidden to common analysis techniques. This chapter focuses on a selection of techniques and illustrates their applicability to medical diagnostic and prognostic problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “From data mining to knowledge discovery in databases,” AI magazine, vol. 17, no. 3, pp. 37–37, 1996.

    Google Scholar 

  2. N. Jothi, W. Husain, et al., “Data mining in healthcare–a review,” Procedia Computer Science, vol. 72, pp. 306–313, 2015.

    Article  Google Scholar 

  3. O. Maimon and L. Rokach, “Introduction to knowledge discovery and data mining,” in Data mining and knowledge discovery handbook, pp. 1–15, Springer, 2009.

    Google Scholar 

  4. R. Bellazzi and B. Zupan, “Predictive data mining in clinical medicine: current issues and guidelines,” International journal of medical informatics, vol. 77, no. 2, pp. 81–97, 2008.

    Article  Google Scholar 

  5. C. Robert, “Machine learning, a probabilistic perspective,” 2014.

    Google Scholar 

  6. J.-G. Lee, S. Jun, Y.-W. Cho, H. Lee, G. B. Kim, J. B. Seo, and N. Kim, “Deep learning in medical imaging: general overview,” Korean journal of radiology, vol. 18, no. 4, pp. 570–584, 2017.

    Article  Google Scholar 

  7. S. K. Pandey and R. R. Janghel, “Recent deep learning techniques, challenges and its applications for medical healthcare system: A review,” Neural Processing Letters, pp. 1–29, 2019.

    Google Scholar 

  8. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, p. 436, 2015.

    Google Scholar 

  9. L. Deng, “A tutorial survey of architectures, algorithms, and applications for deep learning,” APSIPA Transactions on Signal and Information Processing, vol. 3, 2014.

    Google Scholar 

  10. R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley, “Deep learning for healthcare: review, opportunities and challenges,” Briefings in bioinformatics, vol. 19, no. 6, pp. 1236–1246, 2017.

    Article  Google Scholar 

  11. A. A. A. Setio, F. Ciompi, G. Litjens, P. Gerke, C. Jacobs, S. J. Van Riel, M. M. W. Wille, M. Naqibullah, C. I. Sánchez, and B. van Ginneken, “Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1160–1169, 2016.

    Article  Google Scholar 

  12. H. R. Roth, L. Lu, J. Liu, J. Yao, A. Seff, K. Cherry, L. Kim, and R. M. Summers, “Improving computer-aided detection using convolutional neural networks and random view aggregation,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1170–1181, 2015.

    Article  Google Scholar 

  13. Q. Dou, H. Chen, L. Yu, L. Zhao, J. Qin, D. Wang, V. C. Mok, L. Shi, and P.-A. Heng, “Automatic detection of cerebral microbleeds from MR images via 3D convolutional neural networks,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1182–1195, 2016.

    Article  Google Scholar 

  14. K. Sirinukunwattana, S. E. A. Raza, Y.-W. Tsang, D. R. Snead, I. A. Cree, and N. M. Rajpoot, “Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1196–1206, 2016.

    Article  Google Scholar 

  15. G. Currie, K. E. Hawk, E. Rohren, A. Vial, and R. Klein, “Machine learning and deep learning in medical imaging: Intelligent imaging,” Journal of Medical Imaging and Radiation Sciences, vol. 50, p. 477–487, Dec 2019.

    Article  Google Scholar 

  16. A. T. Kharroubi and H. M. Darwish, “Diabetes mellitus: The epidemic of the century,” World journal of diabetes, vol. 6, no. 6, p. 850, 2015.

    Google Scholar 

  17. T. Y. Wong, C. M. G. Cheung, M. Larsen, S. Sharma, and R. Simó, “Diabetic retinopathy,” Nature Reviews Disease Primers, vol. 2, 2016.

    Google Scholar 

  18. V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros, et al., “Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs,” Jama, vol. 316, no. 22, pp. 2402–2410, 2016.

    Article  Google Scholar 

  19. M. D. Abràmoff, P. T. Lavin, M. Birch, N. Shah, and J. C. Folk, “Pivotal trial of an autonomous ai-based diagnostic system for detection of diabetic retinopathy in primary care offices,” NPJ Digital Medicine, vol. 1, no. 1, p. 39, 2018.

    Google Scholar 

  20. N. Harbeck, F. Penault-Llorca, J. Cortes, M. Gnant, N. Houssami, P. Poortmans, K. Ruddy, J. Tsang, and F. Cardoso, “Breast cancer,” Nature Reviews Disease Primers, vol. 5, Sep 2019.

    Google Scholar 

  21. B. E. Bejnordi, M. Veta, P. J. Van Diest, B. Van Ginneken, N. Karssemeijer, G. Litjens, J. A. Van Der Laak, M. Hermsen, Q. F. Manson, M. Balkenhol, et al., “Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer,” Jama, vol. 318, no. 22, pp. 2199–2210, 2017.

    Article  Google Scholar 

  22. J. Wang, X. Yang, H. Cai, W. Tan, C. Jin, and L. Li, “Discrimination of breast cancer with microcalcifications on mammography by deep learning,” Scientific reports, vol. 6, no. 1, pp. 1–9, 2016.

    Google Scholar 

  23. A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, “Dermatologist-level classification of skin cancer with deep neural networks,” Nature, vol. 542, no. 7639, p. 115, 2017.

    Google Scholar 

  24. M. H. Jafari, N. Karimi, E. Nasr-Esfahani, S. Samavi, S. M. R. Soroushmehr, K. Ward, and K. Najarian, “Skin lesion segmentation in clinical images using deep learning,” in 2016 23rd International conference on pattern recognition (ICPR), pp. 337–342, IEEE, 2016.

    Google Scholar 

  25. Y. Cheng, F. Wang, P. Zhang, and J. Hu, “Risk prediction with electronic health records: A deep learning approach,” in Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 432–440, SIAM, 2016.

    Google Scholar 

  26. A. Avati, K. Jung, S. Harman, L. Downing, A. Ng, and N. H. Shah, “Improving palliative care with deep learning,” BMC medical informatics and decision making, vol. 18, no. 4, p. 122, 2018.

    Google Scholar 

  27. A. Rajkomar, E. Oren, K. Chen, A. M. Dai, N. Hajaj, M. Hardt, P. J. Liu, X. Liu, J. Marcus, M. Sun, et al., “Scalable and accurate deep learning with electronic health records,” NPJ Digital Medicine, vol. 1, no. 1, p. 18, 2018.

    Google Scholar 

  28. J. A. Golden, “Deep learning algorithms for detection of lymph node metastases from breast cancer: helping artificial intelligence be seen,” Jama, vol. 318, no. 22, pp. 2184–2186, 2017.

    Article  Google Scholar 

  29. A. R. Post, A. N. Sovarel, and J. H. Harrison Jr, “Abstraction-based temporal data retrieval for a clinical data repository,” in AMIA Annual Symposium Proceedings, vol. 2007, p. 603, American Medical Informatics Association, 2007.

    Google Scholar 

  30. C. Combi, M. Mantovani, and P. Sala, “Discovering quantitative temporal functional dependencies on clinical data,” in 2017 IEEE International Conference on Healthcare Informatics (ICHI), pp. 248–257, IEEE, 2017.

    Google Scholar 

  31. A. Shknevsky, Y. Shahar, and R. Moskovitch, “Consistent discovery of frequent interval-based temporal patterns in chronic patients’ data,” Journal of biomedical informatics, vol. 75, pp. 83–95, 2017.

    Article  Google Scholar 

  32. R. Moskovitch and Y. Shahar, “Fast time intervals mining using the transitivity of temporal relations,” Knowledge and Information Systems, vol. 42, no. 1, pp. 21–48, 2015.

    Article  Google Scholar 

  33. C. Combi, E. Keravnou-Papailiou, and Y. Shahar, Temporal information systems in medicine. Springer Science & Business Media, 2010.

    Google Scholar 

  34. R. Moskovitch and Y. Shahar, “Classification-driven temporal discretization of multivariate time series,” Data Mining and Knowledge Discovery, vol. 29, no. 4, pp. 871–913, 2015.

    Article  MathSciNet  Google Scholar 

  35. R. Moskovitch and Y. Shahar, “Classification of multivariate time series via temporal abstraction and time intervals mining,” Knowledge and Information Systems, vol. 45, no. 1, pp. 35–74, 2015.

    Article  Google Scholar 

  36. Y. Shahar, “A framework for knowledge-based temporal abstraction,” Artificial intelligence, vol. 90, no. 1-2, pp. 79–133, 1997.

    Article  MATH  Google Scholar 

  37. Y. Shahar and M. A. Musen, “Knowledge-based temporal abstraction in clinical domains,” Artificial intelligence in medicine, vol. 8, no. 3, pp. 267–298, 1996.

    Article  Google Scholar 

  38. E. Sheetrit, N. Nissim, D. Klimov, and Y. Shahar, “Temporal probabilistic profiles for sepsis prediction in the ICU,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2961–2969, 2019.

    Google Scholar 

  39. S. Concaro, L. Sacchi, C. Cerra, P. Fratino, and R. Bellazzi, “Mining healthcare data with temporal association rules: Improvements and assessment for a practical use,” in Conference on Artificial Intelligence in Medicine in Europe, pp. 16–25, Springer, 2009.

    Google Scholar 

  40. L. Sacchi, C. Larizza, C. Combi, and R. Bellazzi, “Data mining with temporal abstractions: learning rules from time series,” Data Mining and Knowledge Discovery, vol. 15, no. 2, pp. 217–247, 2007.

    Article  MathSciNet  Google Scholar 

  41. R. Bellazzi, C. Larizza, and A. Riva, “Temporal abstractions for interpreting diabetic patients monitoring data,” Intelligent Data Analysis, vol. 2, no. 1–4, pp. 97–122, 1998.

    Article  Google Scholar 

  42. J. F. Allen, “Towards a general theory of action and time,” Artificial intelligence, vol. 23, no. 2, pp. 123–154, 1984.

    Article  MATH  Google Scholar 

  43. C. Combi and A. Sabaini, “Extraction, analysis, and visualization of temporal association rules from interval-based clinical data,” in Conference on Artificial Intelligence in Medicine in Europe, pp. 238–247, Springer, 2013.

    Google Scholar 

  44. M. Mantovani, C. Combi, and M. Zeggiotti, “Discovering and analyzing trend-event patterns on clinical data,” 2019 IEEE International Conference on Healthcare Informatics (ICHI), pp. 1–10, 2019.

    Google Scholar 

  45. R. Bellazzi, C. Larizza, P. Magni, and R. Bellazzi, “Temporal data mining for the quality assessment of hemodialysis services,” Artificial intelligence in medicine, vol. 34, no. 1, pp. 25–39, 2005.

    Article  Google Scholar 

  46. C. Combi, A. Montanari, and P. Sala, “A uniform framework for temporal functional dependencies with multiple granularities,” in International Symposium on Spatial and Temporal Databases, pp. 404–421, Springer, 2011.

    Google Scholar 

  47. C. Combi, A. Montanari, and G. Pozzi, “The T4SQL temporal query language,” in Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pp. 193–202, ACM, 2007.

    Google Scholar 

  48. J. Kivinen and H. Mannila, “Approximate inference of functional dependencies from relations,” Theoretical Computer Science, vol. 149, no. 1, pp. 129–149, 1995.

    Article  MathSciNet  MATH  Google Scholar 

  49. C. Combi, M. Franceschet, and A. Peron, “Representing and reasoning about temporal granularities,” Journal of Logic and Computation, vol. 14, no. 1, pp. 51–77, 2004.

    Article  MathSciNet  MATH  Google Scholar 

  50. C. Combi, M. Mantovani, A. Sabaini, P. Sala, F. Amaddeo, U. Moretti, and G. Pozzi, “Mining approximate temporal functional dependencies with pure temporal grouping in clinical databases,” Computers in biology and medicine, vol. 62, pp. 306–324, 2015.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlo Combi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Amico, B., Combi, C., Shahar, Y. (2023). Data Mining in Medicine. In: Rokach, L., Maimon, O., Shmueli, E. (eds) Machine Learning for Data Science Handbook. Springer, Cham. https://doi.org/10.1007/978-3-031-24628-9_27

Download citation

Publish with us

Policies and ethics