Skip to main content
Log in

Clinical data analysis based on iterative subgroup discovery: experiments in brain ischaemia data analysis

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This paper presents a case study of the process of insightful analysis of clinical data collected in regular hospital practice. The approach is applied to a database describing patients suffering from brain ischaemia, either permanent as brain stroke with positive computer tomography (CT) or reversible ischaemia with normal brain CT test. The goal of the analysis is the extraction of useful knowledge that can help in diagnosis, prevention and better understanding of the vascular brain disease. This paper demonstrates the applicability of subgroup discovery for insightful data analysis and describes the expert’s process of converting the induced rules into useful medical knowledge. Detection of coexisting risk factors, selection of relevant discriminative points for numerical descriptors, as well as the detection and description of characteristic patient subpopulations are important results of the analysis. Graphical representation is extensively used to illustrate the detected dependencies in the available clinical data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Pazzani MJ (2000) Knowledge discovery from data? IEEE Intell Syst 15(2):10–13

    Article  Google Scholar 

  2. Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Mag 17(3):37–54

    Google Scholar 

  3. Gamberger D, Lavrač N, Krstačić G (2003) Active subgroup mining: a case study in a coronary heart disease risk group detection. Artif Intell Med 28:27–57

    Article  Google Scholar 

  4. Amarenco P et al. (1994) Atherosclerotic disease of the aortic arch and the risk of ischemic stroke. New Engl J Med 331:1474–1479

    Article  Google Scholar 

  5. Barnett HJM et al. (eds) (1998) Stroke. Pathophysiology, diagnosis, and management, 3rd edn. Elsevier Science, Churchill

    Google Scholar 

  6. Victor M, Ropper AH (2001) Cerebrovascular disease. In: Adams & Victor’s principles of neurology. McGraw–Hill, New York, pp 821–924

    Google Scholar 

  7. Gamberger D, Lavrač N (2002) Expert-guided subgroup discovery: methodology and application. J Artif Intell Res 17:501–527

    MATH  Google Scholar 

  8. Fayyad UM, Irani KB (1992) On the handling of continuous-valued attributes in decision tree generation. Mach Learn 8:87–102

    MATH  Google Scholar 

  9. Gamberger D, Lavrač N (2004) Avoiding data overfitting in scientific discovery: experiments in functional genomics. In: Proceedings of the 16th European conference on artificial intelligence (ECAI 2004), pp 470–474

  10. Fürnkranz J (2005) From local to global patterns: evaluation issues in rule learning algorithms. In: Morik K, Boulicaut J-F, Siebes A (eds) Local pattern detection. Springer, Berlin, pp 20–38

    Google Scholar 

  11. Klösgen W (1996) Explora: a multipattern and multistrategy discovery assistant. In: Fayyad UM, Piatetski-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining. MIT, Cambridge, pp 249–271

    Google Scholar 

  12. Lavrač N, Kavšek B, Flach P, Todorovski L (2004) Subgroup discovery with CN2-SD. J Mach Learn Res 5:153–188

    Google Scholar 

  13. Wrobel S (1997) An algorithm for multi-relational discovery of subgroups. In: Proceedings of the 1st European conference on principles of data mining and knowledge discovery, pp 78–87

  14. Wrobel S (2001) Inductive logic programming for knowledge discovery in databases. In: Džeroski S, Lavrač N (eds) Relational data mining. Springer, Berlin, pp 74–101

    Google Scholar 

  15. Klösgen W, May M (2002) Census data mining—an application. In: Proceedings of the 6th European conference on principles and practice of knowledge discovery in databases, pp 65–79

  16. Lavrač N, Železný F, Flach P (2003) RSD: relational subgroup discovery through first-order feature construction. In: Proceedings of the 12th international conference on inductive logic programming, pp 149–165

  17. Suzuki E (2004) Discovering interesting exception rules with rule pair. In: Proceedings of the ECML/PKDD workshop on advances in inductive rule learning, pp 163–178

  18. Roddick JF, Fule P, Graco WJ (2003) Exploratory medical knowledge discovery: experiences and issues. ACM SIGKDD Explor Newslett 5(1):94–99

    Article  Google Scholar 

  19. Pazzani MJ, Mani S, Shankle R (2001) Acceptance by medical experts of rules generated by machine learning. Methods Inf Med 40(5):380–385

    Google Scholar 

  20. Lucas PJF, van der Gaag LC, Abu-Hanna A (2004) Editorial: Bayesian models in biomedicine and health-care. Artif Intell Med 30(3):201–214

    Article  Google Scholar 

  21. Quinlan JR (1993) C4.5: programs for machine learning. Kaufmann, San Mateo

    Google Scholar 

  22. Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3(4):261–283

    Google Scholar 

  23. Okada T (2001) Medical knowledge discovery on the meningoencephalitis diagnosis studied by the cascade model. In: Proceedings of the new frontiers in artificial intelligence, Joint JSAI workshop, pp 533–540

  24. Gamberger D, Lavrač N, Železný F, Tolar J (2004) Induction of comprehensible models for gene expression datasets by subgroup discovery methodology. J Biomed Inf 37(4):269–284

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dragan Gamberger.

Additional information

This work was supported by Croatian Ministry of Science, Education and Sport project “Machine Learning Algorithms and Applications”, Slovenian Ministry of Higher Education, Science and Technology project “Knowledge Technologies”, and EU FP6 project “Heartfaid: A knowledge based platform of services for supporting medical–clinical management of the heart failure within the elderly population”.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gamberger, D., Lavrač, N., Krstačić, A. et al. Clinical data analysis based on iterative subgroup discovery: experiments in brain ischaemia data analysis. Appl Intell 27, 205–217 (2007). https://doi.org/10.1007/s10489-007-0068-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-007-0068-9

Keywords

Navigation