A new approach to distinguish migraine from stroke by mining structured and unstructured clinical data-sources

  • Elham SedghiEmail author
  • Jens H Weber
  • Alex Thomo
  • Maximilian Bibok
  • Andrew M. W. Penn
Original Article


Distinguishing migraine from stroke is a challenge due to many common signs and symptoms. It is important to consider the cost of hospitalization and the time spent by neurologists and stroke nurses to visit, diagnose, and assign appropriate care to the patients; therefore, devising new ways to distinguish stroke, migraine and other types of mimics can help in saving time and cost, and improve decision-making. In this study, we utilized text and data mining methods to extract the most important predictors from clinical reports in order to establish a migraine detection model and distinguish migraine patients from stroke or other types of mimic (non-stroke) cases. The available data for this study was a heterogeneous mix of free-text fields, such as triage main-complaints and specialist final-impressions, as well as numeric data about patients, such as age, blood-pressure, and so on. After a careful combination of these sources, we obtained a highly imbalanced dataset where the migraine cases were only about 6 % of the dataset. Our main challenge was tackling this data imbalance. Using the dataset in its original form to build classifiers led to a learning bias towards the majority class and against the minority (migraine) class. We used a sampling method to address the imbalance problem. First, different sources of data were preprocessed and balanced datasets were generated; second, attribute selection algorithms were used to reduce the dimensionality of the data; third, a novel combination of data mining algorithms was employed in order to effectively distinguish migraine from other cases. We achieved a sensitivity and specificity of about 80 and 75 %, respectively, which is in contrast to a sensitivity and specificity of 15.7 and 97 % when using the original imbalanced data for building classifiers.


Migraine Minority Class Data Mining Algorithm Imbalanced Data Data Mining Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The authors would like to acknowledge Kristine Votova, PhD, the project manager for the SpecTRA Research Project and the Island Health clinical research team at the Stroke Rapid Assessment Unit for their support. Funding for the natural experiment in stroke care and the large-scale personalized medicine for mass spectrometry in rapid TIA triage comes from Canadian Institute of Health Research (2009–2012) and Genome Canada/BC (2013–2017).


  1. Arauzo-Azofra A, Benitez JM, Castro JL (2008) Consistency measures for feature selection. J Intell Inf Syst 30(3):273–292CrossRefGoogle Scholar
  2. Cao ZH, Ko LW, Lai KL, Huang SB, Wang SJ, Lin CT (2015) Classification of migraine stages based on resting-state eeg power. In: 2015 international joint conference on neural networks (IJCNN), IEEE, pp 1–5Google Scholar
  3. Duval B, Hao JK, Hernandez Hernandez JC (2009) A memetic algorithm for gene selection and molecular classification of cancer. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, New York, pp 201–208Google Scholar
  4. Etminan M, Takkouche B, Isorna FC, Samii A et al (2005) Risk of ischaemic stroke in people with migraine: systematic review and meta-analysis of observational studies. BMJ 330(7482):63CrossRefGoogle Scholar
  5. Ghandehari K, Ashrafzadeh F, Mood ZI, Ebrahimzadeh S, Arabikhan K (2012) Development and validation of the asian migraine criteria (AMC). J Clin Neurosci 19(2):224–228CrossRefGoogle Scholar
  6. Government of British Columbia: Msc payment schedule index, neurology (2016).
  7. Hornik K, Buchta C, Hothorn T, Karatzoglou A, Meyer D, Zeileis A (2016) Rweka: R/weka interface.
  8. Jason B (2016) Feature selection to improve accuracy and decrease training time.
  9. Ko LW, Lai KL, Huang PH, Lin CT, Wang SJ (2013) Steady-state visual evoked potential based classification system for detecting migraine seizures. In: 2013 6th international IEEE/EMBS conference on neural engineering (NER), IEEE, pp 1299–1302Google Scholar
  10. MediResource:, migraine (migraine headache) (2015).
  11. Microsoft: Microsoft azure machine learning studio (2016).
  12. Navot A (2006) On the role of feature selection in machine learning. PhD thesis, Hebrew UniversityGoogle Scholar
  13. Sedghi E, Weber JH, Thomo A, Bibok M, Penn A (2015) Mining clinical text for stroke prediction. Netw Model Anal Health Inf Bioinf 4(1):1–9CrossRefGoogle Scholar
  14. Sun Y, Kamel MS, Wang Y (2006) Boosting for learning multiple classes with imbalanced class distribution. Sixth international conference on data mining ICDM’0. IEEE, New York, pp 592–602Google Scholar
  15. TheMigraineTrust: stroke and migraine (2015).
  16. The_R_Foundation: What is r?
  17. Tzourio C, Tehindrazanarivelo A, Iglesias S, Alperovitch A, Chedru F, d’Anglejan Chatillon J, Bousser, MG (1995) Case–control study of migraine and risk of ischaemic stroke in young women. BMJ 310(6983):830–833Google Scholar
  18. University of Waikato, New Zealand: Weka (machine learning) (2014). learning)
  19. Viticchi G, Falsetti L, Silvestrini M, Luzzi S, Provinciali L, Bartolini M (2012) The real usefulness and indication for migraine diagnosis of neurophysiologic evaluation. Neurol Sci 33(1):161–163CrossRefGoogle Scholar
  20. Wasikowski M, Chen XW (2010) Combating the small sample class imbalance problem using feature selection. IEEE Trans Knowl Data Eng 22(10):1388–1400Google Scholar
  21. WebMD: tests for diagnosing migraines (2015).
  22. Weiss GM (2004) Mining with rarity: a unifying framework. ACM SIGKDD Explor Newslett 6(1):7–19CrossRefGoogle Scholar
  23. Wikipedia: feature selection (2016).

Copyright information

© Springer-Verlag Wien 2016

Authors and Affiliations

  • Elham Sedghi
    • 1
    Email author
  • Jens H Weber
    • 1
  • Alex Thomo
    • 1
  • Maximilian Bibok
    • 2
  • Andrew M. W. Penn
    • 2
  1. 1.Department of Computer ScienceUniversity of VictoriaVictoriaCanada
  2. 2.SpecTRA Research ProjectVancouver Island Health AuthorityVictoriaCanada

Personalised recommendations