Application of density estimation algorithms in analyzing co-morbidities of migraine

In this study, we will propose a density estimation based data analysis procedure to investigate the co-morbid associations between migraine and the suspected diseases. The primary objective of this study has aimed to develop a novel analysis procedure that can discover insightful knowledge from large medical databases. The entire analysis procedure consists of two stages. During the first stage, a kernel density estimation algorithm named relaxed variable kernel density estimation (RVKDE) is invoked to identify the samples of interest. Then, in the second stage, a density estimation algorithm based on generalized Gaussian components and named G2DE is invoked to provide a summarized description of the distribution. The results obtained by applying the proposed two-staged procedure to analyze co-morbidities of migraine revealed that the proposed procedure could effectively identify a number of clusters of samples with distinctive characteristics. The results further revealed that the distinctive characteristics of the clusters extracted by the proposed procedure were in conformity with the observations reported in recently published articles. Accordingly, it is conceivable that the proposed analysis procedure can be exploited to provide valuable clues of pathogenesis and facilitate development of proper treatment strategies. Electronic supplementary material The online version of this article (doi:10.1007/s13721-013-0028-8) contains supplementary material, which is available to authorized users.


Introduction
In recent years, data analysis based on large medical and clinical databases has gained attention among biomedical researchers (Himes et al. 2009;Lai et al. 2010;Lugardon et al. 2007). One major merit of this type of studies is that these databases collect cases with good demographic diversity. In addition, researchers can expeditiously verify their hypotheses since they do not need to spend a significant amount of efforts to recruit cases. Nevertheless, most studies have been conducted with conventional bio-statistical approaches. Accordingly, scientists have turned to exploit advanced machine learning and/or data mining approaches to extract valuable clues hidden in large medical and clinical databases (Himes et al. 2009;Lancashire et al. 2005;Li et al. 2004;Niederkohr and Levin 2005). For example, the Bayesian network has been exploited to identify the co-morbidity between chronic obstructive pulmonary disease and asthma (Himes et al. 2009). Furthermore, the decision tree algorithm has been exploited to guide diagnostic interpretation and therapeutic options for temporal arteritis (Niederkohr and Levin 2005).
In our study, we have aimed to exploit density estimation algorithms in the analysis of large medical/clinical databases. Density estimation is a classical problem in statistics aimed at constructing an approximate probability density function based on the samples randomly and independently taken from an underlined distribution. In the proposed approach, we have exploited the relaxed variable kernel density estimation (RVKDE) algorithm (Oyang et al. 2005) and the generalized Gaussian component based density estimation (G 2 DE) algorithm (Hsieh et al. 2009) that our research team has developed in recent years. The RVKDE algorithm has been exploited to identify those case samples that share some distinctive features in comparison with the control samples. Then, the G 2 DE algorithm has been invoked to provide a summarized and highly interpretable description of the underlying distribution.
In our study, aiming to learn the actual effects of the proposed analysis procedure, we have applied the proposed procedure to analyze co-morbidities of migraine. Migraine is a prevalent neurological disorder whereby patients suffer from recurrent headache attacks, nausea, photophobia, and phonophobia. Recent demographical studies showed that migraine was more common to women than to men and its burden has been underestimated. Many illnesses, physical or psychiatric, have been reported to be co-morbid with migraine (Aamodt et al. 2007;Bigal et al. 2010;Buse et al. 2010;Hagen et al. 2002;Kurth et al. 2008;Le et al. 2011); these disorders occur at a greater coincidental rate among migraine patients than among the general population. Understanding the association of migraine with other health conditions can help the clinicians providing better care and investigate the pathogenesis of these disorders.

Density estimation algorithms
In this section, we will elaborate the main features of the RVKDE algorithm and the G 2 DE algorithm exploited in the proposed analysis procedure and the desired effects achieved. Basically, the RVKDE algorithm was designed to construct an approximate probability density function with high accuracy. On the other hand, the G 2 DE algorithm was designed to provide a summarized and highly interpretable description of the underlying distribution.
Let {s 1 , s 2 , …, s n } be a set of samples randomly and independently taken from the distribution governed by probability density function f in a d-dimensional vector space. Then, the RVKDE algorithm constructs an approximate probability density functionf based on the following general form:f where , R(s i ) is the maximum distance between s i and its k nearest training instances; C(Á) is the gamma function (Artin 1964); b and k are parameters to be set either through cross validation or by the user. The general form of the RVKDE algorithm indicates that, for each sample, a Gaussian function is placed at its corresponding coordinates in the vector space. Accordingly, the approximate function constructed by the RVKDE algorithm is composed of a large number of Gaussian functions and it is difficult for a user to gain an abstract image of the underlying distribution in a multiple-dimension vector space. Therefore, our research team has designed the G 2 DE algorithm to provide the complementary feature. The approximate function constructed by the G 2 DE algorithm is composed of a limited number of generalized Gaussian components as shown in the following: i ðv À l i ÞÞ, d is the dimension of the vector space, w i ; l i ; and R i are the weight, center, and the covariance matrix of the i-th Gaussian component, respectively.
Since each Gaussian component in a G 2 DE based probability model corresponds to a cluster of samples, we can examine the centers and the covariance matrices of the Gaussian components to obtain an abstract image of the underlying distribution. Nevertheless, it must be noted that the number of parameters in a G 2 DE based probability model is equal to kðdþ2Þðdþ1Þ 2 . As a result, if we do not set k and d to small integers, then we need to examine a large number of parameter values and it may be difficult for us to interpret the physical meanings of the parameter values.

The clinical database
The study reported in this article has been conducted based on the Research Database released by the National Health Insurance Program in Taiwan. The National Health Insurance (NHI) program in Taiwan was launched in 1995 and as in December 2010 covered about 23,074,000 insurants, which accounted for over 99 % of the entire population in Taiwan. In addition, almost all medical hospitals and clinics in Taiwan have joined the program. As in December 2010, there were 25,031 medical institutes enrolled in the program. Since 2000, the Bureau of the program began to release the National Health Insurance Research Database (NHIRD) to facilitate medical research. The updated version used in this study contains the ambulatory and hospitalization claims records of 1,000,000 randomly selected insurants over the period from 1996 to 2010 without significant difference in age, sex, and insurance cost relative to the whole population.

Case patient definition and control selection
The cases in this study include those patients who were diagnosed with migraine in outpatient and/or inpatient records during 2004-2008. The ICD-9 CM codes (International Classification of Disease, 9th Revision, Clinical Modification; http://icd9cm.chrisendres.com/) used for screening include 346. 09, 346.19, 346.89, and 346.99, which correspond to patients with migraine with or without aura. In our study, for each migraine case, five controls without any migraine record during 1996-2010 and with matched gender and age were randomly selected from the NHIRD. As a result, the cohort contained 19,356 migraine cases and 96,780 controls. For a case, the date of the first migraine diagnosis was defined to be the index date and the same index date was assigned to the matched controls.

Medication exposure utilized as features
In our analysis, each cohort subject was associated with a feature vector that recorded the exposure of the subject to the commonly used medications for migraine treatment during the study period, including amitriptyline, flunarizine, propranolol, topiramate, and valproic acid. The exposure was measured by the number of days and the dosage in milligrams. The dosage was also calculated in defined daily dose (DDD) by World Health Organization (http://www.whocc. no/atc_ddd_index/) for validation. The exposure to each category of medications was counted separately. Accordingly, the feature vector is composed of ten elements. In our analysis, we further normalized the feature values corresponding to the same element in the feature vector by applying the standard min-max normalization.
The five categories of drugs for migraine treatment mentioned above all belong to preventive medicines. Aiming to validate drug medications of our study population, we also analyzed the prescription orders for ergotamine during the study period, which is a frequent relief treatment of migraine attacks.

Diseases utilized as outcomes
Our study focused on those diseases that had been reported to be the co-morbidities of migraine (Aamodt et al. 2007;Bigal et al. 2010;Buse et al. 2010;Hagen et al. 2002;Le et al. 2011). These diseases can be classified into six categories as follows based on the ICD-9 CM codes: For each subject, outpatient and/or inpatient diagnoses of these disorders during the study period would be analyzed. Demographics and clinical variables were compared between migraine cases and controls using the Chi-square test or student's t test when appropriate. We have employed the odds ratio (OR) with 95 % confidence interval to quantify the risk of a co-morbidity of migraine in different groups of patients. All tests were two-tailed, and p values of \0.05 were considered significant.

The analysis procedure
The analysis procedure consists of two stages. During the first stage, the RVKDE algorithm was invoked to construct Application of density estimation algorithms 97 one approximate probability density function for the cases, denoted byf , and another probability density function for the controls, denoted byf 0 . Then, all the cases were examined one by one. Let s i denote the feature vector corresponding to the i-th case in the dataset. Iff ðs i Þ=f 0 ðs i Þ is greater than a threshold, then the case was labeled as sample of interest. As mentioned earlier, this screening process aimed to identify those cases that shared some distinctive features in comparison with the controls. During the second stage, the G 2 DE algorithm was invoked to cluster the cases of interest and provided summarized descriptions of the clusters. However, as mentioned earlier, the number of features, which correspond to the dimension of the vector space and thus the dimension of the covariance matrix output by the G 2 DE algorithm, should be limited to a small integer for us to easily obtain an abstract image of the underlying distribution. Accordingly, we incorporated a feature selection process before invoking the G 2 DE algorithm. The feature selection process proceeded as follows. First, the correlation matrix of the original ten features is derived based on the cases of interest identified in the first stage of analysis. Then, those eigenvectors with the corresponding eigenvalue larger than 1 are selected to form the factor space. Finally, the factor space is rotated orthogonally and the component features of the rotated factors with a loading larger than 0.4 are selected to form a subspace into which the original dataset is projected. Table 1 shows the demographics of the entire dataset, which includes 19,356 migraine cases and 96,780 controls. As expected, the distributions of ages and genders are identical among migraine cases and controls. Furthermore, both for preventive medicines (i.e., amitriptyline, flunarizine, propranolol, topiramate, and valproic acid) and relief treatment of migraine (i.e., ergotamine), case patients have significant higher proportions of utilization than control samples. However, for propranolol, topiramate, and valproic acid, case patients have lower exposure dosages and durations. It is observed that the mean prescription dosage of migraine medication in the current study follows the corresponding DDD (B1 DDD per day). Figure 1 shows the results obtained with the conventional analysis procedure, i.e., without invoking the proposed density estimation-based procedure. The blue bars show the relative risks of suffering co-morbidities among migraine cases and controls. The odds ratios with respect to the following co-morbidities are: alcohol abuse 1.8/1.67, anxiety state 3.14/3.36, bipolar disorder 2.11/2.6, depression  Fig. 1 reveal that migraine patients were more likely than age-and sex-matched controls to suffer these illnesses. Please refer to Supplementary Table 1 for more detailed statistics.

Results
The red bars in Fig. 1 with the detailed data in Supplementary Table 2 show the relative risks of suffering co-morbidities between the migraine cases classified as samples of interest during the first stage of the proposed analysis procedure and their age-and sex-matched controls. In this respect, the RVKDE algorithm identified 7,146 migraine patients as samples of interest. Based on the data shown in Fig. 1 and the statistics shown in Supplementary Tables 1 and 2, we can conclude that those migraine cases of interest suffered even higher risks of co-morbidities.
According to the demographics shown in Table 2, the 7,146 cases of interest have lower male proportion than the remaining 12,210 migraine cases (24.8 vs. 29.1 %; p \ 0.001). Moreover, the mean age of the cases of interest is older than the mean age of the remaining migraine cases, 45.3 versus 41.8 with p value \0.001. For both preventive medicines and relief treatment of migraine, cases of interest have significant higher utilization proportions than the remaining migraine patients. However, for topiramate and valproic acid, the cases of interest have lower exposure dosages and durations. Figure 2 Table 3 show the relative risks of co-morbidities among the cases of interest and the remaining migraine cases. We observed that the cases of interest suffered higher risks of co-morbidities than the remaining migraine patients.

and Supplementary
Since Figs. 1 and 2 (and Supplementary Tables 2, 3) confirm that the first stage of the proposed analysis procedure successfully identified a subset of migraine cases who suffered higher risks of developing co-morbidities according to characteristics of medication exposure, it is highly desirable to conduct an in-depth analysis. Accordingly, in the second stage of the proposed analysis procedure, the G 2 DE algorithm was invoked to identify the main clusters among the 7,146 cases of interest. As mentioned earlier, before invoking the G 2 DE algorithm, factor analysis was carried out to identify the most informative features. In this respect, it must be noted that the set of cases of interest passed the two criteria commonly adopted to measure the adequacy of applying factor analysis. In fact, applying the Kaiser-Meyer-Olkin (KMO) test on the set of cases of interest yielded a value of 0.502, which is higher than the commonly adopted threshold of 0.5, and applying the Bartlett's test yielded a value smaller than 0.001, which is significant for variance homogeneity. The end result of the factor analysis is that exposure dosages (in unit of milligram) for the five preventive medicines of migraine: amitriptyline, flunarizine, propranolol, topiramate, and valproic acid, were selected respectively.
The G 2 DE algorithm identified two clusters with distinctive characteristics shown in Table 3. Comparing the cases in cluster 0 and cluster 1, we can find that the cases in cluster 1 were generally older (52.5 vs. 44.7 with p value \0.001) but they have almost the same gender distribution. Furthermore, for both preventive medicines and relief treatment of migraine attacks, the case samples in cluster 1 had significant larger exposure dosages and longer durations. According to the results shown in Fig. 3 and Average dosage (DDD) (SD) 0.5 (0.3) 0.6 (0.6) \0.001

Co-morbidities of migraine
According to the results shown in Fig. 1 and Supplementary Table 1, our study confirms co-morbid relationships between migraine and various diseases even without carrying out the screening process to identify samples of interest. In our study, the diseases included for co-morbidity analysis can be classified into six categories. Application of density estimation algorithms 101

Mental disorders
The correlation between mental disorder and migraine has been studied extensively in recent years and our results match the previous observations. The American Migraine Prevalence and Prevention (AMPP) study demonstrated that both depression (OR = 2.0) and anxiety (OR = 1.8) were included in the co-morbidity profiles of chronic migraine and episodic migraine patients . Based on the Italian version of the Mini International Neuropsychiatry Interview (MINI), Beghi et al. (2010) reported that significant proportions of depression and moderate proportions of anxiety were among migraine and tension-type headache patients. Dilsaver et al. (2009) showed the association between bipolar disorder and migraine by observing that patients with a family history of bipolar disorder were 4.38 (OR = 4.38) times more likely to have migraine headaches than those without. A recent questionnaire survey revealed that migraine was far more prevalent in the substance abusers, e.g., alcohol, benzodiazepine, or opioids (Beckmann et al. 2012). Because of distinctness for study designs and data sources, we might not directly compare our quantitative results with benchmark values from literatures. Nevertheless, our results in Fig. 1 and Supplementary Table 1 confirm that migraine patients are more likely than controls to suffer mental disorders, which is in conformity with the observations reported in previous studies. Shared serotonergic dysfunction between migraine and affective disorders may contribute these associations.

Otolaryngology
The association between migraine and asthma has still been under debate. The Head-HUNT study showed that both migraine and non-migrainous headache were 1.5 times (OR = 1.5) more prevalent among those with asthma than those without (Aamodt et al. 2007). On the contrary, another study showed that the risk of developing follow-up incident asthma was not materially higher for migraine patients (Becker et al. 2008). Our results support the co-morbid associations between migraine and allergic rhinitis (OR = 2.19/2.34) as well as chronic pulmonary disease (OR = 1.94/1.84). Recent evidence has suggested that activation and sensitization of primary afferent meningeal nociceptive neurons trigger migraine attacks and the triggering factor is the involvement of mast cells (Levy et al. 2006). These findings may explain why allergic nasal symptoms accompany migraine. Finally, it has been reported that patients with Meniere's disease suffered higher prevalence of migraine and Meniere's disease patients with migraine suffered more severe vertigo or hearing loss Fig. 2 Relative risks of co-morbidities among cases of interest and the remaining migraine cases for the study period of 24 months before the index date (blue bars), and for the study period of 12 months after the index date (red bars) (color figure online) (Cha et al. 2007). Again, the results from our populationbased study are in conformity with these findings.

Musculoskeletal illnesses
The Nord-Trondelag Health Survey found that prevalence of chronic headache was 4.6 times (OR = 4.6) higher among individuals with musculoskeletal symptoms than among those without (Hagen et al. 2002). Similarly, 92 Israeli consecutive patients with migraine from a tertiary headache clinic suffered high incidence of fibromyalgia syndrome (Ifergane et al. 2006). In addition, the National Health Examination and Nutrition Survey (NHANES) showed adults with headache/migraine suffered increased odds of rheumatoid arthritis (OR = 1.95) (Kalaydjian and Merikangas 2008). Our results in Fig. 1 and Supplementary Table 1 confirm the co-morbid associations between migraine and various musculoskeletal illnesses.

Metabolism and endocrinology
Results of any significant association between migraine and diabetes are conflicting: some showed co-morbidity (OR = 1.4) (Bigal et al. 2010), some not (Le et al. 2011), and yet the other reported an inverse association (Burn et al. 1984). This debate may be why our results only show a slight co-morbid association between migraine and diabetes mellitus (OR = 1.16/1.15). Similarly, the International Headache Society (IHS) Classification of Headache Disorders Second Edition includes ''Headache attributed to hypothyroidism'', and it was observed that approximately 30 % of 102 hypothyroid patients had bilateral, continuous headache (Moreau et al. 1998). Our observations also support this conclusion (OR = 1.61/1.77), but another population-based study obtained a conflicting result with negative correlation (OR = 0.5) (Hagen et al. 2001). Elevated levels of cholesterol (OR = 5.97) and triglycerides (OR = 4.42) had ever been reported to be associated with migraine (Rist et al. 2011), but there is no direct significant association between electrolyte imbalance and migraine as far as we are concerned to support our results (OR = 1.78/ 1.56). Finally, one epidemiologic study found the positive association between migraine and obesity (Peterlin et al. 2010). This suggestion is also supported by our analyses (OR = 1.73/1.94) while another population-based study disputed the association (OR = 1.03) (Winter et al. 2009).

Cardiovascular and neurological diseases
For over one decade, it has been a consensus among biomedical scientists that migraine increases atherosclerosis risk and ignites cardiovascular disorders such as instance angina, ischemic heart disease (OR = 1.94-2.2), and stroke (OR = 1.5-5.46) (Bigal et al. 2010;Kurth et al. 2008;Stang et al. 2005). Schurks et al. (2008) suggested that the MTHFR 677TT genotype magnifies risk of cardiovascular disease among migraine patients. Bigal et al. (2010) demonstrated a higher cardiovascular risk profile among migraine patients with higher cholesterol and blood pressure level. On the other hand, the co-morbidity between migraine and epilepsy has been suggested in one recent Dutch study (OR = 1.39) (Nuyen et al. 2006). The linkage between epilepsy and visual aura migraine possibly results from a gene defect located at chromosome 9q21-q22 (Deprez et al. 2007). In our population-based study, all these cardiovascular/neurological illnesses were prevalent among migraine patients than among matched controls.

Gastroenterology and hepatology
One recent study has concluded that kidney stone is a comorbidity of migraine (OR = 1.43) (Le et al. 2011), which coincides with our analyses (OR = 1.92/1.83). It was suggested that topiramate dosage, which is commonly used for migraine preventive treatment, was inversely correlated to urinary citrate excretion and led to increased risk of stoneforming (Kaplon et al. 2011). On the other hand, Helicobacter pylori infection might be both causes of hepatic Fig. 3 Relative risks of co-morbidities among the clusters identified by G 2 DE for the study period of 24 months before the index date (blue bars), and for the study period of 12 months after the index date (red bars) (color figure online) encephalopathy and migraine symptoms in patients with cirrhosis (Hong et al. 2007). Although non-steroidal antiinflammatory drugs, which are the symptomatic relief of headache and migraine, may be ulcer-causing medications, peptic ulcer disease did not have a high prevalence in the US headache patients (Rozen and Fishman 2012). This is contradictory to our observations for the co-morbid relation between migraine and peptide-ulcer disease (OR = 2.33), and prescriptions for drugs of headache relief without the side effect of ulcer may explain this difference. Finally, increased plasma concentrations of endothelin-1 had been described in both migraine and renal disease patients; this might be the reason for their co-morbid association (Noll et al. 1996).

Analysis results of density estimation
The co-morbid associations of migraine and various kinds of illnesses can be observed in Fig. 1  age-and sex-matched controls, or to the remaining 12,210 migraine cases, they were even more likely to suffer these co-morbid illnesses. Our study verifies the effectiveness of density estimation algorithms on medical information analyses. The extracted migraine ''patients of interest'' had higher utilization proportions of both preventive medicines and relief treatment for migraine than the filtered cases. Because migraine is a common chronic, recurrent condition, it is believed that patients with significant medication utilization are more representative for this disease. Since some of the co-morbid illnesses studied belong to the Charlson (Charlson et al. 1987) or Elixhauser index (Elixhauser et al. 1998), it is suggested that physicians screen these patients for further risks of poor health conditions. Moreover 489 of the 7,146 migraine cases of interest could be identified by G 2 DE according to the characteristics of medication exposures for migraine. Although for flunarizine and ergotamine, the selected 489 cases and the remaining 6,657 ones did not show significant differences in utilization proportions, these migraine patients had larger exposure dosages and longer durations for all kinds of drugs studied. This can be treated as a migraine severity measurement. According to the results shown in Fig. 3 and Supplementary Table 4, exposure dosage/duration of medicines discriminates best for the mental disorders and cardiovascular/neurological diseases. It was observed that the worse the pain profile, the worse the physical functioning and mental health (Wang et al. 2001). So our results are in conformity with the previous conclusions.
Although conventional algorithms of regression analysis are applicable for data mining in medical and/or clinical information, they borrow the idea from multi-dimensional contingency table to determine certain associations between the dependent variable and the risk factors. Rather than fitting a more saturated model, it might be more inclined to reflect an interaction structure between the dependent variables and corresponding risk factors. However, in this research, we would like to refer the concept of discriminate analysis: classifying an object that comes from one of two populations having associated densities f1 and f2 could be based upon the likelihood ratio f1/f2. It is expected that the significant difference between density distributions represents variances of the dependent variables in distinct groups of independent variable, e.g., an overall migraine severity measurement quantified by synergistic medication exposures. In fact, we ever categorized the migraine patients of interest as the contingency table by age, but this clustering cannot discriminate mental disorders the way G 2 DE can (data are not shown). So the proposed density estimation-based analysis procedure conceivably provides valuable insights which might be overlooked by conventional methods.

Limitations
A major strength of our study was utilization of a large population-based medical claims database, but there were some limitations. First, administrative claims reported by hospitals or clinics may be less accurate than clinical diagnoses and observer-rating scales. Second, prescriptions of medications for migraine do not guarantee drug adherence. Third, the administrative claims data of NHIRD did not include detailed personal information like body mass index, living habits, or results of laboratory tests, which might be important confounding factors. Finally, more confounding factors of the outcome diseases, e.g., age, sex, medication drugs, treatment procedures, or associated symptoms, should be taken into account.

Conclusions
In recent years, data analysis based on large medical and clinical databases has gained attention among biomedical researchers. Furthermore, scientists have turned to exploit advanced machine learning and/or data mining approaches to extract valuable clues hidden in large medical and clinical databases. In this paper, we have proposed a density estimation-based data analysis procedure to investigate the co-morbid associations between migraine and the suspected diseases by characteristics of medication exposure. The primary objective of this study is to develop a novel analysis procedure that can discover insightful knowledge from large medical databases. The results obtained by applying the proposed two-staged procedure to analyze comorbidities of migraine reveal that the proposed procedure can effectively identify a number of clusters of cases with distinctive characteristics. Furthermore, it has been observed that the distinctive characteristics of the clusters are in conformity with the recently discovered knowledge in biomedical research. Accordingly, it is conceivable that the proposed analysis procedure will be exploited to provide valuable clues of pathogenesis and facilitate development of proper treatment strategies.
Three further courses are undertaken. Firstly, since effectiveness of the proposed analysis procedure has been verified, this method will be exploited to investigate characteristics of more epidemics, such as osteoporosis or herpes zoster. Secondly, appropriate statistical tests will be issued on the mined facts to strengthen persuasiveness of this approach. Finally, application of various advanced machine learning/data mining algorithms on medical and/ or clinical databases will also be studied.