Multi-database study of multiple sclerosis: identification, validation and description of MS patients in two countries

Objective To describe the resources and methods used to identify and validate multiple sclerosis (MS) and match non-MS patients in each of the two databases, and to characterize their demographics, comorbidities and concomitant medications. Methods This study was conducted in two separate electronic medical databases, the United States Department of Defense (DOD) military health care system and the United Kingdom’s Clinical Practice Research Datalink (CPRD) GOLD. We identified patients with a first recorded diagnosis of MS in 2001–2016 (CPRD) or 2004–2017 (DOD) and matched non-MS patients using algorithms appropriate to each database. We describe patient symptoms, comorbidities, and medication use at the time of the MS diagnosis and compared them to the non-MS cohort. Results We identified 8695 patients with MS and 86,934 matched non-MS patients in the DOD database and 6932 patients with MS and 68,526 matched non-MS patients in CPRD GOLD. Most MS patients were female (around 70%) and were diagnosed before age 60 (88%). MS patients had higher prevalence of depression and other psychiatric conditions at MS diagnosis compared to non-MS patients. Epilepsy, fractures and infections were also more common. MS patients had many expected symptoms and treatments documented in their records prior to the MS diagnosis. Conclusion These results are consistent between the two databases, as well as with previous studies of MS. Future analyses of these patients’ experience after MS diagnosis will provide valuable insights into disease and treatment patterns in relation to risk of chronic diseases and mortality.


Introduction
Multiple sclerosis (MS) is the most prevalent permanently disabling neurological disease among young adults in Europe and North America, and is associated with diminished quality of life and high socio-economic cost [1][2][3]. There have been recent important advancements in the treatment of this condition, but the long-term implications of these treatments on comorbidities and mortality are not well understood. We conducted a study to follow MS patients and matched non-MS patients to assess changes in comorbidities and drug use in the two cohorts during follow-up and over the progression of the disease in the MS patients. The study covered the period 2001-2017 and was conducted separately in three databases across three countries to further compare patient characteristics, treatments, and outcomes in different patient populations. In this paper, we describe the Department of Defense military health system database (DOD) (United States (US)), and the Clinical Practice Research Datalink (CPRD) GOLD (United Kingdom (UK)). Information on the third contributor to this study (Swedish registry data) will be reported at a later date. The objective of this manuscript is to describe the resources and methods used to identify and validate the MS patients and the matched non-MS cohorts in each of the two databases, and to characterize the demographics of MS patients in the US and the UK, as well as comorbidities and concomitant medications at first MS diagnosis, and to compare these with those of a matched general population.

Methods
This study was conducted in two separate databases to describe patients with MS and their variations in comorbidities and concomitant medications at the time of MS diagnosis across different countries. Both data resources are described below. See Table 1 for a comparison of database details and coding systems used in each.

Department of Defense (US)
The US study was conducted utilizing clinical and administrative data from the DOD military health care system and Health ResearchTx, a health research organization that works with the DOD to provide healthcare information for research purposes [4]. The DOD data are a US-based, ongoing longitudinal database with health information on approximately 10 million active beneficiaries (49% female). It is comprised of data contributed by members of the US DOD, retirees and dependents. Only 14% are active service members. Data are available and of sufficient quality for use in research from October 2003 to 2017 and the average follow-up per patient is 8 years.
For patients who receive medical care at DOD facilities, the data contain virtually complete electronic medical records (EMR), including demographic information, prescription details, clinical events, referrals, hospital admissions, health care resource use, laboratory results, and vital patient characteristics, such as blood pressure, smoking, alcohol consumption and height and weight, and have welldocumented validity. The data for all patients in the database are recorded in enrollment, pharmacy, and inpatient/ outpatient (diagnoses, procedures) files. Drug prescriptions are coded using the National Drug Code (NDC) and each drug claim includes information on the specific product dispensed, the date, quantity, length of supply and refills provided. All medical events and procedures are coded with the International Classification of Diseases (ICD)-9/10 coding system, DRG codes, and CPT codes (procedures). These include inpatient, outpatient and emergency encounters. Up to 20 diagnoses are listed under each hospitalization in a DOD hospital, and 12 in private sector/civilian hospitals. Around 40% of all patients receive medical care at DOD facilities while the remaining 60% are seen for care in the civilian or private sector. Deaths are identified for all patients through a master death file that is updated by a recurring Social Security Death Index (SSDI) feed from the Social Security Administration.
There is also access to chart review of the original records to validate outcomes and obtain additional clinical details. The long average follow-up time and relatively large and stable population are important features of the data. Finally, the DOD covers patients in all 50 states and is thus

Clinical Practice Research Datalink (CPRD) GOLD
CPRD GOLD, established in 1987, is a large, prospectively collected, anonymized medical record database encompassing over 500 UK general practices, covering over 10 million patients and over 65 million person-years of follow-up. It is a population-based resource broadly representative of the UK population (including England, Scotland, Wales, and Northern Ireland) in terms of age, sex, and minority distribution [5]. CPRD GOLD contains virtually complete electronic medical records, including demographic information, prescription details, clinical events, referrals, hospital admissions, laboratory results, health care resource use, and lifestyle details, such as smoking, alcohol consumption and height and weight, and has well-documented validity. CPRD GOLD data are recorded in multiple files, including the registration, drug, events (diagnoses, procedures), laboratory, and patient characteristics files. The Drug file contains detailed information on all drugs prescribed by the general practitioner using the Gemscript coding system. Drug details include the date, precise drug formulation, strength, and quantity of drug prescribed, and the dosing instructions. In certain instances, a specialist may prescribe a course of treatment; these prescriptions are not captured in the database, though future drug therapy is usually directed through the general practitioner and thus captured in the database. Treatments that do not result in a written prescription, such as infusions, are also only rarely captured. The Event file contains all clinically relevant patient diagnoses (inpatient, outpatient, and emergency department) along with the date of the event. In addition to usual care in the office, general practitioners are required to enter the indication for any new drug therapy, as noted above, as well as all diagnoses resulting from hospitalizations, consultations, or emergency medical care. Because the general practitioner is the primary caregiver for all patients in the National Health Service (NHS), all consultants are required to send a letter to the general practitioner describing the relevant clinical events and final diagnoses whenever a patient is seen in hospital or by an outpatient specialist. The key contents of these letters, primarily clinical diagnoses, are then entered into the computer file by the general practitioner. CPRD GOLD uses the Read coding system which provides more detailed clinical codes than ICD. There is also a file with Additional Clinical Details which contains patient information such as blood pressure, height, weight, smoking status, alcohol use, and other lifestyle characteristics. Information on laboratory tests and results is available in a lab file. Finally, additional clinical information and data validation can be obtained through questionnaires to the GPs. These data are not claims based.
In addition to the information in the CPRD GOLD record, it is possible to link around 60% of all CPRD GOLD patients to hospitalization and death registry data (for practices from England only, not Wales, Scotland or Northern Ireland). The Hospital Episode Statistics (HES) contain details of admissions to NHS hospitals in England including dates of admission and discharge, the primary diagnosis, additional diagnoses, and procedures. These data include treatments, such as infusions, provided in outpatient hospital clinics. Diagnoses are recorded using ICD-10. The death registry contains details of all deaths in England through mid-2017 including primary and secondary causes of death as noted on the patient's death certificate. We also received the ONS death registry data for all patients in the study population with registry linkage. These data include the primary cause of death and up to 15 secondary causes, coded with the ICD 10 coding system.
The UK provides a unique medical environment for epidemiological research because of the universal health system which minimizes selection bias related to differential access to healthcare, and because all patient care is centralized with the general practitioner. Information from all inpatient and outpatient medical encounters is reported to the general practitioner in the form of consultant and discharge letters, and is coded in the electronic patient record resulting in virtually complete ascertainment of all medical outcomes for all patients in CPRD GOLD. Thus, concern about missing events treated by specialists is minimized in this study. The results of validation studies found the data to be of high quality and completeness [6,7]. Access to original records via questionnaires and death registry data, long average follow-up time (> 8 years) and a relatively stable population enrolled in a state-controlled health system are all important features of the data. All patients in this study were identified from the general practice setting and, therefore, are representative of patients attending general practice care in the UK. We have used the CPRD GOLD data for many studies, including many prior studies of MS [8][9][10][11][12][13][14][15].

Study population
Within each database, we identified all people with a first recorded diagnosis of MS in years 2001-2016 (CPRD) or 2004-2017 (DOD). To identify newly diagnosed cases of MS, we required at least 1 year of enrollment in the database before the first MS diagnosis. Cases were then validated through multiple steps that varied according to database. The objective of data recording is fundamentally different in the CPRD compared to the DOD database. One is a GP electronic record (CPRD) while the other is a claims database (DOD), thus the recording of diagnoses differs greatly between the two and the criteria for identifying MS patients, and defining comorbidities also differs (Table 1). These differences are reflected below in the selection and validation of study outcomes.
In the DOD, we included all patients with at least one diagnosis of MS or demyelinating disease and at least one prescription for a MS disease-modifying treatment or dalfampridine. We then classified these patients as probable or possible MS patients. Probable cases were those that had at least ten diagnoses of MS or demyelinating disease on different dates and at least five prescriptions for a MS disease-modifying treatment or five or more MS diagnoses on different dates and at least ten disease-modifying treatment prescriptions or infusion codes. Possible cases were those with demyelinating disease codes (without any MS codes) plus disease-modifying treatment prescriptions, or one to ten MS diagnosis codes and one to ten disease-modifying treatment prescriptions. In US claims databases, a code for the presumptive diagnosis is included each time a test is conducted or a visit occurs to further investigate the disease. Thus, in the DOD database, patients with fewer than ten codes for MS were less likely to have true MS. We reviewed the electronic records from a sample of all MS patients in collaboration with a DOD physician, to validate the MS case selection process.
CPRD GOLD is an EMR and not a claims database; consequently codes are entered for purposes of the GP's medical record keeping: GPs record confirmed not provisional diagnoses unlike in the DOD database. Once the diagnosis has been entered in the patient record it is not necessary to repeat it and 66% of MS cases in the CPRD had only one or two codes for MS. Furthermore, MS disease-modifying treatments are not regularly captured in the CPRD because many of them are administered as infusions outside the GP's surgery. Thus, to have confidence that a patient truly had MS we validated MS cases in the CPRD through supporting codes. Probable MS cases were those whose records contained two or more MS diagnosis codes on different dates plus treatment and/or symptom codes. Possible MS cases were those whose records contained (1) at least one MS diagnosis plus codes for symptomatic treatments or symptoms that may or may not have been related to MS, or (2) the record contained two or more MS diagnosis codes on different dates with no supporting treatment or symptom codes. Unlikely cases of MS included those whose records contained one MS diagnosis code only and no supporting treatment or symptom codes, or the record contained an alternate diagnosis such as prior stroke.
In light of the small number of MS diagnoses included and the absence of treatment information in GP records, we conducted a validation study in the CPRD population to assess the MS case algorithm. We requested and received questionnaires on a sample of patients in the CPRD to validate our case algorithm. Thus, despite the differences in the databases and the basis for recording diagnoses in each, we were able to develop and validate database-appropriate case selection procedures customized to each resource.
To exclude uncertain or incorrect MS diagnoses, cases in the CPRD and DOD were excluded where (1) amyotrophic lateral sclerosis was coded at any time in the record or (2) where another alternate diagnosis was present at some time in the record and only one MS code (CPRD) or less than five MS codes (DOD) were present, or (3) where only one MS code (CPRD) or less than five MS codes (DOD) were present and the patient had a code for stroke or transient ischemic attack at any time prior to or up to 6 months after the MS code.
The MS diagnosis date (cohort entry date) was the date of the first recorded MS code in both the DOD and CPRD GOLD databases. All patients with MS were then matched to up to ten patients without MS on age, sex, month and year of cohort entry in the database, and geography (same practice in the CPRD; same region in the DOD).

Study outcomes
In each database, for each MS and non-MS patient, we identified MS symptoms, comorbidities and concomitant medications present at the time of cohort entry (MS diagnosis date) or at the matched date (month and year) for the non-MS patients. See Tables 3, 4, 5 and 6 for outcomes included.
A patient was considered to have a chronic study comorbidity, such as asthma, epilepsy or liver disease, if the disease was recorded at least once (CPRD) or at least five times (DOD) any time prior to or on the cohort entry date. Chronic study outcomes in the DOD require more recordings because of the claims nature of the data and to avoid misclassification of outcomes. Treated depression, hypertension and type II diabetes were defined as the presence of at least one diagnosis code and one respective prescription within 90 days of each other. In the CPRD, cancer was defined as one or more diagnosis codes for cancer or history of cancer. In the DOD, cancer was defined as five or more diagnosis codes within 6 months of each other, or ten or more diagnosis codes at any time before cohort entry, or at least one diagnosis code for a history of cancer or a secondary cancer before cohort entry. For acute outcomes, a patient was only required to have one diagnosis code to be included. A patient was considered to have an acute infection comorbidity if they had an infection code in the year prior to cohort entry. We required multiple diagnosis codes and supporting treatments or symptom codes for meningitis diagnoses in the DOD. Hospitalized infections were inpatient claims (DOD) or diagnosis codes with a hospitalization code within 1 week (CPRD). Suicidal behaviors were defined as having at least one code for suicidal ideations, suicide attempts, intentional self-harm or overdose.
Concomitant medications, recorded in the year prior to MS diagnosis date or matched date in the non-MS patients, included treatments for MS symptoms, as well as treatments for cardiovascular disease and other comorbidities. Many anticonvulsant therapies are indicated for both epilepsy and MS symptoms. We categorized anticonvulsant therapies as epilepsy treatments when the patient had a current or prior diagnosis of epilepsy. Otherwise, we categorized anticonvulsant therapies as treatments for MS symptoms.

Statistical analysis
We described basic characteristics and study outcomes of the MS and non-MS patients at cohort entry (date of first recorded MS code/matched date). Proportions were compared using a chi-square test or, where cell sizes were less than 5, a Fisher's exact test. Statistical analyses were carried out using SAS Release 9.3 (DOD study) and SAS Release 9.4 (CPRD study) (SAS Institute Inc., Cary NC, USA). Testing was not adjusted for multiplicity as this was a descriptive study of two independent large data sets exploring clinically relevant trends.

DOD
We identified 8695 patients with MS and 86,934 matched non-MS patients in the DOD database. Most MS patients were female (71%), and a large portion were diagnosed before age 40 (45%) or between ages 50 and 59 (43%). See Table 2 for more details. Most cases were considered probable cases (N = 7172, 82.5%), i.e. they had many MS diagnoses (median = 45) and many disease-modifying treatment prescriptions (median = 30). Since we determined that in the DOD people with no disease-modifying treatments were not cases, there were few people who were considered only possible cases: 37 (0.4%) people had demyelinating disease codes with treatment and no MS codes, 125 (1.4%) people had five to nine MS diagnoses and five to nine MS diseasemodifying treatments, and 1361 (15.7%) people had less than five MS diagnoses and/or less than five MS diseasemodifying treatments. Possible cases had shorter records after the first MS diagnosis than probable cases (median 3.1 versus 7.5 years) which may account for a smaller number of diagnoses and prescriptions in this group. Among possible MS patients with records longer than 3 years, many had either few diagnoses of any kind in their records or had a few prescriptions for MS disease-modifying treatments and then stopped. MS patients differed from the matched non-MS patients at cohort entry. MS patients had history of many more symptoms of MS including optic neuritis, disturbances of skin sensation, paresis, paralysis and muscle weakness, spasms/ involuntary movements/lack of coordination, abnormality of gait, neuropathic pain and neuropathies, dizziness, etc. They were also more likely to have treated depression prior to the date of the first MS code or matched date in the non-MS patients (cohort entry date) compared to non-MS patients. See Table 3. MS patients were also slightly more likely to have had asthma or COPD, epilepsy, autoimmune disorders, hypertension, dyslipidemia, various cardiovascular diseases, and fracture. MS and non-MS patients were similar with respect to their history of other comorbidities. See Table 4. Finally, MS patients were more likely to have had many types of infections in the year prior to cohort entry compared to non-MS patients though the differences were small for many infection types. The greatest differences between MS and non-MS patients were found for urinary and kidney infections, and respiratory and throat infections. See Table 5. The use of medications that correspond to the treatment of MS symptoms and the comorbidities in the year before cohort entry differed between MS and non-MS patients in a predictable manner. MS patients received more drugs to treat movement disorders, pain, and fatigue. They also received more treatments for epilepsy, depression and other psychiatric conditions, and infections. See Table 6.

CPRD Gold
We identified 6932 patients with MS and 68,526 matched non-MS patients in CPRD GOLD. The age at onset and sex distribution of MS cases are similar in the CPRD and the DOD data (see above). See Table 2 for details. After receiving questionnaires for a sample of all MS patients to validate the diagnoses, we calculated a positive predictive value (PPV) of 92% overall. The PPV was 100% for probable MS patients (53.5% of all MS patients), 84% among possible MS cases and 56% among unlikely MS patients. As in the DOD data, MS patients differed from the matched non-MS patients at the date of the first MS code (cohort entry), or the matched date in the non-MS patients, in many ways. MS patients had history of many more symptoms of MS including optic neuritis, nerve disorders, ataxia, and paresthesias, and more neurology referrals and visits. See Table 3. They were also more likely to have treated depression prior to the cohort entry date compared to non-MS patients. Differences in the prevalence of other comorbidities, including acute infections, were also similar to the DOD findings. See Tables 4 and 5. The greatest differences in infection rates were found for urinary and kidney infections, and eye and ear infections. The use of medications that correspond to the treatment of MS symptoms and the comorbidities in the year before cohort entry were more prevalent in the MS patients, and were similarly different in the CPRD GOLD and the DOD databases. See Table 6.

Discussion
This study used data from the CPRD GOLD and DOD, both valuable databases for the conduct of disease epidemiology studies, to characterize MS patients and matched patients who did not have MS, with respect to symptoms, comorbidities, and treatments at the time of the MS diagnosis (or the matching date in the non-MS patients). The results of this study are consistent with what is known about early MS symptoms and their treatments, and MS comorbid conditions. In addition, new associations were observed that will add valuable clinical insights to support earlier disease diagnosis and treatment. For example, fracture was more common at cohort entry in MS compared to non-MS patients as was asthma/COPD. Breast cancer was less common in MS compared to non-MS patients and cardiovascular disease was similar in MS and non-MS patients at the time of MS diagnosis. Future analyses of these patients' experience after MS diagnosis will provide valuable insights into disease and treatment patterns in relation to risk of chronic diseases and mortality. Note that the results of this study address comorbidities of MS at the time of the first recorded MS diagnosis. We did not attempt to identify the date of MS onset for this study so all results should be interpreted accordingly. This study adds to the knowledge of signs, symptoms and concomitant disease present when the MS diagnosis has been made and is not a study of causes of MS.
While the findings of this study are broadly similar to earlier findings, there are some apparent differences between US and UK patients in both the MS and non-MS cohorts. For example, MS patients tend to be diagnosed at a younger age in the US compared to the UK. In the US, the most common comorbidities present in MS patients at cohort entry were treated depression, treated hypertension, fracture, dyslipidemia and 'other psychiatric disorders'. The most common comorbidities in the UK MS population, at cohort entry, were fracture and treated depression, followed by asthma/COPD, treated hypertension and dyslipidemia. It should be noted that asthma was more prevalent in the UK while hypertension was more prevalent in the US in both MS and non-MS patients. These differences are likely to reflect a combination of differences in the health care systems, the source of data (Claims versus EMR), as well as differences in the US and UK populations. It is also possible that the longer look back in the CPRD (data go back as far as 1988 for some patients) could explain the higher prevalence of certain chronic conditions in the CPRD. Other conditions that were less prevalent overall but more common in MS patients compared to non-MS patients include epilepsy and venous thromboembolism (in the DOD) or peripheral vascular disease (in the UK). Certain terms including vestibular and labyrinthine disorders/vertigo, neuritis/neuralgia/ radiculitis, and disturbance of skin sensation/paraesthesia, numbness or tingling are grouped differently in the DOD compared with the CPRD (ICD versus Read), demonstrating that, for some conditions, direct comparison of frequencies between DOD and CPRD patients is difficult to interpret.
Our results generally agree with the few prior reports that evaluated comorbidities at time of MS diagnosis. In the current study, in both databases, the prevalence of depression in MS patients (~ 21%) was higher than in non-MS patients, as was found in a Canadian study that reported 19.1% prevalence of depression at MS diagnosis [16]. The prevalence of other psychiatric diagnoses at MS diagnosis was also higher compared to non-MS patients in Canadian [16] and French MS patients [17]. Finally, a Danish study reported an odds ratio of 1.4 (95% confidence interval (CI) 1.05-1.88) for  diagnosis of depression or anxiety during the 2 years prior to MS diagnosis [18]. In both the US and UK databases, the prevalence of treated diabetes and dyslipidemia at MS diagnosis were similar for MS and non-MS patients. In the CPRD only, the prevalence of treated hypertension was also similar in MS and non-MS patients. Treated hypertension was marginally higher in MS patients in the DOD cohort compared to non-MS patients. In the Canadian study, the prevalences of hypertension, diabetes and dyslipidemia at MS diagnosis were 15.2%, 5.69%, and 6.89%, respectively, and both hypertension and diabetes (but not dyslipidemia) had higher prevalence in MS compared to non-MS patients [16]. Among French patients aged 15-45, the combined prevalence of Type I and Type II diabetes was substantially higher in MS patients (18.5%) than non-MS patients (8.6%) [17]. In our study, the age range was up to age 90 and we reported on type II diabetes only; therefore, the results of these studies are not comparable.
In both cohorts in our study, the prevalence of epilepsy at MS diagnosis was approximately twice as high as non-MS patients, as was seen in the Canadian study (MS: 1.93% vs. non-MS: 0.89%) [16]. Similarly, in both the CPRD and DOD cohorts, the prevalence of "asthma or COPD" was higher for MS than non-MS patients (5.2% vs 4.5% and 16.0% vs 14.7%), consistent with the finding of the Canadian study, which reported a higher prevalence of chronic lung disease at MS diagnosis versus non-MS patients (12.1% vs. 9.14%) [16].
Other diagnoses common in MS patients in both the US and the UK data included multiple types of infections such as urinary and kidney, respiratory and throat, and eye and ear. All were significantly more common amongst the MS compared to the non-MS patients in the year prior to the MS diagnosis. Although the prevalence and incidence of infections are higher among MS patients after diagnosis [19], our study adds important new evidence of heightened infection risk prior to MS diagnosis. One of the findings, an increased risk of urinary tract infections, could be due to MS-related reduction of muscular control, but there may be alternate explanations. Although many studies have reported associations of bacterial and viral infections with subsequent MS, in a meta-analysis, only Epstein-Barr virus (and seropositivity to EBV nuclear antigen (anti-EBNA IgG)) and infectious mononucleosis have consistent positive associations with MS [20].
The presence of many MS symptoms at the time of MS diagnosis was predictably common among MS patients in both the US and UK data resources. Results were consistent with a recent meta-analysis that reported the following prevalences in patients with existing MS: neuropathic pain 28.5%, painful spasms 15.0%, and trigeminal neuralgia 3.8% [21]. Neurologic conditions, such as optic neuritis, paresthesia, and dizziness, were similarly common prior to the diagnosis of MS, as expected, in both the US and UK. Optic neuritis is often the presenting symptom of MS, and longitudinal studies show that 34-75% of patients presenting with optic neuritis in the UK and US develop MS [22,23] and a relative risk > 30 for MS for Chinese individuals with optic neuritis followed for 9 years [24]. The prevalences of other symptoms such as treated depression and malaise or fatigue were also similar in the two databases. Likewise, the treatments for these symptoms were common among MS patients in both the DOD and CPRD. Use of drugs to treat MS symptoms was notable in that a higher proportion of MS patients in the DOD compared to the CPRD received drugs for MS symptoms including spasticity, convulsions, and pain. Use of pain medications was also more common in non-MS patients in the DOD versus the CPRD. Recognition of patterns of symptoms and treatments that manifest at MS onset but before first diagnosis could lead to earlier diagnosis and treatment for MS patients and potentially improved long-term prognosis. Use of statins, steroids, and antibiotics was also higher in the DOD for both MS and non-MS patients which may reflect a general tendency toward higher prescribing in the US compared to the UK.
The DOD and CPRD databases each have their own strengths and limitations. We used different criteria to identify MS patients in the DOD and CPRD databases. In the US DOD database, patients with confirmed MS are treated with MS-specific therapies. Hence, the inclusion of MS treatment records as a requirement in the case definition. In the UK, most MS treatments are not recorded on GP computers because they are not outpatient prescriptions but rather infusions given in specialty clinics or are prescribed by specialists. In addition, not all people with MS receive MS-specific treatment in the UK. Furthermore, while the nature of claims data leads to repeated entries for chronic diseases such as MS in the DOD database, the EMR character of the CPRD results in as few as one or two codes for chronic conditions over a span of many years. Thus, in the CPRD, the MS case definition did not include MS treatment, nor multiple MS records as required conditions. On the other hand, the CPRD does contain supplementary codes for symptoms and services related to MS which were used to identify likely cases. These differences required different validation processes and inclusion criteria to identify "true" MS cases in the two databases. Despite these differences in the database-specific case criteria, we were able to identify populations of MS patients that we validated through questionnaires or electronic record review in both data sources. The similarity in results between the two data resources provides confidence in the case selection process. Despite this, it is possible that some patients who did not have MS were included in the study. However, it is unlikely that there were many such cases or that their inclusion had much influence on the study results. The CPRD has been used for many studies of MS in the past where extensive validation was done [8][9][10][11][12][13][14][15]. These studies demonstrated the high predictive value of the MS case definition applied in this study. There has been one prior study of MS in the DOD database [4].
Information on patient characteristics such as smoking and BMI was available for most patients in the CPRD (> 80%) but not in the DOD where data were only available for around 35%. Thus, evaluation of these associations to MS will be restricted to the CPRD data.
Information on treatments for MS and concomitant medications is available in the DOD data and includes outpatient prescriptions as well as infusions and injections. Concomitant medications are also captured in the CPRD and treatments for MS symptoms were noted for many patients. However, a limitation of the CPRD is the absence of treatments administered outside the GP office. Thus, most MS treatments were not captured in the CPRD portion of the study. This limitation does not impact the results provided here but will have implications for further analyses of MS outcomes according to MS treatments.
The DOD database in the US and the CPRD in the UK constitute valuable resources for the study of MS patients, their symptoms, comorbidities, and medication use at diagnosis and after. Long follow-up will enable us to describe the prognosis of MS patients compared to patients without MS over time.