Crohn’s disease (CD) and ulcerative colitis (UC), collectively referred to as the inflammatory bowel diseases (IBDs), are characterized by chronic inflammation of the gastrointestinal tract. The cause(s) of these diseases is unknown; however, genetics/family history, and environmental factors appear to influence disease risk. Epidemiological studies of IBD in the USA are necessary to quantify the public health burden of disease and inform policy regarding the allocation of resources and provision of health services for affected individuals. Because IBD is not a reportable condition in the USA and comprehensive, nationwide registries for IBD surveillance have not been established, published studies on the epidemiology of IBD in the USA [16] are limited and primarily include studies which have sampled small, geographically restricted populations. Furthermore, no studies of IBD prevalence have been published using data from the last 5 years and, therefore, current time trends remain unknown.

The aims of this study were (1) to estimate the current prevalence of IBD in the USA as of 2009 using a large, regionally diverse population of commercially insured individuals, and (2) to examine recent trends in disease prevalence. A further aim was to characterize differences in disease epidemiology by geographic region, age, and sex.

Materials and Methods

Study Design and Data Source

Using the inpatient, outpatient, and pharmaceutical insurance claims data sets contained within the PharMetrics Choice Patient-Centric Database (IMS Health, Watertown, MA), we performed three consecutive cross-sectional studies for the following 2-year time periods: 1 January 2004–31 December 2005, 1 January 2006–31 December 2007, and 1 January 2008–31 December 2009. This database pools together claims data from 20 to 25 commercial health plans covering all four US census regions. The majority of patients in the database have both medical and pharmacy benefits. Mean health plan enrolment is 6.4 [interquartile range (IQR) 5.3–7.2] years for the 2004–2005 cohort, 5.8 (IQR 4.3–7.1) years for the 2006–2007 cohort, and 4.6 (IQR 3.2–6.4) years for the 2008–2009 cohort. Prior studies have reported PharMetrics to be representative of the national commercially insured population on a variety of demographic measures [7].

Source Populations

All patients in the database with continuous health plan enrollment for at least one of the above 2-year time periods comprised the source population for this study. Individuals turning 65 years of age or older (year of period end − year of birth) during each 2-year period were removed from the source population to avoid the possibility of missing data due to Medicare dual-enrollment. The following demographic elements were available for all members of the source population: year of birth, gender, and US census region (Northeast, South, Midwest, and West).

Case Identification

The case definition used to identify prevalent cases of IBD was adapted from a previously validated University of Manitoba IBD Epidemiology Database definition [8] and has been used in prior epidemiological studies of IBD [3, 912]. For each 2-year period, cases were identified as individuals with at least 3 healthcare contacts, on different days, associated with an International Classification of Diseases 9th revision diagnosis code for CD (555.x) or UC (556.x). Additionally, individuals with at least one claim for CD or UC who also had at least one pharmacy claim for an IBD-specific medication (mesalamine, olsalazine, balsalazide, sulfasalazine, 6-mercaptopurine, azathioprine, infliximab, adalimumab, certolizumab, natalizumab, methotrexate, and enteral-release budesonide) were also included in our case definition. For those patients who had submitted claims for both CD and UC, disease assignment was made according to the majority of the last nine claims. If a distinction between CD and UC could not be made using these criteria, cases were designated as IBD not otherwise specified (NOS). Each 2-year time period had a unique base population depending on the number of patients who met the above-mentioned eligibility criteria. Some individuals were included in more than one time period if they met the inclusion criteria over multiple years.

Statistical Analysis

Analyses were performed for each of the time periods: 2004–2005, 2006–2007, and 2008–2009. Prevalence estimates were made for CD, UC, and total IBD (CD + UC + IBD NOS) as of 31 December 2005, 31 December 2007, and 31 December 2009, respectively, by dividing the number of identified cases by the corresponding number of individuals in the source population. All estimates were reported as cases per 100,000 persons, and also stratified by age, gender, and US census region. Exact 95 % confidence limits (95 % CI) were calculated using a variation on the Wilson Score confidence interval method.

To determine the independent effects of age, gender, and region on the prevalence of IBD in the final time period, multivariable logistic regression was used to compare differences in prevalence by age group, gender, and geographic region while controlling simultaneously for all other data elements.

To estimate the projected number of people with IBD in the US population as of 2009 and to determine adjusted prevalence estimates, we multiplied the age, gender, and region-specific prevalences obtained in our sample by the corresponding US population according to census data.

Next, in order to facilitate our analysis of trends in the prevalence of CD and UC across time periods, we standardized the estimates obtained in each period to reflect the age, gender, and regional distribution of the US population according to 2009 census data and conducted pairwise comparisons (2006–2007 vs. 2004–2005; 2008–2009 vs. 2004–2005; 2008–2009 vs. 2006–2007) using t tests. We recognize that t tests are based on the assumption that the samples in the different time periods are independent and, in fact, some individuals in this study contribute to more than one time period. However, such repeated sampling can be expected to increase the precision of the comparison between time periods, making the tests conducted here conservative.

All statistical analyses were performed using SAS ver. 9.1 or above (SAS Institute, Cary, NC).

Ethical Considerations

As this study was an analysis of existing, de-identified data, it is not considered to be human subject research.


Source Population

The source population for each 2-year period comprised 5,621,520 individuals (29 % children <20 years of age) in 2004–2005, 8,139,941 individuals (28 % children) in 2006–2007, and 12,538,475 individuals (28 % children) in 2008–2009. Additional demographics for the source population are shown in Table 1.

Table 1 Source population and demographics by time period in a commercially insured US population (2004–2009)

IBD Prevalence in 2009

In this commercially insured population, the overall prevalences of CD and UC in the pediatric population were 58 and 34 per 100,000, respectively. In the adult population, the overall prevalences of CD and UC were 241 and 263 per 100,000, respectively. Additional details on the region-specific prevalence estimates are provided in Tables 2 and 3.

Table 2 Prevalence of Crohn’s disease in a commercially insured US population (2008–2009)
Table 3 Prevalence of ulcerative colitis in a commercially insured US population (2008–2009)

Demographic Distribution of IBD in 2009

Age and region were significantly associated with the prevalence of CD and UC in both the pediatric and adult populations (Table 4). Differences in sex were significant only for CD.

Table 4 Independent effects of age, gender, and region on the prevalence of inflammatory bowel disease in a commercially insured US population (2008–2009)

Effect of Age

The prevalence of both CD and UC increased with increasing age, although for CD the prevalence remained fairly stable between 30 and 50 years of age (Figs. 1, 2).

Fig. 1
figure 1

Age-specific prevalence of Crohn’s disease (CD) per 100,000 persons in a commercially insured US population, 2008–2009

Fig. 2
figure 2

Age-specific prevalence of ulcerative colitis (UC) per 100,000 persons in a commercially insured US population, 2008–2009

Effect of geographic region

There was significant regional variation for both children and adults with CD and UC. For both conditions, the prevalence was lower in the South and West, as compared with the Northeast and Midwest, except for UC among children where only the Northeast had a significantly higher prevalence (Table 4).

Effect of sex

For children, the prevalence of CD was lower in girls than in boys [odds ratio (OR) 0.88; 95 % CI 0.81–0.97). No significant difference in sex was observed for UC (OR 0.93; 95 % CI 0.83–1.04). In adults, the prevalence of CD was higher among women (OR 1.17; 95 % CI 1.14–1.20), with no differences observed in UC (OR 1.00; 95 % CI 0.97–1.03) (Table 4).

Extrapolation to US Population

After standardizing the data from the data set on age, gender, and region according to 2009 US census data, we estimate that 1,171,000 Americans have IBD (565,000 CD and 593,000 UC). Of these, 62,000 are children (38,000 CD and 23,000 UC). Thus, 5 % of all IBD cases in the USA are of pediatric age (<20 years).

Time Trends in IBD Prevalence

Standardized prevalence estimates based on 2009 census data for each time period are shown in Table 5. The prevalence of both CD and UC increased slightly in both the pediatric and adult populations between 2004–2005 and 2008–2009 (P < 0.001 for all comparisons). Among those individuals younger than 20 years, standardized CD prevalence estimates increased from 43 to 48 per 100,000, and UC prevalence increased from 27 to 29 per 100,000 (Figs. 3,4). Among those individuals 20 years of age and older, CD prevalence increased from 214 to 236 per 100,000 and UC prevalence increased from 235 to 248 per 100,000 (Figs. 3, 4).

Table 5 Standardized prevalence estimates of CD and UC colitis in a commercially insured US population (2004–2009)
Fig. 3
figure 3

Standardized prevalence estimates of CD per 100,000 persons in a commercially insured US population by age group, 2004–2009

Fig. 4
figure 4

Standardized prevalence estimates of UC per 100,000 persons in a commercially insured US population by age group, 2004–2009


In our source population of 12,538,475 commercially insured Americans during the period 2008–2009, the prevalence of IBD among individuals aged 20 years and older was estimated to be 241 per 100,000 for CD and 263 per 100,000 for UC. We also estimated the prevalences for the pediatric population (<20 years of age) to be 58 per 100,000 for CD and 34 per 100,000 for UC. When extrapolating to the US population in 2009 based on census data, we estimated that approximately 1,171,000 Americans have IBD (565,000 CD and 593,000 UC). Furthermore, the prevalences of both CD and UC have slightly increased, as is to be expected for conditions without cure and low mortality.

Our overall CD and UC prevalence estimates in 2009 were somewhat higher than most of those reported for other recent US and Canadian studies [13, 5, 13], which may reflect the increasing prevalence of IBD over time (especially CD), given that these other studies did not use data subsequent to 2004. The higher prevalences in our study compared to previous studies could also have resulted from our exclusion of adults aged over 65 years; the other studies included all adults 18 years and over. Our prevalence estimates are consistent with recent data from Olmsted County, MN which report estimates of 273 per 100,000 for UC and 222 per 100,000 for CD in 2005. As in our study, data from Olmsted County also demonstrate increasing IBD prevalence in recent years. The prevalence of CD has been reported to have increased between 1991 and 2001 [5] and that of both CD and UC to have increased between 2001 and 2005 [14].

With regards to pediatric-specific prevalence data, our estimates are between the prevalence estimated by Abramson et al. in northern California (12 and 20 per 100,000 for CD and UC, respectively) and by Loftus et al. in Olmsted County, MN (115 and 107 per 100,000 for CD and UC, respectively) [5, 6]. Our estimates are similar to those reported in a Canadian study, where Bernstein et al. reported pediatric-specific prevalence estimates ranging from 31 to 71 per 100,000 for CD and 18–31 per 100,000 for UC [13].

In our study we observed increasing prevalence with age for children and adults with CD and UC; however, the prevalence of CD leveled off somewhat for adults over 30. These age effects are consistent with previous studies [13, 6]. We also observed a higher prevalence among males than females in pediatric CD, with the opposite occurring in the adult population. We did not find any significant differences in the prevalence of UC among the sexes for both adults and children. These results are similar to those reported in other US and Canadian studies [3, 4, 6, 13].

The major strengths of this study are the large sample size and the geographic diversity of the sample. These strengths allowed us to provide up-to-date estimates of the prevalence of IBD in a US population and describe differences by geography, age, sex, and time period. Because claims data lack clinical details, the results of this study are subject to some degree of diagnostic misclassification. To reduce the effects of misclassification, we used strict case definitions which included the need of several healthcare contacts associated with IBD specific ICD 9 codes as well as pharmacy claims for IBD medications. Although similar to administrative case definitions that have been previously validated in the USA and Canada [1, 2, 13], the specific case definition has not been validated in this database.

Additionally, because individuals in the USA change health plans approximately every 2 years, we divided our analysis into distinct 2-year intervals where the source population for each period was the population of plan enrollees with continuous insurance coverage. This may have led to an underestimation of prevalent cases with low healthcare utilization. However, any misclassification would likely have been stable over the time periods studied, thus not affecting the time trend analyses. Another limitation to this study is the lack of data on race/ethnicity, socioeconomic status, or precise geographic location of the study population; thus, we were not able to evaluate the impact of these factors on IBD epidemiology. A final limitation is that this study included only individuals with commercial insurance who were younger than 65 years; thus, Medicaid and self-insured patients were not included. As such, our findings may not be completely generalizable to the broader US population and may lack comparability to other studies.

In conclusion, the burden of IBD in the USA continues to increase, owing largely to the continued steady increase in prevalence for these chronic conditions. The results of our study, using insurance claims data, suggest that nearly 1.2 million Americans are living with IBD. This up-to-date epidemiological data may be used to support disease surveillance and inform policy, including anticipating the need for clinical services by this patient population, establishing research priorities, and supporting IBD advocacy efforts.