Introduction

A total of 366 million people worldwide—8.3% of the global population—have diabetes mellitus, and an additional 280 million persons have impaired glucose tolerance. Although the prevalence of diabetes varies greatly between ethnic groups and geographical regions, it disproportionately affects persons aged 65 years and older [1]. The public health toll of diabetes is on an upward trajectory, with its prevalence estimated to increase to more than 552 million persons worldwide by 2030 [1, 2]; furthermore, 2011 expenditures for diabetes and diabetes-related complications in the North American/Caribbean region alone are estimated at US$223 billion [1]. Trials assessing interventions to prevent and treat diabetes and its complications are needed, but it is currently unclear whether the numerous clinical trials active in this therapeutic arena are capable of addressing deficiencies in our understanding of diabetes care.

ClinicalTrials.gov, a web-based registry maintained by the National Library of Medicine (NLM) at the US National Institutes of Health (NIH), was created to give the public and healthcare providers easy access to information about clinical trials. In 2007, its scope was expanded to include the mandatory registration of all phase 2–4 interventional trials conducted under US regulatory auspices that have evaluated a drug, a biological therapy or a medical device [3]. Registration of trials with ClinicalTrials.gov or another comparable registry is also a prerequisite for publication in many peer-reviewed journals [4].

In order to evaluate the current state of clinical trials in diabetes, we conducted a descriptive analysis of diabetes-related trials registered with ClinicalTrials.gov from 2007 to 2010 and compared this information with the current clinical picture of diabetes within the USA and worldwide to determine whether the current scope of trials permits us to effectively address disease prevention, management and safety of therapy in a diverse population. In particular, we examined whether recently registered trials were likely to enrol patients from high-risk or under-studied age groups, or from regions marked by a high disease prevalence. We also sought to define the percentage of trials focused on diabetes prevention as opposed to treatment, and the percentage of trials studying interventions other than drugs. Agencies including the European Medicines Agency (EMA) and the US Food and Drug Administration (FDA) have emphasised the need to better assess relative or comparative therapeutic effectiveness [5]; thus, we planned to evaluate the proportion of trials using active vs placebo comparators and the number of interventional arms in the trials. We also planned to describe the number of trials with outcomes that included clinically significant cardiovascular complications, and to identify trials focused upon areas of emerging interest, such as malignancies, bone metabolism/fractures or pancreatitis.

Methods

The methods used by ClinicalTrials.gov to register clinical trials have been described in detail elsewhere [6]. Briefly, trial sponsors and investigators from around the world can enter trial information through a web-based data entry system. The sample we examine in the present study includes trials registered to comply with statutory obligations, as well as those registered voluntarily to meet publication requirements or for other reasons.

Creation of the ClinicalTrials.gov dataset

A dataset comprising 96,346 clinical studies registered in ClinicalTrials.gov was downloaded in XML format on 27 September 2010. This date of download was significant because it coincided with the enactment of the Food and Drug Administration Amendments Act 3 years prior to the date of the download and the corresponding legal obligation for sponsors to register applicable interventional trials. We next designed and implemented a relational database to facilitate the aggregate analysis of data from ClinicalTrials.gov, as described in detail elsewhere [7].

Creation of the diabetes study dataset

Our analysis was restricted to studies categorised as being of ‘interventional’ study type that were registered with ClinicalTrials.gov from October 2007 to September 2010. The diabetes study dataset was created by using disease condition terms (both Medical Subject Heading [MeSH] and non-MeSH) provided by the trial data submitters, as well as additional MeSH condition terms generated by an NLM algorithm. A subset of the 2010 MeSH thesaurus [8] and a list of non-MeSH disease condition terms provided by data submitters that appeared in at least five studies in the analysis dataset were reviewed and annotated by clinical specialists in endocrinology, metabolism and nutrition at Duke University Medical Center (W. C. Lakey, J. B. Green, B. C. Batch and K. Barnard) and the University of Oxford (M. A. Bethel).

As a first step, terms were annotated according to their relevance to the endocrinology domain. This domain is expansive and includes terms related to gland- and hormone-related diseases, as well as nutritional and metabolic conditions. Therefore, as a second step, terms selected for inclusion in the endocrinology domain were reviewed and annotated with respect to their relevance to diabetes and/or diabetes-related complications. In order to identify trials enrolling patients with prediabetes, terms such as ‘impaired fasting glucose’, ‘impaired glucose tolerance’ and ‘hyperglycaemia’ were included.

A total of 9,031 unique MeSH terms and 1,220 unique non-MeSH terms were reviewed. From this review, 1,031 unique MeSH terms and 146 unique non-MeSH terms were relevant to the endocrinology domain and used in the database search. A total of 8,302 studies were identified that had at least one condition term (MeSH or non-MeSH) relevant to endocrinology. In these studies, 1,353 unique MeSH terms occurred among the submitted conditions or NLM-generated MeSH terms, of which 19 were relevant to diabetes; of the 146 non-MeSH terms, 36 were relevant to diabetes (see electronic supplementary material [ESM] Table 1). Using the diabetes annotation, 2,484 studies were identified that had at least one condition term or condition MeSH term relevant to diabetes. Figure 1 displays a flow diagram showing the steps involved in creating the dataset.

Fig. 1
figure 1

Flow diagram illustrating the creation of the diabetes study dataset

Primary outcomes

An investigator (J. B. Green) manually reviewed the listed primary trial outcomes because these data were entered as unstandardised free text. A review of all 2,500 free-text descriptions of primary outcomes was performed to identify outcomes of interest, including mortality or clinically significant cardiovascular complications such as myocardial infarction or stroke. Manual reviews were also performed for outcomes related to malignancies, bone metabolism/fractures or pancreatitis. This was followed by a text search for relevant keywords (J. B. Green, K. Chiswell and W. C. Lakey) to ensure that all listed outcomes of interest for the 2,484 diabetes studies had been identified (ESM Table 2).

Derived funding source

The NLM defines the ‘lead sponsor’ for a trial as the organisation primarily responsible for study implementation and data analysis, and defines ‘collaborators’ as those who provide other meaningful trial-related support [9]. Agency names in these data elements are classified as ‘industry’, ‘NIH’, ‘US federal (excluding NIH)’ or ‘other’. We derived the probable funding source from the ‘lead sponsor’ and ‘collaborator’ fields using the following algorithm: If the lead sponsor was from industry, or the NIH was neither a lead sponsor nor a collaborator and at least one collaborator was from industry, then the study was categorised as ‘industry funded’. If the lead sponsor was not from industry, and the NIH was either a lead sponsor or a collaborator, then the study was categorised as ‘NIH funded’. Otherwise, if the lead sponsor and collaborator fields were non-missing, the study was considered to be funded by ‘other’.

Statistical methods

Frequencies and percentages are provided for categorical trial characteristics. Unless otherwise indicated, missing values were excluded from the denominators before calculating the percentages. Means, medians and 25th and 75th percentiles are reported for continuous characteristics. For studies reporting an interventional model of ‘single group’ and the number of arms as ‘1’, the value of allocation (if missing) was assigned as ‘non-randomised’ and the value of blinding (if missing) was assigned as ‘open’.

Results

The overall characteristics of the diabetes-related trials are shown in Table 1. The 2,484 trials identified accounted for 6.0% of the 40,970 interventional trials overall and 6.4% of the 38,985 trials with a disease area classification. Of trials with an available start date, 81.2% began in 2007 or later. As of 27 September 2010, the largest proportions of diabetes trials were listed as ‘completed’ (36.3%) or ‘recruiting’ (34.1%), followed by ‘active but not recruiting’ (14.2%) and ‘not yet recruiting’ (9%). Of the trials with completion year data available, the majority were to have completed follow-up for their primary endpoint subsequent to 2008 and 12.9% were scheduled to complete follow-up in 2012 or later. The majority of these completion dates were anticipated (61.3%) rather than actual (38.7%). The mean and median times to primary trial completion (where available) were 1.8 and 1.4 years, respectively.

Table 1 Characteristics of diabetes-related interventional trials registered with ClinicalTrials.gov, 2007–2010

Diabetes trials were distributed relatively evenly across the early and late phases of development: 15.6% in phase 1 or phase 1/phase 2, 18.3% in phase 2 or phase 2/3, 17.4% in phase 3, and 16.9% in phase 4. The largest percentage of trials overall (31.4%) had study phase listed as ‘not applicable’; however, smaller percentages of trials involving drug (13%) or biological/vaccine interventions (14%) did not identify a study phase. Among the 2, 449 trials listing enrolment, 91.1% had an actual or anticipated enrolment of ≤500 participants; 58.6% were designed to enrol ≤100 participants.

Of the 2,327 trials with a primary purpose listed, the majority had a therapeutic purpose (74.8%), followed by prevention (10%) and basic science (7.2%). Smaller percentages were focused upon diagnosis, supportive care, screening or health services research. Most trials involved drug interventions (63.1%), followed by behavioural interventions (11.7%), or had interventions classified as ‘other’ (11.7%). Smaller numbers included device-related (7.5%) or procedural (5.2%) interventions.

Of the trials for which data were available, 65.1% were parallel-design while 18% were described as single-arm; 49.9% were open-label and 38.9% were double-blind; and 82.5% of trials had a stated randomised allocation to therapy. Just over half (54.7%) were described as having two treatment arms; 19.1% were single-arm and 13.4% had three arms. Disagreement in the percentages described as single-arm trials are related to the responses given to two separate questions. Among the trials reporting two or more study arms and with arm type available, 51.5% reported use of an active comparator arm while 37.2% reported the use of a placebo comparator arm. Trials of drug interventions were most likely to include two treatment arms (54.0%), followed by one (15.8%), three (14.7%), four (8.73%) or five or more (6.8%) (data not shown).

Within the diabetes-related trials dataset, 92% accepted both male and female participants; much smaller percentages excluded women (4.8%) or men (3.3%). A total of 3.7% of trials limited enrolment to participants aged ≤18 years, while 89.6% specifically excluded patients <18 years. Patients >65 years of age were excluded from 30.8% of studies; those >75 years were excluded from 54.9%. Very few trials (0.6%) selectively enrolled patients ≥65 years; only one was designed to enrol patients ≥75 years. Data regarding planned or actual enrolment by race or ethnicity were not available.

A search for MeSH or non-MeSH terms specific to type 1 diabetes yielded 305 studies, or 12.3% of the total diabetes trials. However, this classification was not verified by manual review of all the trial descriptions, and thus may not accurately reflect the percentage of trials dedicated to study of this form of diabetes.

The mean and median numbers of listed trial primary outcome measures were 1.2 and 1, respectively. Among trials that listed secondary outcome measures, the mean and median numbers of secondary outcome measures were 3 and 1, respectively. Of trials reporting study classification (the type of primary outcome that the protocol was designed to evaluate), the largest percentage had a safety/efficacy endpoint (45.6%) and 37.4% had an efficacy endpoint. A total of 18.6% of studies had at least one primary outcome measuring safety, and 27.9% had at least one secondary safety outcome. A manual review of 2,500 free-text descriptions of trial outcomes yielded 35 studies (1.4%) with at least one primary outcome related to mortality or clinically significant cardiovascular endpoints such as myocardial infarction or stroke, seven studies with at least one primary outcome related to bone metabolism, one study with a primary outcome related to malignancy, and none related to pancreatitis (ESM Table 2). One trial outcome description included both death and malignancy and was counted in both the ‘cardiovascular’ and ‘malignancy’ totals.

Lead sponsorship from industry was identified in 42.8% of trials; 51.4% had an industrial source as the lead sponsor or collaborator. Much smaller percentages were identified as having NIH (3.0%) or U.S. federal lead sponsorship (1.5%), although the NIH was identified as a lead sponsor or collaborator in 7.6% of trials. The largest percentage of lead sponsorship (52.8%) was identified as being from other sources. Manual review indicated that many of these trials listed funding from universities or academic institutions. The derived funding source for the majority of the studies was industry (50.9%), followed by other (41.6%) and the NIH (7.5%). Among the 2,237 trials for which data regarding sponsorship and number of centres were available, the largest percentage (37.7%) comprised single-centre trials without funding from the NIH or industry. The next largest identifiable percentages were multicentre (27.4%) or single-centre (22.0%) trials with funding determined to be from industry. We found that the NIH provided funding for 6.1% of studies that were single-centre and 1.7% of studies that were multicentre trials.

Of the 2,237 studies that provided such information, 1,473 (65.8%) took place at a single location. For studies with more than one location, the mean number of trial facilities was 34.6. Of the 764 multisite trials, 25% had two or three sites, 50% had ≤11 sites and 75% had ≤44 sites. The top 10% of multisite trials (73 trials in total) reported having more than 89 sites. A total of 40.5% of trials had facilities only in the USA, 49.7% were outside the US and 9.8% were conducted in the USA and other regions. The majority of studies (56.1%) had at least one site located in North America, and 33.5% had at least one site in Europe. Much smaller percentages of studies were located in Asian regions (Eastern Asia, 13.5%; South Asia, 6.3%; North Asia, 5.1%; South-East Asia, 3.7%), South or Central America (7% and 4.1%, respectively), the Middle East (5.4%), Africa (3.5%) or Pacifica (3.5%). The distribution of trials by country is shown in Fig. 2.

Fig. 2
figure 2

Distribution of diabetes studies by country

Discussion

A review of the data available from diabetes-related trials registered in ClinicalTrials.gov from 2007 to 2010 provides an important window on the current clinical research enterprise in this therapeutic area. Our descriptive analysis found that the majority of registered trials involve drug therapies rather than preventive or non-drug interventions. Trials appear to include relatively small numbers of patients, are primarily conducted at single sites and are of fairly short duration. Trials often exclude children and elderly participants, their global distribution does not correlate with regional disease prevalence, and only small numbers of trials have focused upon mortality or clinically significant cardiovascular complications.

The International Diabetes Federation (IDF) and the ADA emphasise diabetes prevention as a focus of future research [10, 11]. Previous trials have demonstrated that various lifestyle and pharmacological interventions may delay the onset of diabetes in high-risk persons [1215]; however, additional study is needed to enhance the implementation of preventive strategies in practice and assess the utility of novel interventions. And despite convincing evidence that intensive glycaemic control minimises the onset and progression of complications [16, 17], a significant percentage of persons with diabetes have not achieved optimal glycaemic control [18, 19]. Further study regarding the translation of effective educational, preventive and therapeutic interventions in the community setting is also encouraged by the ADA [11]. We have found that most diabetes-related studies in ClinicalTrials.gov focus on treatment (usually drug-related rather than behavioural), while only small percentages are primarily concerned with prevention, health services research, supportive care, diagnosis or screening. And although the ideal proportion of trials focused on prevention has not been established, the current trials portfolio, comprising studies with smaller sample sizes and shorter durations, appears to be inadequate for expanding and refining preventive efforts or translating effective care strategies into the community setting.

The IDF, ADA and others have emphasised the need for trials designed to compare the effects of therapies in diverse, high-risk and representative populations [10, 11, 20]. The prevalence of diabetes varies by global region and country [1] and by race/ethnicity [21]. In addition, rates of complications including diabetic retinopathy, lower extremity amputation and end-stage renal disease vary among ethnic groups [2225]. To achieve the greatest impact upon clinical care, trials should enrol patients representative of populations disproportionately affected by diabetes, including high-risk patients and those ≥65 years. A better understanding of responses to interventions among diverse individuals and groups may inform individualised treatments of greater effectiveness and tolerability [26, 27].

Race and ethnicity of trial populations are not required fields for registering studies with ClinicalTrials.gov; therefore, there is no readily available information for this category within the dataset. However, the location of trials within countries provides insight into the relationship between clinical trial activities and highly affected populations. Registration with ClinicalTrials.gov is not required for studies taking place outside US jurisdiction; nevertheless, approximately half of the trials registered in ClinicalTrials.gov did not have any US sites. Thus, although our dataset affords an incomplete view of trials activity worldwide, it still likely that it provides a reasonably accurate global perspective.

The IDF list of the ten locations most affected by diabetes includes multiple Middle Eastern countries in which the prevalence of diabetes among adults is approximately 20% [1]. However, our analysis suggests that this region is minimally involved in diabetes-related trials. A comparison of trial activities in countries with the highest prevalence of diabetes among adults reveals over 500 (1,126) trials in the USA; however, we also noted state- and regional-level exceptions to this, as detailed in the ESM text and ESM Fig. 1. China, India and Mexico participated in 101–250 trials each; however, the Russian Federation (12.6 million persons affected) and Brazil (12.4 million affected) are involved in only 51–100 registered trials despite heavy disease burdens [1].

Trials registered in ClinicalTrials.gov are predominantly conducted in North America, Western Europe and a small number of countries in Asia. Notably, most of Africa is either uninvolved or minimally involved in registered studies. Thus, current trials appear unlikely to provide significant insight into the management of patients from many highly affected or under-studied areas.

The ClinicalTrials.gov database permits a review of the ages of participants sought for (or excluded from) trials. Although those aged 40–59 years constitute the largest number of persons affected by diabetes worldwide, older persons are at greatest risk of the disease. For example, 26.9% of US residents ≥65 years were estimated to have diabetes in 2010 [20]. Our analysis found that persons >65 years were excluded from 30.8% of trials, and that the majority of trials excluded those aged >75 years. Thus, the current clinical research portfolio may not allow us to robustly address issues in older persons with diabetes.

Less than 4% of registered trials targeted the enrolment of participants ≤18 years. This may be appropriate given the number of children affected by diabetes; however, the estimated 3% annual increase in incidence of type 1 diabetes may warrant greater representation [1]. Furthermore, the increase in type 2 diabetes among adolescents, particularly noticeable in wealthier nations, is of considerable concern. It is unclear whether findings obtained from adults with diabetes are readily translatable to paediatric/adolescent populations. The inclusion of younger participants in diabetes trials is essential to ensure safe and effective clinical interventions for these groups, particularly given their risk of developing disease complications early in life. Current clinical trials do not appear to be appropriately positioned to address issues related to disease prevention or management in the young.

Organisations including the US Institute of Medicine have encouraged comparative effectiveness trials to comprehensively assess the benefits and risks of multiple therapeutic options for diabetes and other diseases [27, 28]. Trials comparing the safety, effectiveness and durability of the many glucose-lowering therapies now available will create a reliable evidence base for clinical care guidelines.

The majority of currently registered diabetes trials have a parallel intervention model; however, most have two treatment arms (54.7%) or one treatment arm (19.1%), leaving only a small percentage with three or more arms. Among trials with two or more arms, 51.5% include an active comparator. Furthermore, the relatively short duration typical of these trials may compromise our ability to ascertain the durability of therapeutic interventions or the effects of interventions upon long-term complications. The single-site nature and limited enrolment of most trials are likely to limit the conclusions drawn from their results. Therefore, the current set of trials may not contribute to meaningful changes in recommendations for care.

Diabetes care organisations worldwide have emphasised a need to minimise diabetes-related complications. Groups including the ADA have strategically prioritised investigations that will enhance our understanding of these complications, including cardiovascular disease [11]. The relationship between glycaemic therapeutic targets, hypoglycaemia and cardiovascular complications remains inadequately understood and contentious, despite multiple recent outcomes studies [2932]. In addition, scrutiny of the cardiovascular effects of individual glucose-lowering therapies has increased following concerns about rosiglitazone and other drugs in development [33], resulting in new FDA and EMA guidelines for evaluating the cardiovascular safety of new glucose-lowering agents [34, 35]. Of the 2,439 trials in the dataset with available outcomes descriptions listed, only 35 show a primary outcome related to mortality or clinically significant cardiovascular endpoints (e.g. myocardial infarction or stroke). Only small numbers of trials reported primary outcomes related to bone metabolism, malignancy or pancreatitis despite significant clinical interest in the relationship between these issues and glucose-lowering therapy or diabetes itself [3639].

There are limitations to our ability to draw firm conclusions from the data available, many of which have been previously described [40]. Although ClinicalTrials.gov encompasses a substantial proportion of clinical trials and an estimated 80% of studies in the WHO portal, it does not include all studies performed worldwide [40]. Incorporating non-duplicate trials registered with other international databases would have provided a more complete global perspective; however, such an undertaking would require relatively intensive curation efforts to ensure that duplicate studies were removed and categories appropriately matched, and thus lies beyond the scope of the present work.

Requirements and methods for collecting information about trials have changed over time, and data completeness and quality are variable—an unsurprising finding, as the data collection was not initially designed to support aggregate analysis. Missing data, classification of data as ‘other’ in many circumstances, and non-standardised free-text descriptions also complicated our analysis, particularly when reviewing data related to funding sources and trial outcomes. Funding sources are also classified in a manner most relevant to US-based trials. We were able to identify the presence of a trial within countries and specific US states; however, the number of unique sites per country or state could not be determined, thus limiting our capacity to assess the proportion of trial activity in relation to the population density of interest within a given area. In addition, information about facilities that had not yet been activated when the database was downloaded or had been removed from the current study record is excluded from this investigation. Future refinements to data collection may permit a more complete and sophisticated analysis of trials characteristics.

With respect to the data analysis, the non-hierarchical MeSH classifications may categorise a condition in multiple locations, potentially leading to false positives upon querying for a specific condition. Endocrinology experts at two institutions annotated the database; however, this annotation has not yet been externally validated. In this initial overview, we did not examine whether various characteristics have changed over time and are thus unable to discern meaningful trends in the design or implementation of clinical trials.

In summary, this descriptive analysis of data from the ClinicalTrials.gov registry provides a broad overview of interventional clinical studies related to diabetes. Although many trials will provide valuable information upon completion, our review suggests that the current portfolio does not adequately address disease prevention, management or therapeutic safety. This information may be meaningful in the allocation of future research activities and resources.