Basal insulins are commonly used in individuals with type 2 diabetes who fail to achieve and maintain satisfactory glycaemic control with non-insulin glucose-lowering agents [1]. However, the initiation of basal insulin is often delayed because of several hurdles, in particular, the high risk of hypoglycaemia [2]. This risk was particularly apparent with NPH and ultralente insulins, as a result of inappropriate peaking of insulin concentration in the middle of the night after bedtime administration [3]. The development of long-acting insulin analogues (glargine and detemir) has significantly reduced the risk of hypoglycaemic events, particularly nocturnal events [4], and improved the chance of ensuring stricter glycaemic control in the morning after overnight fasting. This was mainly due to flatter plasma insulin pharmacokinetics and pharmacodynamics, longer duration of action and improved day-to-day reproducibility of insulin effects [3]. To further improve these features, new insulin formulations have been developed: insulin degludec and insulin glargine 300 U/ml (glargine U300) [5]. Insulin degludec owes its long half-life to conjugation to a fatty acid, which allows binding to circulating albumin and an excess of zinc and phenol, while it is the greater concentration of glargine U300 that ensures slower dissolution of the subcutaneous insulin depot and prolongation of its activity [6]. These new basal insulins have been shown to reduce the risk of hypoglycaemia compared with that associated with insulin glargine [7, 8], but whether degludec and glargine U300 are equivalent with respect to glycaemic control and risk of hypoglycaemia remains to be fully ascertained. The results of the CONCLUDE trial are published in this issue of Diabetologia [9]. In this study, a total of 1609 type 2 diabetic individuals previously treated with basal insulin and oral glucose-lowering agents (with the exclusion of secretagogues) were randomised to receive either degludec 200 U/ml (degludec U200) or glargine U300. During the maintenance period, HbA1c improved to a similar extent in the two groups with no significant difference in the rate of overall hypoglycaemia (the primary endpoint of the study), while rates of nocturnal symptomatic and severe hypoglycaemia (secondary endpoints) were lower with degludec U200 than with glargine U300. Can we CONCLUDE that insulin degludec may offer an opportunity to improve glycaemic control while exposing individuals to a lower risk of hypoglycaemia than insulin glargine U300? Answering this question requires some points of interest to be addressed.

Point 1: primary vs secondary endpoints

The primary endpoint of the CONCLUDE trial was the rate of overall symptomatic hypoglycaemic events during the maintenance period. The difference between the two treatments for this endpoint failed to reach statistical significance and, accordingly, the authors conclude that no significant harm is associated with the use of degludec U200. Secondary endpoints, which included the rate of nocturnal symptomatic hypoglycaemic events, the rate of severe hypoglycaemic events during the maintenance period and overall symptomatic, nocturnal symptomatic and severe hypoglycaemic events during the total treatment period, were all lower with degludec than with glargine U300. Interpretation of secondary endpoints when the primary endpoint is not statistically significant, is controversial. According to current interpretation if a hierarchical procedure is used, after the primary endpoint failed to be rejected, analyses of secondary endpoints become exploratory. The guidelines of the European Medicines Agency (EMA) state ‘Secondary endpoints may provide additional clinical characterisation of treatment effects but are, by themselves, not sufficiently convincing to establish the main evidence in an application for a licence or for an additional labelling claim’. They also state ‘Secondary endpoints may be related to secondary objectives that become the basis for an additional claim, once the primary objective has been established’ [10].

Interpretation of p values is currently a matter of discussion. Although a call has been made to ban statistical significance in favour of compatible effect sizes [11], this has been claimed to foster statistical confusion and generate problematic issues with data interpretation, while relying on predefined uniform statistical rules allows more reliable comparisons and conclusions [12]. Therefore, the only solid conclusion that can be drawn for the results of the CONCLUDE trial and its statistical analysis is that there is uncertainty about the true superiority of insulin degludec with respect to risk of hypoglycaemia compared with insulin glargine U300. Moreover, if there was no difference in overall symptomatic hypoglycaemic events but the nocturnal ones were lower with degludec, one could argue that the rate of diurnal hypoglycaemia had to be lower with glargine U300. Such speculation adds further uncertainty regarding the real advantages/disadvantages of the two basal insulin analogues.

Point 2: how could insulin degludec reduce risk of hypoglycaemia?

Interpretation of the results of a trial may go beyond statistics as it may depend on other evidence that may corroborate data interpretation. In this respect, differences in pharmacokinetics/pharmacodynamics could account for the different risks of hypoglycaemia. As mentioned, degludec and glargine U300 have flatter and more stable steady-state pharmacokinetic and pharmacodynamic profiles [5], with insulin degludec being claimed to have lower day-to-day variability in the glucose-lowering effect [13]. Of interest, these data were obtained with degludec U100, whereas degludec U200 was used in CONCLUDE. No formal comparison between this insulin strength and glargine U300 is currently available. Similar pharmacokinetic/pharmacodynamic properties for degludec U100 and U200 have been inferred from the reproducibility of the insulin serum profile after the subcutaneous injection of the two insulins [14]. True differences between degludec and glargine U300, however, are difficult to determine as studies performed using the same clamp technique have suggested that glargine U300 may provide, as compared with degludec U100, less fluctuating 24 h pharmacodynamics and a more even pharmacokinetic profile [15]. It is also worth noting that all pharmacokinetic/pharmacodynamic parameters were obtained in type 1 diabetic individuals to avoid the interference of residual endogenous insulin secretion. Therefore, these pharmacokinetic/pharmacodynamic estimates do not necessarily apply directly to individuals with type 2 diabetes who retain variable degrees of endogenous insulin secretion. This, along with several other potentially confounding factors (Table 1), calls for some caution in interpreting pharmacokinetic/pharmacodynamic data in individuals with type 2 diabetes.

Table 1 Factors potentially affecting assessment of insulin pharmacokinetics/pharmacodynamics

In summary, how insulin might account for a lower risk of hypoglycaemia is uncertain and carefully designed mechanistic studies may be necessary to appreciate differences that could justify a lower rate of hypoglycaemia, if this is in fact the case, in individuals with type 2 diabetes.

Point 3: how does CONCLUDE compare with similar studies?

CONCLUDE is the second randomised clinical trial comparing degludec and glargine U300, the first one being the BRIGHT study [16]. The two studies come to similar conclusions as far as glycaemic control is concerned. HbA1c was significantly reduced at the end of the two studies with no difference observed between the two insulins. The rate of hypoglycaemia in BRIGHT was comparable for glargine U300 and degludec during the entire study, although a lower rate was reported with glargine U300 during the titration period. Conversely, in CONCLUDE, the rate of symptomatic hypoglycaemia over the entire study was numerically lower with degludec [9]. However, there are major differences in the two trials that may render a direct comparison troublesome (Table 2). First, the two studies had a different primary endpoint: glycaemic control in BRIGHT, overall number of hypoglycaemic events in CONCLUDE. The study populations are not comparable as the BRIGHT study recruited insulin-naive individuals, while insulin-experienced type 2 diabetic individuals were enrolled in CONCLUDE. Accordingly, the CONCLUDE diabetic population included individuals with a longer duration of diabetes and lower kidney function. The risk of hypoglycaemia is not necessarily attributable to insulin treatment as concomitant glucose-lowering agents may also contribute. With respect to this, use of sulfonylureas or glucagon-like peptide 1 receptor agonists was not allowed in CONCLUDE while 66% of the BRIGHT participants received a sulfonylurea. All these elements should be kept in mind when interpreting (and comparing) the results of the two studies. To some extent the two trials could be seen as complementary since one is looking at the time of insulin initiation while the other explores the effect of switching from NPH, glargine U100 or detemir to newer basal insulin formulations. This also implies that the results of the two trials cannot be generalised to the entire diabetic population.

Table 2 Main characteristics of the study population in CONCLUDE and BRIGHT

Assessment of effectiveness and safety in real-world studies may provide better information on the impact of the two insulins in a broader type 2 diabetes population. In the Clinical Outcome assessmeNt of the eFfectiveness of Insulin degludec in Real-life Medical practice (CONFIRM) study [17], data on 4056 subjects were analysed. After 180 days of follow-up, degludec was associated with a larger reduction in HbA1c (−3.0 mmol/mol [−0.27%]; p = 0.03) and greater reductions in the likelihood of hypoglycaemia (OR 0.64; p < 0.01) compared with glargine U300. In the DELIVER Naive D real-world study [18], mean decreases in HbA1c were comparable in the glargine U300 and degludec cohorts (−18.3 ± 24.4 mmol/mol vs −17.3 ± 24.0 mmol/mol [−1.67 ± 2.22% vs −1.58 ± 2.20%]; p = 0.51) with no difference in the incidence of hypoglycaemia. The likelihood of insulin discontinuation may reflect the degree of risk of hypoglycaemia and exert an impact on patients' quality of life, as well as on healthcare costs. The two studies have generated different results and, as expected, discontinuation was 27% less likely with degludec than with glargine U300 in the CONFIRM study, while there was no difference between the discontinuation rates (29.2% vs 32.6% for glargine U300 vs degludec, respectively; p = 0.14) in the DELIVER Naive D study. In summary, no final conclusion can be drawn with respect to differentiation between insulin degludec and glargine U300 from the results of randomised clinical trials and real-world studies.

Point 4: the impact of protocol amendment

During the trial the CONCLUDE investigators identified some inconsistency between self-monitoring of blood glucose values measured by the participants and values measured in the laboratory. Moreover, the number of participants reporting blood glucose-confirmed hypoglycaemic events was unexpectedly low while pseudo-hypoglycaemic events (>3.9 mmol/l with symptoms) were more frequent. Also, individual reports from patients and investigators showed inconsistency between the patients’ own blood glucose meter and the one supplied in the trial. These observations suggested potential failures of the glycaemic data collection system. Accurate assessment of the blood glucose meters used in the study showed that they did not meet the accuracy requirements specified by the International Organization for Standardization (ISO) and the US Food and Drug Administration (FDA) [19]. In particular, the devices displayed falsely higher results in the hypoglycaemic range accounting for an unusual pattern of hypoglycaemic events [20]. The investigators and the sponsor should be commended for pinpointing the problem and for ensuring the continuation of the study while protecting the patients’ safety and preserving the scientific integrity of the study. This was achieved by a protocol amendment that introduced a 16 week variable maintenance period followed by a 36 week maintenance period [21].

In spite of the faultless handling of this unfortunate situation by the investigators and trial’s sponsor, the potential implications of changes to the study design in the subsequent conduct of the trial should be considered. Falsely elevated glucose readings were likely to trigger excessive insulin titration, with an unwanted increase in the number of hypoglycaemic events. Recurrent hypoglycaemic episodes in the titration and early maintenance period might have altered counterregulatory symptom and cognitive function responses to subsequent hypoglycaemic events in the final maintenance period. Alternatively, the same excess of hypoglycaemic events could have made participants more cautious, leading them to aim for slightly higher blood glucose levels in the fasting state and rendering them more alert with respect to the risk of hypoglycaemia. All these elements may have interfered with the possibility of assessing the true risk of hypoglycaemia associated with degludec and glargine U300.

Final reflections on the CONCLUDE trial

The CONCLUDE trial was an ambitious and innovative trial as its endpoints were based on the risk of hypoglycaemia rather than the glucose-lowering efficacy. However, its results are not conclusive because the statistical interpretation does not support formal superiority of insulin degludec vs insulin glargine U300. Moreover, there is uncertainty about the mechanisms through which insulin degludec may reduce the risk of hypoglycaemia. The distribution of hypoglycaemic events between night and day remains to be fully explored. Finally, the generalisability of the results of CONCLUDE (and BRIGHT) is uncertain, and this uncertainty cannot be addressed by findings from currently available real-world studies.

In spite of all this, CONCLUDE can teach us a useful clinical lesson. As already mentioned, the investigators must be commended for recognising the poor performance of the glucose meter initially used in the trial. The importance of careful surveillance of blood glucose monitoring systems for safe and reliable estimates of glucose control in clinical trials has been recently emphasised [22]. However, if this is critical in a trial, unreliable glucose monitoring systems become a matter of major concern in the real-world clinical setting. Recent publications report that the accuracy of different meters can vary widely [23], with mean absolute relative differences (MARDs), the most common metric used to assess the performance of these devices, ranging from 5.6% to 20.8% [24]. Moreover, accuracy tends to be lower within the hypoglycaemic range [24]. Such variability coupled with low adherence to self-management of the disease [25] and the well-known relationship between frequency of self-monitoring blood glucose and attainment of HbA1c target values [26] can easily offset the advantages of any new insulin formulation both in terms of glucose-lowering efficacy and reduction of the risk of hypoglycaemia. Therefore, the clinician must always keep in mind that the translation of any potential benefit of new insulin analogues in clinical practice cannot just rely on the demonstration of superior efficacy and/or safety but rather it requires careful integration in the patient’s self-management and education as well as continuous surveillance of glucose meters and their proper utilisation.