Reason

Wang et al. recently published a systematic review and meta-analysis on the impact of primary prophylaxis (PP) with granulocyte colony-stimulating factors (G-CSF) on febrile neutropenia (FN) during chemotherapy [1]. In the article, the authors state that “over all chemotherapy cycles, there was a numerical but statistically nonsignificant increase in the FN risk for lipegfilgrastim PP versus pegfilgrastim PP” four times.

We would like to state that in our view, this claim is not justified for the following three major reasons:

  1. 1.

    It is based on an indirect and mixed treatment comparisons although direct evidence is available from two recent head-to-head randomized controlled trials (RCTs) showing a compatible risk.

  2. 2.

    A non-significant result is not conclusive and does not show evidence for the existence of an increase.

  3. 3.

    The indirect comparison has a poor validity because of missing homogeneity, similarity, and consistency.

Contradiction

  1. Ad 1)

    The study aimed to assess the G-CSF products administered as primary prophylaxis to cancer patients receiving myelosuppressive chemotherapy. A systematic literature review identified 27 publications (1990 to 2013) representing 30 randomized, controlled trials evaluating primary prophylaxis (PP) with filgrastim, pegfilgrastim, lenograstim, or lipegfilgrastim in adults receiving myelosuppressive chemotherapy for solid tumors or non-Hodgkin lymphoma (see Fig. 1). Direct, indirect, and mixed treatment comparison (MTC) were used to estimate the odds ratio (OR) of febrile neutropenia (FN) during cycle 1 and all cycles of chemotherapy combined without adjusting for differences in relative dose intensity (RDI) between study treatment arms.

    Fig. 1
    figure 1

    Overview of data from RCTs on G-CSF PP included in meta-analysis

    It becomes apparent from Fig. 1 that there were two head-to-head studies comparing directly lipegfilgrastim with pegfilgrastim [2, 3] including a total of 306 patients.

    The results of the meta-analysis for this direct comparison are shown in Fig. 2; seven events of FN occurred in each treatment group. The odds ratio for the “combined view” is 0.98.

    Fig. 2
    figure 2

    Meta-analysis comparing directly lipegfilgrastim with pegfilgrastim

    Table 1 shows the posterior odds ratios for FN from all cycles with and without the assumption of consistency.

    Table 1 Posterior odds ratios for febrile neutropenia from all cycles with and without the assumption of consistency for lipegfilgrastim vs. pegfilgrastim

    The OR for the direct comparison is 1.00 (indicating equal risks for FN by lipegfilgrastim and pegfilgrastim). The result of the “indirect comparison” is completely different, OR = 2.00, seemingly indicating an “increased” risk for FN by lipegfilgrastim. The result of the combination of direct and indirect comparison (MTC) is between the results of direct and indirect comparison (OR = 1.39).

    Which approach is providing the best evidence?

    The Cochrane Handbook for Systematic Reviews of Interventions [4] states: “In situations when both direct and indirect comparisons are available in a review, then unless there are design flaps in the head-to-head trials, the two approaches should be considered separately and the direct comparison should take precedence as a basis of forming conclusions.”

    Leading institutions for health technology assessment (HTA) such as the Canadian Agency for Drugs and Technologies in Health (CADTH) [5], the National Institute for Health and Clinical Excellence (NICE), London [6] and the German Institute for Quality and Efficiency in Health Care (IQWiG), and Cologne [7] have a clear recommendation for this situation:

    • They have a preference for data from head-to-head RCTs.

    • Evidence from mixed treatment analyses may be presented if it is considered to add information.

    • If data from head-to-head RCTs are not available, indirect treatment comparison methods may be used.

    Many methodological questions for MTCs still remain to be answered [8]. In a current review, Song et al. [9] describe that significant differences between results from indirect and direct comparisons occur more frequently than previously assumed. Due to the high risk of biased results and the numerous unresolved methodological problems, in general, no certain proof of benefit of a medical intervention can currently be inferred from results of indirect comparisons.

    Therefore, claiming a non-significant numerical higher risk for FN for lipegfilgrastim compared to pegfilgrastim is not sustainable. There were a total of 7:7 events of FN in both head-to-head studies together, but an indirect OR of 2.0 indicates that the application of inconsistent indirect or mixed methods is not only unnecessary but leads to potentially biased risk estimates.

  2. Ad 2)

    Results from clinical studies may be conclusive only if hypothesis testing leads to statistically significant results. Therefore, it is impossible to claim a “numerical difference” in favor of a group without statistical significance. For example, in the head-to-head studies against pegfilgrastim, lipegfilgrastim showed “numerical but not statistically significant” superiority in several other efficacy parameters. So far, such advantages have not been claimed, due to statistical non-significance.

  3. Ad 3)

    The indirect comparison itself lacks validity: the meta-analysis of five studies comparing pegfilgrastim vs. placebo shows a strong heterogeneity due to the oldest study (Vogel et al. [10]). The meta-analysis of the pegfilgrastim studies shows a total event rate of 15.1 % in the control group, whereas the lipegfilgrastim studies have a total event rate of 8 %. This shows a strong dissimilarity in the types of patients included which could be the reason for the inconsistency between direct and indirect comparison.

Conclusion

It is not justified to claim a numerical difference in favor of a specific treatment based on non-significant study results. The result of a mixed treatment comparison with poor validity is less reliable than the evidence from available direct comparison. Thus, the FN risk for lipegfilgrastim PP relative to pegfilgrastim PP is comparable.