Introduction

Heart Failure (HF), a clinical syndrome of the dysfunction of ventricular filling or ejection led by the abnormality of cardiac structure and function, affects about 26 million people around the world [1]. The prevalence of HF is 1–2% of the adult population in developed countries [2, 3] and in China there are about 4.5 million patients of HF [4]. Although the treatment of Chronic Heart Failure (CHF) has made great progress, the mortality and rehospitalization of CHF remain high, only half of patients could survive for more than 5 years [5, 6]. The mortality of hospitalized patients with CHF was 4.1% according to the China-HF registry study [7]. From 2000 to 2010, the cardiovascular hospitalization of CHF has not decreased [8].

Coronary Heart Disease (CHD) is the first cause of CHF among all the primary diseases [9, 10], so that the prevention and treatment of CHF caused by CHD is a significant part of cardiovascular health decisions. Traditional Chinese medicine has been widely used to treat all kinds of CHF which could effectively reduce the levels of N-terminal pro-brain natriuretic peptide (NT-proBNP) [11]. However, evidence from TCM clinical trials has not been universally acknowledged in the international medical system nor been included in clinical practice guidelines. The available randomized controlled trials (RCTs) are suboptimal with diverse outcome measures, many of which only showed the improvement of symptoms. To understand the status quo of outcome measures in RCTs of TCM in treating CHF caused by CHD, we conducted a systematic review to evaluate the outcome measures, identify relevant problems and try to put forward solutions.

Methods

Eligibility criteria

We included RCTs meeting the following criteria: (1) performed in CHF patients with CHD as primary disease (2) assessing TCM treatment compared with a control group (without restriction). Exclusion criteria were: (1) duplicate publication (2) studies without full text.

Information sources

Electronic databases including Embase, PubMed, Cochrane Central Register of Controlled Trials (CENTRAL) and China National Knowledge Infrastructure (CNKI), Chinese Scientific Journal Database (VIP), Wanfang and Chinese Biomedicine Literature Database (CBM) were searched from inception to October 8, 2018. Bibliographies of selected articles were also consulted in search of additional trials not detected in the initial searches.

We also searched Cochrane Database of Systematic Reviews (CDSR) to collect Cochrane systemic reviews (CSRs) of CHF for comparative analysis.

Search

We conducted a systematic search. “Medicine, Chinese Traditional [MeSH]”, “Heart Failure [MeSH]”, “Randomized Controlled Trial [Publication Type]” were applied as search terms and free words were used according to the characteristics of each database. The detailed search strategy was shown in Additional file 1.

Study selection

Two reviewers (JY H and RJ Q) independently selected the eligible studies, first through title and abstract and afterwards through the full text. Any disagreements of the selection period were discussed, and if the discussion could not resolve the problem, we consulted the third author (M L) and reached consensus.

Data collection process and data items

Reviewers JY H and CY L independently extracted information of the studies using a standardized data extraction form including the first author, year of publication, disease type, sample size, interventions in the treatment and control group and outcome measures.

Risk of bias in individual studies

We used the Cochrane Handbook for Systematic Reviews of Interventions version 5.1.0 [12] to assess the risk of bias of the included RCTs. Two reviewers (JY H and RJ Q) individually assessed the risk of bias and if there existed any disagreements, we resolved it through discussion with a third author (HC S).

Summary measures

We calculated the reporting rate of each outcome measure in the included RCTs and conducted comparative analysis with that in the CSRs of CHF. On account of the aim to analyze outcome measures, we did not synthesize data of the trials nor conduct a meta-analysis.

Additional analyses

Two authors (JY H, RJ Q) independently evaluated the reporting quality of outcome measures in the included RCTs based on the Management of Otitis Media with Effusion in Cleft Palate (MOMENT) criteria [13], considering the following 6 items:

  1. 1)

    Is the primary outcome clearly stated?

  2. 2)

    Is the primary outcome clearly defined so that another researcher would be able to reproduce its measurement? Where appropriate, this should include clear descriptions of time points, the person measuring the outcome, how the outcome was measured (for example, tools and methods used) and where the outcome was measured.

  3. 3)

    Are the secondary outcomes clearly stated?

  4. 4)

    Are the secondary outcomes clearly defined?

  5. 5)

    Do the authors explain the use of the outcomes they have selected?

  6. 6)

    Are methods used to enhance the quality of outcome measurement (for example, repeated measurement, training) if appropriate?

Results

Study selection

We identified 1910 records from the seven databases. Firstly we excluded 171 duplicated records and 1023 records through titles and abstracts. Then 679 full-text articles were assessed for eligibility and 648 articles were eliminated for the reasons shown in Fig. 1. Finally, we included and analyzed 31 RCTs [14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44] in the review. We also screened sixteen CSRs of CHF [45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60].

Fig. 1
figure 1

Flowchart of study selection

Study characteristics

Thirty-one included studies were all conducted in China and 29 were published in Chinese, two were published in English [39, 40]. The main information of each study is shown in Table 1 and the information of 16 CSRs of CHF in Table 2.

Table 1 Information of included studies (n = 31)
Table 2 Information of CSRs of CHF (n = 16)

Risk of bias within studies

Among the 31 RCTs, only seven studies [21, 28, 30, 39, 40, 43, 44] used “random number table” or statistical software to generate the random sequence, the others just mentioned “random” but no description of specific methods. Two studies [39, 40] described allocation concealment and the blinding methods. Three studies [30, 39, 40] reported the case abscission and withdrawal. Generally, the risk of bias within the included RCTs was classified as high (See Fig. 2).

Fig. 2
figure 2

Risk of bias within studies

Results of individual studies

Reporting of outcome measures

Outcome measures in the included RCTs differed. As the end points of CHF, mortality and rehospitalization were only reported by 4 studies (4/31, 12.90%), the other studies all reported surrogate outcomes, including efficacy of cardiac function (83.87%), left ventricular ejection fraction (LVEF)(54.84%), 6 min’ walk distance (6MWD)(45.16%) and brain natriuretic peptide (BNP)(16.13%). No studies reported related cardiovascular events. Seventeen studies (17/31, 54.84%) reported adverse drug reactions (ADRs), while 14 studies (14/31, 45.16%) did not report any safety measures.

By contrast, all of the CSRs of CHF reported all-cause mortality (16/16, 100%), focused on the end points and safety measures and analyzed the all-cause and specific-cause mortality or hospitalization respectively. The overall reporting of outcome measures is shown in Table 3 and Fig. 3.

Table 3 Overall reporting of outcome measures
Fig. 3
figure 3

Outcomes reporting rate

Additional analysis

Reporting quality of outcome measures

All 31 RCTs reported the specific definition of outcomes, while only two [39, 40] clearly stated the primary and secondary outcome measures which were considered as high reporting quality of outcomes. Eight studies [14, 17, 29, 31, 32, 34, 35, 41] explained the use of the outcomes they had reported and five [19, 21, 23, 28, 40] adopted methods to enhance the quality of the outcome measurement, including training the investigators and arranging executives to measure the outcomes. Tables 4 and 5 shows the assessment of outcome reporting quality [13].

Table 4 Reporting status of each item for the assessment of outcome reporting quality
Table 5 Reporting rate of the items for assessment of outcome reporting quality

Discussion

This systematic review mainly analyzed outcome measures in RCTs which assessed the efficacy of TCM in treating CHF caused by CHD. We included 31 trials meeting the eligibility criteria and extracted outcome measures from these studies. The outcome measures were mortality, rehospitalization, efficacy of cardiac function, LVEF, 6MWD and BNP, of which mortality and rehospitalization are end points for patients with CHF while the others are surrogate outcomes [61]. Only four studies (4/31, 12.90%) reported mortality or rehospitalization, and in comparison, all 16 CSRs of CHF analyzed all-cause mortality. This difference indicated that present TCM trials mostly assessed the surrogate outcomes and lacked evaluation of CHF end points.

In this review, nearly half of the included studies (14/31, 45.16%) did not mention any ADRs or adverse events, which apparently affected the safety assessment.

Apart from the problems of selecting outcome measures, the reporting quality of outcome measures was generally low, twenty-nine (93.55%) trials did not define the primary and secondary outcomes, which would confuse readers about major objectives of the trials and what the interventions really can improve.

In terms of methodology of the included RCTs, there were only two RCTs [39, 40] considered as high-quality. In general, the risk of bias of these trials was classified as high. We considered that the design and implementation of most studies were far away from an optimal RCT in random sequence, allocation concealment, blinding, statistics and reporting.

The selection of outcome measures is a critically important step in clinical trials. Scientifically rigorous outcomes could show significant and comprehensive information about the efficacy and safety of specific intervention [62], which would produce positive impact on clinical choices and decisions for physicians. In large-scale trials of heart failure, end points like mortality and hospitalization, were mostly set as primary outcomes [63, 64] and treatments that could reduce mortality or morbidity would be recommended in influential clinical guidelines [65, 66]. We did comparative analysis with CSRs, which are commonly agreed as high-quality information for making health decisions, to identify the present problems with outcome measures in studies conducted by TCM researchers. It was found that evaluation of improving clinical symptoms without robust evidence of clinical end points might be the primary reason why TCM interventions have not been widely recognized [67].

A European Society of Cardiology (ESC) consensus on the outcomes of HF trials [61], which was included in the Core Outcome Measures in Effectiveness Trials (COMET) database, highlighted that clinical end points could support the consolidation of therapeutic strategies. Whilst surrogate outcomes reflecting manifestations are typically applied in earlier phases of drug or device development to support proof-of-concept (Fig. 4). We recommended that the future TCM trials could refer to this consensus to select outcome measures.

Fig. 4
figure 4

Schematic of outcomes for chronic heart failure trials [61]

The assessment of safety is indispensable for any clinical trial. In the included RCTs, CHF patients secondary to CHD, mostly hade one or more comorbid conditions that would potentially cause treatment conflict [68]. Researchers should attach great importance to ADRs, adverse events or other safety outcomes throughout the studies and have the responsibility to estimate whether the intervention has a negative impact on patients or aggravates heart failure subsequently affecting mortality or hospitalization [69]. It is strongly recommended that TCM researchers should pay enough attention to the evaluation and reporting of safety in each trial.

Through this review, we proposed that TCM clinical trials should focus on the assessment of clinical endpoints when evaluating TCM interventions in treating CHF. Whereas, we were aware that the included trials were all too small to assess clinical endpoints. Whether the quantity of participants, the duration of the trial or the involved areas, these trials cannot be regarded as large-scale trials. The shortest duration of the included trials was 1 week [39] in which it seemed to be impossible to record mortality, rehospitalization or other endpoints. Actually, there might be difference of the endpoints between treatment and control group when the follow-up time was longer than or equal to 6 months in clinical trials [20, 22].

It is indeed difficult to conduct a TCM trial with certain size and duration to evaluate endpoints of heart failure, which would need appropriate organization and funding. We need high-quality prospective, multicenter RCTs [11, 70] rather than the present repetitive trials within a limited scale to promote the benign development of TCM [71]. We recommend collaboration among hospitals, research institutes and enterprises of TCM to conduct multicenter clinical trials to assess endpoints and generate convincing evidence which could guide the TCM clinical practice in a real sense.

This review has several limitations. First, Thirty-one trials might not be enough to analyze various outcome measures. Second, neither our review nor the included trials distinguished heart failure with reduced ejection fraction or preserved ejection fraction, which would affect the selection and evaluation of corresponding outcome measures. Third, the proportion and reporting quality of the outcomes we analyzed in the review cannot involve comprehensive information about outcome measures in RCTs. The methods to measure the outcomes, timing of measurement, how to enhance the quality of outcome measurement, follow-up of the primary outcomes and the assessment of composite outcomes are all significant factors discussing outcome measures and our future research will focus on these problems. Fourth, due to the aims of the review, we did not conduct meta-analysis within the 31 RCTs. In the future, we would include trials without or with low heterogeneity, comprehensively analyze outcomes and evaluate the efficacy and safety of TCM treatments.

Conclusions

Several problems with the outcomes existed in present trials of TCM in treating CHF caused by CHD, including the lack of concentration on the clinical end points of HF, adequate safety evaluation, together with the low reporting quality. Moreover, the risk of bias was classified as high. In order to produce robust and convincing evidence for TCM in treating CHF caused by CHD, further studies should be rigorous and well-designed, set clinical end points as the primary outcome measures and strengthen evaluation of safety.