The use of complementary and alternative medicine (CAM) has become increasingly popular for both adult and pediatric populations. For example, in 1992, 11% of a Montreal-based pediatric sample used CAM [1]. By 1997, however, this figure had grown to 17% [2]. The 1994 National Population Health Survey suggested that 15% of Canadians (any age) had used CAM during the course of the preceding year [3]. By 1999, this figure had grown to roughly 70% [4]. Similar data has been reported for other jurisdictions [5, 6].

What is less clear is the quality of evidence for the use of these products and practices. Focusing our attention on reports of randomized controlled trials (RCTs) allows us to examine the gold standard for evaluating an intervention's effectiveness. This enables readers to judge the extent to which the results are internally valid and free of bias.

Within conventional medicine there is substantial evidence about the quality of reports of RCTs and the consequences of lower quality reporting. Schulz and colleagues have documented that only about a third of reports of RCTs adequately report allocation concealment [7]. Reports of inadequate allocation concealment, compared to those in which this information is adequately reported (i.e., adequate allocation concealment), exaggerate the estimates of an intervention's effectiveness by about 30%, on average [810]. These investigations and others have typically focused on interventions such as pharmaceuticals and adult populations.

We are unaware of any systematic effort to examine the extent of bias in reports of CAM RCTs specifically targeted at pediatric populations. This is important because if CAM studies are not subjected to rigorous evaluations they may jeopardize the health of children and their families [11]. Our primary focus was to evaluate the quality of reports of RCTs in the pediatric population (PedCAM). As a secondary question we also examined whether there was a change in the quality of reporting over time.


We have previously described the assembly of a comprehensive database of PedCAM RCTs [12]. Briefly, after defining CAM we searched 13 bibliographic databases using one of three search strategies. We also identified RCTs from cited references of 47 PedCAM systematic reviews [13]. The search results were downloaded to a reference database and screened. After identifying the PedCAM RCTs, clusters of trials relating to specific disease conditions and intervention types were identified. Three members of the research team (DM, MS, LL) nominated clusters for further consideration. Our goal was to identify 300 trials for closer examination, ensuring broad coverage of diseases and interventions from the full set of identified reports.

Once these reports were retrieved we extracted descriptive information using a 17-item structured data collection form. The questions pertained to the type of CAM used, the condition under investigation (according to the International Classification of Disease – ICD-9), the number and gender of the included children, the number and type of outcomes used, information about the reporting of adverse events and whether the authors reported on any cost information. The complete questionnaire can be obtained from the authors.

We also completed a comprehensive quality assessment of each report using three methods. First, the revised CONSORT statement checklist [14] was modified so that multiple items were listed separately, which resulted in 32 items. Each item was assigned a yes or no response depending on whether or not the author had reported it. Second, the reporting of allocation concealment was assessed as adequate, inadequate, or unclear [8] Third, the Jadad scale [15], which contains two questions each on randomization and masking, and one question on reporting of dropouts and withdrawals, was used to assess quality. Each question contains a yes or no response option. In total, five points can be awarded with higher points indicating superior quality. Three reviewers (DM, LL and MS) completed all of these evaluations.

We did not conduct any formal training prior to evaluating the RCTs using any of the three methods. We have extensive experience using these methods and have previously conducted training with results indicating substantial agreement between raters [16]. Discrepancies were resolved by consensus between the three raters.

We compared the number of checklist criteria included in each report and the mean number of criteria included within each subheading as specified in the CONSORT checklist. We also assessed the percentage of studies that reported unclear allocation concealment and the specific item and overall quality score derived from the Jadad instrument. The number of CONSORT checklist items reported was compared over time (1970s, 1980s, 1990s, 2000s) using an analysis of variance. A similar approach was used to assess the individual components and overall total score of the Jadad scale. The percentage of unclear allocation concealment was evaluated using χ2 tests.


Database searching identified 3580 citations. Of these 2975 were screened from which 1468 PedCAM RCTs were identified (Figure 1). We systematically sampled 301 of these reports for further study. Twenty reports failed to meet our eligibility criteria and a further 30 reports were not evaluated (Figure 1) leaving 251 reports from which descriptive information and quality assessment was completed (Table 1).

Figure 1
figure 1

Flow of citations and articles through the phases of screening and eligibility evaluation

Table 1 Conditions and Interventions selected for further study

The reported objective of CAM intervention was to manage or minimize current symptoms in two thirds of the reports (66.7%) and more than a quarter of the RCTs were undertaken to help prevent disease (32.5%). There were 157 reports involving fewer than 100 children (mean = 47.31; standard deviation = 24.14); 63 reports included between 100 and 1000 children (mean = 298.38; standard deviation = 232.51); and 9 reports included more than 1000 children (mean = 6766; standard deviation = 8402.72).

Forty percent (12.7 out of 32) of the CONSORT checklist items were included in the reports of PedCAM RCTs (Table 2). There was an increase over time (p < 0.001) in the number of checklist items included in the reports (Table 2). Between the 1980s, when 10.8 checklist items were reported, and the 1990s, when 13.4 checklist items were reported, there was a 24% increase in the number of checklist items reported (p = 0.001; mean difference = -2.55 (95% confidence interval: -4.28, -0.81). There was a minor decrease in the 2000s in the number of CONSORT items reported.

Table 2 CONSORT checklist criteria included in 251 reports of complementary and alternative medicine randomized controlled trials in children published over four decades

The majority (83.1%) of RCTs reported unclear allocation concealment (Table 3). We were unable to detect a change over time (p = 0.496; Table 3). The quality of reports achieved approximately 40% of their maximum possible total score as assessed by using the Jadad scale (Table 3). There was no improvement in the quality of reporting over time (p = 0.174).

Table 3 Quality of reports of 251 complementary and alternative medicine randomized controlled trials in children using the Jadad assessment scale and the adequacy of allocation concealment

Information regarding adverse events was reported in less than one quarter of the RCTs (22.4%). Similarly, information regarding costs (e.g., cost effectiveness) was mentioned in only a minority of reports (4.5%).


About one third of the CONSORT checklist items were included in reports of PedCAM RCTs. Although there is still considerable room for improvement in how these studies are reported, there has been a notable increase in the quality of their reports since the 1980s. The 24% increase in the number of reported CONSORT items is encouraging and probably reflects important reductions in bias in the results of RCTs. Similar results were observed when the assessments focused on the adequacy of allocation concealment or the Jadad assessment. Unfortunately, these results also suggest that the validity of some of the PedCAM RCT results is probably questionable. This is particularly true if our attention focuses on aspects of how randomization was reported. Only about half of the reports documented how the random numbers were generated and three quarters of the reports had unclear allocation concealment.

Conducting a randomized trial is a complex series of tasks and it may not always be possible to minimize some potential biases. For example, double blinding (masking) is questionable both ethically and scientifically when performing surgical trials. However, in every single randomized trial it is always possible to ensure that the random numbers are appropriately generated (e.g., computer generation) and concealed from all parties involved in the trial until the child has been randomized (e.g., centralized randomization). High quality reports always include this information.

Our results indicate that the quality of reports of PedCAM RCTs may be lower than that found for conventional medicine. It is difficult to be more certain because the degree of journal overlap between different studies is unknown. In a recent report on the assessment of 77 RCTs published in 1998, in three high impact factor journals, the average number of CONSORT items included was 27.1 (out of 40) [16]. The average Jadad score was 62% of the maximum possible score and 39% of the reports had unclear allocation concealment. Linde and colleagues recently reported on the quality of 207 trials in homeopathy, herbs and acupuncture [17]. The average Jadad score ranged from about 40% (acupuncture) to 60% (herbs) of the maximum possible total score. The Jadad scores are similar to those observed in this study. However, the CONSORT results reported here are considerably lower than observed elsewhere.

One way to improve the quality of reporting of PedCAM RCTs is for more pediatric journals to endorse the CONSORT statement. There is evidence to suggest that journals using the CONSORT statement, compared to those not doing so, have higher quality reports of RCTs [16]. Of course examining the quality of reporting is 'after the fact' when the trial is already completed. CONSORT can also be used by granting agencies [18]; (Allan Bernstein, President of the Canadian Institutes of Health Research, personal communication) to encourage prospective investigators to improve the conduct of their RCTs.

It is not immediately clear why we observed lower quality scores in these reports. It is possible that the PedCAM community conducts fewer RCTs and is therefore less experienced. We have observed that prior to 1975 there were very few published reports of PedCAM RCTs [12], although there has been a sharp increase in the number of reports during the 1990s. These results might also reflect that the PedCAM community has been slower to train researchers in the appropriate conduct of RCTs.

Beyond examining the quality of these reports we were disappointed to find that so few reports mentioned anything about adverse events. This result is similar to that recently reported [19]. Although information on adverse events is extremely important authors have typically devoted less space to this information than to their names and affiliations [20]. Only about one in twenty reports mentioned anything about costs, such as a cost benefit analysis, in their reports. If clinicians and policy makers are to make decisions about the utility of CAM interventions for the pediatric population they will need more information than simply the efficacy of the intervention.

This study had a number of limitations. Our focus was on the quality of reporting of PedCAM RCTs. It is possible that the trials were appropriately conducted but had deficiencies in their reporting. Despite the paucity of data addressing this important question, the evidence that is available points in the direction of a reasonably good correlation between how investigators conduct their trials and how they are subsequently reported [21, 22]. We did not take a random sample of all 1468 trials identified. It is possible that our sample does not reflect the total population and that these results cannot be generalized to all PedCAM RCTs. We selected the reports to broadly reflect the ICD categories and CAM interventions of the 1468 reports. We believe that our sampling approach although systematic is representative and enables us to generalize the observed results. The 2000s results need to be interpreted with caution and are probably not representative of the population, as there were only 13 reports. The small number probably reflects the delay in indexing the studies on electronic databases. We excluded 22 reports because they were written in languages other than English. It is possible that the quality of these reports differs in some systematic way from English language reports. Previous research suggests that the quality of RCTs reported in non-English languages is similar to those reported in English [23].

Randomized trials are an important tool for evidence based health care decisions. If these studies are to be relevant in the evaluation of CAM interventions it is important that they are conducted and reported with the highest possible standards. There is a need to redouble efforts to ensure that children and their families are participating in RCTs that are conducted and reported with minimal bias. Such studies will increase their usefulness to a board spectrum of interested stakeholders.