Background

Abstracts of primary research can provide concise information on a study’s purpose, methods, main results and conclusions. It is not uncommon that abstracts are used to aid decision-making, especially when full reports cannot be accessed [1, 2]. However, there is a risk that the reporting of abstracts is inconsistent with their corresponding full reports, which would then distract or mislead readers [3, 4]. Although the EQUATOR (Enhancing Quality and Transparency in Health Research) network has provided some guidelines to enhance the reporting of abstracts [5], adherence to them remains unsatisfactory [6,7,8,9]. Some studies assessing the inconsistency between abstracts and full reports have reported striking findings of abstract inaccuracy [10,11,12]. Nevertheless, there is a lack of summary showing the general mapping of the abstract reporting problem or providing overarching recommendations in primary studies for future research in the literature. Therefore, as part our series on the state of reporting of primary biomedical research [13], we used a scoping review to summarize the evidence from systematic reviews and surveys, in order to investigate the current state of inconsistent abstract reporting, and also to evaluate factors that are associated with improved reporting. This was done by comparing abstracts and their corresponding full reports.

Methods

We performed and reported our study based on the methodological guidance for the conduct of a scoping review from the Joanna Briggs Institute [14] and the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guideline [15]. Details on the methods can be found in our protocol [13]. Surveys and systematic reviews were considered eligible if they compared abstracts within full reports with the reports themselves, or if they compared conference abstracts with subsequent full reports emanating from the same study. We did not distinguish between these comparisons in this scoping review.

Search strategy and study selection

Briefly, we searched EMBASE (Exerpta Medica Database), Web of Science, MEDLINE, and CINAHL (Cumulative Index to Nursing and Allied Health Literature) from January 1st 1996 to September 30th 2016 to retrieve relevant studies, using key descriptors for systematic reviews or surveys, abstracts, and reporting or inconsistency. All reference lists from the included surveys and reviews were also manually searched to assess their eligibility. All the searches were limited to the English language. Studies were excluded if: 1) they were not systematic reviews or surveys; 2) they did not have a study objective of comparing abstracts with full reports; 3) no data were reported on inconsistency between abstracts and full reports; 4) they were in duplicate (and then only one copy was retained); 5) they only published abstracts, letters, editorials or commentaries without full-text articles for further detailed information; or 6) they did not focus on primary studies.

All the screening of titles, abstracts and full-text articles was conducted by two reviewers (IN and YJ) in duplicate and independently. We used the Kappa statistic to quantify the level of agreement between the two reviewers [16]. Discrepancies between the two reviewers were resolved by consensus, or, failing that, a third reviewer (GL) made a final decision.

Outcome measures

The primary outcome was the level of inconsistency between abstracts and full reports, which was expressed as a percentage (lower percentage indicating better reporting) or categorized rating (such as major/minor difference, high/medium/low inconsistency), as reported by the authors. We also extracted details on inconsistency for the study-validity-related factors including research question or objective, population or sample size, intervention or exposure, comparator, outcome, study duration, study design, statistical analysis, result presentation, result interpretation, and conclusion or recommendation [13]. Secondary outcomes were the factors associated with inconsistent abstract reporting.

Data collection

Data collection was conducted independently by two reviewers (LA and IN) using a pilot-tested data extraction form. Data extracted were: 1) general characteristics of included systematic reviews or surveys (authors, journal, publication year, study design, field of study, data sources for abstracts and full reports, study search frame, numbers of included abstracts and full reports, study country of primary studies, study sample size in primary studies, and funding information for the systematic reviews or surveys); 2) definitions of inconsistency between abstracts and full reports and the main findings in the included studies; 3) information on inconsistency for the study-validity-related factors; 4) factors related to improved reporting between abstracts and full reports; and 5) authors’ conclusions or recommendations in the included studies. We also collected the terminologies that were used to describe the abstract reporting problem, and their frequencies.

Quality assessment of included systematic reviews

We assessed the study quality of included systematic reviews based on the AMSTAR (a measurement tool to assess systematic reviews) criteria [17]. Some items, such as item 9 (“Were the methods used to combine the findings of studies appropriate?”) and item 10 (“Was the likelihood of publication bias assessed?”) were not applicable to the included systematic reviews, and these items were therefore excluded from our overall quality score evaluation. We calculated the scores by summing the number of items on AMSTAR that the included systematic reviews met. No study quality assessment was performed for the surveys because there were no validated evaluation tools available.

Evidence synthesis

We used word clouds to show the frequencies of the terminologies employed to describe the problems identified in abstract reporting. The online program Wordle (www.wordle.net) was used to draw the word clouds, based on input of the terminologies and the numbers of included studies that used them to describe inconsistent abstract reporting. The relative size of the terminologies in the word clouds corresponded to the frequency of their use. We used medians and interquartile ranges to describe the level of inconsistency across studies. Evidence from the included systematic reviews or surveys was summarized qualitatively, but not quantitatively.

Results

There were 9123 records retrieved from the electronic databases. After removing duplicates and having screened the titles and abstracts, a total of 84 studies remained for full-text article assessment (kappa = 0.85, 95% confidence interval [CI]: 0.79 - 0.91 for record screening). Of these, 16 studies met the eligibility criteria [2,3,4, 7, 10,11,12, 18,19,20,21,22,23,24,25,26] and one additional study [27] was identified from the reference lists, yielding17 studies that were included for the data collection and analyses (kappa = 0.65, 95% CI: 0.57 - 0.75 for data extraction). Figure 1 presents the flow diagram of the study selection process.

Fig. 1
figure 1

Study flow diagram showing the study selection process

Table 1 summarizes the characteristics of the included studies, with three systematic reviews [3, 12, 19] and fourteen surveys [2, 4, 7, 10, 11, 18, 20,21,22,23,24,25,26,27]. Eleven studies compared conference abstracts with their subsequent full reports [2, 7, 19,20,21,22,23,24,25,26,27], while the others investigated inconsistency between the abstract section and the main text in the same publication [3, 4, 10,11,12, 18]. Three studies reported that their primary studies were mostly from North America or Europe [2, 19, 24]. The median sample size in the primary studies ranged from 5 to 452. Three studies reported that they received academic funding to support their surveys [4, 11, 21]. Study quality was evaluated for the three systematic reviews, in which their scores on AMSTAR were 8 (out of 9) [19], 7 [12] and 6 [3], respectively. None of these three reviews provided information on conflict of interest for either the systematic reviews or each of the included primary studies [3, 12, 19]. Two studies did not consider the grey literature in their search strategies [3, 12]. Because there was no information available (such as protocol, ethics approval, registration) on the a priori research question and inclusion criteria before the conduct of the review, one study scored zero on the AMSTAR item 1 (“Was an 'a priori' design provided?”) [3].

Table 1 General characteristics of included systematic reviews or surveys

The most frequently used terminology to describe the abstract reporting problem were “inconsistency” (n = 14, out of 17 the included studies, 82%), “deficiency” (n = 11, 65%), “accuracy” (n = 10, 59%), and “discrepancy” (n = 8, 47%). Other terminology included “omission”, “misreporting”, “discordance”, “poor”, “biased”, “inadequate”, “incomplete”, and “selective reporting”, each of which appeared in at most4 studies. Figure 2 shows the word cloud of the terminologies used in the included studies.

Fig. 2
figure 2

Word clouds of the terminologies used in the included studies, with the relative size of the terms in the word cloud corresponding to the frequency of their use

Table 2 shows definitions, main findings and authors’ conclusions of inconsistency between abstracts and full reports in the included studies. The level of inconsistency ranged from 4% to 78%, with a median of 39% (interquartile range: 14% - 54%). In the studies that differentiated major from minor inconsistencies [2, 19, 20, 27], the level of major inconsistency ranged from 5% to 45% (median: 19%, interquartile range: 7% - 31%), which originated from the specification of the study design (5%) or sample size (37%), designation of a primary outcome measure (from 14% to 28%), presentation of main results (19%), or drawing a conclusion (6%). All the included studies concluded that abstracts were frequently inconsistently reported, and that efforts were needed to improve abstract reporting in primary biomedical research (Table 2).

Table 2 Definitions, main findings and authors’ conclusions of inconsistency between abstracts and full reports in the included studies

Table 3 shows the details on inconsistency for the study-validity-related factors between abstracts and full reports. Except for the research question or objective, intervention or exposure, study duration or design, and statistical analysis, inconsistencies were frequently reported in other factors of this type, with percentages of >10% in most cases. For instance, in the nine studies that assessed a total of 896 abstract-full-report pairs, conclusions in abstracts were found to be inconsistent with the full reports (ranging from 15% to 35%), or made stronger statements than in the full reports (17%). As presented in Table 4, three studies investigated factors related with inconsistent reporting between conference abstracts and full reports [2, 20, 21]. A longer time interval before publication of the full reports was found to be the only factor that was marginally or significantly related to an increased likelihood of reporting inconsistencies.

Table 3 Details on inconsistency for the study-validity-related factors between abstracts and full reports
Table 4 Factors reported to be associated with inconsistent reporting between abstracts and full reports

Discussion

In this scoping review assessing inconsistency between abstracts and full reports, we summarized the evidence from systematic reviews and surveys to show the literature mapping for the inconsistent abstract reporting in primary biomedical research. Abstract reports were frequently different from their corresponding full reports, with a high level of inconsistency. The length of time between the appearance of conference abstracts and the publication of full reports was the only factor reported to be associated with inconsistent reporting.

Readers usually rely on an initial assessment of an abstract in deciding whether to access the full report, draw conclusions about the study, or even make their decisions, especially when a full report is not available [4, 28]. For instance, Bhandari et al. found that over 50% of the chapters in the latest editions of some most influential orthopedic textbooks referenced at least one conference abstract, and these abstracts would be frequently cited in lectures and rounds [2]. Therefore, given their potential impact, all the summary information in abstracts should, at a minimum, be accurate and consistent with their full reports. However, abstracts are frequently prepared with the least care [1, 3].Our current review found a high level of inconsistency between abstracts and full reports, especially with respect to sample sizes, outcome measures, result presentation and interpretation, and conclusions or recommendations (Table 3). Unlike the included individual studies that evaluated a specific research area, or a group of journals or diseases, our review summarized all the available evidence from systematic reviews and surveys in various areas of the biomedical literature, and we consistently found severe problems in abstract reporting in the primary biomedical community. More efforts are warranted to reverse and prevent the inconsistency of abstract reporting.

There were two studies that also assessed the spin in the abstract reporting, in which the spin existed in the studies with overall non-significant results but with an overly-optimistic abstract that tried to claim significant results or strong recommendations [3, 4]. The spin may not always be relevant to our objective of identifying inconsistency between abstracts and full reports, because the spin could be the same in content and magnitude in both the abstract and the full report. However, when there is an attempt to incorrectly convince the audiences of a favorable finding or conclusion, the existence of spin in abstracts may pose a threat to distorting study findings and misleading readers of the biomedical literature, especially when readers do not go on to refer to the full reports for the study results in detail.

Many journals have adopted a policy of requiring structured abstracts, because they have been shown to be more informative, have greater readability, and be of better quality [29, 30]. However, one study has argued that the problem of inconsistent abstract reporting will not be mitigated by using structured abstracts [10]. In contrast, if structured abstracts inappropriately emphasize their main points, the inaccurate information that they convey could have a stronger impact on the biomedical community. For instance, some structured abstracts used spin to over-emphasize favorable effects in subgroups of patients, for secondary outcomes or in deliberately modified populations, or they made over-optimistically strong conclusions and recommendations, which would further mislead the audience [3, 4]. Furthermore, word count limitations in structured abstracts can sometimes cause key information to be omitted [21]. The effect of guideline checklists on improved consistent abstract reporting remains unknown, with sparse evidence available in the literature. The CONSORT (Consolidated Standards of Reporting Trials) guideline for abstracts, that was published to aid in improving structured abstract reporting for RCTs [31], might not prevent subtle inconsistencies, especially if editorial staff do not refer to full reports for painstaking scrutiny [10]. Similarly, one trial provided instructions on ensuring data accuracy in abstracts to authors, but found that this was ineffective in actually improving abstract reporting [32].

Some included studies explored the interpretations of the inconsistencies in conference abstract reporting. For example, given that some conference abstracts were presented when studies were ongoing or at an early stage, sample sizes in full reports would probably be updated from the preliminary results described in abstracts [20, 21]. However, one study argued that the inconsistencies may be deliberate, because authors avoided providing details or explanations of the inconsistencies (such as differently handling patients who were lost to follow-up or withdrawal, or adding more exclusion criteria) to achieve more favorable results in full reports [2]. Furthermore, in order to show significant findings in the full reports, authors may selectively report favorable findings, or deliberately change the way of defining primary outcomes, presenting and interpreting results or drawing conclusions [2, 19, 21, 23]. Three studies reported that having a longer time before the publication of full reports was associated with increased risk of inconsistent abstract reporting (Table 4) [2, 20, 21]. This might be partly explained if delayed publications had experienced several rejections from journals, and if authors then consciously or subconsciously modified their full reports to cater to the subsequent peer-review processes [20]. Some delays in publishing were due to having an extended study duration. In long duration studies, large amounts of data may be collected, which may then yield different statistical analysis results from the preliminary results as presented in the conference abstracts [2, 33].

To reduce or prevent inconsistency between abstracts and full reports, we recommend that the authors, reviewers and editorial staff should carefully scrutinize the consistency and accuracy of abstract reporting during the submission and peer-review processes [34]. Copyediting and proofreading should be performed strictly to avoid any confusion or inconsistency in abstracts after submissions are accepted. Journals may also consider more flexible word counts in structured abstracts to allow more details to be presented. Moreover, guidance and/or checklists are needed to facilitate authors, reviewers and editorial staff with their prompt assessment of inconsistency between abstracts and full reports. For conference abstracts, one might argue that conferences or meetings should require a publication-ready manuscript for their abstract submission [24]. However, this expectation is probably unrealistic and its impact remains largely unknown. In contrast, we recommend that editorial staff and reviewers should refer to the previously presented conference abstracts during the peer-review process, and authors should provide explanations of any inconsistencies between those abstracts and the full reports.

Our scoping review has some limitations. We limited the search to articles in English, which would omit studies in other languages. As most included studies focused on RCTs and/or conference abstracts, the findings from non-randomized studies, basic science and/or comparisons between abstract sections and main texts in the same publications, remained largely unknown. Also, we could not assess the quality of included surveys because no such validated guidance was available. Similarly, lack of information on the factors associated with inconsistent reporting between the abstract section and main text in the same publications restricted our investigations and recommendations in this area.

Conclusion

In this scoping review of the state of abstract reporting in primary biomedical research, we found that abstracts were frequently inconsistent with full reports, based on evidence from systematic reviews and surveys in the literature. Efforts are needed to improve the consistency of abstract reporting in the primary biomedical community.