Background

The number of citations that a publication receives is often regarded as an indicator of scientific quality [1]. The rationale behind this is that high-quality work will lead to more citations by peer scientists compared to low-quality work. It can be questioned, however, whether it is scientific merit alone that drives the number of citations or whether other factors have an influence as well.

Due to the large and growing number of scientific articles in the biomedical domain and limitation of the maximum number of references in many journals, it is not always feasible to cite all available literature. Ideally, a representative sample should be cited, but it is unclear on which ground researchers select the articles they cite. If this selection is not representative, but instead associated with specific characteristics of the cited literature, we speak of selective citation.

Citation bias is a special case of selective citation [2]. It concerns the selective citation of studies based on their outcome. Citation bias results in disproportionate attention for a specific segment of the literature that is in line with a hypothesis, while publications with an alternative view are systematically ignored. This can lead to a false consensus that is not evidence-based [3]. For instance, it has been shown that biased citation of previous evidence shapes the conclusions of reviews [4]. Citation bias can also lead to research waste by influencing funding decisions and guiding research in a wrong direction [4, 5].

Citation analyses can be performed in several manners. Greenberg [3] performed a network analysis based on a specific scientific claim: the relationship between a muscle disease called inclusion body myositis and beta-amyloid proteins. After identifying all the literature on this claim, he compared the number of citations to supportive empirical publications with the number of citations to critical ones. Greenberg concluded that 94% of the citations were assigned to the supportive data [3]. Next to a claim-specific network, a network analysis can also be based on a specific journal or time period. For instance, Fanelli [5] performed a citation analysis across several research domains. Articles were included if they tested any hypothesis and were published between 2000 and 2007. It was found that positive studies received on average 32% more citations than negative studies. However, this percentage differed substantially between the scientific domains [5].

In the current study, we performed a claim-specific citation analysis. We focused on the relation between swimming in chlorinated water and the development of childhood asthma.

Chlorinated water and asthma

The prevalence of childhood asthma is high in the developed world [6] and rising in the developing world [7]. Asthma patients are often recommended to swim because it has been shown that exercise can help to alleviate asthmatic symptoms (e.g., [8]). This recommendation has been endorsed by some health authorities and official guidelines worldwide (e.g., [9]).

In 2003, however, Bernard et al. postulated that swimming in chlorinated water could also increase the risk of developing asthma [10], especially in children. This can be explained as follows. Human beings secrete sweat and urine in swimming water, and both of these contain urea. A chemical reaction of urea with the free chlorine in chlorinated water leads to the generation of chloramines. In fact, it is these chloramines that are responsible for the typical smell of chlorinated swimming pools, not the chlorine itself. One of these chloramines, called trichloramine or nitrogen trichloride (NCl3), does not dissolve in water and will be released into the air [11, 12]. This substance can cause irritation to the eyes and lungs. The actual exposure to trichloramine depends on many factors like the level of free chlorine in the swimming water, whether the swimming pool is indoors or outdoors, ventilation, the number of people in the swimming pool, whether they have showered before, and the duration and intensity of the exercise. Children might be particularly prone to the adverse effects of trichloramine, as their lungs are still developing and relatively sensitive.

The pool chlorine hypothesis builds on this potential relationship between swimming in chlorinated water and the development of asthma in children. It postulates that the alleged rise in childhood asthma in developed countries can at least partly be explained by the popularity of indoor chlorinated pools [6] (but see also [7] for counter-evidence on the rise of asthma in the developed world). If swimming in chlorinated water indeed has an adverse effect on respiratory health, it would call into question the official recommendation for asthmatic patients to swim. It would also suggest that other methods should be applied for the disinfection of swimming pool water.

However, no consensus has yet been reached on the question whether swimming in chlorinated water does increase the likelihood of developing asthma, and as such, it is still an object of ongoing research. After all, it is hard to find conclusive evidence that would settle the debate in the form of a single study, even more so because most of the studies on humans are observational rather than experimental. In addition, asthma is difficult to diagnose and has been assessed in a variety of ways. It is therefore necessary to combine different sources of information to reach a representative overview of the literature and an evidence-based consensus.

In order to investigate how these different sources of information are combined in this field, we employed a citation analysis on all published literature investigating the claim that swimming in chlorinated water is related to the development of childhood asthma. In addition to Greenberg’s approach [3], other determinants than study outcome were also taken into account. Specifically, our aim was to investigate which determinants influenced the likelihood of being cited in the scientific literature on the relation between swimming in chlorinated water and the development of childhood asthma.

Method

Prior to performing this citation network analysis, we described the methodology in a study protocol and stored it at an online repository [13]. Deviations from this protocol are mentioned in the digital Additional file 1. In brief, we applied a search strategy to the Web of Science Core Collection, identified relevant literature, downloaded these records with their reference lists, extracted data for each article, built a dataset with potential citations, and used specialized software to determine which citations had occurred. These steps will be explained in more detail below.

Search strategy

The search strategy combined terms related to (a) the determinant (terms like chlorinated pool, swimming, trichloramine), (b) the health outcome (asthma diagnosis and symptoms, lung measures), and (c) the population age (children and adolescents). The exact search strategy can be found in Additional file 2.

Publications were identified via the Web of Science Core Collection. Identification of articles via reference list checking was not applied, as this could result in an overrepresentation of cited articles. The search was performed by BD and updated until 20 June 2017. There were no restrictions with regard to publication language or year.

Both empirical and non-empirical articles were included. Articles were included if they presented data or contained a statement on the association between swimming in indoor chlorinated pools and childhood asthma or other asthma-related health outcome measures. Articles on swimming as recommendation to asthmatic patients were excluded, as well as those on swimming pool accidents. We made some minor deviations with regard to the criteria as described in the study protocol, the updated list can be found in the Additional file 1. Selection of articles was conducted by two authors (BD and MJEU) followed by a consensus meeting. Agreement was reached in all cases.

Data extraction

A range of variables were extracted from each included publication. Data extraction was performed by two authors (BD and MJEU), followed by a consensus meeting. Agreement was reached in all cases. In addition, we developed measures for the within-network authority of the authors and for the occurrence of self-citations. These extracted and developed variables were classified in three distinct categories: article characteristics (study outcome, other content-related, not content-related), author characteristics, and citation characteristics.

Article characteristics––study outcome

We differentiated between two ways of looking at study outcome: data-based conclusion and authors’ conclusion. Selective citation based on either of these classifications of study outcome would signify citation bias.

The data-based conclusion was based on the asthma diagnosis results as reported in the result sections of empirical articles. Asthma diagnosis was assessed in different ways in the various articles. We ranked these assessments in order of decreasing validity: (1) physician’s assessment, (2) self-assessment, (3) occurrence of asthma symptoms, (4) positive lung tests, and (5) blood tests indicative of increased lung permeability. A data-based conclusion was scored as positive if a statistically significant, positive relationship of swimming in chlorinated water with asthma diagnosis was reported. In case of contradicting results, we used the asthma diagnosis with the highest validity. For instance, if blood tests showed a positive relationship with swimming but the physician’s assessment did not, the data-based conclusion was scored as negative.

The authors’ conclusion was scored in a similar way, but then based on the authors’ interpretation of the results instead of the statistical significance of asthma diagnosis. The authors’ conclusion was extracted from the discussion or abstract of a publication.

Article characteristics––other content-related

The following variables were in this subcategory: article type (and study design), sample size, study quality, and specificity.

Article type was classified into empirical articles and non-empirical articles (narrative reviews and commentaries). For some analyses, empirical articles were further classified into study design: experimental studies and observational studies (such as cross-sectional, cohort, ecological, case control studies, case studies, and systematic reviews of observational studies).

Sample size concerned the number of underage participants in the articles (younger than 18). Narrative reviews had no sample size, the other study designs were classified in three fairly equal categories.

The study quality of cross-sectional designs was rated with the NIH National Heart Lung and Blood Institute’s assessment for cross-sectional designs [14]. According to this scale, articles could be classified as good, fair, or poor. Other designs were not rated since the vast majority of the empirical studies was cross-sectional and the other designs had a very low number of publications.

The specificity of the articles could vary. Some articles may deal only with the statement under investigation (i.e., the relationship between swimming in chlorinated water and the development of asthma in children), others were broader (e.g., the health effects of swimming in the general population). The higher the specificity of an article, the better this article would fit in the network. Specificity ranged from 1 (very broad) to 5 (highly specific). Specificity was assessed based on the title of the article.

Article characteristics––not content-related

The following variables were in this category: language (English or not), conclusiveness of the title, funding source, number of authors, number of affiliations, number of references, and journal impact factor. Title conclusiveness was coded as yes if a clear outcome was stated in the title (e.g., “swimming and asthma are related” or “(...) not related”), otherwise as no (e.g., “swimming and asthma”). Funding source was coded as non-profit (e.g., government or university), for profit, both, or not reported. Journal impact factor, in the publication year of the potentially cited article, was retrieved from the Journal Citation Reports (JCR) database.

Author characteristics

The following variables were in this category: gender of the corresponding author (assessed by first name and/or salutation), country of the corresponding author, and affiliation of the corresponding author. Affiliation was classified as government, university, industry, or other.

Citation characteristics

There were some variables that depend on the cited article as well as the citing article: time to citation, authority, and self-citation. (For clarification: when we write about cited articles, citing articles, and citation paths, we refer to potential citations that may or may not occur.)

Time to citation was the number of years between the publication date of the cited article and the submission date of the citing article. This variable was also used to determine the dataset of potential citation paths (see “Section 2.3” below).

As for the publication date, we used either the electronic publication date or the paper publication date, depending on which one was earlier. The average duration from submission to publication was 7 months in this network. Submission date was not always given. If submission date was missing, it was estimated by subtracting 7 months from an article’s publication date.

Within-network authority was a measure for the authority of the authors of a cited article within the network. It was calculated for each author and each year separately, by counting the number of within-network citations to all publications in which the author had been involved. As the number of citations is likely to increase each year, so does the author’s authority. Because we were interested in the authority at the moment of citation, the authority value of a cited article also depends on the publication year of the citing article. In case of multiple authors, we used the authority value of the author with the highest authority in that year.

A self-citation was defined as a citation between two articles that have at least one author in common.

Statistical analysis

The dataset consisted of all potential citation paths between citing and cited articles. A potential citation path means that the cited article is published before submission of the citing article (i.e., time to citation has a positive value). The underlying assumption is that articles can only cite up to their submission date and can only be cited from their publication date onwards. This assumption was met for the entire network with one exception: one article had cited another article that was not yet published at the moment of submission of the citing article. The same authors were involved in both articles, which explains why they could be aware of the cited article before it was published. (This citation was not considered a potential citation and therefore excluded from our analyses.)

Our dependent variable was citation or, in other words, whether a potential citation path was used or not. We used the built-in algorithm of CitNetExplorer to determine whether a citation had occurred [15]. This algorithm makes use of reference lists that can be downloaded from the Web of Science Core Collection. The reference lists of all articles in the network were linked by the algorithm with the actual articles in the network. If possible, this linkage was done by DOI, a unique Digital Object Identifier assigned to most present-day articles; otherwise, it was based on a combination of first author’s surname, first author’s first initial, publication year, volume number, and first page number. Manual checking of the reference lists of the included articles showed that all classified citations were correct and that no citations were missed by the algorithm. The determinants of citation were the characteristics of the cited article as described above.

Since each article could cite multiple other articles, the potential citation paths were related. Therefore, we used a multilevel approach in which the potential citations were nested under the citing article. Specifically, we performed a univariate random-effects logistic regression for each determinant of citation. We repeated these analyses while adjusting for article type.

Where applicable, we also calculated whether the cited and the citing articles had the same characteristics (concordance). This would for instance be the case if positive articles would prefer to cite other positive articles and if negative articles would prefer to cite other negative articles. If citation would be based on the concordance of study outcome, it would be another sign of citation bias. To test if concordance on several characteristics has an impact on the likelihood of citation, univariate and adjusted fixed-effects logistic regression analyses were applied.

Software

We used the built-in algorithm of CitNetExplorer 1.0.0 to extract the actual citations between articles. We used R 3.2.4 to create a dataset with all potential citation paths, based on the data extraction sheet and the actual citations, and also to calculate the within-network authority, self-citation score, and time to citation for each potential citation path. Finally, we used Stata 13.1 to analyze the results and VOSviewer 1.6.0 to assess the co-authorship networks. CitNetExplorer and VOSViewer were also used to visualize the network.

Results

Thirty six articles on the relationship between swimming in chlorinated water and the development of childhood asthma, published between 2002 and 2015, were identified via the Web of Science [6, 10, 12, 16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48] (see Additional file 3). There were 191 actual citations between these articles, out of a total of 570 potential citation paths (34%). The characteristics of the publications are depicted in Table 1. Of the 36 research articles on this topic, 14 were non-empirical articles, 13 cross-sectional studies, 3 cohort studies, 2 pre-experimental designs (without control group), and a few other designs (case study, ecological study, “multiple studies” study, meta-analysis). Notable is the relatively high number of narrative reviews in this network, compared to the data-generating, empirical articles. Sixteen articles concluded that there was support for the association between swimming and asthma, ten articles concluded that there was no support for this association, and ten articles presented a mixed or an unclear conclusion (see Additional file 4).

Table 1 Characteristics of all 36 articles in chlorinated water network

A ranking of the most cited articles and authors can be found in Table 2. Articles were cited between 0 and 26 times (with a median of 4). It seems that Bernard’s article from 2003 is generally seen as the first study that started this line of research [10]. There was an earlier publication in 2002 [16], but asthma was not explicitly mentioned, and one of the studies reported in this publication was also reported in the 2003 publication. (Both publications reported on multiple studies.) A more recent publication, by Font-Ribera, was cited 11 times out 12 potential citations; in other words, almost all subsequent articles cited this publication [35]. The table further shows that, within this line of research, Bernard is clearly the most cited author. It also shows that 5 out of 6 researchers with the highest authority are working at the same research group.

Table 2 Top 6 of articles (above) and authors (below) within network, based on the number of received citations up to 2016

The results from the regression analyses of citation are shown in Table 3. We used odds ratios for the interpretation of the results. An odds ratio greater than 1 signifies an increased likelihood of citation. Originally, we had planned to adjust for article type and sample size as we consider both of them to be justified determinants of citation. However, article type and sample size were highly related and adjusting for both led to unstable results. Therefore, we only adjusted for article type.

Table 3 Odds ratios (95% CIs) for the chance of being cited, all types of articles included, N = 36, n = 570)

First of all, if we look at the content-related article characteristics, there was some evidence for citation bias within this network: articles with a positive authors’ conclusion were cited more often than articles with a negative conclusion (odds ratio (OR) 1.8, 95% confidence interval (CI) 1.1–2.9). The data-based conclusion did not seem to impact the chance of citation. Articles with an empirical research design were cited more often than non-empirical ones (OR 4.2, CI 2.6–6.7), and a higher sample size was associated with a higher number of citations (OR 5.8, CI 2.9–11.6). Articles that specifically focused on our hypothesis under investigation, and thus fitted well in our network, were cited more often than those that did not (one step increase in specificity: OR 1.4, CI 1.2–1.7). However, the quality of the cross-sectional studies did not seem to have an impact on citation.

Other article characteristics that increased the number of citations were journal impact factor, number of authors, number of references, and reporting of a funding statement. Author characteristics that showed an impact on the number of citations were gender and region. The within-network authority of the cited article also increased the likelihood of citation (OR 4.1, CI 2.1–8.0), and there was strong evidence for self-citation in this network (OR 5.2, CI 3.1–8.8).

In addition, we tested whether articles with similar characteristics were more likely to cite each other. The results showed that concordance between articles did not have much impact on the likelihood of being cited, except possibly for concordance of study quality (Additional file 5).

We performed post hoc sensitivity analyses in which we excluded the narrative reviews and commentaries (with sample size being equal to 0) and adjusted for both study design and sample size (Additional file 6). The results were very similar to those of the complete data set. We performed an additional analysis in which we also excluded the ecological study and the case study (both with a wildly varying sample size). These results did show some stronger evidence for citation bias as studies with positive authors’ conclusions and data-based conclusions accumulate about three times more citations (Additional file 6). The results on the other determinants were fairly consistent with the previous findings.

The occurrence of citation bias was tested with two operationalizations of study outcome. The authors’ conclusion on the relation between swimming and asthma, as was often stated in the abstract, seemed to have an impact on citation, while the conclusion based on the underlying data seemed to have no impact. This implied that the authors’ conclusion and the data-based conclusion did not always concur. A post hoc chi square test showed that 40% of the negative results were interpreted as positive by the authors of those articles, while 0% of the positive results were interpreted as negative (data not shown; χ 2 (1) = 6.3, p = 0.012).

Also, because self-citation seemed abundant in this field, we wondered whether particular researchers or research groups were responsible for this. First we assessed the number of research groups based on co-author relations of authors with at least three publications. Two research groups could be distinguished (Fig. 1). We then coded group membership for each article depending on which groups the authors belong to. Twelve articles stemmed from Bernard’s research group, nine articles came from the other identified research group, and 15 articles were written by authors that belonged to neither group. There were no articles in which the authors from both research groups collaborated. Interestingly, 11 of the 12 articles from Bernard’s research group had a positive authors’ conclusion, whereas the other research group only published articles with a negative or mixed/unclear conclusion (Additional file 7). Stratified logistic regression analyses showed that, indeed, self-citation occurred in both research groups (Additional file 7). This finding was confirmed by the self-citation rates of individual authors: the top 6 of self-citing authors stem from both of the research groups (Additional file 7).

Fig. 1
figure 1

Network visualization - authors. Each circle represents an author. The bigger the circle, the higher the number of publications. Each line represents a shared authorship. This graph shows that there are two main research groups active in this network. Authors with less than three publications were excluded

Discussion

Our research aim was to find out which determinants have an impact on the likelihood of being cited in the literature on swimming in chlorinated water and childhood asthma. We found that self-citation, large sample size, and empirical articles play a major role: in this network, they lead to a four to six times higher odds for citation. Other determinants associated with double citation odds or higher are journal impact factor, reporting of funding information, number of authors, authority, geographical region, and number of references.

There is some evidence for citation bias in this field of research. Articles in which the authors concluded that swimming in chlorinated water and asthma were related, are cited more often than those that did not. However, this finding is not robust and depends on the statistical test performed. Similarly, when we looked at the data-based conclusion rather than the authors’ conclusion, it did not have much impact on citation. This suggests that citation bias, if it occurs, is mostly based on interpretations and not so much on the underlying data. This would be in line with previous findings from our meta-analysis on citation bias, which showed that the authors’ conclusion has a higher impact on the likelihood of being cited than a data-based conclusion [2]. Still, the impact of study outcome on citation in the current network is small and not robust.

One explanation why we did not find strong evidence for citation bias might be due to the small size of the network. It is easy to have an overview of all published work in this field, without relying on the people that one knows, or on references that one happens to find. Also, this network seems to be quite balanced between proponents and opponents of the pool chlorine hypothesis, if we look at the number of positive [15] and negative articles [10]. This is contrary for instance, to the field of trans fatty acids and cholesterol, where 83% of all the published literature supports the view of an association, and citation bias was found to be much more prevalent in this field (Urlings et al.: Citation bias in the trans fatty acid literature: evidence from a citation network analysis on the effect of industrially produced trans fat and cholesterol, submitted). Such balance in viewpoints secures the possibility of adequate contradiction so that no point of view disappears until the debate is settled.

This debate, on whether swimming in chlorinated water increases the chance of developing asthma later during childhood, could best be settled by means of longitudinal research. Nevertheless, the majority of empirical articles used a cross-sectional design, and one third of all articles on this topic is not even empirical. We identified only three prospective cohort studies within the network [35, 36, 46], one of which did not assess asthma per se but lung function in general. Of the remaining two studies, Voison et al. concluded that they had found support for the positive association between swimming and asthma development [46], whereas Font-Ribera et al. concluded the opposite that there was a smaller chance of current asthma for children who had swum more [35].

Our citation analysis has several limitations. First of all, the network is small relative to the number of predictors, which makes it vulnerable for chance findings. Similarly, some research design categories were represented by only one publication. Secondly, there was a high degree of multicollinearity between study design and sample size, which made it impossible to adjust for all the factors we had originally planned. We dealt with these two limitations by running several sensitivity analyses, with different adjustments and with inclusion of different research designs. Thirdly, asthma is difficult to diagnose, and in the literature, we found different ways to assess asthma, with a varying degree of validity. To make these articles compatible, we combined these different assessments into one measure for data-based conclusion on asthma.

Our network consists of publications identified by the Web of Science Core Collection. This database offers the option to download reference lists as part of the search output. Other databases, such as MEDLINE and Embase, do not provide this option yet. Because these reference lists are needed by the software we used (CitNetExplorer) to build a citation network, we did not include these other databases in our search strategy. It is therefore likely that we missed parts of the literature. However, we see no reason why the citation dynamics in the Web of Science Core Collection would be different from other databases. Nevertheless, it would be an improvement if reference lists were added to the search output of these other databases so that this literature can be included in future citation networks.

Conclusions

There is clear evidence of selective citation in the research field studying the relation between swimming in chlorinated water and childhood asthma: several factors have an impact on the chance of being cited. Authors particularly prefer to cite their own work over those of other authors on the same topic. The evidence for citation bias in this field was not very strong and inconsistent. The impact of citation bias on the development of knowledge therefore seems limited in this field.