Does the journal impact factor reflect the impact of German medical guideline contributions?

Contributions to medical guidelines, so far, have hardly been considered in the measurement and evaluation of research performance in Germany. We therefore examined 70 high-quality medical guidelines from the years 2017 and 2018 and investigated the type of publications cited by the guidelines, whether the citation rates of articles differ between substantiating guideline recommendations and background text, and whether or not the Journal Impact Factor is correlated with the guidelines’ citation frequency of individual journals. Our study found that the guidelines cited original articles much more than books, reviews, or other guidelines. Slightly less than 10% of the citations came from the 2 years preceding guideline publication, and more than 50% of the references were at least 8 years old. A subsample showed that articles which only provided background information were cited less frequently outside the guidelines than those that substantiated a specific recommendation. Lastly, there was only a weak correlation (0.1 ≤ Tau ≤ 0.35) between the citations counts of individual journals in the guidelines and their respective JIFs despite guideline subject. Our study suggests that the JIF is not an appropriate tool to assess the clinical relevance of medical research.


Introduction
In Germany, clinical researchers are incentivized to publish in high-ranking journals. These journals are ranked by their journal impact factor (JIF) score, which is calculated by dividing the number of citations by the number of articles published in the previous 2 years. On the one hand, the JIF has been praised for its objectivity as a scientific performance indicator as opposed to peer review networks. On the other hand, the deficiencies of the JIF, including its vulnerability to various "gaming" and the knowledge that 1 3 higher citation counts are less likely from lesser-known or more specialized research fields are well documented (Alberts, 2013;PLoS Medicine, 2006). Authors are encouraged to complete research on safe topics, that a large number of others can reference as opposed to unique or specialist work, when being evaluated by the JIF (Alberts, 2013). While the JIF measures impact through journal citations and provides evidence of scientific impact through the spread of citations in the scientific community, medical guidelines might provide insight into which research affects patient care (Eriksson et al., 2020) and inherently have societal impact through policy (Tousoulis & Stefanadis, 2014). However, most performance or evaluation measures do not acknowledge contributions to medical guidelines through either cited references or guideline authorship (Herrmann-Lingen et al., 2014).
The Association of the Scientific Medical Societies (AWMF) serves as the foremost publisher of medical guidelines in Germany. The AWMF is a professional, scientific network consisting of one hundred-eighty member societies and three associated societies from the whole range of medical specialties and health-related areas. This network advises the government of the Federal Republic of Germany and the governments of the German federal states on all topics of scientific medicine and medical research and classification. It represents Germany in the Council for International Organizations of Medical Sciences (CIOMS). In 1995, the Advisory for Concerted Action in Health Care requested that the AWMF and its scientific members create a quality-controlled collection of medical recommendations. From here on out, these recommendations will be called the AWMF medical guidelines. These guidelines are an accumulation of scientific research intended to provide relevant diagnostic, preventative, and treatment information (Kryl et al., 2012)which help to bridge the gap between research and practice (Burgers et al., 2002). Ovseiko et al. suggest that citation in the guidelines demonstrates utility. It is difficult to argue that utilizable research should be incentivized because it benefits researchers, the public, policy, and provides health gains (Hanney et al., 2003). While contributions to the medical guidelines might be considered a worthy measure of societal impact to be incentivized, evidence-based policy change is challenging and research intensive (Ovseiko et al., 2012). Considering the challenges involved in policy change, this article explores to what extent current performance indicators (JIF) already capture the impact of clinical guideline contributions by investigating whether and to what extent the journal impact factor of a reference is relevant in guideline development.
We hypothesize that (a) the 2-5 year JIF window does not reflect the "knowledge cycle" of article production to inclusion in the AWMF medical guidelines (Grant et al., 2000) (b) articles cited as justification for specific guideline recommendations are typically also cited more often by other publications than articles only mentioned in the guideline background text and (c) the references cited by the guidelines are not correlated with the references' JIF. Addressing these issues should provide clarity on whether or not current evaluation solutions are sufficient to reflect contributions to the AWMF medical guidelines as a measure of impact.

Methods
We identified high-quality medical guidelines from the AWMF website. Following the identification of the appropriate guidelines for this study, we explored the type of cited references (books, articles, or reviews) through a reference analysis. An additional reference analysis was performed by extracting a sample of the guidelines and reviewing the literature, which either only provided background information or was used for supporting recommendations. Through a publication date analysis, we investigated how long after publication articles were considered relevant. We were mainly interested in how many references had been published during the 2 years preceding guideline publication ("2-year JIF window"). Lastly, we investigated the relevance of the JIF for selecting references for inclusion in the guidelines by performing correlation tests between the citation count of articles from particular journals and the JIF of publications referenced by the guidelines as a whole and between the citation count and the JIF of publications within topic-related guidelines.

Medical guidelines
We identified the S2e and S3 medical guidelines from 2017 and 2018 as high-quality guidelines relevant to our study (See Online Appendix A). These guidelines are categorized as either "Evidence-based" (S2e) or "Evidence-and consensus-based" based (S3; see Table 1). They were extracted in January 2019 from the AWMF website (https:// www. awmf. org/ leitl inien/ leitl inien-suche. html). The medical societies in charge and the guidelines they coordinated were documented based on the subject code of each guideline. The number of guidelines by medical societies in charge can be found in Online Appendix B.
The guidelines are available as pdf-documents and have neither meta-data nor separate automatically extractable reference lists. The references were therefore extracted, using the 'PDF-XChange Editor' (Tracker Software Products (Canada) Ltd., Version 7), into an excel sheet and then imported into 'Citavi' (Swiss Academic Software GmbH, Version 6.3), a reference manager. The reference manager matched the imported references with online databases (mainly PubMed and Crossref) and filled in relevant bibliographic data (authors, title, journal, year published, affiliations etc.) when possible. References that were not matched in Citavi were revised by hand and completed when possible. The references were organized by the guidelines in which they were cited.

Reference and publication date analyses
The reference manager's various search fields identified the type of reference. The Review, Guideline, and Book categories were determined by searching in the 'title' field for 'review,' 'guideline,' or 'Leitlinie' and in the 'reference type' field for 'book' and 'book, Table 1 Description of the two evidence levels of the included guidelines https:// www. awmf. org/ leitl inien/ awmf-regel werk/ ll-entwi cklung/ awmf-regel werk-01-planu ng-und-organ isati on/ po-stufe nklas sifik ation. html  (Table 2) and separately (Online Appendix C).

Background vs. recommendation
A sample of the guidelines, characterized by author affiliation to at least one of three pilot faculties in Germany, was taken. References of the guidelines authored by members of at least one of these affiliations were qualitatively reviewed for whether they directly supported clinical recommendations or whether they were only cited in the background text, providing context to the recommendations.

Top journal citations: JIF correlations and representation in the guidelines
The references from the top 100 cited journals in the medical guidelines were counted. The journals, sorted by citation count, can be found in the appendix with their respective 2017 (or last recorded) JIFs, citation and JIF rankings, and their relative and cumulative citations (Online Appendix D). We assessed to what extent the number of citations correlates to the journal impact factor in both the top 50 and top 100 cited references. Lastly, the relative and cumulative citations were calculated for each of the journals in the top 100 journal data set.

Top journal citations: JIF correlations within guideline categories
We took three random guidelines from each of the five medical societies that have developed the most guidelines (see Online Appendix B). For each guideline, the references were sorted by citation count, and when possible, the journal impact factor of the guideline's publication year for each cited journal was recorded. For journals with no impact factor, we used a value of 0.2 as recommended by the AWMF for purposes of research fund allocation. For further analysis, we completed Kendall's Tau correlations between citations and their respective journal impact factors. Correlations were calculated not only for the individual guidelines in the specific category but also for the entire guideline subject.

Medical guidelines
Seventy medical practice guidelines that met the criteria were found (see Online Appendix

Reference and publication date analyses
A total of 33,473 references were extracted from the 70 guidelines. Of these, 31,894 could be specified with at least basic information, including title, author(s), and year of publication. Figure  A total of 26,577 references included additional publication data for analysis. In the guidelines, 86.2% of references were original articles, reviews represent 9.49% of references which include sufficient publication data, 3.73% of cited articles are books, and 3.39% of references are guidelines.   (Fig. 4). However, the correlation was only of moderate size (Tau = 0.35) and further decreased to Tau = 0.31 when excluding the top 3 journals with extraordinarily high JIFs. The effect of outliers was even more extensive if only the top 50 journals were considered, where numbers of citations showed only small correlations with the JIF (Tau = 0.22 overall and Tau = 0.10 without outliers). Interestingly, the two journals most cited by the guidelines (Journal of Clinical Oncology, Cochrane Database of Systematic Reviews) show much lower JIFs than the typical "high impact" journals.  7  10  13  16  19  22  25  28  31  34  37  40  43  46  49  52  55  58  61  64  67  70  73  76  79  82  85  88  91  94 7  10  13  16  19  22  25  28  31  34  37  40  43  46  49  52  55  58  61  64  67  70  73  76  79  82  85  88  91  94  97

Guideline representation
Lastly, we explored relative and cumulative citations. The top 50 journals in the guidelines represent 44.02% of guideline citations, and the top 100 journals in the guidelines represent 57.95% of guideline citations. All journals after the 11th citation-ranked journal each represent less than 1% of the total citations in the guidelines. Journals after the 100th citation-ranked journal each represent less than 0.192% of the total citations in the guidelines. The relative and cumulative citations for the most cited 100 journals can be found in Online Appendix D.

Guideline categories
Kendall's Tau correlations between citation count and JIF, for three of each of the 021 (Gastroenterology, Digestive and Metabolic Diseases), 032 (Cancer), 043 (Urology), 053 (General and Family Medicine), and 083 (Dentistry, Oral, and Maxillofacial Medicine) guidelines were calculated. These correlations and the year of guideline publications can be found in Table 3 (below). Figure 5a and b show the most cited journals in sample and 4, the figures show a scatterplots of the most cited journals and their respective journal impact factors, now broken down to the level of individual guidelines. As can be seen, most correlations for individual guidelines or subject areas were also in the low positive range that was mostly not significant. Some of the correlations were even negative, though also not significantly so. Only one of the guidelines (053-024) showed a significantly positive correlation of moderate size (Tau = 0.497; p = 0.001).

Discussion
Our study analyzed citations by German high-quality medical guidelines. For the reason that these guidelines have not yet been studied from a bibliometric perspective, we found relevant new information on the temporal distribution of guideline references, citation frequencies for references in background text vs. clinical recommendations, and correlations between journal citation numbers by the guidelines and the respective journals' impact factors. The Web of Science shows decreasing citation trends 2-3 years after publication (Eriksson et al., 2020). We found however, that even though recent research is cited more than older publications in the AWMF clinical guidelines compared to other publication years, less than 10% of references were from the 2 years preceding guideline publication, and more than 50% of guideline references are more than 8 years old. These findings are in agreement with Tousoulis et al. (2014), who also report that older publications are still frequently cited by medical guidelines which is a statement to their continued relevance past JIF standards.
In the development of the AWMF clinical guidelines, journal articles are preferred over books, reviews, and other guidelines. The guidelines also cite about twice as much background information as direct recommendations, although direct recommendations are cited more often externally. Also, when looking at the guidelines as a whole, there was only a weak correlation (Tau ≤ 0.35) between guideline citation numbers for particular journals and the JIF. More importantly, correlations are very low when looking at most guidelines from a guideline-specific or subject-specific perspective. Furthermore, references in the guidelines maintain usefulness regardless of their impact factor. For example, the Cochrane Database of Systematic Reviews had a moderate 2017 journal impact factor of 6.754. However, it is cited 811 times in the guidelines, making it the second most-cited source journal. On the other end of the spectrum, CA-A Cancer Journal for Clinicians boasts a JIF of 244.585 (2017) but is cited only 18 times in the guidelines and did not even make it to the top 100 journals. This suggests that authors of the guidelines scan through hundreds of different sources, independent of JIF, to meet their needs.

Strengths and limitations
The strength of our analyses lies in the full coverage of German high-quality guidelines from two successive years, yielding a good representation of guidelines across the whole field of medicine and of the references cited by them.
One issue in bibliometric studies is the accuracy and completeness of data retrievable, especially when it comes to dated research. While on the one hand, we were able to retrieve an extraordinary number of references, the completeness and accuracy of them is not perfect. Some references were missing data and the Citavi Reference Manager may have some errors through its automation recognition of sources. To what extent the relevant downloaded metadata of the references are accurate would require a manual check of over 33,000 references. Such a manual check might provide some insight into what types of data are missing.
Moreover, the AWMF medical guidelines currently lack metadata or labeled information. The references of the guidelines are not marked with successor-predecessor indicators and each guideline would need to be reviewed by hand to determine which references continue to be relevant, which references newly contribute to the guideline topic, or which references are simply carried over from previous versions.
In agreement with Eriksson et al. (2020), the AWMF medical guidelines, or other clinical practice guidelines, would benefit from well-managed and well-labeled references in a digital database. Not only would this provide a sense of transparency and help readers of the guidelines quickly review changes, but it would also help interested parties analyze the guidelines and provide a richer source of impact evaluation.

Conclusion
Whether or not it fully captures the scope of research impact, the journal impact factor plays a role when evaluating research or researchers. The necessity to investigate to what extent it can be used to measure or evaluate other forms of impact must be a starting point when discussing research evaluation. With that said, our first findings suggest that the temporal distribution of guideline references exceed the 2 or even 5 year "JIF window." Relevance of older and newer articles could be reviewed with comprehensive labeling, review, or justification of the use of older citations (ex: this citation continues to be relevant because the treatment continues to be used widely in clinics across Germany). This inquiry in particular could provide insights into not only how the guidelines are updated but also into how long research continues to 'impact' patient care.
Also interesting, is the shift in which research is cited more often. As it stands, research focusing on clinical recommendations is cited more often externally, and therefore, more likely to be rewarded for it. It is not inconceivable that integrating a medical guideline performance indicator could balance the incentive, to some extent, in producing quality research whether it provides clear clinical recommendations or not.
All together, our study found that the Journal Impact Factor plays no relevant role in the medical guidelines with regard to publication dates, or as criteria for the selection of references. In order to capture the impact of guideline references, a new indicator should be considered and developed that reflect not only the "knowledge cycle" (Grant) of articles to guideline publication, but also the societal impact. While medical guidelines innately demonstrate societal value, further study of the AWMF guidelines might suggest that the citations of their references instead of the JIF of the references, like many other clinical guidelines, have higher citation rates than other articles in their respective journals (Thelwall et al., 2017).
Funding Open Access funding enabled and organized by Projekt DEAL. The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article. This work was supported by the Federal Ministry of Education and Research (BMBF) under the funding code 01PU17011B.

Declarations
Conflict of interest Christopher Traylor declares no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. CHL is receiving research funding from University of Göttingen Medical Center, partially based on indicator-based research allocation. During the past years he has received a lecture honorarium from Pfizer and Novartis and royalties from Hogrefe Publishers for the German version of the Hospital Anxiety and Depression scale. He is also receiving research funding from the German Ministry of Education and Research, the German Research Foundation, and the European Commission.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.