Introduction

It was at the end of the 1990s, while I served on the faculty of the University of Manchester, that I had a formal meeting with the Head of my department as part of the annual performance appraisal. During this meeting, he took a copy of the Journal Citation Report, looked over the listing of neuroscience journals, and asked me: “Why do you not publish in Brain? It has a very high impact factor, and thus publications in this journal would help you to advance your career.” Whereas the answer to the question is obvious—Brain provides a platform for basic research and clinical studies in neurology, but not for investigations in my areas of research, neuroethology and comparative neurobiology—this anecdote exemplifies an ever-increasing trend to use the journal impact factor far beyond what it was originally designed for: evaluation of individual articles, investigators, institutions, and even whole countries. Despite its wide (ab)use, and the tremendous influence it has on the career of scientists, and on the research and funding landscape as a whole, most investigators and readers of scientific journals know surprisingly little about how the journal impact factor is calculated, what it can be used for, and where its limits are.

What is the journal impact factor?

The journal impact factor was devised by Eugene Garfield, founder of the Institute for Scientific Information in Philadelphia, Pennsylvania (now part of the Intellectual Property and Science business of Thomson Reuters, a for-profit corporation) (Garfield 1964, 1972). The term was first used in the 1961 Science Citation Index, which led to the publication of an annual by-product, the Journal Citation Reports. It draws information from the Web of Science database, which covers in its science edition more than 8,400 journals.

In a given year, the journal impact factor is defined as the average number of times an article published in the two preceding years was cited. For example, the 2012 journal impact factor (JIF) for journal X is calculated as

$${\text{JIF }} = \, A/B$$

where

A is the number of times articles published in journal X in 2010 and 2011 were cited by articles in indexed journals during 2012, and B is the total number of ‘citable items’ published in 2010 and 2011.

‘Citable items’ in the denominator include only primary research papers and review articles, but not ‘front matters’, such as editorials, letters to the editor, or other items, such as the ‘News and Views’ articles in Nature. Books and book chapters are not regarded ‘citable items’. On the other hand, for the determination of the numerator every citation to the journal’s articles is counted, irrespective of their types, including, for example, editorials or letter to the editor—a clear mismatch between the published items and the cited items. Furthermore, there is a clear bias toward publications in English, including both the journals listed in the Web of Science database and the articles cited (Artus 1996; Winkmann et al. 2002).

The journal impact factor: an unbiased measure of journal quality?

Since the early years after its inception, the journal impact factor has received mounting criticism addressing both the methodology of how it is calculated, and the purpose for which it is used. One obvious flaw mentioned above is the mismatch of published items and citable items of a journal. An even more serious issue has been raised by independent audits of samples of the database used by Thomson Reuters to determine the journal impact factor (Rossner et al. 2007; Vanclay 2012). These evaluations indicated a number of weaknesses that have led to the questioning of the integrity of the data and the transparency of their acquisition. The flaws included a substantial number of data entry errors and incorrect article-type designations (such as inclusion of ‘front matters’ in the denominator). As a consequence, it was found that values for the journal impact factor using different databases may vary as dramatically as sixfold (Vanclay 2012).

A second weakness raising questions about the reliability of the journal impact factor as an indicator of journal quality concerns its vulnerability to editorial manipulation to improve the ranking of the journal. In general, review articles attract more citations than original research articles, and thus publication of a larger number of reviews will inflate the impact factor. Similarly, editorials and letter to the editors, in which items of their own journal are cited, result in improvement of the citation record. Naturally, this option is available only to journals that have ‘front matters’ in place, such as Nature with its ‘News and Views’ section, putting them at a considerable advantage compared to their competitors without such an option. Some editors have taken the editorial manipulation even one step further; so has the editor of Leukemia been accused of sending letters to authors who had submitted a paper to this journal asking them to increase the number of references to papers published in Leukemia (Smith 1997).

A third feature of the journal impact factor that often leads to its misinterpretation is that the citations on which its calculation is based are highly skewed—a few articles in a journal are often cited, whereas many articles are rarely cited. In a self-assessment of its 2004 journal impact factor, Nature found that 89 % of the citations were generated by only 25 % of its papers (Nature Editors 2005). Its most cited paper received 522 citations during 2002–2003, whereas the great majority of papers attracted fewer than 20 citations—markedly less than the journal impact factor of 32.2. Similar, although less pronounced, skewed distributions of journals’ citations rates are shown by more specialized journals (Seglen 1997). Thus, the journal impact factor as the arithmetic mean of citations received by citable items in a given journal is, by far, not an adequate estimate of the number of citations that the average paper attracts. From a statistical point of view, the mode or median would clearly provide better estimates of this figure.

The arbitrarily defined 2-year window of the citable items causes a fourth problem, a bias toward disciplines in which a high proportion of citations cover rather recent literature. Such a rapid peak in citation rate is typical for many research areas in biochemistry and molecular biology. By contrast, disciplines like ecology, anatomy, comparative neurobiology, or mathematics are characterized by long-lived results that receive citations over many years following their publication; they are, therefore, clearly disadvantaged by the 2-year rule. Differences in impact factors of journals representing these different disciplines may thus primarily reflect differences in the ‘durability’ of research results and in the citation culture, rather than differences in the impact that the journals make in their own field.

Taken together, the weaknesses inherent to the journal impact factor have been so severe that a number of authors have concluded that this indicator is an “ill-defined and manifestly unscientific number” (Rossner et al. 2007, p. 3053) that “lacks transparency, repeatability and rigor” (Vanclay 2012, p. 16), and thus is an “outmoded surrogate for quality” (PloS Medicine Editors 2006, p. 0708). Even on a less critical note, consideration of the above analysis clearly calls for caution when using the journal impact factor as an indicator of journal quality. For example, comparison of impact factors of different journals is not possible for journals with different editorial formats (e.g., pure review journals versus journals that publish both original research papers and review articles), or across different areas of research (e.g., neuroscience journals that publish in the broader field of biomedicine versus journals that focus on zoologically oriented physiology).

Despite the above criticism, one might assume that authors benefit when they publish in journals with high impact factor by increasing the visibility and prestige of their article, and thus attracting more citations. However, a comprehensive study by Seglen (1994) found that citedness of journal articles was not influenced by the status of the journal in which they were published.

The journal impact factor as a surrogate of research assessment

If publication in high-impact journals does not appear to help authors garner more citations, what, then, remains as an incentive to publish in high-impact-factor journals? It is clearly the fact that the journal impact factor is widely used as a proxy for article quality, and thus as a measure to judge performance of individual researchers, institutions, and even whole countries. I have witnessed renowned institutions that annotate, as part of the faculty recruitment process, each article on the publication list of candidates with the journal impact factor and that have their tenure decisions, in part, based on the impact factor of the journals in which a junior faculty member has published. I have also seen critiques in response to grant proposals submitted to national funding agencies in which a reviewer acknowledged the high quality of past research, but noted as a weakness its publication in low-impact-factor journals.

Such frequent abuse of the journal impact factor has even prompted the inventor of this indicator to warn that impact numbers not be used as surrogates for faculty evaluations (Garfield 1996). The original purpose of the journal impact factor was to provide librarians and publishers with a tool to make informed decisions on journal subscriptions and advertising rates, respectively. It was never designed to assess the quality of any specific article in a given journal.

The pressure upon authors to publish in high-impact-factor journals leads to consequences unintended by the inventor of this metric. Authors are forced to submit their papers to journals with the highest impact factor instead to journals that are best suited for promoting their research. High-impact-factor journals are, thereby, flooded with manuscripts, resulting in an excessive burden on editors and reviewers. The cycle of submission, review, and rejection often repeats several times until a manuscript is finally accepted for publication. On the other hand, many publishers and editors see it as their duty to implement editorial policies that will increase the journal impact factor. Scientists, being confronted on a daily basis with the importance of the journal impact factor, feel pressured to work in areas of research that are highly populated, thereby increasing their chances of being cited by their peers. Such strategies lead effectively to the eradication of entire research fields in which it is difficult to gain credit by publishing in high-impact-factor journals.

How to put the genie back into the bottle

An increasing number of scientists, editors, and policymakers realize what devastating consequences the abuse of the journal impact factor has on the careers of individuals and on the science landscape. The most prominent call to stop the use of the journal impact factor as a measure of the scientific quality of research in an article has been made recently by a group of editors and publishers of scholarly journals. They developed a set of recommendations referred to as the San Francisco Declaration on Research Assessment (http://www.ascb.org/SFdeclaration.html). Its central recommendation is to refrain from using journal-based metrics, such as the journal impact factor, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions. The overwhelming positive response to this call gives reason to hope that the journal impact factor, which has evolved from a journal metrics tools to a research performance instrument, will be reappropriated. No doubt, authors will play a pivotal role in this process. What they need to do is what they have traditionally done for many decades: publish in journals that have the highest impact in their area of research, beyond their impact factors. The Journal of Comparative Physiology-A will surely be among those journals in neurobiology, neuroethology, and sensory physiology.