Pritchard (1969) coined the term “bibliometrics” to mean the quantitative analysis and statistics to scholarly outputs, such as journal articles, citation counts, and journal impact. September 1978 marked the debut of the journal Scientometrics. This broader concept refers to the quantitative features and characteristics of science and scientific research and is attributed to Vaissily V Nalimov by Hood and Wilson (2001). They examine the similarities and differences among bibliometrics, scientometrics and also infometrics and informetrics. Webometrics is now considered a different approach to research rankings. Originally coined by Almind and Ingersen (1997), it applies bibliometric techniques to new web metrics. Webometrics entered the mainstream with the December 2004 special issue of Journal of the American Society for Information Science and Technology. Table 1 tracks the leading universities and countries producing bibliometric literature. 1980 to 1969 marked the first year that WOS included articles on bibliometrics and the number has increased every year since. Papers on “bibliometrics and university rankings” are about 10 % of all bibliometric papers.
Since the first article on bibliometrics appeared in 1969, there were 4474 articles in WOS and 5760 in SCOPUS with almost 34000 citations in WOS and 54800 in Scopus by October 2 2013 (using Biblometric* as a topic in WOS and a keyword in SCOPUS). Fig. 1 illustrates growth by decades.
No matter what term is used, the rankings are only as good as one’s understanding of the underlying measurements described below. Anyone using a ranking should check the documentation and methodology. The earlier rankings used peer review, now referred to as “reputation” and countable output such as journal articles in a group of “top” journals, proceedings, number of actual pages, number of normalized pages based on characters per page or doctoral degrees by school (Cleary and Edwards 1960). Some give full credit to each author, some distribute a percent per school by author; a few just use first author. Peer review may cover one to three years; other output measures cover one year to decades. Article counts may include book reviews, editorials and comments. All of these methods have their strengths and weaknesses. In order to select the international research university ranking that reflects an organization’s needs today, it is necessary to understand the bibliometrics that are used.
The appearance of Science Citation Index in 1955 laid the groundwork for the change from qualitative and manually countable scholarly output to the new era of citation metrics. When Eugene Garfield (1955) launched Science Citation Index, he originally positioned citation indexes as a subject approach to literature and a way to check the validity of an article through its cited references. In 1963, he wrote about the value of using citation data for the evaluation of publications (Garfield and Sher 1963). By 1979, in an article in volume one of Scientometrics he raised concerns about using citations as an evaluation tool that are still being examined by today’s researchers such as negative and self-citations; counting of multiple authors and disambiguation of authors names (Garfield 1979).
Today bibliometrics is a primary tool for organizations, such as universities and government bodies, to measure research performance.Widespread use of bibliometrics is possible with easy access to articles, citations and analytical tools in both Thomson-Reuters Scientific Web of Science (WOS) and Elsevier’s Scopus. Many individuals turn to Google Scholar.
Measurement in today’s academic environment is evidence-based and as noted by Leung (2007) “There is now mounting pressure all over the world for academics to publish in the most-cited journals and rake in as many citations to their work as possible”.
Individuals, researchers, departments, universities and outside bodies are all counting output. Departments employ bibliometrics to evaluate faculty for hire, tenure and promotion decisions, using number of publications and citation counts, journal impact and additional tools such as an H-Index. Academic output such as articles and citations provide the data for internal and external benchmarking. Universities are using more bibliometrics for government and stakeholder reporting of output. Country level benchmarking and comparisons use bibliometrics as well.
International data in any field poses problems involving standardization and cross country comparisons. University research rankings using both quality measures such as peer review and metrics compound these issues. Usher (2009) notes that “as rankings have spread around the world, a number of different rankings efforts have managed to violate every single one of “rankings principles. (Federkeil (2009) adds that “The only field typified by valid international indicators is research in the natural and life sciences….” He also notes that there is no “valid concept for a global ranking of teaching quality…”
Even if rankers agree to use a standard source for tracking articles or citations, there is no consensus on how to count multiple authors. Abramo et al. (2013b) studied the multi-author issue and suggested a further weighting based on how much each author contributed to the research. Other counting questions arise over authors who have changed universities and on whether to use a total figure, which favors large institutions or a per faculty count favoring smaller institutions. However, a per-faculty definition has issues of its own in whom to count as a faculty and how to calculate FTE.
It is necessary to understand the strengths and weaknesses of each of the bibliometric tools when analyzing and applying them to real world situations. It is important to check the methodology, including definitions and weightings, when comparing rankings or doing time series comparisons with the same tool. Table 2 organizes the most commonly used bibliometrics for research assessment by what they measure and which sources use them.
The H-Index is a measure of quality relative to quantity based on papers and citations within the given database. For example, if an author has 44 papers in SCOPUS with 920 citations and the 16th paper has 16 citations the H-Index is 16; if the same author has 36 papers in WOS with 591 cites and the 13th paper has 13 citations, the H-Index in WOS is 13. That same author created an author ID in Google Scholar, which tracks articles and citations. The author has 65 publications, 1921 citations and the 21st article has 21 citations for an H-index of 21.
Other approaches use weighted averages or scores, output per capita and output by subject or country norms. They may also adjust for multiple authors from different organizations. Metrics should be stable and consistent in order to measure changes over time and be replicable for user input.
One of the most controversial metrics is Journal Impact Factor from Thomson-Reuter’s Journal Citation Reports (Werner and Bornmann 2013). Concern about the over-use of this metric in the evaluation of faculty, from publishers, editors and researchers led to DORA, the San Francisco Declaration on Research Assessment, (San Francisco 2013) the outcome of the December 2012 meeting of the American Society for Cell Biology. Not only is there concern for the misuse of the impact factor as a rating instrument but also for its impact on scientific research. Alberts (2013) notes that impact factor encourages publishers to favor high-impact disciplines such as biomedicine and discourages researchers from taking on risky new work, which take time for publication.
JCR is being challenged by newer measures of journal quality which are appearing in university ranking scores. These include the eigenfactor, SNIP and SJR all of which are freely available on the web. The Bergstrom Lab (2013) at the University of Washington developed the eigenfactor, where journals are considered to be influential if they are cited often by other influential journals. The eigenfactor is now incorporated into Journal Citation Reports. SNIP, Source Normalized Impact per Paper, from Leiden’s CTWS measures contextual citation impact by weighting citations based on the total number of citations in a subject field. The impact of a single citation is given higher value in subject areas where citations are less likely, and vice versa. SCImago’s SJR2 recognizes the value of citations from closely related journals (Journal M3trics 2012).
New tools using webometrics and altmetrics which incorporate social media question the old model of scholarly impact (Konkiel 2013). The growing body of literature around “Webometrics” and Altmetrics expand the scope of this article. Björneborn and Ingwersen, in a special webometrics issue of Journal of the American Society for Information Society and Technology warned against taking the analogy between citation analyses and link analyses too far (Björneborn and Ingwersen 2004). However, we can no longer ignore the role of the web in academic research.
Despite the rise of alternative measures of scientific output, Web of Science (WOS) and Scopus remain the two major English language commercial bibliographic sources used by the research rankings. WOS is the current iteration of the original Science Citation Index. The entire suite of databases may include Science Citation Index (SCI-e from 1900), Social Science Citation Index (SSCI from 1900) and Arts & Humanities Citation Index (A&HCI from 1975). Other databases include Conference Proceedings and Books in Sciences and Social Sciences. An institution can subscribe to any or all of the databases, for as many years as they can afford. WOS has two search interfaces: General Search and Cited Reference Search. General Search includes only those articles that WOS indexes. Each article has the references in the article and the times the article is cited by other WOS publications. It is used at an institutional level for the rankings. Users can create their own rankings using analysis tools for authors, institutions or journals and rank output by number of articles by subject area, document type, leading authors, source titles, institutions and countries. Each author’s information (institution, country) receives one count. Not all articles include addresses. An H-Index is also calculated. The Cited Reference Search includes all citations in the WOS articles from any reference source and is primarily used for data on individual researchers. Until the end of 2011, Thomson provided a listing of highly cited papers also used in international rankings. This is now part of Essential Science Indicators, a separate subscription service. Thomson-Reuters publishes Science Watch, covering metrics and research analytics (Thomson-Reuters 2013). Registration is required
Elsevier’s SCOPUS began in late 2004. It includes citations received since 1996. The subscription includes all subject areas and document types for all the years that information is available. The subscription includes four broad subject areas: Health Sciences, Physical Sciences, Life Sciences and Social Sciences. Added features are author and affiliation searches and analysis of citing journals, authors and institutions and an H-Index. Elsevier publishes Research Trends a quarterly newsletter which provides insights into research trends based on bibliometric analysis with a range of articles on different aspects of ranking, from assessing the ARWU (Shanghai Rankings) to explaining the soon to be released U-Multirank (Research Trends 2008).
Google Scholar is the third and most controversial source of citations. The search engine has improved since authors, such as Peter Jacsó, exposed all of the errors that limited the use of Google Scholar for comparative evaluation purposes (Jacsó 2008). Today’s Scholar has an advanced search feature to search by author’s name. It has improved its ability to differentiate dates from numbers; it added the ability to download to bibliographic software; it has its own metrics for measuring journal quality and it is now linking to article citations on publisher pages. It still lacks the editorial control of WOS and Scopus, the controlled vocabulary with subject terms and any information on how articles and citations are included. Meho and Yang (2007) discuss the impact of data sources on citation counts and provide a balanced review, while pointing out the thousands of hours required for data cleansing using Google Scholar.
All three systems have mechanisms for authors to identify themselves, their affiliations and their publications if they chose to do so. Researchers may also create one unique ID through ORCID (http://orcid.org)
WOS and SCOPUS understate the number articles and citations, especially for universities that are not strong in the sciences and SCOPUS, because it only includes citations from articles written after 1995, also understates the citations for older authors. Google Scholar is not a viable alternative for quality university rankings. Table 3 compares features in WOS, SCOPUS and Google Scholar.
WOS or SCOPUS offer quality and standardization. However, they are slower to reflect changes in scientific communication