Skip to main content
Log in

Influence of omitted citations on the bibliometric statistics of the major Manufacturing journals

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Bibliometrics is a relatively young and rapidly evolving discipline. Essential for this discipline are bibliometric databases and their information content concerning scientific publications and relevant citations. Databases are unfortunately affected by errors, whose main consequence is represented by omitted citations, i.e., citations that should be ascribed to a certain (cited) paper but, for some reason, are lost. This paper studies the impact of omitted citations on the bibliometric statistics of the major Manufacturing journals. The methodology adopted is based on a recent automated algorithm—introduced in (Franceschini et al., J Am Soc Inf Sci Technol 64(10):2149–2156, 2013)—which is applied to the Web of Science (WoS) and Scopus database. Two important results of this analysis are that: (i) on average, the omitted-citation rate (p) of WoS is slightly higher than that of Scopus; and (ii) for both databases, p values do not change drastically from journal to journal and tend to slightly decrease with respect to the issue year of citing papers. Although it would seem that omitted citations do not represent a substantial problem, they may affect indicators based on citation statistics significantly. This paper analyses the effect of omitted citations on popular bibliometric indicators like the average citations per paper and its most famous variant, i.e., the ISI Impact Factor, showing that journal classifications based on these indicators may lead to questionable discriminations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. According to the 2011 JCR (Thomson Reuters 2015).

  2. The same portfolio of cited/citing papers was used in another work of ours—i.e., (Franceschini et al. 2014)—which demonstrates the link between omitted-citation rate and publishers (e.g., Elsevier, Springer, Taylor & Francis, etc.) of the citing papers.

  3. Authors are aware that a more rigorous testing should be that of the differences between CPP * values of pairs of journals (Schenker and Gentleman 2001). The fact remains that the qualitative approach in use is simpler and more straightforward.

References

  • Adam, D. (2002). Citation analysis: The counting house. Nature, 415(6873), 726–729.

    Article  Google Scholar 

  • Arnold, D. N., & Fowler, K. K. (2011). Nefarious numbers. Notices of American Mathematical Society, 58(3), 434–437.

    MATH  MathSciNet  Google Scholar 

  • Bar-Ilan, J. (2010). Ranking of information and library science journals by JIF and by h-type indices. Journal of Informetrics, 4(2), 141–147.

    Article  Google Scholar 

  • Buchanan, R. A. (2006). Accuracy of cited references: The role of citation databases. College & Research Libraries, 67(4), 292–303.

    Article  Google Scholar 

  • DORA. (2013). San Francisco declaration on research assessment. http://am.ascb.org/dora/. 20 May 2014.

  • ERA. (2010). Excellence in research for Australia initiative. http://www.arc.gov.au/era/era_2010/era_2010.htm. 20 May 2014.

  • Falagas, M. E., Kouranos, V. D., Arencibia-Jorge, R., & Karageorgopoulos, D. E. (2008). Comparison of SCImago journal rank indicator with journal impact factor. The FASEB Journal, 22(8), 2623–2628.

    Article  Google Scholar 

  • Franceschini, F., & Maisano, D. (2011). Influence of database mistakes on journal citation analysis: remarks on the paper by Franceschini and Maisano, QREI (2010). Quality and Reliability Engineering International, 27(7), 969–976.

    Article  Google Scholar 

  • Franceschini, F., Maisano, D., & Mastrogiacomo, L. (2013). A novel approach for estimating the omitted-citation rate of bibliometric databases. Journal of the American Society for Information Science and Technology, 64(10), 2149–2156.

    Article  Google Scholar 

  • Franceschini, F., Maisano, D., & Mastrogiacomo, L. (2014). Scientific Journal Publishers and Omitted Citations in Bibliometric Databases: Any Relationship? Journal of Informetrics, 8(3), 751–765.

    Article  Google Scholar 

  • Franceschini, F., Maisano, D., & Mastrogiacomo, L. (2015). Errors in DOI indexing by bibliometric databases. To appear in Scientometrics,. doi:10.1007/s11192-014-1503-4.

    Google Scholar 

  • Hicks, D. (2009). Evolving regimes of multi-university research evaluation. Higher Education, 57, 393–404.

    Article  Google Scholar 

  • Jacsó, P. (2006). Deflated, inflated and phantom citation counts. Online Information Review, 30(3), 297–309.

    Article  Google Scholar 

  • Jacsó, P. (2012). Grim tales about the impact factor and the h-index in the Web of Science and the Journal Citation Reports databases: Reflections on Vanclay’s criticism. Scientometrics, 92(2), 325–354.

    Article  Google Scholar 

  • Labbé, C. (2010). Ike Antkare, one of the great stars in the scientific firmament. ISSI Newsletter, 6(2), 48–52.

    Google Scholar 

  • Li, J., Burnham, J. F., Lemley, T., & Britton, R. M. (2010). Citation analysis: Comparison of Web of Science, Scopus, Scifinder, and Google Scholar. Journal of Electronic Resources in Medical Libraries, 7(3), 196–217.

    Article  Google Scholar 

  • Lowry, P. M., Humpherys, S. L., Malwitz, J., & Nix, J. (2007). A scientometric study of the perceived quality of business and technical communication journals. IEEE Transactions on Professional Communication, 50(4), 352–378.

    Article  Google Scholar 

  • Meho, L. I., & Yang, K. (2007). Impact of data sources on citation counts and rankings of LIS faculty: Web of Science versus Scopus and Google Scholar. Journal of the American Society for Information Science and Technology, 58(13), 2105–2125.

    Article  Google Scholar 

  • Moed, H. F. (2005). Citation analysis in research evaluation. Information sciences and knowledge management: Vol. 9. Dordrecht: Springer. http://dx.doi.org/10.1007/1-4020-3714-7. ISBN: 978-1-4020-3713-9.

  • Moed, H. F. (2011). The source-normalized impact per paper (SNIP) is a valid and sophisticated indicator of journal citation impact. Journal of the American Society for Information Science and Technology, 62(1), 211–213.

    Article  Google Scholar 

  • Neuhaus, C., & Daniel, H. D. (2008). Data sources for performing citation analysis: An overview. Journal of Documentation, 64(2), 193–210.

    Article  Google Scholar 

  • Olensky, M. (2013) Accuracy assessment for bibliographic data. Proceedings of the 13th international conference of the international society for scientometrics and informetrics (ISSI), Vol. 2, pp. 1850–1851, Vienna, Austria.

  • Ross, S. M. (2009). Introduction to probability and statistics for engineers and scientists. New York: Academic Press.

  • Rossner, M., Van Epps, H., & Hill, E. (2008). Irreproducible results—A response to Thomson Scientific. The Journal of general physiology, 131(2), 183–184.

    Article  Google Scholar 

  • Schenker, N., & Gentleman, J. F. (2001). On judging the significance of differences by examining the overlap between confidence intervals. The American Statistician, 55(3), 182–186.

    Article  MathSciNet  Google Scholar 

  • Schubert, A., & Glänzel, W. (1983). Statistical reliability of comparisons based on the citation impact of scientific publications. Scientometrics, 5(1), 59–74.

    Article  Google Scholar 

  • Scopus Elsevier. (2015). Scopus content coverage. http://www.scopus.com. 20 May 2014.

  • Thomson Reuters. (2015). http://thomsonreuters.com/products_services/science/science_products/a-z/journal_citation_reports/. 20 May 2014.

  • Van Noorden, R. (2013) New record: 66 Journals banned for boosting impact factor with self-citations. Nature News Blog. http://blogs.nature.com/news/2013/06/new-record-66-journals-banned-for-boosting-impact-factor-with-self-citations.html. 20 May 2014.

  • VQR. (2011). Italian quality research evaluation VQR 2004–2010. http://www.anvur.org/anvur/. 20 May 2014.

  • Zitt, M. (2010). Citing-side normalization of journal impact: A robust variant of the Audience Factor. Journal of Informetrics, 4(3), 392–406.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fiorenzo Franceschini.

Appendix

Appendix

Analysis of the distribution of omitted citations

Study at the level of the journal of cited papers

The dispersion related to the p J value of each journal (defined in Sect. 4.1) can be roughly estimated through an expedient. Each p J value can be expressed as:

$$ p_{J} = {{\sum\limits_{i = 1}^{{P_{J} }} {\left[ {\left( {p_{J} } \right)_{i} \cdot \left( {\gamma_{J} } \right)_{i} } \right]} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{{P_{J} }} {\left[ {\left( {p_{J} } \right)_{i} \cdot \left( {\gamma_{J} } \right)_{i} } \right]} } {\sum\limits_{i = 1}^{{P_{J} }} {\left( {\gamma_{J} } \right)_{i} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{{P_{J} }} {\left( {\gamma_{J} } \right)_{i} } }}, $$
(8)

being (p J ) i  = (ω J ) i /(ω J ) i (γ J ) i .(γ J ) i the percentage of citations omitted by the database of interest, referring to the i-th article published by J.

Equation 8 shows that (p J ) i can be seen as a weighted average of the omitted-citation rates of individual papers (i.e., (p J ) i values). These contributions have a variable weight, represented by the number of “theoretically overlapping” citations of each i-th article of interest (i.e., (γ J ) i ). Of course, articles with no citation will have a zero weight.

Being p J a weighted quantity, one can represent the distribution of (p J ) i values by a special box-plot based on weighted quartiles, defined as w Q (1) J , w Q (2) J and w Q (3) J , i.e., the weighted first, second (or weighted median) and third quartile of the (p J ) i values. Weighted quartiles are reported in Table 9. These indicators are obtained by ordering in ascending order the (p J ) i values of the articles of interest and considering the values for which the cumulative of weights is equal to respectively the 25, 50 and 75 % of their sum.

Table 9 Weighted quartiles related to the distributions of the (p J ) i values, for the scientific journals listed in Table 3

The differences between the (p J ) i distributions of the Manufacturing journals seem insignificant for both WoS and Scopus. The reason is that the notches related to the majority of the journals are overlapped. In particular, we note that most of the notches are “collapsed” on the line corresponding to (p J ) i  = 0 and all w Q (1) J values are zero, as well as almost all of w Q (2) J values, both for WoS and Scopus. This result is very interesting because it tells us that omitted citations are generally concentrated into a relatively small number of articles. To confirm this, we can see that—for each of the journals analyzed—the weighted median of the (p J ) i values (i.e. w Q (2) J , in Table 9) is systematically lower than the weighted average, i.e. p J .

Study at the level of the age of citing papers

The dispersion related to the p Y values of each journal (defined in Sect. 4.2) can be roughly estimated through an expedient, similarly to that presented in Sect. 6.1.1. Each p Y value can be expressed as:

$$ p_{Y} = {{\sum\limits_{i = 1}^{P} {\left[ {\left( {p_{Y} } \right)_{i} \cdot \left( {\gamma_{Y} } \right)_{i} } \right]} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{P} {\left[ {\left( {p_{Y} } \right)_{i} \cdot \left( {\gamma_{Y} } \right)_{i} } \right]} } {\sum\limits_{i = 1}^{P} {\left( {\gamma_{Y} } \right)_{i} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{P} {\left( {\gamma_{Y} } \right)_{i} } }}, $$
(9)

being (p Y ) i  = (ω Y ) i /(ω Y ) i (γ Y ) i .(γ Y ) i the percentage of citations omitted by the database of interest, among those obtained in the year Y, referring to the i-th article examined.

Equation 9 shows that the p Y value relating to a database can be seen as a weighted average of the omitted-citation rates of individual papers ((p Y ) i ). These contributions have a variable weight, given by the number of theoretically overlapping citations ((γ Y ) i ).

The dispersion of the (p Y ) i values can be roughly estimated by examining the relevant weighted quartiles, defined as w Q (1) Y , w Q (2) Y and w Q (3) Y . The construction of these indicators is analogous to that described in Sect. 4.1.

The surprising result is that the totality of the weighed quartiles are zero for both databases. This result is not incompatible with the fact that the weighted quartiles seen for individual journals (in Table 9) were not necessarily all zero. In this new case, we used time-windows of a single year when counting the (omitted) citations of citing papers; the incidence of articles with zero omitted citations is therefore greater than in the previous case. The practical consequence is that all non-zero (p Y ) i values fall beyond the third weighted quartile of the corresponding (weighted) distribution. As an example, the graph in Fig. 7 represents the weighted cumulative distribution relating to the (p Y ) i values for the year 2012, according to WoS. It can be noticed that the first seventy-six weighed percentiles are all zeros. Similar results can be found considering the remaining years.

Fig. 7
figure 7

“Weighted” box-plot of the (p J ) i values relating to the papers in each journal (J), according to the WoS database. w Q (1) J , w Q (2) J and w Q (3) J are the first, second and third weighted quartile of the distributions of interest. Journal abbreviations are reported in Table 3

Fig. 8
figure 8

“Weighted” box-plot of the (p J ) i values relating to the papers in each journal (J), according to the Scopus database. w Q (1) J , w Q (2) J and w Q (3) J are the first, second and third weighted quartile of the distributions of interest. Journal abbreviations are reported in Table 3

Fig. 9
figure 9

Weighted cumulative distribution relating to the (p Y ) i values for the year 2012, according to WoS

This result confirms the fact that, although the p Y values of the two databases tend to decrease over time, these variations are quite weak from a statistical viewpoint.

Additional tables

See Table 10 and 11.

Table 10 Annual number of articles (P) issued by each of the journals analyzed and indexed by both WoS and Scopus
Table 11 CPP, CPP * and relevant statistics for each of the journals examined, in the years from 2008 to 2012

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Franceschini, F., Maisano, D. & Mastrogiacomo, L. Influence of omitted citations on the bibliometric statistics of the major Manufacturing journals. Scientometrics 103, 1083–1122 (2015). https://doi.org/10.1007/s11192-015-1583-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-015-1583-9

Keywords

Navigation