Skip to main content
Log in

Unified citation parameters for journals and individuals: Beyond the journal impact factor or the h-index alone

  • Published:
Pramana Aims and scope Submit manuscript

Abstract

We seek a unified and distinctive citation description of both journals and individuals. The journal impact factor has a restrictive definition that constrains its extension to individuals, whereas the h-index for individuals can easily be applied to journals. Going beyond any single parameter, the shape of each negative slope Hirsch curve of citations vs. rank index is distinctive. This shape can be described through five minimal parameters or ‘flags’: the h-index itself on the curve; the average citation of each segment on either side of h; and the two axis endpoints. We obtain the five flags from real data for two journals and 10 individual faculty, showing they provide unique citation fingerprints, enabling detailed comparative assessments. A computer code is provided to calculate five flags as the output, from citation data as the input. Since papers (citations) can form nodes (links) of a network, Hirsch curves and five flags could carry over to describe local degree sequences of general networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. D J de Solla Price, Science 149, 510 (1965)

  2. D J de Solla Price, J. Am. Soc. Inform. Sci. 27, 292 (1976)

  3. S Redner, Eur. J. Phys. B 4, 131 (1998)

    Article  ADS  Google Scholar 

  4. S Redner, Phys. Today 58, 49 (2009)

    Article  Google Scholar 

  5. R Albert and A-L Barabasi, Rev. Mod. Phys. 74, 47 (2002)

    Article  ADS  MathSciNet  Google Scholar 

  6. G Bianconi, Multilayer networks (Oxford University Press, 2018)

  7. G Bianconi and A-L Barabasi, Europhys. Lett. 54, 436 (2001)

    Article  ADS  Google Scholar 

  8. A-L Barabasi and Z Oltvai, Nature Rev.: Genetics 5, 104 (2004)

  9. G Bianconi, Pramana – J. Phys. 70, 1135 (2008)

    Google Scholar 

  10. E Garfield, Science 178, 471 (1972)

    Article  ADS  Google Scholar 

  11. E Garfield, http://www.garfield.library.upenn.edu/papers/jifchicago2005.pdf

  12. http://www.garfield.library.upenn.edu/commentaries/tsv12(03)p10y19980202.pdf

  13. http://www.garfield.library.upenn.edu/commentaries/tsv12(14)p12y19980706.pdf

  14. E Garfield, Council of Scientific Editors Annual Meeting, May, 2000

  15. E Garfield, The Scientist 10(17), 13 (1996)

    Google Scholar 

  16. E Garfield, Science 144, 649 (1964)

    Article  ADS  Google Scholar 

  17. https://clarivate.com/webofsciencegroup/essays/impact-factor

  18. J Hirsch, Proc. Natl. Acad. Sci. 102, 16569 (2005)

    Article  ADS  Google Scholar 

  19. S Alonso, F J Caberizo, E Herrera-Viedma and F Herrera, J. Informetrics 3, 273 (2009)

    Article  Google Scholar 

  20. S Lehmann, A D Jackson and B E Lautrup, Nature 44, 1003 (2006)

    Article  ADS  Google Scholar 

  21. A L Kinney, Proc. Natl. Acad. Sci. 104,17943 (2007)

    Article  ADS  Google Scholar 

  22. J-F Molinari and A Molinari, Scientometrics 75, 163 (2008)

    Article  Google Scholar 

  23. A Chatterji, A Ghosh and B Chakrabarti, PLoS One (2016), https://doi.org/10.1371/journal.pone.0146762

  24. A Khaleque, A Chatterji and P Sen, J. Scientometric Res. 5(1), 25 (2016)

    Article  Google Scholar 

  25. P Wouters, C R Sugimoto, V Larivier̀e, M E McVeigh, B Pulverer, S de Rijcke and L Waltmann, Nature 569, 621 (2019)

    Google Scholar 

  26. D Hicks, P Wouters, L Waltmann, S de Rijcke and I Rafols, Nature 520, 429 (2015)

    Article  ADS  Google Scholar 

  27. https://youtu.be/GW4s58u8PZo

  28. R Adler, J Ewing and P Taylor, Stat. Sci. 24, 1 (2009)

    Google Scholar 

  29. P O Seglen, Brit. Med. J. 314, 497 (1997)

    Article  Google Scholar 

  30. A M Grimwade, Front. Res. Metr. Anal. (2018), https://doi.org/10.3389/frma.2018.00014.

    Article  Google Scholar 

  31. M Rossner, H Van Epps and E Hill, J. Exp. Med. 204, 3052 (2007)

    Article  Google Scholar 

  32. M Rossner, H Van Epps and E Hill, J. Exp. Med. 205 260 (2008)

    Article  Google Scholar 

  33. N-X Wang, Nature 476, 253 (2011)

    Article  ADS  Google Scholar 

  34. M Price, https://www.sciencemag.org/careers013/09/should-we-ditch-journal-impact-factor

  35. J Bollen, H Van de Sompel, A Hagberg and R Chute, https://arxiv.org/abs/0902.2183

  36. M R Berenbaum, Proc. Natl Acad. Sci. 116, 16659 (2019)

    Article  Google Scholar 

  37. A Fersht, Proc. Natl. Acad. Sci. 106, 688 (2009)

    Article  Google Scholar 

  38. San Francisco Declaration on Research Assessment (DORA) (2012), https://sfdora.org/read/

  39. V Larivière, V Klermer, C J MacCallum, M McNutt, M Patterson, B Pulverer, S Swaminathan, S Taylor and S Curry, https://www.biorxiv.org/content/early/2016/07/05/062109

  40. T Braun, W Glanzel and A Schubert, Scientometrics 69, 169 (2006)

    Article  Google Scholar 

  41. Garfield later stated [11], ”Further, I myself deplore the quotation of impact factors to three decimal places. ISI uses three decimal places to reduce the number of journals with identical impact rank. It matters very little whether the impact of JAMA (J. American Medical Association) is quoted as 21.5 rather than 21.455”

  42. Novel citational correlations may be discovered by analysing proprietary databases that are properly subscribed to and with the database use duly acknowledged in the paper. However, authors may still not be allowed to make their detailed research analysis available to colleagues, in a journal data depository. See Data Availability section of L Bornmann, Quant. Sci. Stud. 1, 1553 (2020)

  43. S Saha, S Saint and D A Christakis, J. Med. Libr. Assoc. 91, 42 (2003)

    Google Scholar 

  44. A I Pudovkin, Front. Res. Metr. Anal., https://doi.org/10.3389/frma.2018.00002

  45. L Waltman and V A Traag, https://arxiv.org/ftp/arxiv/papers/1703/1703.02334.pdf

  46. Garfield notes [11] “Thus the impact factor is used to estimate the influence of individual papers, which is rather dubious considering the known skewness observed for most journals”

  47. Database searches for individuals, yield items \(N_{\rm items}\) but not all are (original or review) research articles. Items displayed could include arXiv preprints, conference abstracts, seminar notices, etc. Eliminating them ‘by hand’could be tedious. However, empirical examination shows that such ‘ephemera’ either have zero cites (\(N_0\) items), or are cited only once (\(N_1\) items). Subtracting these yields a pruned number of papers \(N_p \equiv N_{\rm items} - ({N_0} + {N_1})\) that tend to have ephemera automatically filtered out. The resultant \(N_p (A)\) items cited more than once or \(c(s =N_p) \ge 2\) are taken as the number of research papers. New research publications would eventually get cited, meet this criterion and be included. Similarly, for journals, a database search for ‘all item’ mentions, would include non-research items like editorials, letters of opinion, news items etc. Again, we retain only those items cited more than once, to filter out ephemera. For ten faculty members, the average fractions discarded are \(\langle N_0/N_{\rm items}\rangle \)\(=0.28\), and \(\langle N_1/N_{\rm items}\rangle = 0.08\). For the two journals, J1 has \(N_0/N_{\rm items}= 0.18\), \(N_1/N_{\rm items}\)\(= 0.11\); while J2 has \(N_0/N_{\rm items} = 0.02\), \( N_1/N_{\rm items} = 0.03\)

  48. B-H Jin, L-M Liang, R Rosseau and L Egghe, Chin. Sci. Bull. 52, 855 (2007). Their parameters are related to the 5F as ‘A-index’ = \(hac\); ‘R-index’ = \(\sqrt{h \times hac}\)

  49. Each 5F data set could be depicted by a symbol with three Cartesian axes of \((x,y,z)= (nac,hac,h)\). The other two 5F parameters could enter through variations in symbol size (diameter \(\sim \ln u\)) and symbol colour (\(0 < r < 1\) fixes position in colour bar). In a simpler 2D plot of \(hac\) vs. \(nac\), the more well-cited individuals or journals will be points near the upper right corner

  50. R Koch, The 80:20 principle (Little Brown, 2013)

  51. R Sinatra, D Wang, P Deville and A-L Barabasi, Science 354, 6312 (2016)

    Article  Google Scholar 

  52. Garfield recognised that [10] the “citation frequency of a journal is thus a function not only of the scientific significance of the material it publishes (as reflected by citation), but also of the amount of material it publishes”

  53. https://www.topuniversities.com/university-rankings/world-university-rankings/2022

  54. H Jeong, B Tombor, R Albert, Z N Oltvai and A L Barabasi, Nature 407, 651 (2000)

    Article  ADS  Google Scholar 

  55. G Bagler, Physica A 387, 2972 (2008)

    Article  ADS  Google Scholar 

  56. P Bak, C Tang and K Wiesenfeld, Phys. Rev. Lett. 59, 381 (1987)

    Article  ADS  Google Scholar 

  57. D Dhar and R Ramaswamy, Phys. Rev. Lett. 63, 1659 (1989)

    Article  ADS  MathSciNet  Google Scholar 

  58. The five flags defined in §3 can be obtained as follows from Google Scholar that provides citations in decreasing values. Note down your academic age \(A\), the years after your first paper. Find your citations \(c(s)\) to your \(s=1,2,\ldots \) papers, with the highest \(c(1) = C_{\rm max}\). Note down the largest \(s\) for which \(c(s) \ge s\): this is your \(h\)-index. The largest serial number \(s\) of papers cited more than once \(c(s=N_p) \ge 2\) fixes \(N_p(A)\). Three of the F5 are then known, \(h, r= h/N_p, u= C_{\rm max} /h\). The average citation of the first \(h\) papers over \(s= 1,\ldots ,h\) is the \(hac\)-number. The average citation of the remainder \(n= N_p -h\) papers over \(s= h+1,\ldots ,N_p\) is the \(nac\)-number. These are the five flag components \({\phi }_5 = (h,r,u,nac,hac) \)

  59. We also have developed and provide, a computer code that yields the 5F as output directly from citation data in any order, as input. This is useful when adding new papers to previous-year data files. See URL https://citation-profiler.tifrh.res.in. The source code is also available at URL https://github.com/pankajpopli/cit-prof

  60. A-W Harzing and S Alakangas, Scientometrics 106, 787 (2016). See also the Harzing blog for useful packages to obtain citational information from Google Scholar. URL: https://harzing.com/

  61. J Li, S Fortunato and D Wang, Nat. Rev. Phys. 1, 302 (2019)

    Article  ADS  Google Scholar 

  62. S E Cozzens, Scientometrics 15, 437 (1989)

    Article  Google Scholar 

  63. T S Kuhn, The structure of scientific revolutions (University of Chicago Press, 1962)

  64. P Popli and S R Shenoy, unpublished (2022)

Download references

Acknowledgements

It is a pleasure to thank Mustansir Barma, Smarajit Karmakar, Prasad Perlekar and Surajit Sengupta for helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Subodh R Shenoy.

Appendix A. Properties of the IF

Appendix A. Properties of the IF

We introduce a notation to describe citations to published papers. Citations to papers involve pairs of publication year/citation year and a suitables notation is needed to uniquely identify various citation parameters.

Consider \(n_p(y)\) papers published in the year y, in a publication block y(A) of duration A years. The total papers are \(N_p (A) = \sum _{y(A)} n_p(y)\). We define \(P_c (C; y,Y)\) as the number of citations C garnered in the year Y, in a citation block Y(B) of duration B years. The difference between the start of the publication block of A years and the start of the citation block of B years is the delay \(D \equiv \text {Start } (B)-\text {Start } (A) \) years, i.e., zero for \(B=A\) block coincidence.

A useful notation to describe different citation variables is the sum over all allowed C, y, Y that defines the total citations \(S_c(A, B; D)\):

$$\begin{aligned} S_c (A,B; D) ~ \equiv \sum _C N_c(C; A,B,D), \end{aligned}$$
(A.1)

where

$$\begin{aligned} N_c(C; A,B,D) \equiv \sum _{y(A)} \sum _{Y (B)} P_c (C;y,Y). \end{aligned}$$
(A.2)

We make four observations.

1.1 Observation 1: The A-year IF is not an A-year average

In the familiar case of a citation average, the publication/citation year blocks are equal, coincident and not sequential. Thus, \(A=B\) and \(D =0\). The total number of citations is \(N_{c,\mathrm {tot}} = \sum _C N_c (C; A,A,0) = S_c(A,A,0)\), where \(N_c (C; A,A,0) \equiv \sum _{y(A) }\sum _{Y(A)} P_c (C; y,Y))\). The average citation per paper \({\bar{C}}\) over A years is

$$\begin{aligned} {{\bar{C}}} (A) = S_c(A,A; 0) / N_p (A). \end{aligned}$$
(A.1)

Here \(N_c(C;5,5,0)\) is the 5-year citation frequency, written simply as \( N_c (C)\) and shown in figures 2 and 3. On the other hand, the A-year current impact factor IF(A) has publication/citation year blocks that are unequal and not coincident, but sequential, so \(A \ne B\) and \(D \ne 0\). The single citation year \(B=1\) commences right after A, and so \(D=A\). Thus

$$\begin{aligned} \mathrm {IF} (A) = S_c (A, 1;A)/ N (A). \end{aligned}$$
(A.2)

Clearly, IF\((A) \ne {{\bar{C}}}(A)\). Here, the number of papers \(N_p(A)\) (cited more than once [47]) is replaced by the number of items N(A) (with any citations).

1.2 Observation 2: \({I\!F}(A)\) has far fewer citations than A-year average

The JIF with one citation year, has restricted (and hence fewer) publication–citation year pairs. A toy-model for the 2018 citation year is illustrative. Publications in \(y =2016\) have citations \(P_c(C; 16, 16)\), \(P_c(C; 16, 17)\), \( P_c(C;16, 18)\). Publications in \(y= 2017\) have citations \(P_c (C; 17, 17), P_c (C; 17,18)\). Now suppose the number of citations are \(P_c =2000\) in the year of publication, 1500 in the second year, 500 in the third year and zero thereafter. For the JIF in the year 2018, the two allowed publication–citation pairs are \((y,Y) =(16,18),\) (17, 18) and so total citations are \( [(500) + (1500)] =2000\), for a smaller \(\mathrm {IF}(2) \equiv \mathrm {JIF} = 2000/ [100 +100] = 10\). For the average citation with the same two publication years, the pairs are \((y,Y) = (16,16), (16,17),(17,17)\) and so total citations are the larger \([(2000+ 1500) + (2000)] =5500\) cites, yielding \({{\bar{C}}} = 5500/[100 +100] =27.5\) cites >\(\mathrm {JIF} =10\) cites.

1.3 Observation 3: Different parameters give different rankings

In the early scientometric literature, an adjective made a difference: Current IF means one year of citations and cumulative IF means summing up several years of citations. A multiple-citation-year parameter is [11] the 5-year ‘cumulated’ impact factor with \(B=5\) years of citations, e.g. 1999–2004, from one \(A=1\) publication year of say 1999, with the same start year and so \(D=0\). It is given by \(\sim \!\!S_c (1, 5; 0) /N (1)\) and was applied to JAMA for different single years of publication.

Another parameter is the 15-year ‘cumulative’ impact factor [12, 13] for citations over \(B=15\) years, e.g. 1981–1995, to \(A= 2\) years of publications, e.g. 1981–1982, with the same start year and so \(D=0\). It is \(\sim \!\!\!S_c( 2, 15;0)/N(2)\) and is applied to generate rankings for 100 journals [12, 13] and compare: with the JIF rankings for JCR reference year 1983.

With \(\Delta R\) the difference between the two rankings for a given journal, the average ranking-change magnitude \(\langle | \Delta R | \rangle \) can be found, over subsets of the ranks. For the top 10 journals, the average \(\langle | \Delta R |\rangle = 2 \), is small, consistent with a claimed insensitivity [12, 13]. However, for all the 100 journals, the average ranking shift shows substantial shuffle, \(\langle |\Delta R|\rangle \simeq 34\). The specific JIF rankings depend on the chosen JIF parameter: other choices could give other journal rankings.

1.4 Observation 4: The \({I\!F}(A)\) definition makes rankings A-insensitive

For different durations A, how different are the rankings obtained from the A-year current impact factor of \(\mathrm {IF}(A) = S_c(A,1; A) /N (A)\)? Surprisingly, the large-A ranking can be close to the usual \(A=2\) ranking from the JIF. Suppose that the numerator rises as more years are included, but then flattens to a constant, for A larger than the half-life (‘old papers are less cited’). Suppose further, that the denominator varies as the number of years A in the block, or \(N (A) = A N (1)\) (‘journal size is the same, every year’). In such a case, \(\mathrm {IF}(A) \simeq (2/A) \mathrm {IF}(2)\) and the journal rankings (not values) can be A-insensitive, from the definition. The ranking commonality does not imply that the JIF ranking has any property of uniqueness, or of optimisation [35].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Popli, P., Shenoy, S.R. Unified citation parameters for journals and individuals: Beyond the journal impact factor or the h-index alone. Pramana - J Phys 96, 189 (2022). https://doi.org/10.1007/s12043-022-02413-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12043-022-02413-z

Keywords

PACS Nos

Navigation