Breast Cancer Research and Treatment

, Volume 127, Issue 3, pp 845–851

The h index and the identification of global benchmarks for breast cancer research output

Authors

  • N. A. Healy
    • Department of SurgeryNational University of Ireland, Galway, Clinical Science Institute
    • Department of SurgeryNational University of Ireland, Galway, Clinical Science Institute
  • Cristian Scutaru
    • Department of Information ScienceCharite-Universitatsmedizin Berlin, Free University Berlin
  • David Groneberg
    • Institute for Occupational, Social and Environmental MedicineJohann Wolfgang Goethe-Universität
  • M. J. Kerin
    • Department of SurgeryNational University of Ireland, Galway, Clinical Science Institute
  • K. J. Sweeney
    • Department of SurgeryNational University of Ireland, Galway, Clinical Science Institute
Brief Report

DOI: 10.1007/s10549-011-1436-z

Cite this article as:
Healy, N.A., Glynn, R.W., Scutaru, C. et al. Breast Cancer Res Treat (2011) 127: 845. doi:10.1007/s10549-011-1436-z

Abstract

The h index is used to assess an individual’s contribution to the literature. This metric should not be employed to compare individuals across research areas; rather each subject should have its own baseline and standard. This work aimed to identify global bibliometric benchmarks for those involved in breast cancer research, and specifically, to describe the bibliographic characteristics of breast surgeons in the UK and Ireland. Authorship data was extracted from breast cancer related output from 1945 to 2008, as indexed in the Web of Science. Authors’ publications, citations and h indexes were identified. The breast-related output of 277 UK and Irish breast surgeons was evaluated, and a citation report generated for each. Strong correlation was noted between the h index and number of publications (r = 0.642, P < 0.001) and number of total citations (r = −0.922, P < 0.001). The author with the highest h index is B Fisher (h index = 80). 23.0% of surgeons had not published original research pertaining to the breast; the remainder had together produced 2,060 articles, accounting for 59,002 citations. The top quartile was responsible for 83% of output; the 90th percentile was 20 publications. The range of h index values for the surgeons was 0–50, with a median h index returned of 3 (IQR 1–6); the 90th percentile was 13.5. This work has identified bibliometric benchmarks to which those involved in breast cancer research might aspire. Our findings suggest that there is need for wider involvement of surgeons in the research process and raises questions regarding the future of scientific breast surgery.

Keywords

h IndexBreast surgeonResearch outputBreast cancer

Abbreviations

BCCOM

Breast Cancer Clinical Outcome Measures

SIBS

The Society of Irish Breast Surgeons

UK

United Kingdom

WOS

Web of Science

Introduction

Rapid progress in medical research over the past 50 years has been accompanied by vast increases in research funding and resource allocation. Nowhere have these increases been as evident as in breast cancer research; one recent analysis demonstrated an average 15% increase in output annually since 1945, and a greater than 100% increase since the millennium alone [1]. Allied to the growth in output has been the development of the internet and the emergence of a “publish or perish” culture wherein researchers feel increasingly pressured to generate “publishable” results [2]. Determination of the quality and quantity of this published work is increasingly considered essential in the evaluation and comparison of these individuals, both for employment purposes and in the determination of their suitability for grant allocations [3, 4].

Until recently, quantitative and qualitative assessment of an individual’s research output remained problematic, and there was a lack of consensus as to how to optimally score individual research performance [5]. In order to address this issue, Jorge Hirsch developed a novel tool, the h index, in 2005[6]. It has rapidly gained acceptance because of its ability to judge authors on the value of their significant works only. Whilst previously the total number of publications or citations, or the number of citations per publication have been used to assess productivity and relevance of a body of work, the h index now incorporates these factors into a single statistic [7]. The popularity of this measure is demonstrated by the fact that Hirsch’s original paper has already received over 400 citations and over 110,000 downloads [8]. As with any bibliometric indicator, however, a scientist’s h index must be viewed strictly in the context of the specialty area within which he/she is working, and should not be extrapolated for comparison with the output of those operating in other areas of science or the humanities. This article thus aimed to establish the spread of h indexes for those working in breast cancer specifically, such that discipline-specific normative and elite values might be set for those tasked with setting standards and evaluating individuals in this area. A secondary objective of this work was to examine the breast cancer-related output of consultant breast surgeons working in the United Kingdom and Ireland specifically.

Methods

The first part of this work involved the extraction of authorship data from all breast cancer related output indexed in the Web of Science (WOS) Science Citation Expanded database (SCI-Expanded) from 1945 to 2008 inclusive. The WOS was searched using the Boolean operator, ‘OR’, with different terms related to breast cancer as follows:

TS = ((phyllodes tumo$r$) OR (Cystosarcoma Phyllo$des) OR (Malignant Cystosarcoma Phyllodes) OR (breast invasive ductal carcinoma) OR (infiltrating duct carcinoma$) OR (mammary ductal carcinoma$) OR (breast cancer) OR (breast neoplasm$) OR (breast tumo$r$) OR (human mammary neoplasm$) OR (human mammary carcinoma$)).

Since this work was designed to assess overall activity in relation to breast cancer, we did not refine our search to include some document types such as original articles or reviews, or to exclude others such as letters and editorials. Computer software developed by the Charite University in Berlin was subsequently utilised to extract and collate data regarding publishing authors from all published items. Authors with the greatest number of publications, citations and their associated h indexes were then identified.

Hirsch explained his h index as follows; “A scientist has index h if h of his/her Np papers have at least h citations each and the other (Np-h) papers have no more than h citations each” [6]. This indicates that the h index is the greatest number of articles that an individual is the author or co-author of, that have each been cited h or more number of times. A h index of 6, therefore, indicates that a scientific author has 6 publications, each with 6 citations.

The second part of this work involved the assessment of the breast-related research output of consultant breast surgeons working in the UK and Ireland. The output of 277 surgeons participating in the UK’s Breast Cancer Clinical Outcome Measures (BCCOM) Project, and 36 consultant members of The Society of Irish Breast Surgeons (SIBS) was evaluated using the WOS Science Citation Expanded Database. This was performed by searching for each surgeon according to their surname and first initial of their first name with the field tag “author”, combined with the topic search tag “breast”. The resulting dataset was then examined to identify inappropriate entries and a citation report was then produced for each surgeon. All data, including each surgeon’s total number of publications, associated citations and h index was transferred to Microsoft Excel and graphically represented. Statistical analysis was performed using SPSS v17.0.

Results

Authorship, 1945–2008

During the period from 1945 to 2008 (1974 excluded, n = 352), a total of 180,126 items were produced on breast cancer, as catalogued in the WOS. Authorship was examined in items catalogued from 1949 onwards. 286, 191 authors had contributed to at least one published item over the period under study. The average number of authors increased steadily over this timeframe, reaching a peak in 2008 of 5.92 authors on average per article (Fig. 1). The most prolific author identified in our search was GN Hortobagyi (n = 626 items), followed by AU Buzdar (n = 385) and RW Blamey (n = 377).
https://static-content.springer.com/image/art%3A10.1007%2Fs10549-011-1436-z/MediaObjects/10549_2011_1436_Fig1_HTML.gif
Fig. 1

Increase in number of authors per published item, over time

The total output and their associated h index for the top 15 most prolific authors are shown in Fig. 2a. Similarly, Fig. 2b relates the output of these authors to the average number of citations per item published. The author with the highest associated h index is B Fisher (h index = 80), followed by M. E. Lippman (77) and A. L. Harris (76) (Table 1). Strong correlation was noted between the h index of the top 200 most prolific authors and their associated output (r = 0.642, P < 0.001) and between their h index and their total citations (r = 0.922, P < 0.001) (Fig. 3a, b, respectively).
https://static-content.springer.com/image/art%3A10.1007%2Fs10549-011-1436-z/MediaObjects/10549_2011_1436_Fig2_HTML.gif
Fig. 2

Most prolific authors and a their associated h-indexes, and b their average citations per item published

Table 1

Top 20 most prolific authors and top 20 h-index authors

Most prolific authors

Authors with highest h-index

Author

No. items

No. citations

h-index

Author

No. items

No. citations

h-index

Hortobagyi, GN

626

20538

71

Fisher, B

215

30122

80

Buzdar, AU

385

16435

61

Lippman, ME

286

22058

77

Blamey, RW

377

11854

54

Harris, AL

339

19587

76

Dowsett, M

367

15221

64

Willett, WC

293

18686

75

Ellis, IO

350

10373

50

Osborne, CK

253

18527

74

Goldhirsch, A

343

13588

56

Hortobagyi, GN

626

20538

71

Harris, AL

339

19587

76

Colditz, GA

288

18058

71

Howell, A

322

13914

60

Mcguire, WL

204

22616

70

Jordan, VC

314

12549

58

Norton, L

228

21812

64

Coombes, RC

303

13016

59

Dowsett, M

367

15221

64

Willett, WC

293

18686

75

Wolmark, N

156

24280

64

Colditz, GA

288

18058

71

Easton, DF

180

19354

63

Lippman, ME

286

22058

77

Buzdar, AU

385

16435

61

Baum, M

285

11375

40

Howell, A

322

13914

60

Klijn, JGM

254

14623

60

Klijn, JGM

254

14623

60

Osborne, CK

253

18527

74

Coombes, RC

303

13016

59

Veronesi, U

243

17521

59

Veronesi, U

243

17521

59

Narod, SA

242

11817

54

Jordan, VC

314

12549

58

Kaufmann, M

239

9930

41

Clark, GM

154

17386

58

Singletary, SE

231

7055

47

Ponder, BAJ

145

14904

57

https://static-content.springer.com/image/art%3A10.1007%2Fs10549-011-1436-z/MediaObjects/10549_2011_1436_Fig3_HTML.gif
Fig. 3

Relationship between authors’ h-index and their a number of publications and b number of citations received

h Index of breast surgeons in the UK and Ireland

Of 277 bibliographic profiles, 38 were excluded due to lack of clarity in relation to author identity. Of the remaining 239 profiles, 55 (23.0%) surgeons had not published original research pertaining to the breast; the remaining 184 surgeons had together produced 2,060 articles, which together accounted for 59,002 citations. The median number of publications amongst the group was 2 (IQR 1–8); the top quartile of surgeons were together responsible for 1,710 (83.0%) outputs. The 90th percentile for output was 20 publications. The median number of years since the most recent publication was 3.5 (IQR 1–8).

The median total number of citations per surgeon was 62.0 (IQR 10.0–216.5), and the median average number of citations per item published, per surgeon, was 12.2 (IQR 5.0–25.7). The range of h index values for the surgeons was 0–50, with a median h index returned of 3 (IQR 1–6). The 90th percentile was 13.5. Strong correlation was noted between a surgeon’s h index and his/her total output (r = 0.953, P < 0.001) and the total number of citations associated with that output (r = 0.888, P < 0.001) (Fig. 4). In addition, correlation was noted between the h index and the average number of citations per publication (r = 0.707, r < 0.001), and there was a significant negative relationship between the h index and the number of years since last publication (r = −0.306, P < 0.001).
https://static-content.springer.com/image/art%3A10.1007%2Fs10549-011-1436-z/MediaObjects/10549_2011_1436_Fig4_HTML.gif
Fig. 4

Correlation between authors’ h-index and a total number of publications, b total number of times cited, c average number of citations per item, and d years since last publication

Discussion

As noted in the introduction to this thesis, the h index was first proposed by Hirsch [9], and has since been accepted as a method of ranking individuals relative to their research output. This metric is a novel bibliometric tool that has generated considerable interest within the research community due to its objectivity in terms of rating scientific performance [10]. It is a number that reflects not only the quantity but also the quality of an individual’s work and is automatically calculated using bibliometric databases such as Web of Science and Google Scholar. Our finding that one’s h index correlates well with the more traditional markers of research impact, including total number of articles published and total number of citations received, echoes that found in similar recent studies in other branches of medicine [11, 12], and again reiterates its utility as a reasonable alternative in the search for a proxy index of research impact.

Hirsch’s original paper examined the h index for physicists and estimated that, in this area, a h index of 12 would approximate with promotion to associate professor stage, and that h = 18 might be a typical value for advancement to full professor. Exceptional h values within physics were estimated at those at 45 and above [6]. This work initially looked at global breast cancer-related research output from 1945–2008 and assigned the highest h index value to Bernard Fisher (h = 80), based at the Department of Surgery at the University of Pittsburgh, and who has been so intimately involved with the National Surgical Adjuvant Breast and Bowel Project and other chemotherapy-based clinical Trials since the late 1960s [13]. Comparing the 20 most prolific authors over the study periods with those 20 authors who had achieved the highest h indexes, 13 are on both lists. The range of h index values for all 27 authors was 40 (Baum, K) to 80; one could surmise that those with h index values of 40 and above can be regarded as truly exceptional within the field of breast cancer research. Indeed, an analysis of the 286,191 authors who had contributed at least one item revealed that just 115 (0.04%) returned a h value of 40 or greater.

In order to test the above hypothesis in the clinical setting, we investigated the h values of 239 breast surgeons working in the UK and Ireland. The range of h index values amongst this group was 0–50 (Professor J. F. Robertson); Professor Robertson was the only surgeon with an h index above 40, however, with the range for the 90th percentile lying between 14 and 50. Importantly, the majority of the surgeons included in this latter study will have come through a general surgical training scheme; many will thus have a general surgical workload and research interest in addition to their breast-related practice and, if all of their research output had been considered, the range for the group as a whole, and indeed for the 90th percentile, would unquestionably have been much higher. This demonstrates the importance of placing h index values and normative standards in the context of the community for which they have been designed. What can be accurately surmised from our work, however, is that while a h index value of 40 or greater indicates an exceptional author in terms of global breast cancer research output, amongst surgeons in Great Britain and Ireland a breast-related h index of 14 should be regarded as the benchmark for excellence.

It was interesting to note that 23% of the UK and Ireland-based breast surgeons included in this work do not appear to have published any breast-related research. The above caveat regarding the general surgical background of many of these surgeons notwithstanding, this finding may raise concerns as to the future of breast-related surgical research in the UK and Ireland.

There are a number of limitations to this work. Output from 1974 (n = 352, 0.2% of total output) was accidentally excluded during data collection, and hence was not included in the subsequent analysis. In addition, this study has focused on entries contained in the WOS only, and it should be noted that the employment of other databases including PubMed and Scopus may have yielded slightly different results [14, 15]. The aforementioned discussion notwithstanding, there are a number of important limitations associated with the h index. In particular, this metric is considered time sensitive, such that researchers at earlier stages of their career are placed at a significant disadvantage, regardless of the quality of their work during that career period [16]. In addition, the h index fails to make an adjustment for the ordering of authors on research articles, suggesting that some authors with minimal contributions to specific research publications may nevertheless improve their h index.

Conclusions

This work has identified the benchmarks to which those involved in breast cancer research might aspire. In addition, we have demonstrated for the first time the research output of breast surgeons in the UK and Ireland, and have shown that a small minority are responsible for the majority of output. Our findings suggest that there is a need for wider involvement of surgeons in the research process, and raises questions regarding the future of scientific breast surgery. Perhaps just as importantly, however, we have demonstrated that for the majority of surgeons a smaller level of output is the norm, and this needs to be highlighted to those tasked with assessing those individuals and setting expected standards into the future.

Conflict of interest

None.

Copyright information

© Springer Science+Business Media, LLC. 2011