Introduction

To allocate authorship credit for multi-authored publications according to a harmonic progression was originally suggested by Hodge and Greenberg (1981) in a letter to Science. Their letter was a response to Derek De Solla Price who, although aware that coauthors did not contribute equally, had proposed equal division of publication and citation credit among coauthors as “a deterrent to the otherwise pernicious practice of coining false brownie points by awarding each author full credit for the whole thing” (Price 1981). Ironically, both Price’s proposal for equal division of authorship credit (fractional counting), and the practice he opposed (inflated counting) have persisted as routine bibliometric methods for nearly 30 years. By contrast, harmonic counting went virtually unnoticed until reproposed without acknowledgement to Hodge and Greenberg in the 17 October 2008 issue of Science (cf. Hagen 2009).

Recently, harmonic counting was shown to improve the accuracy of h index scores by removing distorting bibliometric bias from the input data (Hagen 2008). Such bias is generated by equal allocation of authorship credit, either by inflated or fractional counting, and has the potential to distort all derived bibliometric measures.

In the present study harmonic authorship credit scores are validated by comparison with previously published empirical data from medicine, psychology and chemistry. Such validation does not imply causation, and for that reason harmonic counting is also assessed ethically by contrasting its main features with previously proposed counting schemes from the bibliometric literature, including arithmetic (‘proportional’, Van Hooydonk 1997), geometric (Egghe et al. 2000) and fractional counting (Lindsey 1980; Price 1981).

Methods

Empirical validation

Harmonic authorship credit for the ith author of a publication with N coauthors was calculated according to the following formula:

$$ {\text{Harmonic}}\,i{\text{th}}\,{\text{author}}\,{\text{credit}} = {\frac{{{\frac{1}{i}}}}{{\left[ {1 + {\frac{1}{2}} + \cdot \cdot \cdot + {\frac{1}{N}}} \right]}}} $$

For medical research, where the corresponding author is customarily listed last to signify elevated status (Wren et al. 2007; Zuckerman 1968), the harmonic authorship credit was calculated assuming approximate equality between the contributions of the first and last authors (Hagen 2008, Fig. 5C therein).

Empirical data from the bibliometric literature were obtained as follows: data for psychology were obtained from an internet-based study on how name-ordering conventions in three different disciplines affect inferences about authorship credit (Maciejovsky et al. 2009). The data for psychology were used because this discipline has a tradition of hierarchical byline positioning, whereas the other two, marketing and economics, do not. For psychology, authorship credit per author for papers with 2, 3, or 4 coauthors was assigned by analyzing responses from 52 faculty members and advanced graduate students. The data were obtained by scanning figure A2 from Maciejovsky et al. (2009), and using the ImageJ (http://rsbweb.nih.gov/ij/download.html) image analysis program to measure the average credit scores for psychology papers with non-alphabetical name ordering.

Empirical data for medicine were obtained from a survey of perceived authorship credit allotted by 87 promotion committee members from a wide selection of American medical schools. The data consisted of mean authorship credit scores and standard deviations for papers with three or five coauthors and the last author as corresponding author (Wren et al. 2007, Table 1 therein).

Empirical data for chemistry were obtained from tabulated authorship scores based on extensive empirical and theoretical investigations (Vinkler 2000, Table 4 therein). The data consisted of authorship credit scores for papers with up to six coauthors. The data were used with one minor correction: the first author credit for a paper with six coauthors was altered from 0.33 to 0.35 in order to make the total credit sum to unity, as was Vinkler’s intention, while maintaining a consistent internal increment of 0.05.

Lack of fit

Lack of fit was calculated as a standardized departure from model predictions as follows:

$$ {\text{Lack}}\,{\text{of}}\,{\text{fit}} = {\frac{1}{(n - 1)}}\sum {{\frac{{(O - E)^{2} }}{E}}} $$

where n is the total number of empirical observations, O is the empirical observation, and E is the model prediction.

Model predictions of authorship credit for the ith author of a publication with N coauthors were calculated according to the following formulas:

$$ {\text{Arithmetic}}\,i{\text{th}}\,{\text{author}}\,{\text{credit}} = {\frac{N + 1 - i}{(1 + 2 + \cdot \cdot \cdot + N)}} $$
$$ {\text{Geometric}}\,i{\text{th}}\,{\text{author}}\,{\text{credit}} = {\frac{{2^{N - i} }}{{2^{N} - 1}}} $$
$$ {\text{Fractional}}\,i{\text{th}}\,{\text{author}}\,{\text{credit}} = {\frac{1}{N}} $$

Results

Validation of the harmonic counting model

It is evident that the harmonic authorship credit scores are in close agreement with the empirical data from from psychology (Fig. 1a, Maciejovsky et al. 2009), medicine (Fig. 1b, Wren et al. 2007) and chemistry (Fig. 1c, Vinkler 2000). For medicine the harmonic credit scores were calculated on the assumption that the first and last (corresponding) authors were perceived as equal contributors. This assumption is supported by the close fit between the harmonic credit scores and the empirical means. The large error bars associated with first and last author credit in medicine may be an indication of diverging opinion among the 87 promotion committee members of the original survey about whether the last author position signifies approximate equality with the first author.

Fig. 1
figure 1

Harmonic authorship credit scores compared with previously published empirical data from a psychology (Maciejovsky et al. 2009), b medicine (Wren et al. 2007) and c chemistry (Vinkler 2000). n number of coauthors

The overall fit between the predicted harmonic authorship credit scores and the empirical data was close to the line of perfect fit, with no outliers (Fig. 2). The excellent fit to the harmonic authorship credit scores was quantified by a standardized score that estimated the overall departure from the model’s prediction at a mere 0.0035 (Fig. 3).

Fig. 2
figure 2

Relationship between predicted harmonic authorship credit scores and previously published empirical data from psychology (Maciejovsky et al. 2009), medicine (Wren et al. 2007) and chemistry (Vinkler 2000). The diagonal line indicates perfect fit between prediction and observation. N = 37 observations

Fig. 3
figure 3

Lack of fit between authorship credit scores predicted by harmonic, arithmetic, geometric and fractional counting models, and previously published empirical data from psychology (Maciejovsky et al. 2009), medicine (Wren et al. 2007) and chemistry (Vinkler 2000). N = 37 observations

Contrasting the bibliometric counting methods

The harmonic counting model fits the empirical data better than the arithmetic, geometric or fractional counting methods (Fig. 3). The fractional model, which allocates equal credit to all coauthors, exhibits the greatest discrepancy between model prediction and empirical data with a standardized departure score of 0.064, an 18-fold increase over harmonic counting. Arithmetic and geometric counting models have an intermediate lack of fit, with standardized departure scores for arithmetic more than double, and for geometric more than 6-fold greater than for harmonic counting. To further elucidate the differential lack of fit, a more detailed juxtaposition of how these models allocate authorship credit follows (Fig. 4; Table 1).

Fig. 4
figure 4

Comparison of bibliometric counting models. a harmonic, b arithmetic, c geometric, and d fractional counting models. Curves comparing allocated authorship credit are plotted for the first five authors for publications with N ≤ 20 coauthors

Table 1 Authorship credit scores for papers with up to N = 6 coauthors

In harmonic counting (Fig. 4a), the ratio of credit allotted to the ith and jth authors is always j:i, regardless of the total number of coauthors (N) (Hodge and Greenberg 1981), i.e. the 1st author always gets twice as much credit as the 2nd author, the 2nd author always gets 1.5 times more than the 3rd, the 3rd author always gets 1.33 times more than the 4th author, and so on.

Arithmetic counting also allots twice as much credit to the 1st author when there are only two coauthors (Fig. 4b), but has no fixed ratio of allotment when N increases. First author credit decreases rapidly and continuously, whereas last author credit initially increases and thereafter decreases slowly as N increases, e.g. the 4th author gets 0.1 credits as last author but >0.1 credits for 5 ≤ N < 15.

Geometric counting always allots twice as much credit to the ith author as to the (i + 1)th author (Fig. 1c), which implies that the allotted authorship credit rapidly approximates asymptotic values as N increases, such that the first few authors get most of the credit while negligible credit is allotted to the rest.

Fractional counting (Fig. 4d), systematically favors secondary authors by allotting equal credit to all coauthors. The amount by which secondary authors are favored is equal to the difference between fractional and harmonic authorship credit, and is referred to as equalizing bias. For primary authors the equalizing bias is negative (Hagen 2008, Fig. 3 therein).

Discussion

Harmonic counting matches established notions of the relationship between authorship credit and authorship rank in psychology, medicine and chemistry, by providing a robust fit to empirical data from three independent studies using disparate methodologies. It would appear, therefore, that harmonic counting provides a fair and accurate representation of the perceived quantitative norms of the byline hierarchy in branches of scientific publishing where unequal coauthor contribution is the norm. Furthermore, harmonic counting succeeds in capturing the essence of the unadorned byline by ensuring that three basic ethical criteria for equitable sharing of authorship credit are met (Hagen 2008):

  1. 1.

    one publication credit is shared among all coauthors,

  2. 2.

    the first author gets the most credit, and in general the ith author receives more credit than the (i + 1)th author, and

  3. 3.

    the greater the number of authors, the less credit per author.

In contrast, arithmetic counting does not consistently satisfy criterion 3 as the credit of the former last author is initially increased by adding more authors (Fig. 4b). Geometric counting does not consistently satisfy either criterion 1 or 3 because authorship credit rapidly approximates asymptotic values as N increases, so that the first few authors get most of the credit while negligible credit is allotted to the rest (Fig. 4c). And fractional counting violates criterion 2 by systematically favoring secondary authors at the expense of primary authors (Hagen 2008, Fig. 3 therein). In addition, these counting methods do not match the empirical data nearly as well as does the harmonic counting formula (Fig. 3).

Harmonic counting easily accommodates further decoding of explicit byline information about equal contribution of some coauthors (Hu 2009), or implicit information about the approximate equality of contributions by first and last authors, as in biomedical research where the corresponding author is customarily listed last (Buehring et al. 2007; Hagen 2008, Fig. 5 therein; Wren et al. 2007). However, the kind of ambiguity that may arise due to divergent opinion on the preferential status of corresponding last authors (e.g. Buehring et al. 2007; Hodge and Greenberg 1981), or as a result of unwritten conventions about coauthor equality and alphabetical name-ordering (e.g. Boas 1964; Endersby 1996; Maciejovsky et al. 2009), needs to be resolved by requesting unequivocal byline information, explicit contribution statements or editorial clarification.

In conclusion, it would seem that harmonic counting provides unrivalled accuracy, fairness and flexibility to the long overdue task of standardizing bibliometric allocation of publication and citation credit (cf. Larsen 2008).