Introduction

Since the 1970s, there has been a growing focus on gender issues (Zuckerman & Cole, 1975), accompanied by significant efforts to foster gender equality at both international and national levels (UNESCO, 2024). Beyond policymakers, researchers, and various organizations have also become increasingly interested and dedicated to narrowing the gender gap. Extensive evidence in scientific literature indicates that, generally, male academic productivity surpasses that of females, despite notable improvements in the representation and performance of women in science overall (Larivière et al., 2013; Chan & Torgler, 2020; Elsevier, 2021; OECD, 2021). Gender disparities in academic productivity/success have been thoroughly examined through the application of standard bibliometric indicators in science studies. However, the results of these studies are not consistent. The gender gaps in productivity, citation, and promotion have been evolving, albeit at a gradual pace, as evidenced by recent reviews of gender disparities in science (Kwiek & Roszka, 2021). The increasing participation of women in academic science alters the framework within which gender disparities in terms of productivity, citation impact, research areas/topics, and international research collaboration are currently examined. Recent bibliometric studies are increasingly employing diverse methods to determine gender among authors and authorships (Halevi, 2019), while comprehensive research on gender disparities in science is being conducted (e.g., Chan & Torgler, 2020; Thelwall et al., 2019), shedding light on the extent of ongoing transformations. In Table 1, we summarized the findings of some recent large-scale studies on the gender gap in science. This purposive sample only aims to demonstrate that the prevailing significance of gender differences has also recently been challenged in various approaches (and by no means considered a comprehensive review).

Table 1 Results from recent large-scale studies on the gender gap in science

A high number of bibliometric analyses have been conducted in recent years on gender disparities in different fields of sciences, such as aspects of gender differences and gender disparity in research/scientific productivity, women's contribution to science in life sciences (DesRoches et al., 2010), medicine (Henderson et al., 2014; Pashkova et al., 2013), economy (Maske et al., 2003), astronomy, immunology and oceanography (Leta and Levinson, 2003), psychology (D’Amico et al., 2011), and criminal justice and criminology (Snell et al., 2009). From the results and conclusions of academic research, an increasing body of evidence is drawing on women's participation in science.

Gender disparities vary by research field and country (Chan & Torgler, 2020; Larivière et al., 2013). Bibliometric evaluation of gender imbalances in different research areas is especially important hence bibliometric indicators are considered important features of research excellence. Showing the degree of gender disparities in different fields of science opens a lot of doors towards informing policymakers, creating more equal opportunities for women scientists, improving good governance, and also can have an impact on research priorities as well as funding allocations.

The selected subject of the present research is the area of anti-doping research (ADS), a relatively young and emerging specialty. The main reason for selecting ADS for investigating gender differences was that though specific in scope, this interdisciplinary domain is made up of STEM and SSH reference fields often characterized by different gender-related patterns. ADS as a „melting pot” emerging from interdisciplinary interactions can therefore be assumed as a case where these dynamics potentially interfere (e.g. knowledge integration, also studied in our work). Anti-doping science intersects with various fields due to its multidisciplinary nature. It encompasses sports science, medicine, pharmacology, toxicology, analytical chemistry, but also law, ethics, public health, psychology. Together, these interdisciplinary perspectives shape the foundation of anti-doping science, driving research, policy development, and enforcement efforts to address the complex challenges associated with doping in sports. This characteristic renders this domain of interest for a case study with valuable contributions to the more general topic of gender-related issues in science.

Related work on gender-related trends in sport science and anti-doping research

Gender-related differences in sport sciences were examined several bibliometric studies in terms of academic productivity, authorship position (Chang-Yeon et al., 2019; Dynako et al., 2020; Loder et al., 2021; Mujika & Taipale, 2019; Ryan et al., 2020) editorial leadership position (Martínez-Rosales et al., 2021), academic positions, women scientists’ career development or international collaboration. Analyzing the trends of gender distribution in sport sciences there is an increasing body of evidence that the presence, role, and contribution of women have increased significantly in recent decades, but the gender gap remains a global phenomenon in the field.

For instance, Mujika and Taipale (2019) investigated gender differences in authorship within the International Journal of Sports Physiology and Performance. They found that only 13% of the authors were women in the first 5 issues of the journal published in 2019. In the realm of sport-related medical science, Chang-Yeon et al. (2019) analyzed orthopedic sports medicine literature from 1972 to 2018 to observe changes in the proportions of female authors across various authorship positions. Although only 16.6% of the authors were female, there was a significant increase in female authorship from 2.6 to 14.7% over the 46 years. Ryan et al. (2020) evaluated articles in the Sports Health journal from 2009 to 2018, noting an increase in publications with at least one female author from 52 to 64% over the study period. They highlighted Sports Health's comparatively high rates of female authorship compared to other journals. Loder et al. (2021) conducted a bibliometric analysis of English musculoskeletal literature over 30 years, revealing gender differences in first and corresponding author positions. The percentage of female first authors increased from 10.8 to 23.7% between 1985–1987 and 2015–2016, while corresponding authorship increased from 8.9 to 18.9% over the same period. Dynako et al. (2020) examined trends in two American sports medicine journals over 30 years, finding an average of 13.3% and 8.1% female first authors for the American Journal of Sports Medicine and Arthroscopy, respectively.

Specific to anti-doping science only a few studies provide bibliometric analysis. The study by Agulló-Calatayud et al. (2008) identified key research centres and authors of scientific articles on anabolic steroids. A working paper by Engelberg & Moston (n.d.) focuses on doping-related papers published in sport management journals.

The present research aims to provide an in-depth analysis of gender-related patterns that go beyond the simple identification of the gender gap. It is also a follow-up to our previous study, completing the picture of gender-related patterns in ADS. In our previous study (Kiss et al., 2022) we aimed to examine the contribution of women to anti-doping research and provide a picture of the relational structure of gender aspects of country-related, authorial, and topical features via bibliometric analysis. To our knowledge, our study was the first original article in anti-doping science that specifically studied the presence and role of women based on a bibliometric analysis of international academic literature.

Aims and research questions

The research design was based on the overall research aim to investigate whether gender in a specific field (ADS) has an effect on different aspects of research impact, including (1) the size of citation impact obtained by the research output, (2) the impact on the development of the knowledge base of ADS, expressed as the capacity of integrating knowledge from different research areas, and (3) the (expected) type of research impact targeting either societal or scientific developments (or both). This latter aspect (social vs. scientific impact) is not directly investigated, so it is not social/scientific impact itself that we subjected to analysis, but we used a proxy that can be expected to predict the type of impact: research aim or, to put it differently, research orientation. In particular, it is a plausible hypothesis that the outcomes of research with e.g. „societal” orientation/aim will be valorized in societal progress, as a type of (social) impact. We capitalized on this expectation by mapping the relationship between gender and research aim – which is assumed to be causally related to the type of subsequent impact – within the field.

Accordingly, our three research questions were the following:

RQ1. How do women’s authorship features (first, last, corresponding, position in the author network) affect scientific impact within ADS?

RQ2. Are there gendered differences in knowledge integration between different areas behind ADS?

RQ3. Is there a difference between dominantly female vs. dominantly male authored papers in research goal orientation (in terms of contributing to societal progress orientation vs. scientific progress/orientation?)?

Methods

For RQ1, we employ a form of regression analysis to disentangle the effects of several potential factors on citation impact, including female authorship and its patterns in the anti-doping literature. Via this tool, we aim to capture and quantify the true, individual effect of female authorship on research impact (if any) and separate it from the contribution of other factors. On the other hand, it was a natural way to test whether any “gender gap” existed in the field(s) of ADS, being well documented in earlier studies for many cases and research areas.

In the study of knowledge integration (RQ2), we introduce a new metric to uncover gender differences in authors’ capacity to connect the diverse reference field behind ADS.

For RQ3 we applied qualitative content analysis to investigate the difference between dominantly female vs. dominantly male authored papers in research goal orientation. Our focus (in this part of the research) was the identification of research orientation which can be expected to predict the type of impact (scientific progress, societal progress) for the paper. However, the concept of impact is not directly targeted in this approach.

The study concept is depicted in Fig. 1.

Fig. 1
figure 1

The logical framework of investigations

Data collection

To delineate the anti-doping research field, bibliometrics-aided retrieval was used in line with the methodological paper of Gal et al. (2015). Initially, a core dataset was established using a keyword profile based on a core publication (Acute Topics In Anti-Doping) for ADS (Core set, C). Subsequently, a broader dataset of publications and research outputs was assembled using a more expansive search profile (Broad search term set, B). Lastly, citation-based similarity measures were employed among the document sets to gain a final dataset with a high degree of precision.

For data collection and field delineation, the Web of Science Core Collection databases were used (including the Science Citation Index, the Social Science Citation Index, and the Arts&Humanities Citation Index). The Core set was based on the WoS database with a search query of “anti-doping* OR antidoping*” OR “anti-doping”*. The search terms were applied to the title only in WoS. No limitations were placed on the dates of the searches; the final date of data collection was October 20, 2021. The search resulted in a core set of 572 publications.

For the Broad search term set, B, we identified relevant anti-doping-specific search terms in sport science journals, systematic reviews, highly cited research papers, and WADA documents. We created a broad search term using the following query: “doping control*” OR”doping prevention*” OR”doping-free sport”* OR”clean sport*” OR”drug-free sport*” OR”anti doping in sport*” OR”fair sport*” OR “WADA*” OR “fair play*” OR “anti-doping*” OR “antidoping*” OR “anti doping*” in (Topic) in WoS database. The broad search term set included 2390 research papers.

In the next step, citation-based similarity measures were applied to integrate the core dataset and the broad search term set, resulting in two further document groups. Group I included papers in the broad search term set citing the core dataset at least twice (N = 889). To group II. belonged papers in a broad search term set with at least 10% references among the pooled references of the core dataset (N = 1594). In the final set, publications from the core dataset (C) and publications belonging to group I or II.) were included, thus ensuring the inclusion of relevant topics only. After eliminating duplications, the final set included 1802 publications, ranging between 1998 and 2021. After excluding studies to which genders cannot be assigned to authors 1341 studies were included in the analyses. Figure 2 illustrates the identification workflow of relevant research papers.

Fig. 2
figure 2

Source: Kiss et al. (2022)

Flowchart of database construction.

Sex/gender assignment

To identify authors' genders, we first standardized author names by removing abbreviations, and symbols, and abbreviating first names to full names. We retrieved full author names from bibliographic data obtained from WoS databases. Using the Gender API (available from https://gender-api.com/en/), a platform analyzing names to ascertain gender, we assigned genders to authors based on their full names. We also noted the country assigned to each author from the publications, which increased result accuracy. The final dataset included 3628 author names, of which 2415 full author names were identified. Among these, genders were assigned to 2346 authors (97%) with an average accuracy of 93.83%. A detailed description of sex identification is provided by Kiss et al. (2022).

Data analysis

The effect of gender differences in authorship on scholarly impact

To construct an explanatory (regression) model of citation impact in anti-doping research, two sets of explanatory variables were defined and calculated for our sample. The unit of analysis was the individual publication; therefore, factors of citedness were also defined at this level. Most importantly, first, the indicators that accounted for female authorship were developed. This set was then complemented with the group of those factors that have been demonstrated to have a significant effect on citation counts. In the subsequent section, we outline the gender-related indicators, followed by the list and definition of variables that constituted the second group we refer to as “main citation factors”.

Indicators conveying female authorship

The operationalization of the concept of female authorship at the publication level is not an unproblematic task. The main difficulty comes from the fact that papers are typically multiauthor with a mixed gender composition, so the direct attribution of a paper to any gender category is mostly unfeasible. In related studies, a heuristic is most often applied, in which the gender of the 1st (or 1st and last, or the corresponding) author is used as the decisive rule to classify a paper as a female- or male-authored publication (Zhang et al., 2021). Although this heuristic is a reasonable choice on the common assumption regarding distinguished roles within author listings, we attempted to take a step further and formulate women’s contributions from further, different angles. To that end, the following indicators were employed (short names for each variable are indicated in parentheses; the measures are defined for individual papers).

  • Gender of the corresponding author (c.au). A binary variable indicating whether the corresponding author of the paper is male or female (0 = male, 1 = female).

  • The proportion of female authors (weight). The ratio of the number of female authors to the number of all authors of a paper. The measure represents the weight of female authorship at the paper level, and its value ranges between 0 and 1 (for single-authored papers, this measure, therefore, takes its maximum or minimum value depending on the gender of the author).

  • Relative position (rel. position). The indicator is a generalization of those measures that account for the (assumed) role of female authors based on the byline, i.e. the order of authors. It is defined as the rank of the first appearance of a female author in the author list, normalized to the number of authors. For example, if a paper has n = 4 authors, of which the 2nd and 4th one is a female, then the value of this indicator equals 2/4 = 0.5. It can be conceived as the maximum (relative) rank of female authors in the author list as well. The 1st author position then becomes a special case, which is the maximal relative rank for a single paper, and its value is increasing with the size of the author group (technically, the value is decreasing in this case: a consequence of this definition is that the maximal value is the closest to 0, while the minimal value is always 1. For single-authored papers, a correction was applied, setting the value to 0 or 1 for female- and male-authored papers, respectively).

Variables for the main factors of citation impact. The remaining variables introduced in the explanatory model of citation impact all played the role of controlling for those factors that have been shown to exert a considerable effect on citation counts (Tahamtan et al., 2016). Since these factors constitute a relatively wide spectrum, we enrolled the most important ones based on the bibliometrics research tradition. This consideration left us with the following indicators.

  • Research field and area. As long evidenced in bibliometrics, research fields, and areas show differing citation behavior so that the impact of papers from different fields are not directly comparable. To account for this ingredient of citedness, control variables were applied conveying the research area to which a paper belongs. To support a more robust model, we used a relatively high-level taxonomy distinguishing between six broad categories. Papers were preassigned to these categories in the data source (Clarivate InCites Analytics©, providing the so-called GIPP research categorization scheme). The six categories included were referred to as (1) Clinical, Preclinical and Health, (2) Life sciences, (3) Physical sciences, (4) Engineering and technology, (5) Social sciences and (6) Arts and Humanities. These areas entered the model as a dummy variable each, with values {0,1}. For example, a paper took on the value 1 in the Life sciences variable if it belonged to that area and 0 otherwise. Note that multidisciplinarity, which is clearly present in ADS, was allowed to be represented this way since any paper could belong to more than one category.

  • Journal Quality. Since publications in top-tier journals have been shown to attract more citations than lower-ranked journals, a journal metric is applied in the model as an explanatory variable. For this role, the well-known Journal Impact Factor (JIF) has been chosen. JIF values for our dataset were retrieved from the InCites database (as provided in Clarivate’s Journal Citation Reports).

  • Further indicators of authorship. The author's composition of papers exhibits several characteristics that influence the extent of their recognition in the scientific community. Given this insight, we included further variables to complement those of female authorship, thereby disentangling the effect of different author-related characteristics of papers. In particular, three indicators were implemented: (1) the number of authors for a paper (co.au), (2) Author productivity (au.prod), and (3) Country of affiliation (cou.freq). The motivation for co.au is that the number of co-authors and citedness typically show a positive correlation. Author productivity for paper P is the total number of papers within our sample that pertain to the authors of P. This serves as a proxy for the visibility of the authors of a paper within the area of ADS. Similarly, Country of affiliation accounts for another aspect of author visibility and reputation by summing for a paper P how many times the affiliated countries (of the authors) of P appear within the affiliation data in our sample. In other words, this variable is a score of country-level affiliations for a paper, where each country contributes with its own weight (frequency) in the sample.

  • Type of document. A distinct set of factors in relation to citation impact is the so-called document type. It is also a well-confirmed hypothesis that, in general, research (or target) articles, reviews, and other bibliographic types, such as editorials, proceeding papers, book chapters, etc. tend to have differing potentials to attract recognition and hence citation counts. To control for this, we used a categorical variable with three levels: (1) Article, (2) Review, and (3) Other.

  • Open Access. Related to the aspect of document type is the open access status of a paper. There is a vast amount of literature confirming the so-called “OA citation advantage” assumption. that a publication available to the reader free of charge, that is, published under some form of OA, can collect more citations than the same publication behind the paywall. To address the intricacies of OA publishing, we introduced a two-level dummy variable into our model that simply recorded whether a paper is available only through subscription (non-OA, value = 0) or whether it is published through any open access channel (OA, value = 1). The latter case encompassed all the versatile types of OA publishing (“gold”, “green”, “hybrid”, etc.). Data were retrieved from the Web of Science databases.

The base regression model

Beyond the explanatory variables outlined above, an appropriate measure for citation impact had to be selected for the target variable. Instead of pure citation counts, we have chosen a measure that is designed to yield directly comparable impact scores between different types of publications. In particular, the so-called field-normalized citation score (FNCS or NCS) was applied as the outcome variable, which is defined for a publication as the ratio between the number of citations to the average citation number within the same research field and in the same publication cohort (i.e. papers with identical publication year). The FNCS indicator hence normalizes citation counts in two ways, to the field of research and the age of the paper. Although we control for these factors already in the model, as research fields and the year of publication are featured among our explanatory variables, model performance can still be improved by controlling for these on “both sides” of the equation. The main reason for this choice, however, is that this citation measure can be applied as a continuous outcome variable suited to a simple linear regression model (as opposed to raw citation counts): its values are real numbers. In particular, its value is FNCS = 1 (< 1 or 1 <) if a publication reaches (remains below or levels beyond) the citation average within the field: in fact, its value conveys how many times the field’s average a paper is being cited.

In sum, our base regression model was a multivariate linear regression of the following form.

$${b}_{0}+{\sum }_{n}{b}_{G\left(i\right)}G(i)+{\sum }_{m}{b}_{C\left(i\right)}F(i)+{\sum }_{u}{b}_{C\left(i\right)}C\left(i\right)=\text{ln}FNCS$$

whereby.

  • \(G(i,\dots ,n)\) is the series of n variables on gender and female authorship (n = 3),

  • \(F(i,\dots ,m)\) is the group of m variables controlling for the research field (m = 6 field variables),

  • \(C\left(i,\dots ,u\right)\) is the set of u variables accounting for the main factors on citation impact (journal quality, further authorship patterns, document type, year of publication, open access status).

FNCS, the normalized impact measure, is included with the regular logarithmic transformation (log-normalization). This was motivated by FNCS being a variable with a highly skewed distribution (as is characteristic of citation distributions), so a transformation was needed to achieve a normally distributed target variable for the linear regression model. Regarding data collection FNCS values have been obtained from the InCites service of Clarivate, measured at the Subject Category level.

Women’s impact on knowledge integration

The other main research question of ours concerned a special but equally important aspect of the scientific impact of women in ADS. This aspect can be termed the impact on knowledge integration. The methodology of measuring to what extent female authors as “mediators” are positioned between different fields was as follows. In the first step, we constructed a network representation of our publication sample. Given the pre-existing assignment of papers to the six research areas of the GIPP category system, we extracted the author—paper and the paper—area relations from the data, out of which the interconnections between authors and their research areas (within the document sample) were recorded. Based on these interconnections, we built a bipartite graph or network with authors appearing as one type of network node and the six broad research areas as the other type. The edges of the graph, connecting an author to one or more research areas each, conveyed the relation that the author has been active in the related field (i.e. at least one of her publications belonged to those areas). The network was also a weighted graph in that edges were assigned a numeric value equal to the number of papers of the author in the corresponding research area. This representation allowed us to investigate to what extent authors – especially female authors – participate in “integrative” research that connects different areas. For this purpose, we introduced a positional network measure based on the author-field graph outlined above. We refer to this as LinkScore, alluding to the idea of quantifying the capacity of an author’s work in linking research fields. The LinkScore was calculated for each author in the following steps:

  • Counting the number of connected fields per publication. In the first step, each publication was given a raw score by counting the number of research areas assigned to it. Hence, for “monodisciplinary” papers, the score equaled 1; for papers sorted into two GIPP research categories, it was 2, and so on.

  • Weighting connections based on the distance between fields. Another consideration for the LinkScore measure was that linking distant research areas represents a larger extent of knowledge integration than linking closer areas. Therefore, we scored higher in the cases of “long-distance” interdisciplinarity over “short-distance” interdisciplinarity (Larivière et al., 2015). In particular, the raw LinkScores for publications were weighted according to the following scheme: a weight of 3 was applied to raw LinkScores if there was a minimum of 1 “natural science” field plus Social Sciences and/or Humanities, whereas LinkScores within “natural” or within “social” science fields were not amplified (weight = 1). As a consequence, a paper linking, e.g., the Life Sciences and the Social Sciences, will have a LinkScore = 6 (2 fields with a link weight = 3), while the same paper would gain a LinkScore = 2 if it was assigned, e.g., to the Life Sciences and the Clinical Sciences (2 fields again, but with a link weight = 1). The reason for tripling the weight for the “social + natural sciences” combination was to generate sufficient discriminatory power for the measure, since lower values would not allow us to distinguish between “long-distance” and “short-distance” interdisciplinarity in all cases (e.g., the LinkScore for a combination of the four hard science fields would be the same as that of a “social + natural science” combination).

  • Aggregating publication-level scores for authors. The final step was to relate these publication scores to authors, i.e. to aggregate the author’s LinkScores of their papers. Since we intended to create a size-independent measure (so that the number of publications should not influence the score), the maximal score by author was taken. In particular, the LinkScore for an author with n papers was calculated as follows:\(\text{max}\left({\sum }_{i,\dots ,n}{\text{LinkScore}}\left(i\right)\right).\)With this measurement specified in the previous three steps, we calculated the LinkScore values for all authors in our sample and used the results to characterize female authors’ contribution to knowledge integration as well as to compare female and male authors along this scale.

Research goal orientation of dominantly female vs. dominantly male papers

To detect potential differences in the research goal orientation of dominantly female- vs. non-female-authored papers, content analysis was carried out. For the analysis of whether dominant research goals are associated with gendered authorship, we had to sort publications into at least two categories: one that is dominated by female authorship and the other that is characterized by male authorship. We referred to these two as “dominantly female publications” and “dominantly male publications”, respectively. In the definition of the two categories, we relied on our indicators of female authorship, namely, c.au, recording the gender of the corresponding author; weight, the share of female authors among the author group per paper; and rel.position, the rank of female authors within author lists. In particular, we applied the following rules:

  • Dominantly female publications. For a publication to be primarily female-authored, at least one of the following conditions were to be met: (1) the corresponding author is a female, (2) at least half of the authors of the paper are female, and (3) female authors (at least partially) appear in the first 50% of the author list. Formally, (c.au = 1) ∨ (weightfemale ≥ 0.5) ∨ (rel.positionfemale ≤ 0.5).

  • Dominantly male publications. Consequently, for a publication to be deemed primarily male-authored, each of the following conditions were to be met: (1) the corresponding author is a male, (2) at least half of the authors of the paper are male, and (3) male authors (at least partially) appear in the first 50% of the author list. Formally, (c.au = 0) & (weightmale > 0.5) & (rel.positionmale < 0.5).

Although logically sound, the above definitions imply some asymmetry between the two groups: at the cost of accounting for various roles of female authors in the “female-dominated group”, the “male-dominated” set becomes more restrictive. To counterbalance this effect, we relaxed the second definition (to retain the sensitivity of the first) and considered publications that met at least two criteria out of the three (corresponding authorship, weight and rank of male authors) as eligible for the male-dominated category. Based on this consideration, a certain set of papers was reassigned to the male-dominated category in a post hoc manner.

In our attempt to capture gender dominance based on the authorship patterns of papers, it would be natural to incorporate the existing evidence on author credit allocation practices. However, in doing so, a basic difficulty would be that the ADS domain under study has a broad multidisciplinary scope, incorporating social sciences, health/medical sciences, natural sciences, engineering, social sciences and humanities as well, Since these disciplines substantially differ in their practices of allocating author credit, we decided to overcome this difficulty by using the most general considerations on author credit indication (taking the first, the corresponding author and the gender ratio in our case) and use them in tandem so that they can compensate each other’s biases when applied to the diverse subfields within ADS.

Coding criteria

After categorizing the papers according to gender, a random sample of abstracts of 105 dominantly female and 105 dominantly male publications were selected from the final set of anti-doping publications. This dataset included the WOS ID, title, abstracts, and authors' keywords of the randomly selected research paper. The reading and coding of the selected abstracts were performed by two of the authors. Following the coding of abstracts, the authors discussed their differences until a consensus was reached. They conducted the evaluation simultaneously but independently to avoid influencing each other's judgment on scoring, ensuring that the evaluation was free from biases. When coding the selected publications, only the content of the abstract was considered for categorization, as the content of the article is often not discernible from the publication title alone. During the coding of the abstracts, the two authors adhered to the definitions determined in the working paper of Zhang et al. (2021). To classify publications by their aims they introduced a new distinction instead of the widely used basic/applied distinction by differentiating between three variations of the aims of the research: aiming at scientific progress, aiming at societal progress, or both. The criteria for classifying publications into three categories are shown in Table 2.

Table 2 Coding criteria of the content analysis

It should be pointed out that the operationalization of the notions of „basic science” and „scientific orientation” vs. „applied science” and „social/societal orientation” is context-dependent, i.e., it cannot be detached from the domain of investigation in our research. In our case, where the context is the domain of ADS, „applied” is considered to cover the approaches aimed at directly contributing to Anti-Doping applications (regardless of this being the medical, social or ethical aspect of AD), while „basic” or „scientific” is considered as a more general research aim. Therefore, although e.g. Clinical Studies are usually considered as applied research (not being Basic medicine), in this context these would also be (and have been) categorized as „basic” or with „scientific aim” on an individual basis, if their subject went beyond direct applications, that is, external utilization of knowledge was addressed, and hence aiming at contributing the clinical knowledge base beyond ADS.

Results

Gender differences in the proportion of authors and authorship position

Female authorship throughout the timespan of the publication records in anti-doping is best described by the dynamics of three gender-related indicators. Figures 3 and 4 depict trends in these indicators – corresponding authorship, mean relative position of female authors on papers, and the share of female authors in author lists (mean weight)—for the last one and a half decades, covering approximately 17 years between 2005 and 2021.

Fig. 3
figure 3

The annual values of the number of papers with female corresponding authors and the fractional count of papers with regard to female authors (weight.total) between 2005 and 2021

Fig. 4
figure 4

The yearly averages of the relative position of female authors on papers and the share of female authors in author lists (mean weight) between 2005 and 2021

The first indicator captures whether the role of the corresponding author for a paper is taken by a male (0) or female (1) researcher (c.au). We used the annual number of papers with a female corresponding author (no. papers with c.au = 1). Striking from Fig. 3 is that this quantity has been increasing since 2007 in a linear and mostly monotonic fashion (with a few exceptional years of “recession”). The curve is also rather steep, as the number of such papers doubled in approximately every five years during this period. This tendency clearly signals a growing representation of women in ADS.

A similar tendency is also present in the dynamics of the other two indicators. For the proportion of female authors per single paper (weight), we used the annual sum of this quantity (Fig. 4), which is the total weight of female authors in a particular year. In bibliometric terms, this is identical to the “fractional counting” of papers with regard to authors, where each publication is counted according to the weight of its female authors (meaning that a paper with 40% women on its author list was counted as 0.4 paper instead of 1). In this way, we can calculate the total amount of outputs (per year) that could be assigned to female authors only. Just as with the number of papers with female first authors, the total weight of women-authored papers has also increased since 2005, in fact, in the very same manner as the extent of first authorship. The two curves proceed in close proximity (almost overlap), so the same steep slope and overall monotonicity are demonstrated by the annual share of female authors (in papers), approximately doubling every five years.

The annual values of the size-independent version of the same indicator, the weight of female authorship within author lists, are depicted in Fig. 4. In this case, the yearly averages of the shares of female authors are reported (mean weight). For the period under study, this share has been slowly but clearly (monotonically) growing: it has increased from 5% (2005) to more than 20% (2021) on average. This trend has been strengthened by the positional changes within the order of authors, as witnessed by the development of the third indicator, that of the relative position. In Fig. 4, just as for the weight indicator, the mean values of the relative position indicator are indicated for each publication year (mean rel. position). Note that it is a’tricky’ measure to interpret because lower values represent higher ranks; therefore, the slightly descending curve actually signals improving positions for women. More specifically, from 2007 on, the mean position has gradually been improving from approximately 0.6–0.5 to 0.4, meaning that female authors are moving forward from the “first 60%” to the first 40% on author ranks (i.e. within individual papers’ author lists). Although in certain research areas, being the last author is also a distinguished position, in such a multidisciplinary domain as ADS, higher ranks (being the 1st or among the first authors) can still be expected to trigger broader recognition. Altogether, each of the three measures on female authorship mutually confirms that women’s representation has been considerably extended in the domain of ADS throughout the last two decades. Next, we turn to our main question on the relationship between this elevated representation and research impact.

Effects of women’s authorship features on scientific impact

In the regression analysis seeking the relationship between female authorship and citation impact, we followed the strategy of finding the best available model. Regarding the procedure of variable selection, that is, in our attempt to find the relevant factors influencing citation scores (out of the indicator set described above), we preferred to choose the “theory-based” method. That is, over automatic selection algorithms that seek the best fitting model (in purely quantitative terms), we prioritized theoretical considerations in setting up our collection of alternative models or indicator sets. In particular, we decided to include in each alternative model the variable group C(i,..,u), which we called “the main factors of citation impact”, as well as F(i,…,m), i.e., the control variables for the research area since the effect of these variables is vastly demonstrated in the literature. The only source of variation between our models was the set G(i,…n), that is, the variables of female authorship, the effect of which we wanted to test in all possible configurations. This left us with seven models for testing. Of these, models 1–3 involved only one of the three indicators (c.au, rel.position, weight), models 4–6 involved two of them, accounting for all such combinations, and model 7 involved each of the three variables. The most informative results are reported in Table 3, comparing the effect size of each explanatory variable across the seven alternative models using the standardized regression coefficients (plus indicating the model fit measures).

Table 3 Standardized coefficients (beta values) in the seven regression models under study

As reflected by the model fit measures in Table 3, the seven models performed almost equally in terms of the goodness of fit. The AIC (Akaike Information Criterion) value is practically the same across all models, meaning that varying the set of gender-related variables does not affect how well the model fits the data. Also equal is the R2 value, according to which these models account for approximately 15–20% of the variation in citation scores, which makes them a week but (regarding such complex relationships) acceptable explanations. In fact, we also ran automatic model selection algorithms to complement the theory-based selection, but the performance of the best “automatic” model did not differ from the ones below. Moreover, the effect sizes of individual factors also remained quite stable throughout the models.

First and foremost, indicators of female authorship appear to have a weak effect on citation scores. More precisely, the corresponding author being a female (c.au) is weakly but positively correlated with citation scores (the β-values ranging from 0.04 to 0.06 across models). The relative position of female authors (rel.position) technically shows a mostly negative, weak correlation with impact (the β-values ranging from − 0.04 to 0.01). However, recall that, by the definition of relative position, lower values mean a higher rank of female authors in the author list. Hence, we can say that this negative correlation signals a positive relationship between female author rank and citation scores so that the higher the rank of female authors is, the greater the expected impact of the paper. On the other hand, the proportion of female authors (weight) came out as slightly negatively affecting citation values (the β-values ranging from − 0.05 to 0.01).

Regarding the pattern of effect sizes in relation to the other variables, our models confirmed the previous findings regarding the relevant factors of citation impact. In the “main factors of citedness” set, the greatest and invariable positive effect is attributed to journal quality (Journal.Impact. Factor), which is moderate (β-value = 0.16) but the highest among all variables with a positive influence. The open access citation advantage has also been confirmed (β-value = 0.13). The next such indicator is author productivity (au.prod), (β-value = 0.10), emphasizing the value of the extent of authors’ representation within the professional community. Interestingly, the number of co-authors (co.au) and authors’ nationality (cou.freq) appeared to have a small – but positive – effect on impact. Finally, a weak negative influence was detected on the part of publications that did not belong to the “Article” or “Review” category (DT = Other, (β-value = − 0.19), and of the year of publication (PY, β-value = 0.14), the latter warning that even age-normalized citation measures as our target variable cannot fully omit the effect of the citation window (so that older publications still get higher citation scores than more recent ones).

To gain a deeper insight into the structure of effects exerted by individual variables, especially the gender-related ones, we looked into the composition of a selected model out of our seven variants. Since all the models tended to be equivalent regarding their performance, we have chosen the one that was most equipped with the indicators of female authorship: in Model 7, all three such variables were included. The detailed regression results for this model are reported in Table 4 (primarily including the raw regression coefficients and the significance values).

Table 4 Detailed regression results from the selected model (Model 7)

According to the detailed analysis, the effect of female authorship variables is not statistically significant. Significance (at the level of p ≤ 1%) was detected for the same factors that showed higher effect sizes (as discussed above): Journal Impact. Factor, the year of publication (PY), author productivity (au.prod), OA status and document category of “Other” (DT = Other). Additionally, publishing papers in the Physical Sciences or, to a lesser extent, in the Clinical, Preclinical & Health area appear to have a positive and significant influence on citation scores.

Although not statistically significant, it is still worth considering the raw estimates of coefficients in the case of gender-related variables, more specifically, that of c.au, the corresponding author being a female researcher. The reason for this is the simple fact that our “sample” can even be conceived as the population itself, since our data were retrieved by delineating the field of ADS and using “all available” publications. This value is B(c.au) = 0.36, meaning that female corresponding authors (in the sample) trigger a 43% higher citation score (exp(0.36) = 1.43) on average than male corresponding authors (other factors held constant). Since an impact value above FNCS = 1 already means a recognition level above the world average (within the field), this can be viewed as a considerable increment.

Women’s impact on knowledge integration

The overall structure of knowledge integration present in our data is first reported via visualization of the network of authors and research areas. Recall from the Methods section that this bipartite graph represents the relations between authors and their research fields (given the GIPP categorization scheme) by connecting each author to the research categories assigned to her/his papers. In the visualization (Fig. 5), the nodes corresponding to the six research areas are indicated by squares, while nodes corresponding to authors are shaped as small dots. To support the interpretation and increase visibility, two further modifications were applied. First, the edges of the network (connecting authors to fields each) were colored according to the research field involved in the relation (similarly, the borders of the research area nodes were colored the same way). Colors, therefore, were specific to research areas as follows: red edges signal connections to the Physical Sciences (PS), blue edges to the Life Sciences (LS), (3) green edges to the Clinical, Preclinical and Health Sciences (CP&H), orange edges to the Social Sciences (SS), yellow edges to the Arts and Humanities (A&H), and gray edges to Engineering and Technology (E&T). Additionally, as the second modification, the graph has been simplified (or “purified”) for a better structural view: only authors attached to at least two fields are included in the visualization; hence, the reported subgraph shows the pattern of interdisciplinarity within the data.

Fig. 5
figure 5

The bipartite graph represents the relationships between authors and research areas (according to the GIPP categorization scheme). Colors are specific to research areas as follows: red edges signal connections to the Physical Sciences (PS), blue edges to the Life Sciences (LS), green edges to the Clinical, Pre-Clinical and Health Sciences (CP&H), orange edges to the Social Sciences (SS), yellow edges to the Arts and Humanities (A&H), and grey edges to Engineering and Technology (E&T). The network is being presented with two layouts: the joint distribution of authors and research areas is represented (using the Fruchterman-Reingold layout algorithm), while the second visualization captures the basic structure of the network, where author nodes in the same position (related to the same set of fields) are being collapsed into a single node by overlaying one another (using a Multidimensional Scaling-based layout).

This pattern of interdisciplinarity, as observable through the graph, is strikingly clear. The strongest connection emerges between the Physical Sciences (PS) and the Life Sciences (LS), as signalled, on the one hand, by the close proximity of these two nodes at the centre of the network (the graph layout being sensitive to the number of connections, positioning densely related nodes close to each other) and, on the other hand, by the large number of authors (dots) with blue and red edges. This part of the network is joined with the rest through the Clinical, Preclinical and Health sciences (CP&H), which is in a typical knowledge broker position: many authors relate it to either the Physical or the Life Sciences or both, as witnessed by many authors (dots) with green, blue, and/or red edges. However, the area of CP&H also plugs in the social sciences (SS) into the system, as the latter is attached to the network through its interconnections with the CP&H area. This becomes visible through a distinct set of authors (dots) “on the North part” of the network, possessing green and orange edges. Finally, both the Arts and Humanities (A&H) and Engineering and Technology (E&T) behave as satellite areas. The humanities are pulled in through some connections with the social sciences, while engineering is mainly connected through the physical sciences. (In fact, as we shall later see, in some papers, the latter fields were present in broader combinations, but – as a further simplification – for this overview, we only kept the “backbone” of our graph where stronger links with at least two related publications per author were preserved.)

The analysis of knowledge integration scores (the LinkScore values) among authors yielded the distribution shown in Table 5, which is a cross-tabulation of two variables, namely, gender and LinkScore. The latter was interpreted here as a categorical variable, for which some explanation is in order. Since, conceptually, the integration score is an author-level maximum over publication scores, the application of this measure to our sample resulted in five discrete values. More specifically, as expected from its design, the measurement resulted in.

  • 1, for authors who have been active in a single field,

  • 2 or 3, for those who have typically been connecting two or three more closely related areas of “hard science”,

  • 6, for those who have typically linked a “hard science” field with either the social sciences or the humanities,

  • 9, for those who have been linking more than two fields, including the social sciences and/or the humanities.

Table 5 The joint distribution of gender and the LinkScore indicator

The level of knowledge integration increases with the score. The test for the association of gender and LinkScore did not support that the author’s gender has a statistically significant relationship with the role in knowledge integration in general (χ2 = 18.26, p = 0.14), and the association of these variables was also weak (Cramer’s V = 0.05).

Considering the distribution of LinkScores among female and male authors reported in Table 5, it is indeed striking that the percentage of female authors is quite close to that of male authors in almost every category. The majority in both gender groups (45–50% of the authors) is a “specialist” in that these researchers have been active only in one field. A fair amount, however, belongs to the “short-distance interdisciplinary” category (approximately 20–30%), connecting 2 or 3 “hard science” fields. (Nonetheless, we should note that interconnections within the “hard sciences” category can also vary in the level of knowledge integration, as the Physical Sciences—Life sciences combination, which is rather frequent in our data as could be observed through the network visualization, potentially covers a greater distance than the Life Sciences—Clinical Sciences combination.) Authors with a value of 6, that is, connecting an SSH field with a hard science area, are an order of magnitude lesser, approximately 4–5% for both genders. Finally, the share of the most integrative authors with a value of 9, linking SSH fields with multiple hard science fields, is approximately 0.8–0.9% among female and male researchers as well. The distribution of female and male authors according to LinkScore categories is presented in Fig. 6.

Fig. 6
figure 6

Comparison of LinkScore categories by gender

Based on our results, we also compiled a list of the most integrative researchers in the ADS field. This list is reported in Table 6, where authors on outputs with a LinkScore of 9 and 6 – that is, all authors participating in research that embodies “long-distance interdisciplinarity” – are being collected. Within the table, researchers are ordered according to their gender so that the group of female researchers can easily be separated from the other gender group. Because LinkScore relates to a given output and because the measure is not size-dependent, it means that authors listed in any LinkScore categories in Table 6 are those who have at least one paper that fits the criteria for that category. In other words, the Linkscore measure captures the potential of an author to connect fields, not the overall performance in connecting fields.

Table 6 Authors with LinkScore values of 6–9, ordered by gender

Gender differences in research goal orientation

To explore gender differences in research goal orientation, 210 randomly selected publication abstracts were coded for three variations of the aims of the research: advancing science, societal impact or both. The key determinants of the coding were the primary purpose of the research and the consideration of external knowledge utilization. The coding criteria for the three categories of the aims of the research are presented in Table 2. The results of the two coding exercises are compared in a confusion matrix (Table 7), based on which the Cohen’s Kappa statistic of the agreement between the two codings was calculated. Results show a considerable agreement of the two coding rounds (disagreement could detected on the Orientation type = 1: „scientific aim” category) with a Kappa value of 0.84 (usually interpreted as an indication of a „fairly good” agreement).

Table 7 Confusion matrix of the two codings

The general share of female- and male-authored papers was made equal. Half of the abstracts (n = 103, 49.0%) were coded for scientific progress as the main aim. The other half was split between aiming for societal progress (n = 44, 21.1%) and having dual aims (n = 63, 30.1%). Dominantly female papers were overrepresented among publications classified as aimed at scientific progress, while the share of male-authored papers was higher in publications classified as aimed at societal progress. Among publications classified as having both aims, the share of male-authored papers is higher by 8%. Contrary to other studies, female authors are engaged more in research we have classified as aimed at scientific progress, while male authors often value research we have classified as aimed at societal progress and research having both aims. Figure 7 represents the gender differences in the research goal orientation of dominantly female vs. dominantly male papers (χ2 = 12.7, p = 0.002).

Fig. 7
figure 7

Gender differences in research goal orientation of dominantly female vs. dominantly male papers

We have included research areas in the analysis and displayed the distribution of subject areas. Among the coded papers, the subject areas of the anti-doping-related articles are mainly distributed in Clinical Sciences (n = 65), followed by the field of Social Sciences with 51 publications and Life Sciences, then Physical Sciences (n = 25). The share of dominantly female vs. non-female papers varies by the areas of research (Table 8). The largest imbalances in publications of different research areas are in Social Sciences (19 dominantly female papers vs 32 dominantly male papers) and Clinical Sciences (36 dominantly female papers vs 29 dominantly male papers).

Table 8 The share of female vs. non-female authored papers by research area

Table 9 shows the distribution of all three categories of research goal orientation according to research areas. Within the 102 publications classified as aimed at scientific progress, anti-doping-related articles mainly occur in Clinical Sciences with 34 publications, Social Sciences with 17 and Physical Sciences; Life Sciences with 16 papers. The largest shares in publications classified as aimed at societal progress or having both aims appear also in the field of Clinical Sciences (n = 14; n = 17) and Social Sciences (n = 16; n = 18).

Table 9 The share of variations of the aims of the research by research field

Figure 8 summarizes the analyzed abstracts according to research areas, gender differences, and research goal orientation. The largest shares of dominantly female papers that aimed at scientific impact were in Clinical Sciences, Social Sciences and Physical Sciences. In the field of engineering and technology, predominantly male papers have been published in all three research goal orientation categories.

Fig. 8
figure 8

Gender differences in research goal orientation by scientific fields

Discussion

The primary aim of this work was to provide a novel approach for the analysis of research impact and identified actors within the anti-doping research landscape who play the role of connecting different research fields in their work. Specifically, we explored the characteristics and factors of the scientific impact of women’s contribution to ADS and detected potential gender differences in research goal orientation. In addition, we were interested in women’s contribution to the interconnections and combinations of different areas and fields behind anti-doping that would lead to effective interdisciplinarity, which is a requirement to address such complex and multi-faceted problems as doping. To realize these aims, we used a mixed method approach including regression models, network measures, and qualitative content analysis.

Academic impact

The results suggest that the domain of anti-doping exhibits fairplay conditions or a “healthy” ecosystem, where no gender bias against female authors can be detected. Furthermore, in terms of citation scores, the contribution of female authors with research excellence is even more impactful. This is reflected in the study outcomes that, on the one hand, most gender-related variables showed a small positive, albeit not statistically significant, effect. On the other hand, the higher rank of female authors in the paper’s author list (such as corresponding authorship or being the first author or being “among the first couple of authors”) still appeared to trigger a somewhat larger impact if we consider our data as the “population” of research output within ADS. What seemed to matter more to impact was the set of publication-related variables (as opposed to author-related ones), such as journal quality or the Open Access status. One exception to this is author productivity, which cannot be underestimated in scientific recognition. These results are very much in accord with recent studies on the relation between female authorship and academic impact (Thelwall, 2020b). Furthermore, Hang et al. (2022) found no substantial or statistical significance of gender in shaping citation impact but identified gender-related author rank (namely, female author as first author) as the only factor that exerts some effect on citation scores. Previously, collaboration between two female researchers or with a female researcher was found to be more transient (Shen et al., 2022) than collaboration between male researchers, but collaboration pairs with a female partner yielded more citations.

Impact via knowledge transfer and integration

Regarding women's knowledge integration roles, the general structure of the female author group was similar to that of the male author group, and we can infer that no substantial gender differences can be detected here. However, small differences could still be observed in particular categories. First, female authors tend to be somewhat more interdisciplinary in that fewer female researchers were shown to be active in just one area (45% compared to 51% in the male gender group). Second, this increased interdisciplinarity is manifested mostly within the hard sciences: a higher share of female authors can be found in the LinkScore category of 2 and 3 than of male authors (28% vs. 24%, and 23% vs. 19%, respectively), meaning that it is slightly more common among women to work at the intersection of two or three areas within ADS. Finally, in the most integrative category (LinkScore = 6), where social sciences and humanities research fields are connected with more of the hard sciences, it is female authors again whose percentage is slightly higher (0.9% vs. 0.8%). Although these differences are not conclusive, they still reflect a tendency towards gender equity or slight superiority (on the part of female authors) regarding the knowledge integration aspect of scholarly impact. Our findings are consistent with those found by Elsevier's gender report (2021), indicating that in general, women's scholarly output includes a slightly higher proportion of highly interdisciplinary research compared to men's.

Research orientation

The content analysis of 210 randomly selected abstracts of anti-doping papers showed significant differences in the aims of research between dominantly female and dominantly male papers. Dominantly female papers contributed more to research aimed at scientific progress, while dominantly male papers are overrepresented among publications classified as aimed at societal progress or as having both aims. This finding is surprising and novel to the anti-doping literature. Zhang et al. find that female first authors are relatively more involved in research categorized as aimed at societal progress, while male researchers more often engage in research mainly aimed at scientific progress in different research fields. The results of Shang et al. are consistent with the findings of Zhang et al., indicating that female researchers more frequently participate in research aimed at societal progress. The authors concluded that female first authors are more likely to be involved in research related to Sustainable Development Goals.

However, our results suggest that focusing on specific multidisciplinary specialties (other than SDG) may reveal different patterns of gender involvement as is the case with our result.

This may be due to anti-doping being a relatively young and still emerging research field where established researchers, men, and women, took on the dual role of advancing the field (academic progress) and conducting research aiming for societal progress. Perhaps women have a stronger affinity to adapt to the (perceived) demands as well as the standards of the field. These results may reflect differing research priorities or interests between genders within the anti-doping community, however, Thelwall et al. (2019) demonstrated that gender disparities in research fields and topics cannot be solely attributed to inherent differences in people/thing interests or inclinations.

Relation of the study outcomes to the existing literature

Since the results of the present study apparently show a mixed relationship to the existing results, it is worth briefly summarizing its position – and considering its potential explanation – in the landscape of bibliometric research focusing on the relationship between gender and research impact.

For RQ1 (gender and citation impact), the purposive sample in Table 1 shows that although our results diverge from those of (Zhang et al., 2021), which served as the direct motivation for our study, several other studies did not find a significant effect of female authorship on citation impact, similar to our case. Among these studies were research reports on multidisciplinary and large samples collected on relatively wide time intervals. In our account, this „contradictory picture” is only seemingly contradictory: the contextual differences (time period, areas of research/levels of aggregation, maturity, and traditions of research domains) may vastly influence the gender-related patterns discerned from the study. In our case, the previously emphasized peculiarities of ADS (also among the reasons for its selection as the subject area for this research) can amplify this context-dependence of these findings (even more illustrative in this respect is the case for RQ3, see below).

Beyond contextual differences such as the scope and the subject matter analyzed in these studies, it is natural to assume that methodological differences also play an equally important role in accounting for discrepancies in respective study results. The methodological variability in question ranges from the whole research design to the choice of instruments for measuring academic impact, as we shall see below. We can gain sufficient insight into the role of methodology by invoking a previous study on the potential role of female authorship on academic impact in psychology in the US (Thelwall, 2020c). Thelwall applied a very similar methodology to ours: they used a multivariate regression model to measure the effect of female authorship in various author positions on normalized citation counts. Notably, they argued for the application of the citation measure that we adopted here, the log-normalized citation score (LNCS), as providing valid (hence fair) comparisons between papers in terms of their recognition. The results of their study showed no (either practically or statistically) significant effect of gender on citation impact, except for a small positive effect of female first authorship, which is then termed „female citation advantage” as it could be reproduced in a series of further studies (cf. Thelwall, 2020b). What makes this study exceptionally useful for our purposes is that the authors provided a very short, still, in a sense, comprehensive list of methodology-related reasons that may have led other bibliometric approaches to opposing results. As the authors claim, „It is possible that misinterpretations of citation counts, such as use of the h-index, career citations, or un-normalised citation counts compared between fields, have incorrectly led to males gaining greater recognition.” (Thelwall, 2020c, p. 693.) In other words, the authors deem the instruments used to measure citation impact as responsible for the differences, and also misleading, for example, the h-index, apart from not being an impact measure per se, is known to be unnormalized for many impact-unrelated factors affecting citation counts. Moreover, many of those measures are author-level indices – such as the h-index – aggregating paper-level citation counts to gauge the impact of the author. Though it is a tempting option, since authors can much more straightforwardly be sorted into gender categories (male, female), this step introduces many potential biases as well. In the first place, by summing over individual papers, the opportunity to control for the diverse paper-level factors behind citation impact (journal prestige, co-authorship, international vs. domestic collaboration, etc.) is mostly lost. This methodological difference implies a different research design, whereby (gendered) authors are compared on their impact, instead of the impact of papers being tested for the potential effect of authorship features. This design was applied in different research on psychology (González‐Álvarez & Cervera‐Crespo, 2019), along with paper-level comparisons within groups homogenized for several citation-related factors (number of authors, scope of collaboration), as a solution for controlling these factors even at aggregate levels. For the measurement of impact, however, raw citation counts were used and averaged over the female- and male- (first- and last-) authored papers. Even this setting resulted in very small differences between male and female papers in terms of effect size (according to the ANOVA test), though favoring male papers in this case. On Thelwall’s abovementioned account, however, raw citation counts for such a broad field as psychology (encompassing „soft” and „hard” areas) and sheer averages for highly skewed citation distributions profoundly distort the results – as demonstrated by his analysis of psychology on subfield-level and using a robust paper-level citation measure (LNCS) to preserve commensurability. In any case, this discourse conveys the sensitivity of the results to the selected methodology – in our case, we based our choice on Thelwall’s arguments.

For RQ3 (gender and research aim), the divergence in research orientation patterns found in (Zhang et al., 2021) is a more subtle issue. As an aspect of context-dependence, a specific feature in our method might have led to this difference: in the present context (ADS domain) we used an adapted version of the coding criteria in (Zhang et al., 2021) to categorize papers, but we argued for a modification specific to the research domain (ADS). Namely, most research in clinical medicine has been categorized here as having a scientific aim, the reason for which is discussed in detail in the description of coding. This modification, considering the high share of such studies in the sample, is one potential factor of the differences from the previous study. This alteration, however, was not arbitrary, but arguably domain- or field-related, reflecting a more adequate conceptualization of „societal aim” within the confines of the specific context (ADS). In sum, a detailed comparison dissolves the picture that the present results directly contradict existing evidence (as we may speak of differences, but not contradiction per se).

As is the case with the sensitivity of authorship effects on citation impact (RQ1) to the selected methodology, further literature also suggests a similar sensitivity on the relation of gender and research orientation (related to RQ3). Although this literature is very scarce, a specific study could be identified as directly reflecting our methodology. Santos et al. (2022) provided an approach to the relationship between academics’ research agendas and their preferences for basic research, applied research, or experimental development. In contrast to our qualitative approach, they used a questionnaire-based correlational design, a previously validated inventory for measuring the components of research agendas, and analyzed their effect on the weight of basic/applied research in academic practice. The latter was also measured via an instrument recording the proportion of time allocated to each, which served as a tool for cluttering them into “applied” and “basic” categories (similarly to our dichotomy on “social/societal” vs. “scientific” aims). Their sample consisted of 1160 respondents collected from various universities (no further specification was given, concerning e.g. the research fields sampled). Regarding their results, the authors explicitly pointed out that “we do not find different research focus preferences between male and female academics, which is inconsistent with other studies indicating that male academics lean toward the basic sciences and female academics lean toward the applied sciences (Zhang et al., 2021)” (Santos et al., 2022, p. 4212). In any case, we can suspect that alterations of the methodology lead to differing results: the quantitative methodology just described was associated with no difference, the qualitative methodology applied in Zhang et al. and our study showed an existing difference but with opposite signs, which we partially attributed to a different context-based modification of the coding criteria. Naturally, it is difficult to assess the individual contribution of the context, subject matter characteristics – e.g. research fields – or the methodological choice to the differences in the results, which is beyond the scope of the present study but would be worth further investigations. As we basically adopted the qualitative approach in a strict manner, we can assume, as discussed above, that the inconsistency of our study with the previous one on research aims and gender can be attributed – beyond the moderate alteration of the method – primarily to the context under study.

Limitations and future directions

This paper provides insight into the correlation between gender differences and research impact on anti-doping research. During the analysis, we faced several challenges; thus, our study is limited by several factors. One, for example, is that the pool of publications we used for our research was limited due to gender identification. Although we have chosen different methods to increase the precision of gender assignment, there are still a few authors' names we were unable to identify by full author names. Thus, the number of papers we could include in the analysis was restricted by a deficiency in gender assignment. The identification of the remaining author names would allow us to further increase the set of anti-doping publications suitable for analysis.

Another limitation is related to the sample of publication abstracts for examining research goal orientation. To select the abstracts, we used stratified sampling, in which we partitioned the publications based on the gendered characteristics of the papers. Then, we selected 105 abstracts independently from dominantly female papers and dominantly male papers. Despite demonstrated gender differences in research goal orientation by scientific fields in the results, the analyzed sample cannot be considered representative of the research field. Focusing on the research field in the sampling would give a more robust result regarding gender differences in research goal orientation in subfields of anti-doping research.

We should note that our study did not incorporate funding information on anti-doping scholarly output as a variable among the main factors of citation impact. Funding has been studied empirically as an explanatory variable affecting the citation impact of individual publications elsewhere (e.g., Roshani et al., 2021; Yan et al., 2018). Having funding supporting the research can be attributable to the research impact of scholarly communication, and funded projects can contribute to the advancement of research cooperation and collaborative publishing. However, it was also noted that a direct link between research funding and output cannot be drawn because research outcomes from grant-funded work often generate multiple outputs that are tailored to different audiences with different foci (Rigby, 2011). Furthermore, funding is essential in some subfields of anti-doping (e.g., experimental research), while applying for funding within the subfield of social sciences may not be vital in all cases. Since an assessment of the impact obtained via the output of individual projects on the advancement of anti-doping research can be analysed through the measurement of citation impact, mapping such an impact of funded projects by major organizations operating in the anti-doping research area would be recommended in future studies. Future research could also look into the impact of gender in the peer-review process of grant applications and publications (Bianchini et al., 2022; Chubb & Derrick, 2020; Squazzoni et al., 2021; Tricco et al., 2017).

Contributing to advancing the field from a bibliometric viewpoint, we developed a new score for the assessment of authors whose work links different research fields. The LinkScore allows detecting to what extent authors participate in “integrative” research that connects different areas as well as quantifying the capacity of the author’s work in linking research fields. Applying LinkScore in future scientometric analyses is a feasible approach to detect gender differences in knowledge integration in other research areas that appear to be highly interdisciplinary.

Additional future research questions revolve around the influence of male dominance in ADS on our understanding of the doping problem and proposed solutions to date. The results presented in this paper are not able to answer this question, but future research could build on the results we presented and employ citation path analysis to follow the influence of the ‘star’ men and women researchers identified in our previous study (Kiss et al., 2022) and via LinkScore in the current study.

Conclusion

This study provides a novel approach for the analysis of research impact and research goal orientation within the anti-doping research landscape. Specifically, we explored the role of women’s contribution to the scientific impact of ADS and detected potential gender differences in research goal orientation. The large-scale, data-driven bibliometric models we used in our study, can account for the structure, dynamics, and drives of both the global system of sciences and such specific, complex, and multidisciplinary (that is, otherwise hardly characterizable) constellations as the field of anti-doping sciences.

The results from this study showed that the gender gap observed in anti-doping scientific outputs wanes for research impact. The characteristics of female authorship (e.g., gender of the corresponding author) do not have an effect on citation impact except for gender-related author rank (female first author), which has a positive influence on citations for women scientists. The trend-defying picture in ADS may not be because women’s ADS are different, but it could be the consequence of the field being relatively young and still emerging. Future research is needed to further elucidate the underlying factors of women’s success in ADS. Regarding research goal orientation, dominantly female papers were overrepresented among publications classified as aimed at scientific progress, while the share of male-authored papers was higher in publications classified as aimed at societal progress. A better understanding of the dynamics of women’s research priorities or interests will offer valuable insight into how anti-doping scientific advances. Anti-doping research is in a unique position due to its multidisciplinary nature, in essence, scholars can truly grasp the richness and complexity of this research field only by applying a wide range of methodologies.