OA fraction of the German publication output
In a first step, we analysed how the overall OA share of the German research system developed over the time period from 2010 until 2018. The following Fig. 2 displays the number of publications with addresses of German research institutions and highlights the freely accessible subset. The overall OA share was 45% considering all years collectively. This finding is in line with results from Robinson-Garcia et al. (2020), who reported 43% as the global median OA share of publications from universities in the period 2014–2017, with a slightly higher share for German universities. Piwowar et al. (2018) reported a slightly lower OA percentage of 36% for a sample of 100,000 articles registered within the Web of Science that were published between 2009 and 2015.
As Fig. 2 shows, the total number of articles, as well as the number of OA articles increased constantly over time. The absolute number of toll-access articles was quite stable with a slow increase from 52,803 in 2010 to 54,873 in 2013, and decreasing again from that point onwards to 51,430 publications in 2018. Since the number of OA articles increased continuously from 30,664 publications in 2010 to 55,649 in 2018, the relative proportion of OA articles rose from 37% in 2010 to 52% in 2018.
As an answer to research question RQ1, we were able to establish that the OA fraction of the publication output of German universities and non-university research institutions has been rising continuously over the observed time period from 2010 to 2018, confirming the international trend.
Differences between research sectors
In a next step, the development of the OA shares are analysed for the different sectors (universities, non-university research institutes like MPS or WGL institutes, and government research agencies) of the German research system separately. The results are displayed in Fig. 3.
Two results of the cross-sector comparison are highlighted: First, the total publication output varied strongly between sectors. The differences in the publication output do not result from the different sizes of the sectors (in terms of budget and staff) only but also reflect the different missions of the sectors. The publication outputs of the sectors oriented towards basic research (like UNI, MPS, and HGF) were considerably larger than those of sectors with a practise-oriented mission like GRA and the FhS. Second, a similar trend can be found with respect to the OA shares across all the sectors. Again, sectors with an academic orientation and basic research focused mission outperformed the two more practice-oriented sectors regarding the adoption of OA. Of all sectors, the MPS had the highest OA share over the whole period, rising from 59% in 2010 to 77% in 2018. The HGF shows a strong rise both in the overall publication output (from 10,365 publications in 2010 to 15,996 publications in 2018) and in the OA share that rose from about 47% in 2010 to about 63% in 2018. The example is of particular interest as it shows that an increase of the publication output does not necessarily have to happen at expense of the OA share. Compared with these numbers, the fraction of OA publications of the two sectors with practise-oriented missions were low (41% for GRA and 29% for FhS).
In order to deepen the understanding of OA within the German research landscape, the OA shares of individual institutions, grouped by sector were calculated. The analysis was restricted to institutions with a publication output of at least 100 publications in the period 2010–2018 and excluded administrative facilities as well as residual and aggregating categories. Of the 444 institutions in total, 320 meet these conditions, while 124 institutions with a cumulated volume of 6259 articles were excluded from this step of the analysis.Footnote 20
Figure 4 displays the results.
A comparison of the scatter plots of the different sectors suggests that the distributions are not determined by a single factor but by a combination of different factors.
For UNI, the spread around the linear trend line was very low, indicating that the OA shares were partly determined by its size, as measured by their overall publication output—universities with larger publication outputs tended to have larger OA shares. Outliers with above-average OA shares were universities that strongly support OA or that are known as OA pioneers in Germany. An example is the University of Konstanz with the highest overall OA share of 70% among all German universities. Compared with the other two basic research-oriented sectors (MPS and HGF) the OA share of UNI was comparatively low. Possible reasons might be, on the one hand, that researchers based at universities enjoy a high degree of autonomy guaranteed by the German constitution that makes it difficult for the management to enforce compliance with OA policies. On the other hand, research at German universities covers a large variety of disciplines and fields, including those with both high and low adoption of OA.
Evidence for the influence of disciplinary publication cultures on OA shares can be drawn from the scatter plot of another sector of the research system, the MPS. Following the divisions of the four quadrants separated by the two median lines, physics and astronomy institutes were located in the upper right corner with a high publication output and a high OA percentage. Researchers in this field traditionally tend to publish preprints on subject-specific repositories and the landscape of the journals are characterised by a high level of openness (Taubert, 2019). In the upper left quadrant with similarly high OA shares but with lower publication counts, institutions with a life science profile dominated. In the lower left quadrant humanities’ and social sciences’ institutions accumulated, having had a lower publication output in journals covered by WoS-KB and a lower OA share. Lastly, the lower right quadrant, characterised by an above-average number of publications, but an OA share lower than median, was occupied mostly by institutions with a focus in materials research.
In the case of the HGF the distribution also seems to be influenced by the disciplinary publication culture. The majority of institutions with an OA percentage above the median value were located in the natural and life sciences. The highest OA share (84%) of all Helmholtz institutes was registered for the Deutsches Elektronen-Synchrotron (DESY) a large-scale research facility in (particle) physics. The plot showing the institutions of the HGF also suggests that disciplinary publication cultures have had a stronger influence on the OA share than institutional support. For example, the Jülich Research Centre (FZJ) and the Helmholtz-Zentrum Dresden-Rossendorf (HZDR) both support publication in fully OA journals with their publication funds and provide repositories for self-archiving but their overall OA percentages were below the median value for this sector (52% and 41% compared to a median value of 63%).
The FhS has other major output formats aside from journal publications, like patents and technology products and reports. However, those are not covered by this analysis based on the Web of Science. The comparable low OA share of most of the Fraunhofer institutes may reflect the more application-oriented specific mission of the sector.
An interpretation of the results for the WGL and the GRA is less straightforward, since these two sectors comprise heterogeneous institutions regarding their missions and orientations. However, the disciplinary publication culture again seems to play a certain role also here. The two leading institutions in each sector, namely the Leibniz-Institute for Astrophysics Potsdam (AIP), the Leibniz Institute for Solar Physics (KIS), the Robert Koch Institute (RKI), and the Deutscher Wetterdienst (DWD) can all be attributed to the natural sciences, and physics in particular, as well as to the life sciences.
Figure 5 quantifies the observations regarding the variability of OA shares within sectors that we already made in Fig. 4. Using the non-overlapping of boxplot notches as an approximate measure of significant differences in median, we deduce that the two research-oriented sectors, HGF and MPS, had significantly higher median values for the OA percentage than the other sectors. On the other end of the spectrum, the more practise-oriented institutes of the FhS had a much lower OA percentage than all other sectors. UNI with a typically very diverse disciplinary profile, and WGL and GRA with their diverse primary missions all had intermediate levels of their median OA percentage. Furthermore, we can confirm the observation that the variation of OA percentages within the sector of UNI is very low, whereas for the WGL the diverse strategic focuses might be a key factor explaining the high spread of OA shares.
Regarding research question RQ2, we found large differences in the degrees of OA adoption for the research sectors of the German research landscape. These differences may originate from the diverse disciplinary profiles of the research institutions as well as differing key missions. Moreover, the different orientations toward basic research versus application in practise or supply of infrastructure typically amount to vastly different importance of journal publication as a research output. However, more rigorous investigations are necessary to determine the influence of the different factors.
Prevalence of OA categories
As outlined previously, there are several ways of providing OA to publications. In this section, research question RQ3 is addressed as the prevalence of the most widespread OA routes is investigated: OA via repositories (Green OA) and via journals (Gold OA). In the case of Gold OA, we further distinguished between articles in fully OA journals and other types of OA provided by journals (e.g., delayed, hybrid and promotional OA). In the case of repositories, we distinguish between disciplinary, institutional, and other OpenDOAR-listed repositories as well as sources not registered within OpenDOAR. The OA categories are non-exclusive, that is, an article might be counted for several categories. Articles were fully counted in every category they appear in. Hence, numbers do not sum up to the total number of articles considered in this study, and percentages do not sum up to one hundred per cent.
As a first step, the relevance of the two main OA types is analysed in Fig. 6.
The most striking observation is that the majority of openly accessible journal articles (51% of all OA articles over the whole observation period) were available through both types: via the journal and also via at least one repository. Moreover, this overlap also shows the strongest increase over time, from 12,136 articles in 2010 to 31,237 in 2018. Articles that were available exclusively via a journal are the minority, yet the numbers have risen strongly over time from 4860 articles in 2010 to 7668 articles in 2018. In addition, there is a relatively steady amount of around 15,500 articles published every year which was OA exclusively via a repository.
A closer inspection of the data reveals that of the articles which were OA exclusively via a journal (highlighted in blue as ‘by Host’ in the left column in Fig. 6), only 33% were published in fully OA journals, while the remaining 67% were other journal provided OA types like delayed, hybrid and promotional OA. This distribution strongly differs from the second group, where OA was provided via journals and repositories (highlighted in blue as ‘by Host’ in the middle column in Fig. 6). Here, more than half of the articles (54%) were published in fully OA journals. In other words: it is more likely for an article in a fully OA journal to be archived on a repository than for an article where journal-provided OA follows a different model. Robinson-Garcia et al. (2020) suggest that this partially might be a result of indexing in PubMed Central including Europe PMC.
Turning to the repository categories, and keeping in mind that articles may be deposited in more than one repository, in both cases (overlap and exclusively repositories), subject-specific repositories contributed the largest share. However, while little more than half (54%) of the articles that were OA exclusively via a repository (highlighted in blue in the right column of Fig. 6) were deposited on a subject-specific repository, this was the case for almost 80% of articles in the overlapping group. A similar observation can be made for the residual category ‘other_repo’ with 30% occurrence in the exclusive repository group, and 49% in the overlapping group. Institutional repositories (around 40%) as well as other OpenDOAR registered repositories (around 14%) appeared equally often in both groups.Footnote 21
Figure 7 shows that of all OA sub-categories, journal- and repository-provided, subject-specific repositories as classified by OpenDOAR were the most prevalent OA subtype in each year of the period analysed in this study. This is in contrast to findings from earlier studies that base their analyses on the field best OA location of Unpaywall (Martín-Martín et al., 2018; Piwowar et al., 2018; Voigt et al., 2018).
Regarding the different journal OA subtypes, three findings are highlighted here: First, there was a growth of the percentage for both articles in fully OA journals and for other OA types provided by journals (‘other_oa_journal’) in the observation period. Second, the growth of the percentage of articles in fully OA journals was larger and at first glance it seems that this sub-category has become more important than other OA types provided by journals. However, these trends should be interpreted carefully as there was a notable drop in the percentage of other OA types provided by journals in the years 2016–2018. This is most likely caused by delayed OA journals where some or all articles of a journal are made available after a certain embargo period which can extend up to several years. Articles from these publication years may therefore not have been openly accessible at the time of analysis but will become OA in the near future. Third, most articles in fully OA journals were published with Springer Nature and Public Library of Science (PLOS). However, the strongest increases over time, mirroring the overall increment in this category, were found for Springer Nature, Frontiers Media SA, and MDPI AG. Publication volumes in PLOS grew from 827 articles in 2010 to 3086 in 2013 and from then on continuously decreased, though they remained at a generally high level: in 2018, there were still 1774 articles published by German research institutions in PLOS journals. Presumably, this can be explained by the general trend. Since 2014, a decline in the number of articles published in PLOS ONE was observed, while the number of articles published by the competing mega journal Scientific Reports has grown since then (Spezi et al., 2017).
Regarding OA provided by repositories and its subtypes, we stress three main findings: First, deposition in subject-specific repositories (‘opendoar_subject’) was, in terms of OA share, by far the most important subtype. There are no hints that this situation will change in the near future as there has been a sustaining growth of the OA share of this subtype. A more detailed look into the data reveals that the gain for subject specific repositories can be attributed mostly to the arXiv and PubMed Central including Europe PMC. This suggests that a few disciplinary publication cultures impacted the continuing relevance of this OA publication practice. Second, there was a notable drop in the share of articles openly accessible via residual repositories not registered with OpenDOAR (‘other_repo’) in the years from 2016 to 2018. This decrease in recent years is almost entirely caused by records found on Semantic Scholar, accounting for almost 83% of all articles in this category. The slight decrease in OA publication for institutional repositories (‘opendoar_inst’) in the last year is presumably caused by delays in deposition due to self-archiving embargoes allowing a deposition only after a certain period. Another reason might be that not all articles were delivered into the institutional repository by the authors themselves immediately after submission or publication. Third, the remaining category opendoar_other shows a continuous increase, which was, however, not as steep as the growth in subject-specific repositories or in fully OA journals.
In the next step, we analysed if the sectors differ regarding the adoption of OA (see research question RQ3). To explore this, OA percentages per category were calculated for each sector. Figure 8 displays the results.
In each sector, the most prevalent type was OA provided by disciplinary repositories. Sectors with a high OA share, like the MPS, had high proportions of OA provided by subject-specific repositories, but also in the case of the FhS that had the lowest overall OA share, this type contributed the most. It is likely that the OA shares of subject-specific repositories reflect to what extent disciplines with strong self-archiving practices contributed to the publication output of the different sectors.
With respect to OA provided by institutional repositories, a comparison of the sectors shows that HGF, an organization with a comparable strong hierarchical structure and a central unit that supports OA, had the highest respective OA shares, while the shares for UNI and FhS were both comparatively low. These findings are compatible with the assumption that the OA share of this type is at least to some extent affected by the relevance of self-archiving in a particular type of organization and the ability of the organization to enforce their members to self-archive their publications. In addition, the secondary publication right granted by German copyright may play a role in the higher share of self-archiving in the non-university sectors as this right applies to mainly third-party funded research only. For articles in the category ‘opendoar_other’, the particularly high share for MPS is a data artefact caused by an ambiguous classification of the repository of MPS as both “institutional” and “aggregating” within OpenDOAR. Such repositories, which are registered within OpenDOAR but not unambiguously classified, were labelled as ‘opendoar_other’ in our analysis. The results for the category ‘other_repo’ are difficult to interpret as this category is dominated by a single repository—Semantic Scholar—that aggregates various content from different sources.
Regarding OA provided by journals, two findings of the cross-sectoral comparison are highlighted: First, the percentage of articles published in fully OA journals seems to be largely independent from the type of organisation as the shares of different sectors do not vary much from the overall percentage of that category for the German research system. The results suggest that the shares of the sector may be influenced primarily by the extent to which journals apply a full OA publishing model and not so much by organisational factors.
Second, this finding sharply contrasts to the distribution of the OA shares of the ‘other_oa_journal’ type, as MPS had a remarkably higher share in this category compared to the overall proportion for all sectors. A more detailed look into the access conditions of the journals that contributed the most to the publication output of MPS in this category shows that the high share to a large extent results from the delayed OA model that is applied by large journals in physics, astronomy, and the life sciences. Therefore, the high OA share of MPS in this category mainly reflects the disciplinary profile of MPS with a strong publication output in these disciplines.
Overlap of OA categories
For 72% of all OA articles in our dataset, Unpaywall tracked more than one OA full text link. In our analysis, we classified each OA location according to our schema in Table 1. As noted, our categories are non-exclusive, i.e. articles that are openly accessible through different means were counted once in each of the categories. In order to quantify this overlap, Fig. 9 displays the most common combinations of OA categories found as upset graph (Lex & Gehlenborg, 2014).
The largest groups were articles available only through a subject-specific repository, followed by articles freely accessible exclusively via a non-fully OA journal and articles on institutional repositories only. Next, several combinations, including articles that were available via a fully or non-fully OA journal as well as through one or more types of repositories, for example on a disciplinary and an institutional repository, followed. These articles were counted fully in each of the OA categories they appeared in. Figure 9 highlights that many articles published in fully OA journals were available through repositories simultaneously, while a larger proportion of OA articles published in otherwise toll-access journals was only available through the publisher website.
With respect to question RQ3, we found that subject-specific repositories are the most prevalent OA type over the whole period on the national level as well as for each sector. However, the percentages for publication in fully OA journals and OA via institutional repositories show similarly steep increases over the observed period. A comparison of the development in different sectors suggests that organisational factors (like centralised or decentralised OA adoption) may influence the share of OA via institutional repositories, and disciplinary profiles may impact the prevalence of OA in subscription-based journals, whereas publication in fully OA journals seems to be affected mainly by the availability of journals offering this publishing model.