Introduction

The debate around open access is an important and complex one. Academic research outputs have traditionally been subjected to subscription-access and a paywall, but over the past three decades the situation has started to change. Recent estimates suggest that more than half of recently published journal articles are now freely available online (Piwowar et al. 2018). The change towards openness has been more rapidly evolving in some research disciplines compared to others (Archambault et al. 2014; Crawford 2017; Piwowar et al. 2018), depending on e.g. the availability of funding to support payment of author processing charges, availability of well-established open access journals, or repositories for authors to share their manuscripts on. An established way to distinguish between the main channels of open access provision is by separating open access provided directly by journals (gold open access), and open access provided by authors via self-archiving (green open access). However, whilst this fairly crude division has merit in simplicity, the underlying supporting mechanisms and circumstances for how and in what form an article has been made available remains hidden behind the category label. In order to obtain usable knowledge about the mechanisms enabling open access, on any level of analysis, there is a need to look beyond the surface level.

The complexity of the debate around open access also stems from the presence of clashing stakeholder interests, where the vision for the path forward is not uniform and key actors have their own considerations and arguments for how the future of scholarly publishing should be shaped. The case for open access is sometimes based on pragmatic grounds and pointing to the increased citations that research outputs being made freely available through the web have been found to receive (Tang et al. 2017; Sotudeh et al. 2015; Fukuzawa 2017), though some have also argued that the positive effect on citations does not occur in all fields (Wray 2016). However, there is also an evident ethical dimension in this debate (Piccininni 1997; van Krevelen 2005; Troll Covey 2009a). A general assumption is that academics want to have their work read, and universities are paying them to write it and to provide the bulk of the expertise-requiring work for journals. And yet universities have traditionally payed again to get access to that work, and potential readers who are outside the universities are denied access to it. It should come as no surprise that this looks to many an unsustainable and unfair process. At the same time, whilst many academics have seen open access publishing to be a viable solution to the unfairness and unsustainability of the current situation (Bacevic and Muellerleile 2017), others have warned that the case for open access has also opened the door to research and publication practices of lower standard (Beall 2012).

The goal of this study is to comprehensively examine the actual open access availability of journal articles compared to journal copyright policies and restrictions by considering a specific research community, namely ethics research. Previous research exploring open access in relation to copyright compliancy has included approaches such as looking at the total article output published by a small number of journals within a specific discipline (Laakso and Lindman 2016), sampling random articles available through academic social networks (ASNs) such as ResearchGate (Jamali 2017), and analyzing the output produced by the faculty of a specific institution (Troll Covey 2009b). In this study, we want to assess the current status of open access within the community of ethicists and their academic production in terms of articles in scholarly journals.

We first aim at clarifying the extent, and the ways, through which ethicists share their scholarly material online, focusing specifically on the following set of questions:

  • To what degree are ethicists’ journal publications freely available online?

  • How common is it for journal publications to be open access through journal websites within the field of ethics?

  • Which websites and platforms do ethicists use when self-archiving?

  • What versions of the journal publications do ethicists use when self-archiving?

  • Are popular ethics journals clear with regard to their self-archiving policies?

As the second aim, we will carefully examine the two important aspects of (1) copyright infringement and (2) undersharing. These can be operationalised, respectively, as making copies of an article freely available online when this is not allowed by journals and publishers’ policies, and failing to make copies of an article freely available online when this is allowed by journals and publishers’ policies. More in detail, our study seeks to provide data to answer the following questions:

  • Comparing policies to web observations, are ethicists prone to copyright infringement?

  • Do ethicists undershare their research outputs?

  • What is the current role of institutional repositories in facilitating authors’ self-archiving?

  • What is the current role of ASNs for sharing research publications among ethicists?

The paper is structured as follows. In “Literature review” section we provide a brief literature review separated into two components: the first part concerning open access and copyright compliance in the context of scholarly journals, and the second part focusing on open access in the context of philosophy and ethics research specifically. “Methods” section details the methodology used in this study and “Results” section is dedicated to presenting the results. “Discussion” section offers a discussion and answers the questions listed above in light of the results obtained. “Conclusions” section summarizes the main conclusions of the study.

Literature review

Previous research on open access and copyright compliance

Studies addressing the degree to which scientific literature is available open access across the sciences have suggested that around half of all recently published articles in scholarly journals are freely available in some form on the open web (Archambault et al. 2014; Piwowar et al. 2018). While this figure is a substantial increase compared to the situation around eight years ago which pinned the share of freely accessible content at 20% (Björk et al. 2010), a considerable share of accessible content is currently being provided through mechanisms which are less than ideal from the perspective of persistent access. Undermining the possibility to reliably maintain or sustainably increase the current level of open access going forward is that a large share of what is currently available on the open web on other locations than journal websites is infringing on journal publisher copyrights as will be evidenced by the previous research reviewed in this section. In such cases the authors (or someone else) has made an article freely available somewhere on the web under circumstances not permitted by the journal that has published the original article.

It is not that scholarly journals would be restrictive in permitting self-archiving to a lesser extent than what is currently provided, quite the contrary in fact, it is that freely distributed copies are not made available in line with instructions given by publishers regarding conditions such as allowed article version, embargo period, and on what type of web location distribution is allowed. In a study of the 100 largest journal publishers indexed in Scopus, Laakso (2014) found that publishers are relatively liberal in permitting distribution of accepted manuscripts (81% of all articles permitted) while distribution of the publisher version (the final copyedited version of record) is considerably more restricted (11% of all articles permitted). Publisher policies for self-archiving also change over time, and can be modified towards more permissive or restrictive at the whim of the publisher to apply to future articles published in the journal. In a longitudinal study of 107 publishers listed on the SHERPA/RoMEO publisher policy database, Gadd and Troll Covey (2016) found that while publishers have in theory become more permissive over time during the 12-year observation span by allowing some sort of self-archiving, they have simultaneously been increasing the specific conditions for when, where and how self-archiving may be performed by authors. The authors could also observe a relationship between publishers making self-archiving conditions more specific in conjunction with journals introducing optional paid open access options, commonly referred to as hybrid open access (Laakso and Björk 2016).

An increasingly important element to consider in the context of open access and copyright infringement are ASNs such as ResearchGate and Academia.edu, as such services have so far not strictly enforced copyright compliance for content uploaded and made publicly available. Jamali and Nabavi (2015) recently studied the extent to which open access to journal articles is available through Google Scholar for articles across all major research disciplines and ResearchGate came out as the top source for providing full-text articles. The dominance of ResearchGate as a major source for providing full-text access through Google Scholar has been echoed by Laakso and Lindman (2016) and Laakso et al. (2017), highlighting that most of the content on the service is provided as the publisher’s version. Jamali (2017) has further investigated the extent to which ResearchGate members as authors of journal articles comply with publishers’ copyright policies when they upload versions of their articles to the service and found that about half (51%) of the user-uploaded articles that were not published as open access in a journal violated publisher copyright agreements. Legal action against ResearchGate has recently been threatened by major journal publishers (Chawla 2017) so the long-term future of the service is still uncertain.

While there is valid reason for concern for the persistence of current levels of access to journal articles on the open web, there would not be a decrease in the degree of access if sharing was instead made within the limitations set by publisher policies, on the contrary, there is a lot of unused potential in providing access through compliant self-archiving. Laakso (2014) and Troll Covey (2009b) have pointed to a large gap between the potential for self-archiving permitted by publisher policies and the actual self-archiving practice by scholars. Zhu (2017) found that whilst most academics support the principle of making knowledge freely available to everyone, the use of open access publishing is still limited and related to the authors’ awareness of open access policy and open access repositories, their attitudes towards the importance of open access publishing and related citation advantage. Lovett et al. (2017) has argued that ASNs should not be seen as a threat to open access: authors who posted articles to ResearchGate were actually more likely to have complied with open access policy, and vice versa. The complementarity of ASNs is an aspect our study will also explore.

While the impact of scientific outputs can be perceived and quantitatively studied through various metrics that have been developed and adopted over time, such as e.g. citations and social media activity, standardised metrics relating to openness have yet to become established and standardised. Based on a review of studies measuring open access prevalence, Nichols and Twidale (2017) present several suggestions for how the openness metrics for authors could be designed in order to take into account author-level factors such as unused self-archiving potential for publications, self-archiving in breach of publisher policies, and the long-term archival capability of platforms used for self-archiving. While standards such as OAI-PMH and DOI with related metadata from Crossref are making the type of data required for calculating these types of metrics increasingly readily-accessible, publisher policies are yet to be available in a comprehensive and reliable machine-readable format.

Detached from the dissemination behaviour of individual authors is the comprehensive pirate access to content offered by the Sci-Hub website, which retrieves copies of articles from behind paywalls and distributes them for free. Sci-Hub hosts more than 50 million research articles (Machin-Mastromatteo et al. 2016). Notably, Himmelstein et al. (2017) argue that the subscription-based model is becoming unsustainable because almost the entirety of scholarly research is now freely available thanks to Sci-Hub, but recent literature has addressed limitations and problems of the Sci-Hub initiative as well (Lawson 2017; Priego 2016). Due to the illegal nature of the service, and lack of external indexation of content found in Sci-Hub (e.g. Google Scholar), this study as well as most other studies focusing on open access more broadly does not include measurement of content available through this service.

Open access in the context of philosophy and ethics research

Whilst some claim that in the humanities journal articles are not a key medium of academic communication as monographs represent the most significant scholarly vehicle (Eve 2014),Footnote 1 our study assumes and corroborates the view that journals articles constitute, at least for some areas in the humanities, a fundamental type of research output. Shedding light on features of this population and differences with other fields is of great interest.

The general consensus among studies on the share of open access journal articles within the humanities has been that this research area has some of the lowest share of content available open access independent of measurement method used. In a report for the European Commission Archambault et al. (2014) studied the availability of journal articles indexed in Scopus concerning publication years 2011–2013. The total share of content available open access with custom web harvester was used to determine the shares of articles with freely accessible versions on the web was 53.7%, while journals articles within philosophy and theology was measured a share of only 34.7% which was the 5th lowest of the 22 discipline categories. In a study of journals included in the Web of Science, Bosman and Kramer (2018) found that the share of content freely available through the oaDOI API for all research areas combined was 20.3% for content published in 2010 and 25.5% for content published in 2015, while the respective figures for philosophy journals was 6.5 and 10.7%. A report by Science-Metrix (2018) documents a study of journals included in the Web of Science with a similar custom harvester to the one utilized in Archambault et al. (2014). For articles with the publication year of 2014 the overall open access share was 55% while the category of journals belonging to arts and humanities was found to have an open access share of only 24%. Since the methods utilized to study these shares are heterogeneous and the coverage bias of the two indexes with regards to the humanities research area in general, and its open access journals more specifically, these findings can only be used in a limited capacity.

Philosophers have expressed growing interest in the free online availability of scholarly material. Among the flagship open access initiatives within philosophy features also the Stanford Encyclopedia of Philosophy (Allen et al. 2002; plato.stanford.edu 2017), which publishes and regularly updates entries on key topics and which are authored by eminent scholars in the field. Moreover, preprint archives offered by PhilSci Archive or PhilPapers have also played an important role in facilitating the sharing of scholarly information and promoting green open access in the field. But philosophers’ interest in open access has also been reflected in the recent launch of successful open access journals. In particular, Ergo and Philosophers’ Imprint, which are ranked amongst the twenty best general philosophy journals based on a poll by Leiter Reports, are both open access journals (leiterreports.typepad.com 2015). Among specialist philosophy journals there have been important open access initiatives, including the launch of the Journal of Ethics and Social Philosophy, THEORIA, and Philosophy, Theory and Practice in Biology. The growth of open access journals in philosophy is evident when considering the number of outlets listed in the philosophy section of the Directory of Open Access Journals´ list of open access journals (doaj.org 2017), although the assessment of the quality of such journals is hindered by the lack of accepted quantitative approaches to quality analysis in the field (Polonioli 2016). Notably, a distinctive feature of open access journals in philosophy is that they typically avoid charging authors with any processing fees. As reported in Neuman and Laakso’s (2017) recent case study evaluating open access publishing models for a society journal within philosophy, the introduction of fees in the field would likely require pedagogical measures to convince authors that this is a promising publishing model, in addition to creating mechanisms for authors to obtain funding to cover such costs.

Ethicists’ behaviour has recently been explored empirically by a number of studies (see Schwitzgebel and Rust 2016 for a review), which have overall suggested that ethicists do not behave significantly differently from non-ethicist academics when considering a number of seemingly morally relevant issues. Since the open access debate is also resting on ethical premises, it is especially interesting to shed light on ethicists’ behaviour in the context of scholarly information sharing.

Methods

In studying copyright infringement and depositing behaviour, this study offers a methodological contribution by combining and refining methods used from previous studies within the general topic area. As evidenced by the reviewed literature, previous research on these issues has primarily focused on populations of journals and their outputs, or exploring open access behaviour among authors affiliated with a single institution, or users of a specific ASN, whereas in this study the focus is on a broad population of researchers and building up the bibliometric data based on the publication records of individuals. To identify a group of ethicists we resorted to the Philosophical Gourmet Report 20142015 (philosophicalgourmet.com 2017), which is a poll-based ranking of philosophy graduate programs based on the perceived quality of their faculty. Whilst not necessarily uncontroversial (Bruya 2015), it is a very widely used guide. Departments are ranked based on specialties, and the 41 best departments for the subject Ethics have been considered. For our purposes, the particular internal order of the ranking was not important since we included all 41 departments. For each department, we classed as ethicists those researchers who listed ethics or moral philosophy as one of their areas of research on their faculty website, or in lack thereof, their personal website or similar information source. The final list included 375 ethicists, which was reduced to 297 after only considering currently affiliated ethicists that had published at least one journal article during 2010–2015. Their original research from 2010 and 2015 outputs were manually recorded by consulting institutional webpages, personal websites, PhilPapers profiles, and profiles on Google Scholar. Rather than sticking to only one source of information the goal was to flexibly retrieve a curated and recently updated list of publication records for each identified author. A major benefit of this approach is independence from any particular indexing service like Scopus or Web of Science since such services are selective in their coverage of journals. 1718 journal article records were identified of which 1682 were unique, i.e. not co-authored with other authors included in the analysis.

We then proceeded to manually query Google Scholar with the title of each identified journal article, specifically looking for freely accessible full-text versions of the article. This was done off-campus, without access to paid journal content. Further, we used a dedicated web browser installation for conducting the search without logging into any ASNs or Google services which could influence what results are visible when conducting queries. For each journal article we recorded data for up to eight freely available copies in order to paint an as comprehensive picture of the availability as possible, to our knowledge this is the widest spread included in any study so far. The limitation of eight separate observations rather than including even more was a trade-off between practicality and methodological strength, as can be seen from the results later on it is rare that an article is represented even in four separate web location categories. In addition to copying the URL we also categorized the web location and document version according to a standardized schema (Table 1) that is an evolution of the schema used in Laakso and Lindman (2016) as well as tailoring it for the unique mechanisms discovered during the data collection of the dataset for this particular study.

Table 1 Document version and web location category classification scheme.

It is important to note that our estimate of the free availability of article copies is likely to be quite conservative. More precisely, we relied on Google Scholar as the sole search engine and focused on available copies detected by it. In recent years the controversial and illegally-provided Sci-Hub service has allowed retrieval of free copies of articles from behind paywalls. In addition to not being indexed in Google Scholar, availability of articles on Sci-Hub was not considered due to such access not being enabled by the authors of the articles themselves, and that the systematic access that Sci-Hub provides is illegal and likely to be temporary.

After collecting all the access data concerning the identified articles we then retrieved the journal policies for the 20 most popular journals in the sample by visiting the journal websites and coding the terms according to a common framework where combinations of web locations and document versions are allowed/prohibited or status remained unclear. Journal policies tend to distinguish between commercial and non-commercial repositories. Notably, most repositories are non-commercial but it is useful to note that it might be hard for authors to ensure they comply with this aspect: information not always transparent and status of repository might change.

There are two central methodological limitations to this study which are shared with most previous studies exploring self-archiving policy-alignment across articles published during multiple years. The first limitation is the lack of accounting for changes in publisher policies over time. The second limitation is that information about when a document had been uploaded was not considered due to lack of this information on many web locations. In the following paragraph we describe how these limitations influence the study and interpretations of its results.

Publisher policies were accessed and recorded during the summer of 2017 and those policies were used to analyse the compliance of articles published in the timespan of 2010–2015. As described earlier in the literature review, the most notable study on changes in publisher policies over time is Gadd and Covey (2016), which found that publishers had often become more specific in listing conditions for self-archiving during the 12-year observation period. So it is likely that at least some of the journals in this study have modified their policies since 2010, however, what alleviates this limitation slightly is that publisher policies are in general retrospectively applicable (i.e. in lack of an archived version of a potential copyright agreement authors can consult and act based on the current policy for self-archiving older articles published in the journal) which means that current versions of policies are most likely what would be practically used for self-archiving said articles today. However, the topic of policies changing over time is a very unexplored area both in research and in practice leaving open questions due to the lack of general guidelines on how changes should affect self-archiving of older material. With regards to the lack of information about when a document had been uploaded the conscious choice was to limit the most recent year observed to 2015, i.e. allowing at the minimum of around two years for eventual embargoes to expire.

Results

General access metrics

This section presents the results obtained from analysis of the collected web observations for the 1682 unique journal articles authored by the 297 ethicists included in the study.

The annual publication output (2010–2015) is presented in Table 2 together with the annual share of publications to which it was possible to retrieve at least one copy for free. The annual volume of journal articles ranged between 250 and 305 and the share of articles available for free between 52 and 61%. These high-level results demonstrate no consistent tendency for either more recent or older articles being available more frequently. In total, a free copy could be retrieved for 948 of the 1682 articles, producing a total share of open access to be 56%.

Table 2 Annual publication volumes and share of annual publications with at least 1 copy available online for free

The high-level results found in Table 2 only paint a simple outline for the complexity found within the dataset. Since we collected web observations for up to 8 copies of freely available versions per article the variation in observed web location types and document versions within the 948 articles to which a copy could be found varied greatly.

In order to summarise the collected data as comprehensively as possible Table 3 provides a breakdown of every recorded observation per web location category subdivided by document version found for all of the 948 articles to which between one and eight free copy observations were made. The three most frequent providers of access to free copies in descending order was ASN, subject repositories, personal webpages. In all three of these categories the most frequent document version was the publisher’s version.

Table 3 Breakdown of all observations (2183) of free copies, grouped by year of original article publication, web location category, and document version

The results so far have not explored the extent to which article access overlaps across multiple web locations. Figure 1 presents a visualization of the distribution of article access, with particular focus on conveying shares of articles available either nowhere, or then at the other extreme, across six different web location categories which was the maximum value observed in the dataset. Note that this merely presents spread across unique categories, articles could be featured multiple times on the same location category, e.g. in two different institutional repositories, but that is not conveyed here. Of the 1682 articles, 726 articles (43%) were not available anywhere, 454 (27%) only through one web location category, 280 (17% through two different categories, 126 (7%) through three, 64 (4%) through four, 26 (2%) through five, and 6 (0%) though six categories.

Fig. 1
figure 1

Distribution of article access across different web location categories

Having articles available through more than one web location arguably increases their resilience for becoming completely unavailable, however, some web locations can be assumed to be more future-proof than others in providing sustained access. Table 4 provides a closer look at particularly the 454 articles that were only available on one type of web location. ASNs were found to be the leading category for providing unique free access to articles (98 articles), followed by publisher webpages (87 articles), and personal webpages (77 articles).

Table 4 Web locations providing unique access to one or more copies of a single article

Continuing on the thread of exploring ways through which unique access to content is being provided, similarly to how unique web location categories were dealt with Table 5 provides a breakdown of which articles only have a single document version made available. A clear majority of the 774 articles with only one document version available were publisher versions, 500 or 64.6%. This result has implications for volatility of access, as a very small minority of publishers allow distribution of the publisher version.

Table 5 Document version distribution for copies where only one type of document version was recorded

While the institutional level is not a primary focus in this study, a high-level comparison grouped by institution can help in discovering access patterns that relate to institutional environments, and particularly the degree of use that the institutional repository has. Table 6 provides a list of institutional affiliations included in the study, sorted by the total number of ethicists identified from each institution in descending order. The higher on the list the higher the usefulness and reliability of drawing conclusions based on the obtained numbers due to inclusion of more ethicists and articles for conducting the calculations. What is apparent is that UK-based institutions have a higher share of copies available through institutional repositories, something which likely stems from the strong open access policies that been implemented within the country. The relationship between ASNs and institutional repositories is interesting to look at from this perspective as authors affiliated with UK-based institutions are also the among the ones with the highest proportion of copies available through ASNs.

Table 6 List of included institutions from which the ethicists were identified

The publication activity among the ethicists included in the study varied a lot in terms of volume (1 article at the minimum, 92 at the maximum). To convey the spread Table 7 provides a categorization of ethicists based on their publication activity during the time period of 2010–2015, placing them into one of four categories. Most ethicists published between 1 and 3 articles (120), followed by the category of 4–6 articles (92), the 7–9 article category (53), and finally the category of authors with more ten or more articles published. The category comparison consistently suggests that higher publication activity is related to higher proportion of open access. The share of articles not having any article available open access also drops as more publications are produced, from 44% in the 1–3 article category to the 0% in the over 10 publications category.

Table 7 Analysis of proportion of articles available open access based on individual publication activity

Through our data collection we recorded 234 unique articles being available directly through publisher websites. In order to provide a better understanding for the exact open access mechanism through which these articles were made available through we returned to the collected URLs and manually classified these observations into more granular categories. In cases where an individual article was available through multiple web locations classified under the ‘publisher website’ category the open access mechanism was derived from the information related to the copy available through the primary journal website. Table 8 provides a summary of the results. What stands out is that 51% of the publisher website observations were within full open access journals that provide all of their content open on the web immediately on publication, and 19% of the publisher website copies being provided as hybrid open access, i.e. articles individually made open access within subscription journals.

Table 8 Distribution of the 234 unique articles with copies found on publisher websites

Another category warranting a closer look than simply the top-level web location category are observations made on ASNs. Table 9 contains a breakdown of all observations made within this web location category. While both Academia.edu and ResearchGate could be considered well-represented, Academia.edu provided access to more than double the amount of articles compared to ResearchGate in this population of articles (351 vs. 164). From the distribution of article versions across the two platforms Academia.edu has a higher relative representation of article versions other than the publisher version (46% non-publisher versions), while ResearchGate has a notably different version distribution (29% non-publisher versions).

Table 9 Copies found on academic social networks, i.e. ResearchGate and and Academia.edu

Continuing with the focus on ASNs, Table 10 provides further insight into the exclusivity and overlap in providing access to individual articles between Academia.edu and ResearchGate. This perspective suggests that Academia.edu provides access to almost three times as many articles as ResearchGate (15.4% Academia.edu vs. 5.6% ResearchGate). Something rarely explored at this level of detail is the overlap between access provided through the two services, which we here got a figure of 3.9% of all articles in the population.

Table 10 Exclusivity and overlap in access provided by academic social networks

The web location category we labelled as ‘aggregators’ were web locations where access to copies is provided through a secondary mechanism where content is automatically cached and mirrored after first being available out in the open somewhere else. They provide little insight into author behaviour, since individual action is not needed, however, they play a substantial part in contributing towards availability resilience should the original copy be removed. Table 11 shows a breakdown of the observations made within this category: Semantic Scholar (170 copies), CiteSeerX (47 copies), and Core (27 copies). Table 4 showed earlier that locations belonging to this category provided unique access to 8 articles so while redundancy is high there is a handful of articles which have been mirrored by these services before being removed from their original location.

Table 11 Breakdown of observations in the aggregators web location category

Table 12 provides a closer look at the breakdown of copies found in subject repositories, where PhilPapers constitutes 43% of all observations in this category (159 copies). Second, third, and fourth of the list are subject repositories belonging to the PubMedCentral network which have a focus on biomedical and life sciences content (168 copies in total spread out on the US, European and Canadian platforms).

Table 12 Breakdown of observations from subject repositories

Regarding copies found within the web location category of ‘other website’ no individual domain registered reached even 10 observations, as such no detailed analysis of these domains is provided.

Compliance analysis

The first part of the results section was dedicated to providing a comprehensive picture of access to all of the articles included in the population. The remainder of the results section is dedicated to investing the degree to which copies are aligned to the distribution instructions set out by journals as part of the self-archiving instructions provided to authors. The 1682 journal articles of the sample were published by a total of 481 different journals. Since detailed information about journal self-archiving policies need to be collected and coded on a per-journal basis the compliancy analysis is limited to articles belonging to the twenty most frequent journal outlets in the dataset. The policies were collected during the summer of 2017 and compliance of articles published during 2010–2015 interpreted through that information. Please see the methodology section for more discussion about the potential implications of this methodological limitation.

Table 13 provides an overview of which journals are included together with the article count for each journal which spans from 100 articles for Philosophical Studies to 14 for Erkenntnis.

Table 13 Journals included in compliance analysis, with overview of available copies

The total number of articles included in the compliance analysis was 597, and concerning authors it included 217 of the 297 ethicists included in the full population. Since the focus of this analysis was on studying author behaviour when it comes to access provision in light of journal policies, observations belonging to copies found directly on publisher websites, through aggregators, and as JSTOR read-only copies are not included since they are not reliant on journal self-archiving policies and provide little opportunity for authors to influence their availability. Of the 597 articles included in the analysis our data collection had retrieved at least one copy for 293 the articles with the previously mentioned limitations in place. Document versions where the exact version status could not be established were compared to the publisher´s policy for allowing dissemination of accepted manuscripts.

As with the overview of the complete dataset previously, giving one single exhaustive table or visualisation of the contents is not possible without losing important information on the way due to the way that observations overlap. Starting with an overview of the policy alignment over all observations Table 14 gives insight into the policy status of copies found at the five web location categories included in the analysis. Of all the 487 copies observed, 258 were non-compliant, 166 compliant, and 63 had an unclear status where the combination of web location category and document version was not prohibited nor permitted explicitly in the publisher policy. Most of the non-compliancy is due to use of the publisher version across all location categories, and with few journals allowing copies to be distributed on commercial platforms (i.e. ASNs) in any form.

Table 14 Overview of all copies found and their policy compliancy related to the 597 articles included in the compliancy analysis

Table 14 does not shed light on overlap, where multiple combinations of version-location copies could be observed per original article in the sample, and is thus of little aid for understanding policy-alignment at a deeper level. Figure 2 aids to remedy this by showing the complete per-article policy distribution of the population of 597 articles included in the compliance analysis. Of the 293 articles for which at least one copy could be found, the journal policy status of 211 articles belonged to just one policy category, the copies retrieved for the 82 remaining articles produced mixes of aligned, infringing, and unclear policy status.

Fig. 2
figure 2

Diagram produced by using eulerAPE open source software (Micallef and Rodgers 2014)

Compliancy overlap of copies found. This figure visualises the policy status and overlap for the 293 articles to which one or more free copies could be found (overlap = multiple copies with multiple policy status´ found per one article).

Conclusions regarding the aspect of undersharing, i.e. the degree to which research that could but is not made open access, can be grounded by observing Fig. 2 in conjunction with the journal policies. All but one of the twenty journals included in the compliance analysis explicitly allow self-archiving of the accepted version on institutional and subject repositories, and that journal merely leaves those locations unclear while explicitly allowing self-archiving on a personal webpage. As such the theoretical maximum that could be made available within reasonable effort on the author side is 100%. The current utilization of policy-compliant self-archiving is 22.1%, while disregarding the aspect of policy-alignment the utilization is 49.1%.

The final component introduced as part of the compliancy analysis is the perspective author publication activity in relation to policy alignment. Table 15 provides similar publication activity categories to those found within the full overview of the dataset (Table 7), however, now the scope is limited to articles published in the 20 journals which were part of the compliancy analysis. From the comparison between the categories it is possible to discern that it is more common for authors to have at least one policy-infringing copy of one of their articles available than what the proportion is for authors that have at least one policy-aligned copy available at least one article, and this relationship was found across all publication activity categories.

Table 15 Ethicist publication activity categorization and policy alignment comparison as part of compliancy analysis

This concludes the presentation of results. In the following section we will provide further interpretation of the results both in terms of potential implications as well as describe how they relate to previous work within this are of research.

Discussion

In order to determine the level to which the previously stated research questions could be answered based on the data and its analysis, the questions are dealt with individually.

The first question was: To what degree are ethicists’ journal publications freely available online? The short and simple answer is that slightly over half (56%) of recent journal publications are available to read for free, with Table 2 containing the main data. As was demonstrated through review of results, this is a figure that hides a lot of complexity regarding how access is distributed among authors, journals, various web locations, and document versions that the follow-up analysis in the results section shed light on. Compared to earlier studies on open access shares in within the humanities and philosophy in particular (e.g. Bosman and Kramer 2018; Science-Metrix 2018; Archambault et al. 2014), the figure of 56% available open access is very high. There are at least three key factors facilitating the observed high share. Firstly, we did not select only journals within a specific discipline to define our population of studied articles, but rather selected authors specializing in a specific discipline and considered the full breadth of their journal article output. This leads to inclusion of articles published in journals within the e.g. health sciences and natural sciences where open access is more well-established practice among journals and potential co-authors. Secondly, another explaining factor is not limiting inclusion of journal articles to the population by them having to be in journal included in either Scopus and Web of Science, we included all which means that there is likely a more extensive representation of newly-started open access journals which are at least not yet included in any indexing service. Lastly, our methodology for collecting data was manual rather than automated through an API or web scraper.

The second question was: How common is it for journal publications to be open access through journal websites within the field of ethics? Again, the short answer of 13.9% for all articles is a convenient summary but hides behind it many different mechanisms through which journals can provide open access to its publications. The breakdown of the various mechanisms is provided in Table 3 shows that roughly half (51%) of all journal articles available on journal websites were in full open access journals where all content is made open access, with the remainder split between hybrid open access articles (19%), and a mixture of promotional, delayed, and unknown open access mechanisms (30%).

The third question was: Which websites and platforms do ethicists use when self-archiving? Here there is not one simple answer that can be given, but in descending order of popularity ASNs, subject repositories, and personal websites of authors are the top three web location categories used for providing free access to publications. Table 3 provides the complete breakdown for found copies across web locations. Here a few remarks are in order. While institutional and subject repository access can be assumed to be the most resilient due to their monitoring and enforcement of publisher self-archiving policies as well as by providing persistent identifiers and URLs, access provided through ASNs can be considered volatile due to platform ownership structures being in flux and publisher legal action being targeted to such platforms specifically. But ASNs are not the only platforms susceptible to volatility, even publisher webpage access is not guaranteed to persist. In a recent large-scale study by Piwowar et al. (2018) publisher webpages were found to provide access to a large proportion of freely available content, without licensing such content as open and free to distribute. This can be the case when publishers provide e.g. the first issue of the most recent volume free to read on a moving wall basis for promotional purposes, or when a journal provides free access after a set delay of e.g. 12 months since original publication (Laakso and Björk 2013) but can withdraw such access at any moment.

The fourth question: What versions of the publications do ethicists use when self-archiving? Across all other locations than institutional repositories the most frequent article version was the publisher’s version. This breakdown was provided in Table 3, while Table 5 further highlighted that for 500 of the 1682 articles included in the study the only freely available version was provided as the publisher’s version. This is a large concern for the sustainability of the current level of access since very few publishers allow distribution of the publisher’s version of the article. Accepted manuscripts are the second largest category, followed by unlabeled versions where the status of the manuscripts was unknown, and lastly preprints which had a very minimal presence in the dataset.

Moving towards the compliancy aspect of the study, the fifth question was: Are popular ethics journals clear with regard to their self-archiving policies? The answer is yes for the most part though there is some room for improvement. This aspect was explored by collecting and manually coding the allowed and prohibited web location/document version categories as per the categorizations utilized in this study. A study conducted over 10 years ago on journals within Library and Information Science by Coleman (2007) demonstrated that publisher policies for self-archiving were sometimes not publicly available on journal websites or ambiguous in their formulation. While the situation has improved since then with most journals having information on display the policies can still leave room for interpretation. While all 20 journals included in this part of the study provided clear instructions regarding distribution of accepted manuscripts on institutional and subject repositories, five journals did not have a clear policy regarding ASNs, and one journal failed to give a clear policy regarding dissemination on personal webpages. With regards to preprints five journals had unclear status for some web locations, and all non-open access journals were very clear in prohibiting distribution of the publisher’s version unless specifically paid for through hybrid open access.

The sixth question was: Comparing policies to web observations, are ethicists prone to copyright infringement? Here the answer is yes based on interpretation through the publisher policies collected in 2017, however, there is no reason to believe that ethicists would be more or less prone to making content available through infringing web locations and document versions. Table 14 and Fig. 2 provide more detail on the policy distribution of articles included in the policy analysis, and whichever way one looks at the data the majority of copies made available are not provided in compliancy with the journal publisher policies.

The seventh question was: Do ethicists undershare their research outputs? The answer is yes, and particularly when considering the current proportion of policy-compliant sharing. Based on the publisher policies the theoretical maximum that could be made available with reasonable effort on the author side is 100%. The current utilization of policy-compliant self-archiving is 22.1%, while disregarding the aspect of policy-alignment the utilization is 49.1%.

The eighth question: What is the current role of institutional repositories in facilitating authors´ self-archiving? Since many institutions were only represented by a few ethicists it is impossible to draw reliable conclusions comparing the institutions to each other, however, the overall position of institutional repositories in relation to other web locations is weak. As Table 4 demonstrates institutional repositories provided unique access only to 38 articles, which was the second lowest rating for any web location category only beating out aggregators which per definition should not provide any unique access. As such the current role of institutional repositories outside of the United Kingdom seems to be relatively weak and often used in tandem with other channels of distribution.

The ninth and last question: What is the current role of ASNs and subject repositories in facilitating authors’ self-archiving? ASNs are a dominant presence and a very popular venue through which ethicists´ provide free access to their research. Since alignment with publisher policies has not been rigorously enforced on such platforms as of yet a large share of content available through Academia.edu and ResearchGate are infringing on such policies. Overall this is nothing new, the rise in popularity of ASNs for disseminating full-text copies of research has been highlighted through multiple previous studies. PhilPapers was clearly the most popular subject repository, and there is reason to believe that the popularity will increase in the future. In October 2017, i.e. after the data collection for this study had been completed, PhilPapers announced the rebranding and relaunch of PhilArchive, a sister website focusing on the repository functions of PhilPapers, in an effort to make more authors aware of the possibility to disseminate their works through the service (philpapers.org 2017). As of the 7th of December PhilArchive hosts over 28 000 open access works. Ethicists conduct multi-disciplinary research and publish their research across a broad spectrum of journals. The 1682 journal articles were spread across 431 different journals, a large part of which disciplinary focus is not dedicated to philosophy or ethics. This is also seen in the prevalence of content in PubMed Central subject repositories which took up all following three spots in the most popular subject repository listing after PhilPapers.

Conclusions

The study discovered a high proportion of articles available open access, 56%, which is among the higher open access percentages observed in any study for any discipline. That this figure stems from ethics researchers is even more surprising since previous studies have measured very low open access shares for articles published by journals within the humanities and philosophy. Open access to 27% of total articles, i.e. close to half of the 56% total open access observed, was provided through a single copy available on the web. What is cause for concern for the long-term availability of content was that ASNs were found to be the most frequent provider of open access to articles which were not available anywhere else on the web. However, we could also observe that ASNs often also have a complementary role in being parallel avenue researchers chose to make their works available through. Academia.edu, ResearchGate and PhilPapers were all observed to have a strong presence among the dissemination channels used among ethicists, while institutional repositories were found to have low use outside of a few universities who seem to have stronger support for ensuring that content is self-archived in the institutional repository. We found that ethicists are at the same time prone to copyright infringement and undersharing their scholarly work, i.e. articles are made available on the open web incompatible with publisher policies (mainly publisher´s PDFs distributed on ASNs) while these and a much larger proportion of articles could be self-archived in compliance with the policies but are not.

The main contributions of this study have been providing a test case and template for conducting near-exhaustive mapping of accessibility with high granularity within a research discipline concerning web location and document version classifications. The study is one of the few that have explored authors’ copyright infringement and undersharing as part of the same study. With the exception of Troll Covey (2009b) previous works have tended to focus on either undersharing (Borrego 2017) or copyright infringement (Jamali 2017) in isolation, but doing so offers only partial understanding of authors’ compliance with open access policies. The study has observed that the phenomenon of widespread use of ASNs seems to extend with similar prevalence to ethicists, which as part of philosophy and the humanities has been observed to be lagging in comparison to other disciplines when it comes to open access prevalence through established journal publishing and repository usage. The discrepancy between the use of ResearchGate and Academia.edu is also an interesting finding, most previous studies have found ResearchGate to be the more popular platform while the reverse was true for this population of authors.

We encourage future research mapping and assessing access and policy alignment of web distribution to utilize methodologies that capture the complex and overlapping nature of access provision. In order for studies in this area to improve in accuracy, reliability, and replicability there would be great benefit in having longitudinal datasets of publisher policies to utilize. Studies inquiring into aspects of open access need to carefully weigh the benefits and drawbacks of different sampling strategies against each other. The approach can be author-centric like this study, or it can be journal-based, depending on the fit with the posed research questions. If the goal is to draw conclusions at both the author and journal level the sample of included articles needs to be sufficiently large as well as the level of detail for the observations needs to be high and inclusive of overlapping ways of providing access.