US researchers will soon face requirements to publicly share both the peer-reviewed publications and the digital data that result from their federally funded work. In mid-February, the White House Office of Science and Technology Policy (OSTP) issued a memorandum that gave federal agencies with over $100 million in annual research and development expenditures a six-month time frame for developing plans to increase public access to federally funded research results. The memo argues that these public access policies will accelerate scientific breakthroughs and enhance economic growth and job creation.

The US government has been developing its public access objectives for several years, gathering input from the National Science and Technology Council and public consultation. The OSTP memo follows from a requirement in the America COMPETES Reauthorization Act of 2010 (PL. 111-358) for policies to improve the quality, access, dissemination, and long-term stewardship of federally supported research results. The United States is not the only country making the move to public access. The United Kingdom implemented new open access (OA) policies for peer-reviewed publications on April 1, 2013, and the European Commission has plans to expand its OA requirements as well.

For publications, the memo requires agencies to “ensure that the public can read, download, and analyze in digital form final peer-reviewed manuscripts or final published documents.” It recommends a 12-month post-publication embargo period before research papers must be made publicly available, but allows agencies to tailor the time frame to accommodate the needs of different disciplines. The National Institutes of Health already mandates that publications resulting from its funding be deposited in the PubMed Central database within 12 months of publication, and some journals, such as Science, make all of their articles freely available after 12 months.

Most journals already offer at least the option for OA publication, either under the Green or Gold OA model. The Green OA model allows authors to deposit articles in non-commercial repositories like arXiv.org, or to post an article to their own website, while the Gold OA model allows free access directly from the publisher’s website, with publication costs covered by the author through grant funds or means other than traditional subscriptions.

However, several publishers that depend heavily on subscription revenue to cover their expenses are concerned that a 12-month embargo would jeopardize their financial sustainability. The Institute of Electrical and Electronics Engineers, for example, has noted that 85% of articles retrieved from its digital library are older than 12 months. To remain sustainable, stakeholders advocate that agency procedures accommodate the economic implications of various public access models.

The memo leaves open the question of how public access will be achieved, requiring only that agency procedures “facilitate easy public search, analysis of, and access to” the publications. If a repository approach is chosen, the management of this infrastructure could be centralized within the federal government or distributed in separate repositories managed by institutions, publishers, professional societies, or other third parties. Ideal functionalities of a repository include capabilities for bulk downloading, text mining, and computational analysis, as well as interoperability with other repositories and long-term stewardship and preservation of the work.

The OSTP memo requires that data sets used to support scholarly publications “should be stored and publicly accessible to search, retrieve, and analyze,” so long as they are non-confidential and non-proprietary. The memo does not specify the time period before data must be made publicly available, but stakeholder recommendations have ranged from immediately after the data have been gathered, to after an embargoed publication period, to upon completion of the relevant supporting grant.

Currently, data access and analysis are hampered by inconsistencies in formatting, nomenclature, metadata standards, and approaches for storage. Implementation of any plan will need to address these issues. As with publications, an infrastructure for housing data contributions has not been determined and may require considerable development to include full functionality, such as search functions, mining, and interoperability with other repositories. Additionally, stakeholders recommend a standardized data attribution system along the lines of the Digital Object Identifier System, ideally linked to the relevant publication.

figure 1

A significant and as-yet unanswered question is how these policies will be funded. The OSTP memo recommends leveraging existing archives and fostering public-private partnerships to maximize efficiencies. It also calls for agencies to identify resources within their existing budgets to implement the plans and ensure compliance with their policies. However, it seems likely that full implementation and compliance will require more resources than are available in existing budgets.

The cost of making data publicly accessible is unknown. The OSTP memo requires that agencies allow the inclusion of appropriate costs for data management and access in grant proposals, but in the case of terabyte-scale data sets common among some computational scientists, this could consume a significant portion of a grant budget. The short-term costs associated with databases include storage and distribution bandwidth, but agency plans should also provide contingencies for long-term costs such as preservation and potential migration. Some aspects of the infrastructure and expenses should be more manageable in the near term, as the memo stipulates that these public access requirements will only apply to manuscripts submitted for publication and data generated after the effective date of any implemented agency policy.

As the August 22, 2013 deadline for agencies to deliver their draft plans to OSTP fast approaches, the policies are currently being discussed in many forums, including National Research Council public comment sessions which took place in May. While many details associated with carrying out the OSTP memo’s objectives are in development, one thing seems certain—proper implementation of increased public access to publications and data will require significant resources in a time of constrained budgets.