Ensuring data access, transparency, and preservation: mandatory data deposition for Behavioral Ecology and Sociobiology

Universal data sharing allows access to data sets supporting published research and enables the use of data for additional applications. It is important for quality control to ensure compliance with best practices for publication and scientific ethics and reliability of data storage. Data deposition in repositories is obligatory for publications involving molecular data (e.g., nucleotide sequence data, protein structural data) and for all data in publications in the life sciences and earth sciences in the highimpact journals Nature, Science, Proceedings of the National Academy of Sciences USA, and Proceedings of the Royal Society B (see https://www.nature.com/sdata/ policies/repositories for recommended repositories for Nature). Data sharing allows the preservation and reuse of data when ethical (Duke and Porter 2013; Mills et al. 2015). Data not stored in a repository may be lost in the future: in a study of more than 500 papers containing morphological data, each annual increase in article age was found to decrease the odds of the data set remaining extant by 17% (Vines et al. 2013). Data sharing also facilitates the detection of irregularities or anomalies in raw data that may lead to corrections or retractions of published papers in behavioral ecology (e.g., Keiser et al. 2020; Laskowski et al. 2020a, b; Proceedings B Editorial Team 2020). Mandatory data sharing can thus strengthen journal credibility and stature and generally advance scientific interests (e.g., Piwowar and Vision 2013). Data upload, however, is uncommon in the publication of behavioral research (Caetano and Aisenberg 2014). Mandatory requirements for making data available are increasing (Setchell et al. 2016; Sim et al. 2020). Behavioral Ecology and Sociobiologywill now require data deposition and strongly recommends to supplement source codes for publications in the journal to strengthen our policy of promoting the transparency and reproducibility of behavioral science by making data sets accessible (see Instructions for Authors for recommended repositories). This change in policy will ultimately benefit authors, as papers with publicly available datasets have higher citation rates than similar studies that do not post data (Piwowar and Vision 2013). Sharing long-term data may come with serious costs to the principal investigator; Mills et al. (2015) therefore suggest that “journals should have a rule that no paper is considered where the data users have not corresponded with the data owners and included appropriate acknowledgment of the source of the data within the paper.” Data for articles published in Behavioral Ecology and Sociobiology should be archived properly; recommendations are provided by Roche et al. (2015) and are available on upload service websites. Error should be minimized, or the full benefits of data availability will not be realized. For example, in a study of 100 datasets in Dryad from non-molecular evolutionary and/or ecological publications in leading journals, 56% were incomplete, and 64% were archived in formats that partially or entirely prevented reuse (Roche et al. 2015). The most common problems with these data sets were inadequate metadata, the use of inadequate file formats, and failure to archive raw data. Proper data archiving has a high benefit, as completeness and reusability scores are strongly correlated (Roche et al. 2015). Repositories may be institutional, national, or global. Dryad, Drum, and EASY have been identified as suitable repositories for general scientific data and have a number of favorable characteristics such as availability of guidelines for upload and storage and long-term preservation (Banzi et al. 2019). Sequence data must be deposited in disciplinary repositories. Institutional repositories may be able to better guarantee the ethical use of shared long-term data (Mills et al. 2015). * Theo C. M. Bakker tbakker@evolution.uni-bonn.de

Universal data sharing allows access to data sets supporting published research and enables the use of data for additional applications. It is important for quality control to ensure compliance with best practices for publication and scientific ethics and reliability of data storage. Data deposition in repositories is obligatory for publications involving molecular data (e.g., nucleotide sequence data, protein structural data) and for all data in publications in the life sciences and earth sciences in the highimpact journals Nature, Science, Proceedings of the National Academy of Sciences USA, and Proceedings of the Royal Society B (see https://www.nature.com/sdata/ policies/repositories for recommended repositories for Nature). Data sharing allows the preservation and reuse of data when ethical (Duke and Porter 2013;Mills et al. 2015). Data not stored in a repository may be lost in the future: in a study of more than 500 papers containing morphological data, each annual increase in article age was found to decrease the odds of the data set remaining extant by 17% (Vines et al. 2013). Data sharing also facilitates the detection of irregularities or anomalies in raw data that may lead to corrections or retractions of published papers in behavioral ecology (e.g., Keiser et al. 2020;Laskowski et al. 2020a, b;Proceedings B Editorial Team 2020). Mandatory data sharing can thus strengthen journal credibility and stature and generally advance scientific interests (e.g., Piwowar and Vision 2013). Data upload, however, is uncommon in the publication of behavioral research (Caetano and Aisenberg 2014). Mandatory requirements for making data available are increasing (Setchell et al. 2016;Sim et al. 2020).
Behavioral Ecology and Sociobiology will now require data deposition and strongly recommends to supplement source codes for publications in the journal to strengthen our policy of promoting the transparency and reproducibility of behavioral science by making data sets accessible (see Instructions for Authors for recommended repositories). This change in policy will ultimately benefit authors, as papers with publicly available datasets have higher citation rates than similar studies that do not post data (Piwowar and Vision 2013). Sharing long-term data may come with serious costs to the principal investigator; Mills et al. (2015) therefore suggest that "journals should have a rule that no paper is considered where the data users have not corresponded with the data owners and included appropriate acknowledgment of the source of the data within the paper." Data for articles published in Behavioral Ecology and Sociobiology should be archived properly; recommendations are provided by Roche et al. (2015) and are available on upload service websites. Error should be minimized, or the full benefits of data availability will not be realized. For example, in a study of 100 datasets in Dryad from non-molecular evolutionary and/or ecological publications in leading journals, 56% were incomplete, and 64% were archived in formats that partially or entirely prevented reuse (Roche et al. 2015). The most common problems with these data sets were inadequate metadata, the use of inadequate file formats, and failure to archive raw data. Proper data archiving has a high benefit, as completeness and reusability scores are strongly correlated (Roche et al. 2015).
Repositories may be institutional, national, or global. Dryad, Drum, and EASY have been identified as suitable repositories for general scientific data and have a number of favorable characteristics such as availability of guidelines for upload and storage and long-term preservation (Banzi et al. 2019). Sequence data must be deposited in disciplinary repositories. Institutional repositories may be able to better guarantee the ethical use of shared long-term data (Mills et al. 2015).
Acknowledgments We thank Rebecca Grant for comments on an earlier draft.
Funding Open Access funding enabled and organized by Projekt DEAL.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.