Investment in new research and technology is made under the assumption that scientific claims are supported by solid evidence; however, recent studies have shown that this is often not the case. For example, it has been shown that for some published results with major impact, replication of published results is difficult or impossible (e.g. Prinz et al. 2011; Begley and Ellis 2012; Fokkens et al. 2013; Anderson et al. 2015), and that exaggerated and false claims, sometimes with fabricated data and fake authors, have been accepted by and published in respectable journals (e.g., Fanelli 2009; Ioannidis 2011; Bohannon 2013; Hvistendahl 2013). As a result, there is an increasingly urgent call for validation and verification of published research results, both within the academic community and the public at large (e.g. Naik 2011; Zimmer 2012; Begley 2012; Editorial 2013a, b; Branco 2012). The discussion surrounding the reliability of published scientific results is often focused on the Life Sciences, especially in the media because of the immediate relevance of work in genomics, neuroscience, and other health-related areas to the public good; but the problem extends to all empirically-based disciplines, including human language technology (HLT). In fact, several recent articles have reported on reproducibility and/or replication problems in the HLT field (e.g., Johnson et al. 2007; Poprat et al. 2008; Gao and Vogel 2008; Caporaso et al. 2008; Kano et al. 2009; Fokkens et al. 2013; Hagen et al. 2015), and two recent workshopsFootnote 1 have addressed the need for replication and reproduction of HLT results. However, there is no established venue for publications on the topic, and perhaps more problematically, research that investigates existing methods rather than introducing new ones is often implicitly discouraged in the process of peer review.Footnote 2

To address this need, Language Resources and Evaluation (LRE), the premier journal for publication of papers concerning resources that support HLT research as well as evaluation of both resources and results, is acting to encourage the discussion and advancement of what is commonly referred to as replicability and reproducibility in the field of Human Language Technology. Researchers have not always used these two terms consistently, as discussed in Liberman (2015); here we adopt the distinction between the two terms put forward in Stodden et al. (2014):

Replication, the practice of independently implementing scientific experiments to validate specific findings, is the cornerstone of discovering scientific truth. Related to replication is reproducibility, which is the calculation of quantitative scientific results by independent scientist using the original datasets and methods. (Preface, p. vii)

It should be noted that despite efforts to distinguish reproducibility and replicability (e.g., by definig “levels” of reproducability Dalle (2012)), the line between the two is not always clear. What is clear is that whether for the purposes of replication or reproduction of prior results, access to the resources, procedures, parameters, and test data used in an original work is critical to the exercise. It has been argued Ince et al. (2012) that insightful reproduction can be an (almost) impossible undertaking without access to the source code, resources (lexica, corpora, tag-sets), explicit test sets (e.g., in case of cross-validation), procedural information (e.g., tokenization rules), and configuration settings, among othersFootnote 3; and it has been shown that source code alone is not sufficient to reproduce results Louridas and Gousios (2012). Awareness of the importance of open experiments, in which all required resources and information are provided, is evident in publications in high-profile journals such as Nature Ince et al. (2012) and initiatives such as myExperiment Footnote 4 and gitXivFootnote 5. However, as discussed in Howison and Herbsleb (2013), even though its importance is increasingly recognized, often not enough (academic) credit is given for making the code and resources used to produce a set of results available.

By establishing a special section on Replicability and Reproducibility, LRE is encouraging submissions of articles providing positive or negative quantitative assessment of previously published results in the field. We also encourage submission of position papers discussing the procedures for replication and reproduction, including those that may be specific to HLT or could be adopted or adapted from neighboring areas, as well as papers addressing new challenges posed by replication studies themselves. Submissions outlining proposals for solutions to the replicability/reproducibility problem and/or describing platforms that enable and support “slow science”Footnote 6 and open, collaborative science in general are also welcome. Articles accepted for publication on the theme will be highlighted in a special section of the LRE issue in which they appear, under the heading “replicability and reproducibility”. Three members of the LRE Editorial Board (António Branco, Kevin Bretonnel Cohen, and Piek Vossen) have been appointed to oversee the reviewing process for submissions addressing the topic.

At the same time, in order to encourage the availability of the resources required for adequate replication and reproduction of research results, the journal is also strongly encouraging the authors of submissions reporting novel research results to provide full and open access to these materials where possible, by including information about where these materials can be obtained (e.g., a github or gitXiv repository, a URL for a Jupyter NotebookFootnote 7, etc.).Footnote 8 Our review form is being modified to reflect this new emphasis, by asking reviewers if full materials have been made openly available.

LRE accepts full papers, survey articles, and Project Notes. Submissions in any of these categories are appropriate for papers reporting on replication/reproduction experiments as well as papers addressing issues surrounding the topic. LRE Project Notes, in particular, provide a venue for publication of information about the availability of materials and experimental data for experiments previously reported in LRE or elsewhere, or data that reflect interim results that can be used in replication/reproduction studies and upon which others can profitably build or expand.

LRE’s fostering of submissions reporting results of replicability and reproducibility studies and reports on experimental resource availability reflects its commitment to fostering a fundamentally collaborative (rather than competitive) mindset within the field. In addition, by providing a respected venue for publications on the topic, the journal wishes to reiterate its commitment to ensuring adequate academic credit for research and development activities–including both replicability and reproducibility studies and publication of experimental resources–that traditionally have not been well-recognized in HLT.