1 Introduction

In the beginning, there was little to distinguish science from magic.

Any efforts to differentiate these two disciplines would probably have lacked meaning to practitioners of either (see Dampier 1961, p. 144). What today we recognize as early scientists were people who, like John Dee, spent much time contemplating and trying to explain the universe, by means of a curious mixture of mathematics and conjuring (Dee was imprisoned in 1555 for casting horoscopes of Queen Mary; see Forshaw 2017, p. 120). No less an icon of the Scientific Revolution than Newton was once called by John Maynard Keynes “not the first of the age of reason [but] the last of the magicians” for having straddled the emerging divide between magic and science.

This divide was most keenly observed in the fruits of magical and scientific thought. Where chrysopoeia–the art of turning base metals into gold–never succeeded at its stated objective, the scientific method has in contrast helped us to understand (and change) the world, in ways that would almost certainly seem like sorcery to pre-scientific peoples (and, one might argue, willfully sub-scientific people in the twenty-first century).

A key reason for the success of modern science is its methodic approach to observe, hypothesize, and predict phenomena, if not with full certainty, at least with bounded levels of uncertainty. But this is not all. One of the most famous scientific disputes of all time involves the discovery of differential and integral calculus. It is now accepted that Newton and Leibniz arrived at the principles of calculus independently; slicing space and time into infinitesimal pieces appears to have been an idea that was ripe for discovery. But while Newton appears to have worked on the fundamentals of calculus earlier (giving him notional priority), for some reason he did not publish his fluxional notation for the calculus in full until 1704, approximately 20 years after Leibniz published Nova Methodus pro Maximis et Minimis, the first paper to employ his version of calculus.

Clearly, it is not enough to claim that one has harnessed infinite powers to describe the universe (q.v., Strogatz 2019): for a discovery to count in science, the process by which those powers were harnessed must be public and clear, so that others can verify the claim and replicate it with consistent success. Unlike magicians, who jealously guard their secrets lest others steal their powers, scientific discovery must withstand the constant challenge of examination and falsification.

2 The opening and closing of science

Science, of course, has changed remarkably since its quasi-magical origins in the days of Dee and Newton. While always a collective enterprise (e.g., “If I have seen further it is by standing on the shoulders of Giants”), science in the twenty-first century is no longer the abode of mighty minds making astounding discoveries powered mostly by incredible imaginations. The enterprise of science has become increasingly complex, with more, and more refined moving parts, that try to uncover the secrets of more subtle and perhaps more elusive aspects of natural and social phenomena.

As science has become a more democratic activity with growing numbers of practitioners, each with their very own human motivations for doing it (not to mention their individual foibles and limitations), peer-review has grown to be seen as an almost essential mechanism to weed out error and to better approximate scientific truth. Indeed, peer-review was already seen toward the end of the twentieth century as “the warp that holds the complex fabric of science together” (Jukes 1977; who, incidentally, went on to add that “the woof is the noise made by scientists who complain about it”, anticipating by decades the memetic complaint about Reviewer #2 in the age of Twitter). And, although questions abound about how efficiently the mechanisms of peer-review achieve this (see Smith 1997; Tennant and Ross-Hellauer 2020), in my mind at least there is little doubt that evaluation of research by peers, even if it happens after publication in the literature, remains an effective mechanism to discover error and fraud: snake oil has never become panacea due to lack of testing.

Calls for opening up the black box of peer-review are not new. Smith (1997), for instance, was concerned with editorial peer-review; I tend to think of peer-review in a more general sense: the examination of research results at any stage in the life of an idea, including after its publication. A challenge, in either case, is that in contemporary practice, the results of published research often fail to be reproducible, let alone replicable (Gustot 2020; Iqbal et al. 2016). There are many reasons why research may not be reproducible or replicable, including poorly documented data collection processes, use of proprietary software, failure to share data and code, and so on (see Stodden et al. 2018). In such circumstances, replication is an expensive proposition in need of institutional support (e.g., research grants), human resources and time (e.g., training scientists), and appropriate incentives (e.g., discovery, publications, tenure). Needless to say, in the current “publish or perish” environment, the cards of “innovation” are often stacked against reproducibility and replication, as “new” results tend to be prized above confirmation by funding agencies, reviewers, researchers, and editors. Paradoxically, however, when research is not reproducible/replicable, the box of science becomes almost as black and closed as magic, and peers and users of scientific information are asked to accept the results on faith.

The present issue on Open Spatial Sciences was born of my growing interest in reproducibility and replicability in my own little corner of science over the past few years. This issue draws inspiration from an earlier thematic collection published in Geographical Analysis (Rey and Anselin 2006) that concentrated on software for spatial analysis. According to Rey and Anselin (2006), early developments in software for spatial analysis emerged in response to the specialized needs of working with geographical data (e.g., creating maps, computing matrices of spatial relationships, etc.) These developments in turn led to the dissemination of numerous tools, thanks to pioneering work such as that of Luc Anselin, first with SpatStat and then with GeoDA (e.g., Anselin 2000; Anselin et al. 2006), Roger Bivand and his contributions to spatial analysis in ‘R’ (e.g., Bivand 2006; Bivand et al. 2013), and Jim LeSage and his toolboxes in MATLAB (e.g., Liu and LeSage 2010), to name just a few. Many of these developments were explicitly open, in the sense that the code, far from being a black box, was available for examination by any interested party. Rey (2009), in a subsequent paper, argued that adoption of open source practices in computational geography would be essential for developing next generation software tools, but perhaps more importantly would prove influential for scientific practice and education.

This special issue is, in a way, an exploration of the trends anticipated by Rey more than a decade ago. In inviting contributors to this issue, I wished to learn from my colleagues about recent developments and whether embracing openness is changing the practice of spatial analysis, both as a research field and in educating a new generation of experts. An advantage of being a smaller journal is that as editor I can pay close attention to the papers I manage. A disadvantage is that I could not possibly invite all those researchers and thinkers who I thought had something valuable to say on the subject matter. The selection of papers is therefore to some extent biased by my own idiosyncratic interests. Despite this, it is my hope that as a collection these papers will provide food for thought and will continue to inspire new ideas for making the spatial sciences increasingly transparent and reliable. Not with magic and faith, but with openness and reason.

So, without further ado, it is my pleasure to introduce the papers that conform this special issue.

3 Opening the spatial sciences

The first two papers in the issue discuss two important aspects of opening the spatial sciences.

Brunsdon and Comber (2020) reflect on recent trends in the spatial sciences, which include the practice of coding as a core competency and a turn away from proprietary “black-box” software tools. The high tide of Big Data is here to stay, and data are increasingly also spatial. A corollary is that someone is going to analyze them: geographers and spatial data scientists may not have planned to be in the spotlight, but this is where we are. Brunsdon and Comber (2020) make an earnest call to ensure that spatial data scientists take a critical approach toward their craft, with particular emphasis on “the need to explicitly account for and articulate the inherent assumptions and biases present in all data”, and “openness, transparency, code and data sharing, and reproduction.” These are all important recommendations that we should all heed: the reward will be, I believe, a more exciting, vibrant, lively discipline, which can also enjoy greater legitimacy and impact.

Arribas-Bel et al. (2021), in their paper, turn their attention to an issue that is critical for reproducible research, but that is seldom as glamorous as other scientific endeavors: data preparation and packaging. Data collection, cleaning, and processing to make them ready for analysis are often a series of tedious, time-consuming chores. And yet, data are among the fundamental building blocks of spatial research. Typically, data-related tasks are reported very briefly in a journal article. Even when they come from public sources, the way they are prepared for analysis often remains obscure; recreating the steps, assuming that one was willing to try, might not even be possible due to lack of documentation. Arribas-Bel et al. (2021) propose the concept of Open Data Products (ODP) as a complement for open software—and as a compromise to sensitive data that cannot be directly shared. A framework for the creation of ODPs would ensure that analysis-ready data are properly documented in an open and transparent fashion. This would provide enhanced support to the kind of critical spatial data science advocated by Brunsdon and Comber in their paper. ODPs are, in my mind, one of the most exciting recent developments to keep an eye on in the spatial sciences, both for their potential to accelerate discovery, as well as to make science more immediate and tangible to students and trainees.

4 Trends in tools and applications

In his paper for this issue, Bivand (2020) provides an update on progress in the ‘R’ ecosystem for working with spatial data. Bivand has been, along with Pebesma and Gómez-Rubio Bivand et al. (2013), Brunsdon and Comber (2015), Lovelace et al. (2019), and Baddeley et al. (2015), among many others, a driving force for making computational spatial sciences more open and accessible with their contributions to the ‘R; Project for Statistical Computing. This paper reflects on the evolution of spatial analysis in ‘R’, from the development of basic spatial functions in the early days of ‘R’ (Bivand and Gebhardt 2000), through a painstaking process of maturation (Bivand 2006), to the emergence of newer and improved approaches for representation of spatial data, including the simple features format in the ‘sf’ package (Pebesma 2018). Bivand concludes by noting the opportunities and challenges presented by newer sources of data, and in particular the treatment of spatio-temporal data such as trajectories, of interest in fields such as ecology and transportation.

The latter field, transportation, is inherently a spatial discipline, whether space is represented explicitly or not. It is also a field where dedicated software can be expensive to acquire and learn. In his contribution to the special issue, Lovelace (2021) surveys the landscape of open source software tools for transportation planning. Lovelace with numerous collaborators has been a very active contributor to the development of open source tools for the analysis of transportation data (Lovelace and Ellison 2018; and particularly active travel Lovelace et al. 2017), as well as an advocate for greater openness in transportation modeling (Lovelace et al. 2020). Development of open source tools for transportation planning, in addition to the usual advantages of openness, involves for developers a degree of rebellion against conventional tools, many of which were developed for motorized traffic. As the paper shows, there are many open source alternatives that offer options not available even a decade ago. Their application, in conjunction with traditional and emerging sources of data, can greatly improve the way we research, teach, and practice transportation planning in the twenty-first century—as recent examples in the literature demonstrate (e.g., Boeing 2021; Desjardins et al. 2021).

5 And on the education front

“Everyone does need to learn to code” is, according to Brunsdon and Comber (2020), one of the conditions for more open spatial data sciences. But teaching how to code remains a challenge in disciplines that reacted strongly against quantitative methods in the past, including geography (e.g., Hamnett 2003a, 2003b; Johnston et al. 2003) and planning (e.g., Batty 1994; Lee 1994). One possible way forward is to adopt literate programming as a delivery tool for courses that introduce coding to students. Literate programming is a style of coding that prioritizes natural language to communicate with humans, interspersed with code to implement computational tasks. Computational notebooks, increasingly an important part of the Python and `R` language ecosystems, provide an excellent medium for literate programming (Nüst and Pebesma 2021; Rowe et al. 2020). A challenge is the creation of relevant, high-quality content for use in education. With GeoPyTeR, Reades and Rey (2021) conceive a hub where creators can contribute and borrow content, and that anyone can resort to for materials to use in their teaching practice. Instead of the standardized, and frankly boringly uniform Massive Open Online Courses dreamt by business-oriented minds, a framework such as GeoPyTeR represents an invaluable opportunity to create vibrant, locally relevant, and lively teaching communities based on the principles of openness.

The special issue closes with a flourish with the contribution of Solís et al. (2020a, 2020b). As other papers in the issue recount, open geospatial data and tools such as those developed in Python and `R` have been essential to increase openness in research and to create exciting opportunities to improve teaching practice as well. In their paper, Solís et al. (2020a, 2020b) draw from their experience working with YouthMappers and OpenStreetMaps to illustrate the real-life impacts of open geospatial resources. YouthMappers is a student-led, faculty-mentored organization with hundreds of chapters in dozens of countries. YouthMappers is described as a “hybrid movement” that empowers and mobilizes youth to engage with global discourses on issues that matter to them, such as sustainability (see Solís et al. 2020b). The effectiveness of this movement when working with open geospatial tools has already been put to the test, for instance in the fight against malaria (Solís et al. 2018). In their paper in this issue, Solís et al. (2020a, 2020b) report that there appears to be a confidence gendered gap when working with geospatial tools, something that instructors should definitely pay attention to. With growing proficiency (which is often paired with internship opportunities, reflecting the demand for geospatial skills), work with open geospatial tools appears to be key to support, not only workforce capacity, but also meaningful opportunities for YouthMappers to grow as engaged global citizens.

6 CODECHECK

For this special issue, we had the pleasure of partnering with Daniel Nüst and the CODECHECK initiative (Nüst and Eglen 2021). CODECHECK is an open science project that aims to address some of the limitations of traditional journal articles when communicating computational research. By providing independent verification of computational processes, CODECHECK represents an effort to break the barrier of the inverse problem in reproducible scientific publishing: the article is not the scholarship itself, it is merely an advertisement of the scholarship (Buckheit and Donoho 1995). CODECHECK involves the author(s) of a scientific publication, the editor(s) at the candidate journal, and the code-checker(s). As presently configured, the author(s) must provide the computational materials to the code-checker(s), perhaps through the mediation of the editor(s). This process calls for a collaborative approach to review the research, and the role of the code-checker(s) is not to accept or reject a submission, but rather to ensure that all computational processes run as advertised. Checking the code can help to find issues with the reproducibility of a process, as well as to suggest fixes to any issues identified. A certificate is issued by CODECHECK reporting the status of computational processes as they are reported in the final version of the scientific paper.

Code-checking is not appropriate for every single paper since not every paper presents a computational process. In the case of the present issue, two papers were code-checked: Brunsdon and Comber (2020) and Bivand (2020). An example of the first page of a CODECHECK certificate is shown in Fig. 1. In my experience working with authors and a code-checker, CODECHECK does not currently sit comfortably within the traditional editorial process of peer-review, but it is a very promising avenue for further exploration. If CODECHECK is an improvement over current editorial practice that does not prioritize reproducibility, making us uncomfortable is perhaps the least of our issues. But even in our experiments, one thing that I particularly liked of working with CODECHECK was the collaborative approach to identifying and fixing issues prior to publication, and the fact that at least one independent reviewer can vouch for the reproducibility of the computational aspects of the research. There are numerous questions about the scalability of this system, including crowdsourcing the checks of code, training and hiring in-house code-checkers, and so on, and we are continuing our conversations with the CODECHECK initiative and our Editorial team as we search for ways to increase the reproducibility of published research.

Fig. 1
figure 1

Example of the first page of a CODECHECK certificate

7 Some final remarks

It has been my incredible pleasure to work on this special issue. I would like to thank all contributors and reviewers for their amazing contributions. Throughout its development, it provided much food for thought and inspiration. Two odd-decades or so after open source principles were wholeheartedly adopted by the spatial analysis community, it is clear that there has been much progress to date, and that plenty of challenges and opportunities remain to achieve better standards for scientific publishing and educational practice. Judging from the emergence of a cohort of scholars who have embraced openness as part of their scientific and teaching practice (e.g., Arribas-Bel et al. 2017; Leonelli et al. 2015; Raimbault et al. 2021; Singleton et al. 2016; Sui 2014; Trojan et al. 2019), Rey’s words in his 2009 paper have proved prescient. The evolution of my own attempts at adopting reproducibility in my research can be tracked from an early effort, which resulted in a rather cluttered research repository (Paez et al. 2019),Footnote 1 to more streamlined reproducible research workflows, such as Paez et al. (2021)Footnote 2 and Paez and Higgins (2021).Footnote 3 The same principles have informed my efforts to become a more effective teacher (see https://github.com/paezha/envsocty3LT3). My objective going forward is to instill the principles of openness, reproducibility, and replicability in my students, so that openness becomes a natural frame of mind for them: in the end, my hope is that they will not have to remember, because they have always been truthful. I look forward to the next series of developments in open spatial sciences and remain optimistic about my own obsolescence as emerging scholars and teachers achieve even higher standards of openness than are possible today.