1 Introduction

Scale is a central concept in the geographical sciences and is an intrinsic property of many spatial systems. For decades, the term scale has been used across a diverse set of literature to capture a wide array of phenomena. For instance, scale is used to demarcate or link physical processes that are expressed across landscapes to those that occur at lower levels (e.g., constituent soil patches) or at higher levels (e.g., broader climatic regions). Alternatively, scale is used to refer to the level at which data are collected (e.g. individuals, census tracts, counties) or the range over which spatial processes vary (e.g. local, regional, global). There is also a persistent focus on the identification and quantification of representative scales in an effort to alleviate issues generated by the misspecification of scale (i.e., MAUP, uncertainty). These diverse and often diverging usages of related terminologies have resulted in confusion as to what is meant by scale. Several attempts have been made to bring clarity to the concept of scale, though ambiguity still remains (Quattrochi and Goodchild 1997; Sheppard and McMaster 2004; Dabiri and Blaschke 2019). In particular, the concepts and vocabulary used when simultaneously referring to more than one scale are diverse and inconsistent. Therefore, this paper aims to shed light on this important issue by conducting a scoping review and drawing connections between different conceptualizations and deployments of multiple spatial scales.

By exploring the range of scale concepts employed within the geographical sciences, specifically with regard to the terms and methods typically used to operationalize spatial scale, it is possible to uncover novel insights into the nature of scale as used in practice. As a result, rather than adopting any single existing template for organizing knowledge about spatial scale, of which there are many (e.g., Meentemeyer 1989; Lam and Quattrochi 1992; Marceau 1999; Gibson et al. 2000; Atkinson and Tate 2000; Goodchild 2001; Dungan et al. 2002; Wu and Li 2006; Manson 2008; Ruddell and Wentz 2009; Wu and Li 2009; Dabiri and Blaschke 2019), we propose using a scoping review of publications from a collection of leading geographical science journals to provide an alternative lens to understand the many concepts of scale employed in quantitative research. The construction of a scoping review has gained popularity, particularly in the medical and health fields, as a formal mechanism for mapping the evolution of knowledge and identifying broad trends in contemporary research (Arksey and O’Malley 2005; Khalil et al. 2016; Levac et al. 2010; Munn et al. 2018; Peters et al. 2015; Pham et al. 2014; Tricco et al. 2018). As such, this review attempts to identify what is meant by “multiscale” in scale-centric research by: (1) reviewing key terms and popular expressions of spatial scale; (2) proposing a typology for underscoring the role of scale in various spatial analytical techniques; (3) bringing attention to the lack of consensus surrounding multiscale notions and proposing characteristics to differentiate these notions; and (4) suggesting avenues for future work on measuring process scale and related uncertainties. The review also highlights recent contributions and provides another useful entry point into the vast literature on scale.

In particular, this review focuses on the role of scale multiplicity (i.e., more than one scale) in the pursuit of making inferences about spatial processes. In this context, scale multiplicity has two interpretations. It can refer to what we label type I scale multiplicity or the fact that there are multiple definitions of spatial scale. Alternatively, it can refer to type II scale multiplicity or the idea that the spatial analytical tradition often leverages data at multiple realizations of scales of the same type to learn about spatial processes. Therefore, we explore the proposition that the inferences that can be made about spatial processes depend on the type of scales employed (i.e., type I scale multiplicity), how they are abstracted, and how they are integrated to combine or differentiate information across multiple realizations of scale (type II scale multiplicity). This proposition is similar to that put forward by Dungan et al. (2002) focusing on type I scale multiplicity, though it is extended here to explicitly consider the role of type II scale multiplicity across a variety of contexts. In order to effectively develop this proposition, the first goal of this paper is to distinguish between "data" and "process" notions of spatial scale and discuss how traditionally the spatial scale of data has governed inferences about spatial processes. The second goal of this paper is to clarify the various terms for referring to more than one spatial scale (e.g., multiscale versus cross-scale), the types of tasks associated with the manipulation of multiple scales, and how some techniques permit inferences on the scale of spatial processes. A third goal is to bring to the fore the importance of scale in the geographical sciences and motivate the development and use of methods that leverage scale to learn about spatial processes. Achieving these goals will expand our understanding of spatial scale and inform researchers from different traditions of collective opportunities and challenges when conducting spatial analyses and developing new tools.

In pursuit of the above goals, we start with a guided background that provides an overview of previous reviews on spatial scale, examining foundational ideas, and bridging related but sometimes disparate nomenclature. Next, a workflow to gather and analyze an illustrative sample of the literature is outlined. The focus then shifts to typifying facets of scale in empirical research and discussing trends in how scale is both quantified and utilized as a quantitative lens. Finally, some future steps are proposed and some concluding remarks are offered.

2 A synthesis of terminology and concepts

Researchers from both the natural and social sciences often analyze phenomena at one or more spatial scales. Indeed, there is by now a large volume of literature describing the concept(s) of spatial scale and detailing developments across various disciplines. On the one hand, domain-specific scale terminologies have arisen in conjunction with substantive research in areas such as landscape ecology (Turner 1989; Stuber and Gruber 2020), remote-sensing (Marceau and Hay 1999; Wu and Li 2009), and segregation (Fowler 2016; Johnston et al. 2018). Together, this has perhaps led to the idea that scale is an ambiguous concept and is dependent upon application context (Turner 1989; Goodchild 2001; Dabiri and Blaschke 2019). On the other hand, there now exist several general treatments of the topic that attempt to bridge subjects and forge a common understanding (e.g., Harvey 1968; Lam and Quattrochi 1992; Marceau 1999; Atkinson and Tate 2000; Dabiri and Blaschke 2019). From these efforts, several definitions of spatial scale are frequently encountered across the geographical sciences.

Dabiri and Blaschke (2019) provide a high level synthesis of some previous conceptualizations of spatial scale that yield several common categorizations of the notion of scale: (1) cartographic scale; (2) geographic scale; (3) process (or operational scale); and (4) observation (or measurement) scale.Footnote 1 Cartographic scale, which refers to the ratio or proportion of the size of map features to their true size, is perhaps the most traditional and self-consistentFootnote 2 definition. Here, “small scale” refers to objects that are rendered on the map with less detail and “large scale” refers to objects that are rendered on the map with more detail. The growth of spatial data and technology in the past few decades has encouraged researchers to seek alternative and sometimes contradictory definitions to accommodate additional aspects of spatial phenomena. For instance, Goodchild (2001) argues a disconnect between the pre-digital era definition of large-scale and adaptations to fit the modern era where large-scale could also refer to the extent of a study area or the physical size of the features involved in a given process. Another example is provided from the field of ecology through a series of short exchanges highlighting the need for both consistency and versatility when referring to scale in ecology (Silbernagel 1997; Jenerette and Wu 2000; Csillag et al. 2000). As a result, Dungan et al. (2002) recommend avoiding the singular term ‘scale’ and to specifically describe the concepts relevant to a particular analysis. They also suggest differentiating between scale concepts used for collecting data, conducting statistical analysis, and describing phenomena. These calls for more flexible conceptualizations have led to the three contemporary notions of geographic scale, observation scale, and process scale that are often described in the literature, though a large variety of terminologies are used to describe these notions and the nuances amongst them. For example, Table 1 attempts to compare various terms used for similar conceptualizations of scaleFootnote 3 and it becomes apparent that with the exception of cartographic scale, there is not a strong consensus within or between notions. Though potentially imperfect, we elect to summarize the many notions identified in the literature under the auspices of geographic scale, observation scale, and process scale (bottom row of Table 1) to simplify further discussion. Generally, geographic scale refers to the extent of an area of interest; observation (or measurement) scale refers to the resolution of spatial units across an area of interest; and process scale (or operational scale) is the dimension over which particular processes operate and may refer to their theoretical description or empirical measurement. This latter concept of process scale is of primary interest here and is further differentiated and examined throughout this review.

Table 1 Examples of the diverse terminology used in the literature for different notions of scale

A more general definition of scale is provided by Marceau (1999) who states that, “scale refers to the spatial dimensions at which entities, patterns, and processes can be observed and characterized”. Following this definition, an alternative conceptualization of the latter three definitions of spatial scale mentioned above can be provided in terms of how each modulates spatial entities, patterns, and processes. Often in the geographical sciences, the focus is on collecting and analyzing georeferenced data in order to measure patterns and ultimately inform about spatial processes. In this context, geographic scale and observation scale are the dimensions that modulate spatial patterns. Geographic scale can be thought of as the macro-attribute governing spatial patterns, whereas observation scale can be thought of as the micro-attribute governing such patterns (Goodchild 2001). The former controls the amount of area over which a pattern can vary, whereas the latter controls the number and nature of the spatial units over which a pattern can be expressed. In contrast, process scale is the dimension that modulates the relationships that generate the data and patterns we ultimately observe. Since we typically cannot directly observe processes, geographic scale or observation scale measurements are often used to characterize the scale that processes occur.

Atkinson and Tate (2000) provide another framework based on a geostatistical perspective to conceptualize spatial scale. In their view, the dominant scale of spatial variation (i.e. inferred process scale) measured across spatial data is at least partially determined by the scales of measurement (i.e., observation scale) used to obtain the data. Atkinson and Tate also define “multiscale” as the multiple scales of measurement at which data may be observed. This raises an important issue because multiscale could equally refer to multiple geographic scales, multiple observation scales, or multiple process scales, with an emphasis on empirical data or theoretical processes, each of which may refer to different types of connections between and within spatial systems. For example, Gibson et al. (2000) describe the scale-related term of hierarchy and how multiple levels are connected across constitutive hierarchies, such as individuals to families or cities to regions, since phenomena at any one level are affected by levels above and below. In contrast, in contemporary geographically-weighted models, multiscale does not refer to multiple levels of measurement. Instead, relationships between variables and an outcome can change quickly or slowly with distance while the level of measurement remains constant (Fotheringham et al. 2017). This means that some relationships in a multivariate model may be ‘local’ relationships that change within a single city or neighborhood with many inflection points across the study area, whereas others may be ‘regional’ relationships that vary slowly with only one or two inflection points. Others still may be ‘global’ relationships which hold for all locations. Thus, “multiscale” in the former sense refers to the scale of a set of separate connected entities, whereas the latter sense refers to the scale of process variation. These examples are further distinguishable from multiscale technologies that allow data to be efficiently stored, accessed, and visualized. Further confusion may also arise from related expressions for more than one scale, such as multiscale, cross-scale, multiple scales, multiscalar, and scale-invariant.

Given the diversity that exists in the literature regarding the conceptualization and operationalization of scale, in what follows we attempt a comprehensive yet bounded review of the use of scale by focusing on a selection of core geographic journals. The overall goals and organizational logic of this review are summarized in Fig. 1, which illustrates the use of the literature to extract examples on the many uses of scale and distill them into a typology. From left to right, we focus on identifying how the different types of scale are employed (type I scale multiplicity), how multiple realizations of scale are integrated or differentiated (type II scale multiplicity), and the core tasks informed by these aspects of scale.

Fig. 1
figure 1

Flowchart outlining goals of the study

3 Selecting a ‘scale’ corpus from the geographic literature

To examine the use of ‘scale’ in the geographic literature, five journals were selected as a representative sample of the much larger corpus of geographic knowledge: (1) The Professional Geographer (TPG); (2) The Annals of the American Association of Geographers (AAAG); (3) Applied Geography (APGEO); (4) Geographical Analysis (GEAN); and (5) The International Journal of Geographic Information Science (IJGIS). Collectively, these journals represent a wide breadth of contributions from across the discipline and a diversity of application areas. Although literature from other disciplines contains important developments on the conceptualization and quantification of spatial scale, such outlets tend to be more narrow compared to the geographical journals selected here, and placing meaningful bounds across an array of discipline-specific outlets would likely prove to be a very difficult task that is outside the scope of this work.

The inclusion of each selected journal toward the goal of this review is now briefly justified. As the premier journal of the American Association of Geographers and with over 100 years of contributions, the AAAG is a core outlet for essential geography research in four categories: Geographic Methods; Human Geography; Nature and Society; and Physical Geography, Earth and Environmental Sciences. Hence, it features leading research that privileges traditional and contemporary geographic thinking across the entire spectrum of sub-disciplines and specializations. A similar breadth is covered by TPG; however, it favors shorter submissions that prioritize fresh approaches and therefore it serves as a platform for making poignant statements and presenting straightforward applications and issues. With a slogan of “Putting the World's Human and Physical Resource Problems in a Geographical Perspective”, and an eponymous title, the central aim of APGEO is to contribute towards solving practical problems through the application of geographic theories and methods. As such, it is particularly suited to offer a glimpse at how geographic methodologies are deployed in the wild and to observe connections between geography and other fields. Meanwhile, GEAN provides another point of view as the first journal (since 1969) in its specialty area of the analytical traditions, such as spatial data analysis and spatial statistics, in a world now awash in forums focused on data science and applied computing. It is an excellent resource for tracing the lineage of quantitative thought and techniques exported from within the discipline. Finally, the IJGIS serves as an encompassing exchange for all facets of the rapidly growing and maturing field of Geographic Information Science (GIScience). Given its emphasis on all aspects of geographic tools and techniques for handling spatial data and quantifying spatial patterns and processes it is ideal for capturing the pulse of GIScience as it develops and diffuses across domains. The scope of each journal is further described in Table 2 of the Appendix based on excerpts obtained directly from their official web pages. Though there is some overlap, these five journals represent sufficiently different perspectives to justify their separate inclusion and, collectively, they provide a strong base upon which to build an initial understanding of the quantitative use and development of spatial scale that can be subsequently expanded to become more holistic and comprehensive.

Table 2 Additional descriptions of the five selected journals

Manuscripts published through 2021 were obtained from each of the five selected journals by querying each for the keywords “scale” or “multiscale” using the search utility on their respective web pages. Though this returned hundreds of results, the 100 most relevantFootnote 4 hits returned by each journal (500 total) were initially selected for further screening—beyond this threshold, there were no papers that explicitly or implicitly made any reference to scale in their title or abstract. Next 194 manuscripts were selected from the initial 500 by excluding those that: (i) only described the size of the study area (i.e., large-scale or small-scale, continental-scale, street-scale); (ii) were primarily about qualitative scale (i.e., scales of power); or (iii) focused exclusively on non-spatial scales (i.e., economies of scale or temporal scales).Footnote 5

4 A typology of how ‘scale’ is used in practice

The selected corpus was reviewed and categorized based on the types of spatial scale (cartographic, geographic, observation or process) considered as the primary focus (red box in Fig. 1). Almost half of the manuscripts (42%) focused on observation scale and measuring the effects of varying resolutions of spatial units on the results of spatial analyses. The next largest focus (32%) was on geographic scale, followed by process scale (18%) and finally by cartographic scale (8%). Given the minimal use of cartographic scale, which was sometimes used in conjunction with or as a synonym for other types of scale (e.g., Rendenieks et al. 2017), it is not a central focus moving forward. Furthermore, an in-depth exploration of process scale is reserved for the following section. The rest of this section is therefore centered on highlighting popular themes detected in the corpus and the role of observation and geographic scales.

Manuscripts in the corpus were reviewed by the research team and categorized based upon the team’s expertise and experience into a set of primary themes and topics. Specifically, each manuscript was assigned at least one topic label, though up to three topic labels were permitted to accommodate interdisciplinary research and to accommodate the relatively subjective nature of the labeling process. Labels were identified by the team on a rolling basis as the literature was reviewed, with topics occasionally being merged or split to maintain a minimally sufficient subset able to represent the themes within each manuscript and across the entire corpus. This ultimately resulted in 18 topics, which are presented in Appendix Table 3, along with the tally of the number of times each topic was observed in each journal and across all five journals, though the topic of primary interest moving forward is Data Structures and Analytics.

Table 3 Tally of topics detected in each journal and across all journals

A similar deductive approach was used to disaggregate the Data Structures and Analytics topic into sub-topics, which are tallied in Appendix Table 4. Contributions were categorized based on different types of methods and actions that depend on similar scale concepts, again allowing manuscripts to belong to more than one category. Additionally, a distinction was made between “cross-scale” methods, where multiple spatial scales are manipulated independently or compared, and “multiscale'' methods, where different spatial scales (of any kind) were measured or captured simultaneously. The focus was to group contributions to identify general patterns rather than sort them into mutually exclusive and exhaustive categories. In the remainder of this section, all in-text citations come only from the assembled corpus, though some related references from outside the corpus are provided in footnotes for those interested in further details.

Table 4 Disaggregation of the sub-topics in the data structures and analytics topic

As one of the most frequent sub-topics within the theme of Data Structures and Analytics, Cartography and GIS encapsulates methods for computer cartography and geographic information systems. It has two prevailing themes. First, these contributions were almost exclusively about integrating information obtained at different scales (e.g., Li and Zhou 2012; Yue et al. 2015; Zhang et al. 2015, 2021a, b). Second, the most frequent task associated with these efforts was to generalize geographic features (e.g., topography or road network) or maintain consistent relationships between them for efficient and consistent viewing, storage, and access across scales (e.g., van Oosterom 1995; Jones 1996; Du et al. 2010a, b; Jiang et al. 2013; van Oosterom and Meijers 2014; Jiang 2015; Clarke 2016; Liu et al. 2020). Other tasks included capturing space–time change (Plumejeaud et al. 2011) and feature identification (Deng and Wilson 2008). Though cartography and cartographic systems were at the center of this category, it is interesting that the cartographic definition of scale (i.e., representative fraction) was not. Rather, there were examples where geographic scale (e.g., Deng and Wilson 2008; Hoover et al. 2019) and observation scale (e.g., Du et al. 2010a, b; van Oosterom and Meijers 2014) were used, as well as examples where the cartographic definition was employed (e.g., Stoter et al. 2011; Chen and Zhou 2013; Peng et al. 2021). In one case, the scale definition was not explicit despite the research focusing on multiscale data (Sinha and Silavisesrith 2012). Surprisingly, there was only one instance of a data model for working across both spatial and temporal scales (Van de Weghe et al. 2014).

Two tasks differentiate the Remote Sensing and Image Processing sub-topic from the Cartography and GIS sub-topic. The first task focuses on the classification of image pixels or point clouds (e.g., Dekavalla and Argialas 2017; Zhao et al. 2018; Guo and Feng 2018). For example, different geographic scales (i.e., distance-based neighborhoods) can be used to create 2D images from 3D point clouds that are each fed into a convolutional neural network to extract a set of features that are assembled into a multiscale feature set for subsequent classification (Zhao et al. 2018). Alternatively, a 3D point cloud can be resampled at different observation scales (i.e., number of cubes) that nest into a hierarchy of geographic scales (i.e., cube size) to form a multiscale point cloud pyramid that is able to reduce the effects of noise and varying point densities when classifying points (Guo and Feng 2018). Meanwhile, Dekavalla and Argialas (2017) enhance the automatic classification of land surface features by using an adaptive geographic scale (i.e., pixel-specific radius) that is efficient across input observation scales (i.e., resolution). In contrast, the second task focuses on the identification of objects from imagery and does not appear to explicitly define scale, instead using the notions of scale to refer to the size of objects (number of pixels) and multiscale to refer either the comparison of analyses for different size objects (i.e., cross-scale) or to the bottom-up formation of larger objects from smaller ones (Stefanidis et al. 2002; Drǎguţ et al. 2010; Chen et al. 2011; Argyridis and Argialas 2019). Overall, these tasks described under the banner of Cartography and GIS and Remote Sensing and Image Processing typically focus on multiscale data fusion and integration.

Profiling is an umbrella term used here to refer to any method that computes a statistic or measure as a function of geographic scale or observation scale. Perhaps the most well known example is the variogram or semivariogram, which measures the spatial variation as a function of different geographic scales (i.e., ranges) (e.g., Phillips 1988; De Cola 1994; Liu and Jezek 1999; Goovaerts et al. 2005; Zhang and Zhang 2011; Lloyd 2012, Lloyd 2016). Other profiles may alternatively be based on spatial autocorrelation statistics (Zhang and Zhang 2011), entropy (Appleby 1996), diversity indices (Zhang et al. 2013a, b), isolation statistics (Östh et al. 2015), fractal dimension (Lam and Quattrochi 1992; De Cola 1994), percentages (Petrović et al. 2018), or cumulative probability distributions (Wong 2001). These values are often computed as a function of geographic scale metrics, most commonly ‘global’ distance lags between all observations (Phillips 1988; Lam and Quattrochi 1992; Liu and Jezek 1999; Goovaerts et al. 2005; Zhang and Zhang 2011) or ‘local’ aggregates for each observation across distance bands, within a moving window, or based on a population-based number of nearest neighbors for an individual-contextual approach (Wong 2001; Lloyd 2012; Östh et al. 2015; Petrović et al. 2018). However, values are also sometimes computed as a function of observation scale (i.e., resolution or spatial unit size) (De Cola 1994; Appleby 1996; Zhang and Zhang 2011; Zhang et al. 2013a, b). All of these profiling variants involve cross-scale analysis, though Petrović et al. (2018) carry out an explicitly multiscale analysis by measuring the location-specific entropy across profiles for 101 geographic scales (i.e., extents). Furthermore, while most of the profiling examples here focused on comparing values across scales, a few contributions focused more explicitly on selecting an ‘optimal’ scale (e.g., Zhang and Zhang 2011; Zhang et al. 2013a, b).Footnote 6

The Clustering sub-topic pertains to a variety of analytical tasks that includes: (i) detecting whether spatial units are clustered; (ii) determining which spatial units are clustered; and (iii) deciding how spatial units should be clustered into larger aggregate spatial units or groups. Task (i) is most frequently achieved using traditional ‘global’ point pattern analysis techniques where clustering tests are typically based solely on location information (e.g., Smith 2004) and spatial autocorrelation statistics when clustering is based on location information and an observed attribute information (e.g., Chou 1991). Task (ii) usually employs focused or ‘local’ versions of point pattern analysis or spatial autocorrelation statistics,Footnote 7 as well as scan statistics, that evaluate each spatial unit at a fixed observation scale while conditioning on or exploring a series of geographic scales (i.e., extents) to identify clusters from randomness (e.g., Wong 2001; Shiode and Shiode 2009; Rogerson and Kedron 2012; Rogerson 2015; Westerholt et al. 2015; Carr et al. 2019; Li et al. 2019; Liu et al. 2019). In this case, cross-scale analyses are typically used for comparing results across observation scales or for selecting an ‘optimal’ geographic scale. However, more explicitly multiscale techniques in this context are those that incorporate information found at several geographic scales into a metric (e.g., Shiode and Shiode 2009; Westerholt et al. 2015; Liu et al. 2019; Griffith 2021; Yu and Fotheringham 2021). Finally, task (iii) is often referred to as regionalization and entails creating larger (i.e., coarser) observation scales by combining or grouping units from smaller (i.e., finer) observation scales (e.g., Mu and Wang 2008; Meng et al. 2021). At the root of regionalization methods is a tradeoff between the level of detail (i.e., fewer regions means less detail) and noise (i.e., fewer regions averages out extreme values) in spatial patterns and processes that is often driven by minimizing variation within regions and maximizing variation between them.Footnote 8 In this context, one example used multiscale to refer to an algorithm for searching across geographic scales (i.e., cross-scale) rather than the explicit integration of entities or relationships at different scales (Meng et al. 2021).

Decomposition Models include an array of techniques that incorporate univariate statistics, multiple regression, multilevel models, and geographically weighted models. Since many of these techniques overlap, they are therefore discussed collectively in terms of their shared qualities and defining characteristics. By Decomposition, we refer to how a process is partitioned into contributions from different but interrelated components, either at different scales and/or between different variates in a multivariable process. Multilevel models are one type of Decomposition method that extracts the contribution of information from individual ‘levels’ towards a statistical pattern or measure, where levels are different observation scales (i.e., resolution or grain). Since the levels are typically hierarchically nested and linearly additive, multilevel models are intrinsically multiscale as they always incorporate more than one spatial scale into a single analysis of a variable of interest.Footnote 9 Furthermore, the main goal of these methods is the expression and comparison of contributions at a handful of prespecified scales rather than the selection of a single optimal scale across a large range. It is often implied though that the level with the largest variation is the observation scale at which further study is required. Multilevel models are further differentiated by their focus on either a univariate statistic or a multiple regression context. The former focuses on decomposing a single variable or measure as a function of itself, such as spatial variability (Oliver and Webster 1986; Collins and Woodcock 2000), moving window averages (Pigozzi 2004), the statistical likelihood (Kolaczyk and Huang 2001), diversity and dissimilarity indices (Wong 2003; Manley et al. 2019), or entropy (Phillips 2005; Batty 2010; Leibovici and Birkin 2015). In contrast, the latter focuses on decomposing a variable as a function of other variables (e.g., Duncan and Jones 2000). More recent work found in this review corpus extended these types of multilevel multiple regression models to examine contributions from different groups (i.e., categories) across scales (Manley et al. 2015), modeling spatially clustered survey data based on attributes of individuals, neighborhoods, wider regions, and heterogeneities across them (Ma et al. 2018), the development of hierarchical spatial autoregressive models to capture dependencies at each level (Dong and Harris 2015), and a locally adaptive extension (Dong et al. 2020).

It is also possible for Decomposition Models to focus on the contribution of information from different geographic scales (i.e., extents) rather than different observation scales (i.e., resolution or grain). For example, Goovaerts et al. (2005) use decomposed semivariograms (i.e., variation across distance lags) to identify and model local and regional components of cancer risk and find that risks have qualitatively different associations at different scales that could otherwise go undetected. Another example by Johnston et al. (2004) conducts factor analysis and partial regression coefficient reconstitution to compare potential neighborhood effects among individual voting patterns measured at several k-nearest neighbors aggregates (i.e., population-based extents). These cross-scale analyses both depend upon the analyst to select the ‘interesting’ scales for further investigation. In contrast, geographically weighted regression (GWR) models are able to computationally search a continuous range of distances or an exhaustive set of k-nearest neighbors to select or seek for an optimal geographical scale at which to find local associations based on a model fit criterion such as cross validation or the Akaike Information Criterion (AIC).Footnote 10 This optimal geographical scale is then often interpreted as an indicator of process scale. A recent multiscale extension to GWR allows the identification of a unique indicator of scale for each response-covariate relationship in the model (Fotheringham et al. 2017). This means that within a single model some associations may be indicated as ‘global’ and having no smaller (geographic) scale effect, while others might be treated as regional or local. The importance of accounting for these type of multiscale effects for accurately capturing spatial process has been highlighted and also shown to be similar to multilevel models with a global level (i.e., no spatial variation) and a local level with an observation scale equivalent to the coordinates of sample locations or centroids (Murakami et al. 2019; Wolf et al. 2018).

The Scaling and Fractals sub-topic focuses on linking the distribution of an attribute or the calculation of a metric across levels of detail based on the assumption that there is some pattern or process that remains consistent across them, with levels implicitly or explicitly corresponding to different spatial scales. As such, there is some overlap between this sub-topic and that of Profiling. Scaling inquiries also often entail comparing observed distributions to theoretical distributions to investigate the degree of scaling in an attribute (i.e., size, length, connectivity, number of events, configuration). In particular, fractals have influenced the concept of scaling and Goodchild and Mark (1987) discuss three notions that incorporate fractals into spatial analysis.Footnote 11 The first notion concerns the response of a measure to explicitly defined spatial scales. Since fractal dimension is used to characterize the complexity, irregularity, or roughness of a geographic feature, this intuition can be extended to examine how different values of fractal dimension or domains of consistent fractal dimension across spatial scales may be associated with different spatial processes (i.e., Profiling). An example of fractal dimension as a function of geographic scale is provided by Lam and Quattrochi (1992) while an example of fractal dimension as a function of observation scale is provided by Appleby (1996). The second notion is that of self-similarity or the repetition of statistical patterns at different scales (e.g., Ovando-Montejo et al. 2021), which has been used to simulate terrain and geomorphological processes, as well as develop alternative null models to compare against them. The third notion pertains to the recursive subdivision of space that leverages self-similarity to produce space-filling patterns, inspiring the development of efficient spatial data structures.

More recently, Jiang and colleagues (Jiang et al. 2013; Jiang 2013; 2015) demonstrate that the Pareto-like distribution of many geospatial phenomena provides the basis for cognitive mapping and cartographic generalization across intrinsic hierarchical observation scales based on recursive 80–20 splits of an attribute and Jiang and Ren (2019) suggest a multiscale topological representation of space that captures the relationships within and between intrinsic hierarchical geographic scales. Underlying this later technique is the Ht-index, which expresses hierarchical levels of scales, in comparison to the fractal dimension, which expresses the degree of heterogeneity for a level (Jiang and Yin 2014). Another recent direction develops the concept of multifractals, which acknowledges that fractal properties of an attribute may vary across space, generating multiscale patterns (Tan et al. 2021).

Changing the support size (i.e., observation scale) of data or models is referred to as Rescaling, or more specifically, upscaling (i.e., aggregating) and downscaling (i.e., disaggregating) depending on whether the units are becoming coarser or finer, respectively.Footnote 12 The focus is often on rescaling data that are then analyzed or modeled at different observation scales where observations are translated into a new scale through averaging, smoothing, extrapolating, or interpolating (e.g., Bednarz and Ralston 1982; De Cola 1994; Atkinson and Tate 2000; Yoo and Trgovac 2011; Buck 2017; Zhang et al. 2019). An alternative approach proposes to calibrate a geostatistical model (i.e., variogram) for one observation scale and then directly upscale or downscale the modeled variation across geographic scales (i.e., distance lags) using regularization rather than rescaling the data and modeling it at the new scale (Atkinson and Tate 2000). Rescaling may be seen as a contrast to multilevel models that incorporate data across multiple scales rather than harmonizing data to a single scale (e.g., Wilson et al. 2011). In addition, the smoothing that results from some upscaling methods typically results in information loss and evidence suggests that analytical results may be more sensitive to rescaling than using data directly measured at a coarser observation scale (Atkinson and Tate 2000; Zhang et al. 2019).

The modifiable area unit problem (MAUP) refers to the sensitivity of data and analytics to the spatial units or support upon which they are measured.Footnote 13 These effects are traditionally grouped into two categories with one focusing on the sensitivity to changes in zonal boundaries (e.g., Burden and Steel 2016) and another focusing on the sensitivity to changes in scale, which could include either varying observation scales within a single geographic scale (e.g., Chou 1991; Mu and Wang 2008; Burden and Steel 2016) or varying geographic scales for a fixed observation scale (e.g., Wong 2001). However, in practice, it appears that the scale aspect of the MAUP is more often investigated by the former task of varying observation scales within a single geographic scale (e.g., Kwan and Weber 2008; Houston 2014). The latter task of varying geographic scales for a fixed observation scale may perhaps be less frequently investigated under the guise of the MAUP because qualitative shifts in patterns and processes at different geographic scales starts to become more related to hierarchy theory,Footnote 14 which is less apparent in this corpus. Only a few manuscripts sought to specifically examine the MAUP effects of a particular method,Footnote 15 which is closely related to the tasks highlighted in the Profiling and Clustering sub-topics, with many more contributions indirectly incorporating an analysis of the MAUP by making cross-scale comparisons. Fowler et al. (2020) demonstrate the potential uncertainty and contextual fallacy of using relatively aggregate geographies (e.g., census delineations) rather than individual or ego-centricFootnote 16 neighborhood definitions. While there are no general solutions to mitigate the MAUP, strategies to account for the related sensitivities typically suggest more explicitly multiscale approaches that integrate a distribution of results across scales (e.g., Wong 2001; Burden and Steel 2016).

There were two flavors of contributions centered on simulation models using agents or cellular automata. The first focused explicitly on the sensitivity of results to spatial scales. These examples present cross-scale comparisons, focusing on sensitivities associated with different observation scales (Jantz and Goetz 2005; Bonnell et al. 2016) and on sensitivities associated with different geographic scales (Kang and Aldstadt 2019), or both (Wu et al. 2019a). The second approach focused on incorporating different types of agents for entities that exist at different scales and interact across scales, providing a multiscale analysis of complex systems (An et al. 2005; Tang and Bennett 2010; Xu et al. 2020).

Rather than developing or applying methods, a few manuscripts focused instead on developing Conceptual models of interrelated notions of spatial scale. For example, Pereira (2002) put forth a typology of scale relationsFootnote 17 based on the comparison of the grain (i.e., observation scales) and extent (i.e., geographic scales) of two different hypothetical scales. The typology provides seven possible relationships, though there is limited discussion on how they are used practically, which relations are most common, or the appropriateness of the relations for describing (empirical) data versus (theoretical) processes. Another example proposes The Scale Matcher, which provides an ontology for describing the relationships between issues of precision, accuracy, and geographic scale between the available data, the expected input to a model, and the phenomena being modeled (Lilburne et al. 2004). Similarly, Zhang et al. (2014) develop a scale compatibility framework that considers different scale types, dimensions, and measurements. The goal of these conceptual models is to make the limits of spatial modeling more explicit to analysts and provide a mechanism to validate the scales used in an analysis, though a lack of their pervasive adoption hints towards the trend to instead rely on intuition, often allowing limitations to remain implicit.

Finally, it is worth noting some Other areas that either did not form coherent sub-topics or that seem to be under-represented in the corpus. First, there is a surprising lack of manuscripts pertaining to spatial networks, including both planar networks, where the nodes and the edges are geometric entities (i.e., street system or power grid), and non-planar networks, where the nodes are georeferenced but the edges are not (i.e., migration or retail expenditure). Planar network methods were observed in the corpus in the context of multiscale hierarchical data structures for street networks and the detection of multiscale clusters along a network (Shiode and Shiode 2009; Li and Zhou 2012). Since the observations are physical entities abstracted as nodes and edges, this removes the need for an exogenously defined observation scale, which means these examples leverage multiple geographic scales. In contrast, non-planar networks were not observed in the corpus despite the rich tradition in the geographical sciences of modeling spatial interaction and the use of scale concepts to define their spatial structure.Footnote 18 Second, machine learning and neural networks were only central in two manuscripts (Zhao et al. 2018; Guo and Feng 2018), which is counter to the rise of geospatial artificial intelligence (i.e., geoAI) and the recent explosion in deep learning.Footnote 19 This suggests more work is needed to develop spatially explicit AI methods that leverage spatial scale. Third, only a single example exploring the long-standing task of location-allocation (Cromley et al. 2012) may imply that spatial scale is a relatively peripheral concern for spatial optimization compared to the other tasks described here perhaps because the focus is instead on an exogenously defined objective function and optimality criterion, rather on than a particular process.Footnote 20

5 Scale multiplicity and the inference of spatial processes

In this section, the trends from the scoping review are further distilled in order to formalize a typology of how scale is used for different tasks (blue box in Fig. 1). This predominantly entails categorizing tasks and specific methods in terms of the notions of scale that are employed, whether they are cross-scale or multiscale, and if the focus is on data, measurement, or inference. In addition, methods are distinguished as endogenous or exogenous depending on whether the scales were explicitly predefined (exogenous) or extracted from a system based on a criterion (endogenous).

Most quantitative geographic research relies upon the specification of at least one geographic scale (i.e., extent) and one observation scale (i.e., units or objects), either implicitly or explicitly, to facilitate data collection and analysis. However, at a single scale, the amount of information that can be obtained about process scale is limited. It is the shift from one scale to multiple scales that enables the scale of processes to be brought into focus. Furthermore, though it is straightforward to theorize about the scale of a process, such a scale often cannot be directly observed. This means that to learn about process scale, it is necessary to rely on inference, which requires data at multiple measurable scales (i.e., geographic scale or observation scale). Figure 2 illustrates the mode of inquiry whereby data are observed and collected in order to measure associations that ultimately allow inferences to be made. This diagram was previously used by Fotheringham (2020) to discuss how data are used to inform on the properties of spatial processes (i.e., quality and magnitude), which are dependent on scale. Therefore, the diagram is modified here to incorporate the role of scale at each stage of the diagram and to describe how data at multiple scales allow inferences to be made about the scale of processes (i.e., local versus global or higher-level versus lower-level) in addition to the quality and magnitude of processes.

Fig. 2
figure 2

The relationship between the multiple scales of spatial data and spatial processes and the tasks and methods that are most closely related with them

The box in the top left of Fig. 2 represents the sub-topics and tasks identified through the scoping review that are predominately about spatial data handling (i.e., storage, integration, viewing, accessing, and feature engineering), which were central to the Cartography and GIS and Remote Sensing and Image processing categories. Here, scale plays the role of moderating how observations are semantically related to one another. Data observed at similar cartographic, geographic or observation scales are anticipated to be related (possibly to the point of being redundant) whereas data observed at different scales are more likely to be unrelated. Moreover, the focus in this context is typically on multiscale methods with the objective of facilitating the integration of spatial information for downstream consumption (i.e., classification, regression, simulation).

In contrast, the boxes on the right focus on using data measured at multiple geographic or observation scales to inform about process scale(s). The top box can be further differentiated by tasks that compare associations at multiple scales using exogenous cross-scale techniques. This covers most profiling techniques, clustering methods, explorations of the MAUP, or verifications of scaling properties, and includes cases where either multiple geographic scales or multiple observation scales are used. As a result, the common label of k-comparison methods is appropriate for any task looking to informally quantify process scale through the juxtaposition of results obtained at k pre-specified realizations of scale. The majority of examples (~ 1/3) from the corpus of literature used a k-comparison approach to analyze changes in metrics, such as correlation coefficients, indices, spatial autocorrelation statistics, simulations or regression parameters (e.g., Parker et al. 2001; Nelson et al. 2007; Southworth et al. 2006; Elliott and Kipfmueller 2011; Kim et al. 2012; Perveen and James 2011; Patterson and Doyle 2011; Wright et al. 2013; Zhang et al. 2013a, b; Jacobs-Crisioni et al. 2014; Bao and Tong 2017; Fernandez and Wu 2016; Liu et al. 2017; Carr et al. 2019; Li et al. 2019 and many moreFootnote 21). It is important to recognize that even though k-comparison methods may include inferences on processes (i.e., their quality and magnitude) they do not typically provide explicit inferences on the scale(s) of processes. That is, they do not inform us about how appropriate a particular scale is nor do they express uncertainty about the scales under consideration.

The final box at the bottom of Fig. 2 contains the tasks and methods that provide explicit information about process scale(s) through more formal inference. The first example in this group is GWR from the decomposition sub-topic, which searches across geographic scales to make cross-scale comparisons and endogenously selects a bandwidth that is an indicator (i.e., extent) of process scale. As already mentioned, GWR has been extended to MGWR which provides an indicator (i.e., extent) of process scale for each relationship in a model. MGWR achieves this by making a series of simultaneous cross-scale searches across geographic scales. Therefore, MGWR is a multiscale method in that it combines and extracts information regarding multiple process scales. There were many examples of GWR in the corpus (e.g., Gao and Li 2011; Miller and Hanham 2011; Propastin 2011; Su et al. 2012; Pearsall and Christman 2012; Rennermalm et al. 2012; Brown 2017; Li et al. 2017; Jendryke and McClure 2019; Hazell and Rinner 2019) and because of its novelty, there were fewer examples using MGWR (Fotheringham et al. 2017; Wolf et al. 2018; Murakami et al. 2019; Bilgel 2020; Shabrina et al. 2021; Forati and Gose 2021; Fotheringham et al. 2021), though the number was increasing in recent years, along with methodological enhancements and computational improvements (Wu et al. 2019c, 2021a, 2021b; Yu et al. 2020; Li and Fotheringham 2020; Zhang et al. 2021a, b; Hagenauer and Helbich 2021). A second example of a model form that falls into this box is that of multilevel models, which are also found in the decomposition sub-topic. Since traditional multilevel models always incorporate multiple exogenously defined observation scales, they allow inferences to be made about the levels where one or more processes can be explained. There were several examples of traditional multilevel models in the corpus (e.g., Barnett 1973; Duncan and Jones 2000; Kolaczyk and Huang 2001; Dong and Harris 2015; Manley et al. 2015; Tian et al. 2015; Johnston et al. 2016; Malanson et al. 2017; Ma et al. 2018; Greene and Kedron 2018; Sun and Yin 2018; Manley et al. 2019), but few instances of multilevel models extended to become analogous to GWR and MGWR (i.e., spatially varying coefficient models) by searching across geographic scales (i.e., ranges) to allow process scale(s) to be expressed endogenously. The final examples in this box include techniques for calculating summaries across exogenously defined scales. Recall that Petrović et al. (2018) computed the entropy for values measured for a series of geographic scales and that Burden and Steel (2016) computed distributions of regression coefficients resulting from the MAUP. Though these both present straightforward means of moving from cross-scale techniques to multiscale techniques to learn about the characteristics of processes (i.e., complexity or uncertainty) they were not featured prominently in the corpus nor do they explicitly measure the complexity or uncertainty of process scale, suggesting important directions for future work.

6 Summary

This paper highlights how the concept of spatial scale is used in the geographical sciences by providing an overview of previous conceptualizations of scale and then undertaking a scoping review on the topic. From this, we describe and classify tasks and methods that use scale in the literature, leading to an expanded understanding of the role of different notions of scales (type I scale multiplicity) and the simultaneous use of multiple scales (type II scale multiplicity). Previous reviews on the topic of spatial scale do not differentiate between different types of scale multiplicity and typically focus on type I scale multiplicity rather than type II scale multiplicity. In contrast, this review included a substantial focus on type II scale multiplicity, illuminating the need to distinguish the characteristics of both types of scale multiplicity when discussing spatial analytical methods. The main outcome is a categorization of the primary modes of inquiry through the lens of multiple scales for handling spatial data, measuring associations, and inferring unmeasurable processes. Another outcome is the finding that methods capable of making explicitly multiscale inferences about the scale of processes were less abundant in the literature than methods that informally examined process scale. The results of this study therefore suggest that more work is needed to formalize multiscale inferential techniques and reasoning associated with process scale when developing and applying quantitative geographic workflows. For example, initial efforts have been made to explore the uncertainty of process scale measurement (Stuber et al. 2017; Wolf et al. 2018; Li et al. 2020), to understand the MAUP in terms of the properties of spatial processes rather than the properties of spatial data (Fotheringham and Sachdeva 2022), and no method yet provides location-specific inferences on process scale (Oshan et al. 2020; Fotheringham 2020). The development of frameworks for incorporating multiscale representations into GeoAI and deep learning algorithms (e.g., Guo and Feng 2018; Zhao et al. 2018; Janowicz et al. 2019) is another burgeoning area of research that is important for the development of the field. These advances will allow us to better understand the spatial processes that generate the patterns we experience and to more accurately and efficiently characterize spatial relationships.

There are also a number of steps that can help accelerate the development of a more unified theory of spatial scale and the scientific inquiries that depend on such a theoretical framework. One step would be to adopt a standardized reporting guideline for geographic research, such as the STROBE initiative for reporting observational studies in epidemiology (Elm et al. 2007; Vandenbroucke et al. 2014), which aims to promote the adequate dissemination of research through collaboratively compiled checklists. A similar mechanism could be developed by geographical scientists in conjunction with domain specialists to suggest a minimal set of criteria for adequately differentiating the many types and uses of spatial scale. This could help further alleviate existing ambiguities about scale and stem future ones, as well as make similar types of research more discoverable. It could also have positive ramifications for reproducibility and replicability in the geographical sciences, given the acknowledgement that findings may not be generalizable at different scales and spatial contexts (Kedron et al. 2021; Goodchild and Li 2021). Another step forward could entail a series of more systematic reviews that complement this scoping review in order to further explore individual categories identified here or how particular spatial multiscale analytical methods are used across disciplines. Finally, this review could be expanded to increase the number of manuscripts in the corpus by including additional journals, such as those targeting specific disciplines. These steps would likely stimulate more work that directs attention toward addressing the feasibility and effectiveness of techniques for quantitatively measuring aspects of spatial scale and its role in understanding our world.