Categorization of green and grey infrastructure complexity in the rural–urban interface of Bengaluru, India: an unsupervised volumetric approach with relevance for urban quality

Trees are key elements of urban green infrastructure and provide multiple ecosystem services that are essential for the quality of life of people in urban environments. Grey infrastructure is made up of buildings or built-up area, generally characterized by imperviousness of the surface. The complexity of urban green and grey infrastructure and their interactions co-define the quality of urban life and the ecological value of urban areas. Using conventional dichotomies by separation into “urban” and “rural” contexts does hardly allow to comprehensively assess the situation in rapidly urbanizing environments of the Global South. We present an unsupervised remote sensing-based approach that integrates 3D information to objectively categorize the complexity of green and grey infrastructure. Using the rural–urban interface of Bengaluru, India, as a case example, we distinguished five categories that describe the composition and configuration of green and grey infrastructure, where three variables served as indicators for categorization into five clusters. We argue that such integrated 3D assessment of green and grey infrastructure is particularly useful for understanding and classifying “rurban” environments, where a distinction between urban and rural is often no longer possible. Our final map allows to quantitatively characterize such rurban configurations.


Introduction
The Earth´s land surface is undergoing rapid and manifold changes. Among the most drastic land use changes is the sealing of land surface by infrastructure or building constructions, generating so-called impervious surfaces (Seto et al. 2011). Such areas are core elements of built-up areas, settlements, villages, and urban areas. Estimated at about 0.45% of the habitable land area in 2010 (Liu et al. 2014), the global total of settlement and urban area is relatively small in comparison to agricultural land cover (about 50%) and forests (about 35%), but urban areas are constantly and rapidly expanding. A well-defined and unambiguous definition of urban areas is therefore a must when reliable land use maps shall be produced, also due to their multiple and substantial sustainability impacts (Elmqvist et al. 2021). There is a multitude of possible definitions for the degree of urbanization: the Statistical Office of the European Union (Eurostat) uses, for example, a criterion of geographical contiguity in combination with a minimum population threshold within 1km 2 square grid cells (European Commission 2021). For remote sensing image interpretation, however, one needs in the first place to resort to observable biophysical features, since social and economic variables cannot be identified directly from the imagery. The most typical biophysical feature of settlements and urban areas is the presence of built-up elements, including houses, industrial areas, roads, and parking lots. In a hierarchical classification framework, it appears straightfoward to start out with an identification of impervious surfaces and proceed with a classification into classes characterizing the degree of urbanity, either in categories (such as "urban", "transition", "rural") or on a continuous scale of, for example, percent of "urbanity" on the second level.
Research on quantification of urban impervious surface and its spatial pattern is commonly based on remote sensing analysis (Elvidge et al. 2007;Ma 2016). Yet mapping the class "urban" is the central element in urban studies, and research has considered various spatial resolutions as well as different sensors and classification methods (Weng 2012). High-resolution satellite imagery such as IKONOS, QuickBird or WorldView are the premier data source for detailed thematic mapping of urbanity (Goetz et al. 2003;Hu and Weng 2011;Cablk and Minor 2003). Impervious surface is used as proxy variable as the class "urban" cannot be directly classified from remote sensing.
Within the assessment of urbanization, urban green spaces play an important role. Urban green spaces, also referred to as green infrastructure (Hansen et al. 2019), are composed of parks, gardens, roadside trees or alleys, and other "green" landscape elements and have a crucial importance for regulating air temperature (Herath et al. 2018) and air quality (Nowak et al. 2006) and also for providing habitats to animals and plants (Lepczyk et al. 2017). Various studies have shown that the occurrence of green spaces in the neighbourhood has a positive effect on health and well-being (Groenewegen 2006;Maas et al. 2009;Troy and Grove 2008), in particular where trees form the main element of green infrastructure.
Conventional methods for assessing and analysing urbanization and its complexity are usually based on gradient analysis or on the use of concentric circles but may fail to assess a new type of environment, which is found around many of the rapidly urbanising megacities of the Global South. Such environments are characterised by both urban and rural features, thus showing "rurban" characteristics. For example, despite intensive development of traffic and settlement infrastructure, they are often characterised by a high presence of green spaces that may either have a long historical continuity or be newly created. The megacity of Bengaluru in the South of India is one example of such new environment, with e.g. old-growth tree structures overrun by urbanization, so that historic green spaces or remnants of agricultural use are frequently found next to new and very dense settlements (Nagendra and Gopal 2011). An enormous complexity of grey and green infrastructure emerges here, both in terms of (two-dimensional) area and the (three-dimensional) spacefilling built-up volume.
High-resolution satellite imagery allows assessing and categorizing the complex pattern of grey and green infrastructure at the rural-urban interface through quantitative analyses. Previous studies have commonly used data derived in a two-dimensional domain in combination with hard thresholding, frequently ignoring the important fact that urbanization takes also place in a three-dimensional space. Therefore, it is time to integrate the third dimension when doing a classification of urbanity. The aim of this study is to fill this gap by developing a novel, transparent, and unambiguous approach to extract quantitative and comparable 2D/3D information on the degree of urbanity and as basis for a categorization of grey and green infrastructure. The approach we propose here uses a small set of biophysical variables, extracted from high-resolution remote sensing imagery as indicators.

Study area
Bengaluru is the capital of the Indian State of Karnataka, located at 12°58'N, 77°35'E. It is situated on Southern India's Deccan plateau at an altitude of about 920 m above MSL (Sudhira and Nagendra 2013). The topography is relatively flat in Bengaluru North, which is one of four urban districts, while Bengaluru South is slightly undulating, with a central ridge running in North-East and South-West direction. The city is nowadays a centre of IT, biotechnology, aerospace technology, and other advanced knowledge-based industries and research centers (Hiremath et al. 2013). Bengaluru used to be known as the "garden city" of India owing to its widespread parks, green spaces, and many alleys with old and huge trees. Even though many of these natural areas and elements have been lost due to infrastructure development, trees and green spaces remain abundant as compared to other megacities (Nagendra 2016).
Our analysis uses data from a 50 km x 5 km rectangular research transect (Fig. 1) in the Northern part of Bengaluru that has been defined in the framework of an Indian-German collaborative research project (Hoffmann et al. 2017). This transect covers a wide range from densely built-up urban environments to areas that have a purely rural character. The unusually complex pattern of green and grey infrastructure is what makes Bengaluru an interesting and challenging study site for developing the approach presented here.

Terminology and definition
For any classification of land cover and land use, clear definitions are crucial. A meaningful interpretation of the results can only be warranted when a comprehensive, transparent, and unambiguous definition of all elements is available and consistently used in the classification process. In our context, we use a hierarchical "three-level framework" of definitions of classes and terminology, where the first level is assigned to individual pixels and further levels include an evaluation of the neighborhood inside a defined reference area: • 1st level-"Impervious": Impervious is a characteristic of the land surface and describes mainly artificial structures which are usually water-impermeable. Imperviousness can be determined at each (dimensionless) point; from bird's eye view either a point falls on an impervious surface or not. In order to make an accurate mapping from remote sensing, very high-resolution imagery is required if one wishes to follow this definitional ele-ment. In our case, this is the WorldView-3 imagery with a nominal ground resolution of about 0.3 m. The identification of "Impervious" from high-resolution remote sensing imagery is quite straightforward in many cases. • 2nd level-"Urban": Determining the status of "urban" or "rural" is more complex, since the classification is based on a mix of biophysical and socio-economic characteristics. In this study, we consider percent impervious surface (PIC), a two-dimensional biophysical characteristic exclusively. When assigning the mean PIC to a location (or to a single pixel), the definition of a "reference area" of different sizes around that point is required. The relevance of the definition of a reference area when dealing with percent cover has been addressed early by Kleinn (2001), and in the remote sensing context by Magdon et al. (2014). • 3rd level-"Green and grey infrastructure complexity": While in level 2 the grey infrastructure is represented by the mean PIC in the 2D space, in the third level, the complexity is extended to the 3D space to jointly categorize the configuration and complexity of green and grey infrastructure, derived from biophysical characteristics of remote sensing imagery. The biophysical vari- Fig. 1 Location of the study area, a transect of 50 km x 5 km in the Northern part of Bengaluru, India. The transect is enlarged here as a WorldView-3 false colour composite ables include building volume, tree-or green vegetation volume, and the mean impervious cover percent in the surrounding area from the 2nd level.
For the 1st and 2nd level, the class is assigned to individual pixels. For "impervious surface" (level 1) this decision is made exclusively from information of the pixel itself: the pixel is on impervious surface or not. Given the very high spatial resolution, mixed pixels can be ignored as they occur in a relatively small proportion only. For "degree of urbanity" (level 2) a support area needs to be evaluated in the immediate surroundings of the target pixel to derive the relative proportion of impervious surface. It is then a matter of definition how large this support area shall be and which shape it should have. The 3rd level is assigned to a reference area of one hectare, a common size in urban studies (Schoepfer et al. 2005).

U-Net classification (1st level)
The land cover classification system had three classes and is detailed in Table 1.
A deep learning multi-class classification approach was performed to classify the WorldView-3 image covering the 5 × 50 km transect. We used a convolutional neural network called U-Net (Ronneberger et al. 2015) with a network structure very similar to the one implemented by Iglovikov et al. (2017), along with the published joint loss function. The network receives images with a size of 112 × 112 pixels as input and produces a probability map of 72 × 72 pixels as output. The implementation was done using the Keras (Chollet 2017) framework with TensorFlow as backend (Abadi et al., 2005). Training and validation were performed on random subsets of the 330 tiles covering the 1 ha sample plots. The split into training and test data was 70% to 30%. Training was done for 100 epochs, where in each epoch 135 batches of 16 images were presented to the network. The total image area fed to the network in each epoch was equal to half of the image area of the training set. Four (blue, green, red, near-infrared) of the eight WorldView-3 bands were used, as it proved to be enough for this classification task. The study was conducted within the same study area by Freudenberg et al. (2019). In order to prevent over-fitting, the input images were augmented by random flipping and 90° rotations.

Mean impervious cover % (2nd level)
To produce a continuous map of the degree of urbanity, a moving window approach with pixelwise step size (a continuous sliding window with overlap) was used. For every single pixel in the image the relative proportion of impervious surface (Percent Impervious Cover = PIC) was calculated for three varying window sizes of 100 m x 100 m, 400 m x 400 m and 800 m x 800 m. The window sizes were adjusted to describe the neighborhood on different spatial scales, from neighboring houses to ward or village level. The workflow was automated and implemented in Python 3.8 using the NumPy library. The classified map of impervious surface from the 1st level was used as the input. The PIC was assigned to the central pixel of the moving window. The output is a multi-layer image with three bands, one for each window size. To reduce the multi-layer image to a singleband product with 0.3 m spatial resolution we averaged the three bands with equal weights (Eq. (1)): The derived weighted average image was normalized to scale of 0 to 1 to retrieve the degree of urbanity. A high value corresponds to high urbanization which is characterized by a high percent impervious cover for the three window sizes and a low value describes less urbanization. The map allows classifying any point within the study area into a level of urbanity based on its environment using the 2D metric alone.

Green and grey infrastructure complexity (3rd level)
The third level combines 2D information with 3D information on buildings and tree or green volumes. The first step was to extract an accurate digital surface model (DSM) from WorldView-3 satellite stereo pairs. For this purpose, we used the OrthoEngine SE from Geomatica Banff (PCI Geomatics, Richmond Hill, Ontario, Canada). A non-overlapping sliding window of 100 m × 100 m was (1) Weighted average = PIC 100 + PIC 400 + PIC 800 3 Tree cover Patches of leaf-on trees within the landscape applied over the DSM to extract the lowest 1% quantile representing the ground elevation within each square. A normalized height model (NHM) for each square was derived by subtracting the ground elevation value from the DSM. The space filling volume of buildings was estimated from the average of the NHM per building polygon, where the building polygon was extracted from the classified built-up area map. The volume of buildings was then calculated from the building polygon and its mean height. The tree volume was estimated accordingly and corresponds to that of a pillar above the crown projection area with tree height. From the three indicator variables "mean impervious cover %", "building volume per ha" and "tree volume per ha" we identified clusters with similar overall variable profiles regardless of their absolute magnitudes; this is a common grouping approach taken from gene expression data analysis. First, we computed the correlation-based distance using the function get_dist() [in factoextra R package], which uses the Pearson correlation to quantify the similarity of value profiles over the three ordered indicator variables. Second, a hierarchical clustering was applied from hclust() with k = 5 clusters which was the optimal number of clusters identified using the gap statistics method fviz_nbclust() function [in factoextra R package].

Impervious surface, built-up, and tree cover mapping
The results of the classification for the three classes "tree cover", "built-up", and "impervious surface" are shown in Fig. 2 for a subset of 1 km × 1 km. The image shows a good match with the underlayed WorldView-3 imagery. The overall accuracy (OA) is 89% for tree cover, 92% for impervious surface, and 87% for built-up respectively. We used an independent validation dataset of size n = 200 points that have been classified based on visual interpretation. Although the classification outcomes were convincing, it is also visible in Fig. 2 that the network partly failed to separate single buildings from each other.  Figure 3 shows the averaged variable profiles of the separated five clusters. The variable Mean PIC, which contains the spatial aggregation over moving windows of different sizes, identified four separable clusters where each represents a different degree of urbanization. Cluster #1 and #2 are very close to each other and, in view of the profile, describe unsealed areas with vegetation or arable/ fallow land, etc.. Table 2 gives a more detailed description of the cluster characteristics.

Mapping green and grey infrastructure complexity
Our map of the complete study area (Fig. 4a) shows the spatial distribution of the five clusters from correlation-based distance clustering on a continuous scale. The red coloured area comprises built-up area, other types of impervious surfaces, and a few green spaces. It also becomes clear that there is all but a linear urban-rural gradient within the research transect. Instead, the spatial distribution of the individual clusters shows that, although there is a clumping of cluster #4 in the area close to the city center, this cluster is also found in more distant areas that may have similar characteristics with respect to the input variables. Figure 4b shows the spatial distribution of the villages in the Northern transect and their assignment to the respective clusters. On a small scale there are two regions in which cluster #1 to #4 occur together. Fifty percent of the villages were classified into cluster 2, while 20% fell into clusters 3 and 4 respectively. Only 10% of the villages were attributed to cluster 1 (Fig. 5).

Discussion
Considering an expected number of five billion people living in urban environments by 2030 (Venugopal et al. 2010), keeping urbanization within sustainable limits, and harnessing green infrastructure such as urban trees, are grand future challenges (Elmqvist and Maddox 2018). Decision processes in urban planning should be based on meaningful information, most notably regarding the future distribution of green infrastructures (Haaland and van den Bosch 2015).
With the aim of contributing to satisfying these information needs, the present study provides a data-driven approach that characterizes rural-urban gradients in terms of urbanity and grey/green infrastructure complexity. The produced map combines the percent impervious surface on a 2D scale together with the presence of trees and buildings and their corresponding volumes on a 3D scale as indicators to quantify the degree of urbanity, with implications for urban quality. Considering the space filling 3D volume of buildings and trees is a new approach that helps to uncover important characteristics of urban quality that might be overseen by a purely two-dimensional view. Such a classification might also provide a closer linkage to social-ecological complexity and helps to identify social-ecological systems (SESs) as described in Pacheco-Romero et al. (2021).  Cluster #2 shows the largest amount of tree volume together with a low mean impervious cover % in the surroundings. Combined with the non-existent-building volume, this cluster includes plantations, green spaces such as parks, and agricultural land Cluster #3 is characterized by a high mean impervious cover in the surroundings and a high amount of building volume which indicate multi-floor buildings together with moderate tree volume sourced from roadside or sparsely distributed single trees Cluster #4 shows the highest mean impervious cover in the surroundings and a moderate built-up volume per hectare. The tree volume for this type of area is quite low, thus, this cluster can be categorized as a highly-dense urban area Cluster #5 is compromised by a moderate mean impervious cover in the surroundings, less built-up volume but compared to the previous cluster, cluster #5 shows a higher tree volume In our analysis we discriminated five clusters of similar characteristics; each of these five clusters characterizes urban quality of life in specific ways, and these contributions can be either positive or negative. The clustering used here is based on the similarity of variable profiles of three metrics that are informative if considered all together. Compared to a classification based on hard thresholds of single-or multiple variables, this clustering approach is less influenced by the magnitude of absolute values that might vary over the study area. Clustering approaches used in other studies (e.g. Pacheco-Romero et al. 2021) often use the normed differences of several indicators in a weighted distance metric (e.g. Euclidean or Mahatten distance). Contrary to such approaches, our approach of pattern recognition is based on the similarity of variable profiles (Q-correlation), which makes it a more general metric that is applicable in different environments. In our analysis we limited the number of clusters to k = 5 based on their statistical "separability", which turned out to be well in line with the number of classes that can be clearly distinguished by visual interpretations. A larger number of clusters would lead to lower separability and ambiguous classification results.
Cluster #4 is characterized by a high coverage of impervious surface and only minor tree volume, and is the dominant type in urban Bengaluru. Relatively few trees with low green volume occurring in combination with a high coverage of impervious surface coverage may have negative effects on the quality of life and a healthy urban community. In terms of air quality, McPherson et al. (1994) mentioned that particularly the large trees, like those huge alley trees in Bengaluru, remove up to 70 times more air pollution than small trees.
Street trees play a key role for quality of life (Turner-Skoff and Cavender 2019) and therefore, there should be minimum criteria for the amount of tree cover available. For example, Konijnendijk van den Bosch (2021) recently proposed a 3-30-300 m rule, according to which one should be able to see three trees from each house, the neighbourhood canopy coverage should be at least 30% to provide environmental, economic or social benefits (Mullaney et al. 2015), and lastly there should be a green space at 300 m distance.
Regions where neighbouring villages are categorized differently in terms of the green and grey infrastructure complexity may be an indicator for an ongoing transformation towards a more urban environment. The presented cluster approach provides a first indication for new type of environment that could be named "rurban". Rurban environments do not necessarily represent an intermediate state along a linear transition from urban to rural, but often lead to highly dynamic, complex, and novel land cover constellations due to multiple development processes and tensions.
Tree cover (and tree volume) is a very broad and general land cover class. Cluster #2 in particular includes mango plantations as well as urban parks -although these different types of tree cover provide very different ecosystem services and thus also contribute to the quality of life in very different ways (one point is the question of public access, which is not necessarily guaranteed in the case of plantations, for example). As a future extension, we plan to implement a more detailed classification scheme which may provide more linkages to the different ecosystem services and to quality of life.
The three indicators used in the clustering process may have some shortcomings, such as misclassification. Although the classification is based on very high-resolution imagery, mainly pure pixels per land cover class and a stateof-the-art classifier, it cannot be avoided that some objects or land cover types have been incorrectly classified. This in turn may have only a minor impact due to the coarser grid resolution but will nonetheless misallocate some clusters. The grid resolution of the presented continuous map of 100 m x 100 m is in line with Schoepfer et al. (2005) who created a 100 m grid after an exchange with city planners. In comparison, the resolution of the maps used for the analysis is very high with 0.3 m. The reference areas (sizes of overlapping moving windows) we used to derive the mean PIC per pixel were adjusted to the expected size and spatial scale of relevant objects or structures, like single houses or properties, wards and small villages. For other applications different spatial scales of neighborhoods might be more appropriate. However, for an area of 250 km 2 , the analysis is computationally intensive. Down sampling to 1 ha during cluster analysis is a compromise of computational requirements, preservation of spatial variability, and interpretability of the map. The down sampling also offers the possibility to upscale this approach by integrating Sentinel-2 satellite imagery and monitor changes in categorization over time.

Conclusion
Rapid urbanization creates new and complex configurations of land that are not captured by conventional dichotomies of "urban " and "rural " -in particular in the Global South. At the same time there is evidence of the important role of green infrastructure in mitigating the negative impacts of urbanization on sustainability. We make use of 3D stereo satellite imagery and develop a data-driven approach to characterize a rural-urban interface regarding its green and grey infrastructure complexity. The approach is novel in terms of the method following an unsupervised and objective approach and the integration of 3D information.
Our analysis shows that the three-step procedure of 1) classifying "imperviousness" at the pixel level, 2) determining "urbanity" based on a relative proportion of impervious surface in a defined reference area in the 2D space, and 3) adding information on complexity by considering the 3D volume of green and grey infrastructure helps distinguishing different clusters representing different intensities of urbanization. Adding a third dimension of build-up and green volume, however, requires high resolution stereo imagery or other 3D data sources (e.g. airborne LiDAR data) that can be used as basis to derive a digital surface model (DSM).
The presented method could be considered in urban planning and in particular in green infrastructure development as it provides spatial information and a categorization of green infrastructure. For further development of the approach, an assessment of ecosystem services from green infrastructure would complement the information basis for decision making. Our study contributes to the existing recommendations from Hansen et al. (2019) for planning green infrastructure in cities and allows planning and site development by using a wall-to-wall map of green infrastructure categories. Our clustering approach may also advance integration of green infrastructure into ongoing efforts to identify and map land system archetypes, which are a hot topic in landscape and sustainability science (Pacheco-Romero et al. 2021). Using clustering approaches to identify land system archetypes facilitates the understanding of social-ecological complexity and context by categorizing systems with similar variable profiles that can be correlated or enriched by socioeconomic and ecological data (Rocha et al. 2020).
Author contributions All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Nils Nölke. The first draft of the manuscript was written by Nils Nölke and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding Open Access funding enabled and organized by Projekt DEAL. The authors gratefully acknowledge the financial support provided by the German Research Foundation, DFG, through grant number 279374797 (Research Unit FOR2432/2).

Availability of data and material Not applicable.
Code availability Not applicable.

Competing interests
The authors have no relevant financial or nonfinancial interests to disclose.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.