1 Introduction

1.1 Background

Natural hazards may result in negative consequences, including loss of life, property damage, and economic disruption, all of which must be considered in order to develop successful risk mitigation strategies (Eshrati et al. 2015). Exposure, vulnerability, and risk analysis require inventories of elements-at-risk (EaR), also called assets, to determine who and what is at risk, as a basis for selecting risk reduction options (Gill and Malamud 2014). Many types of EaR can be considered (e.g. buildings, population, transportation infrastructure, agriculture), depending on the aim of the risk assessment and the sector considered. Detection of existing EaR footprints and their characterisation or classification are important requirements for risk assessment (Eshrati et al. 2015). Buildings are the most frequently used EaR in risk assessments, as they are key for estimating economic losses and population losses. Important building characteristics for loss estimation are the occupancy types, structural types, number of storeys, the value of the structure and its content, and the number of people in different time periods (Papathoma-Köhle et al. 2007). This information, however, is difficult to obtain as it requires field surveys or access to census data, which is often restricted or outdated. Recent advances in volunteered geographical information (VGI) with platforms such as OpenStreetMap (OSM) and Mapillary, highlight the continuing emergence of citizen science (Goodchild 2007). Collaborative initiatives have aided in many applications, such as land use mapping (Ribeiro and Fonte 2015), post-disaster damage mapping (Panek 2015), and community development (Panek and Netek 2019) in countries like South Africa (Panek 2015), Spain (Ariza-López et al. 2014), India (Papnoi et al. 2017; Raskar-phule and Choudhury 2015) and Malaysia (Husen et al. 2018). Barrington-Leigh and Millard-Ball (2017) estimated that over 80 per cent of the world is mapped on OSM. Raskar et al. (2015) generated flood vulnerability maps by employing VGI data like OSM of Mumbai city for critical infrastructure and transportation systems. Papnoi et al. (2017) also looked at the hazard, risk, and severity of the floods in Navi-Mumbai in a gridded system while recommending that urban areas can be easily and effectively mapped using OSM data, which are generally quite accurate. Geiß et al. (2017) estimated building and population EaR from land use/cover data on very high-resolution remote sensing while also commenting on the applicability of OSM data in exposure assessment. A promising area of application is the use of OSMs for data analysis processes and the substitution of conventional data sources. The quality of OSM properties and metadata has also been researched for suitability in different fields including the natural hazard risk science and suggests much pertinence (Fan et al. 2014ab; Poser and Dransch 2010; Schnebele and Cervone 2013). Furthermore, OSM data can also be used to train supervised algorithms to extract information relevant to field of application.

However, the usage of OSM building information for risk assessment has several problems. OSM buildings are often not updated on a regular basis and buildings that have been destroyed by a disaster, for example, are often still present in OSM even if they no longer exist physically (Foody et al. 2015). Accurate attribute collection of building data from OSM is also significant difficult since building characteristics cannot be seen by volunteer mappers on vertical satellite images, and often left blank (Zhang and Pfoser 2019). In the assessment of the vulnerability, loss, and risk, building typological variables such as occupancy class (e.g., single-family home), construction type (e.g., reinforced concrete), and the number of storeys is required. Current online products and tools such as Google Street View, Google Maps, OSM maps, land use data, and other auxiliary datasets can assist in providing additional context mostly of the occupancy type. However, the integration of such data with collaborative mapping can be difficult due to the different nature of these data.

1.2 Building mapping

Building footprint mapping through visual interpretation and manual digitisation from remote sensing images has become the standard approach. However, this is time-consuming and depends on the skill and dedication of the mapper, and may result in omission and misclassification (Mobasheri et al. 2018; Wu et al. 2020; Ghorbanzadeh et al. 2021). Therefore, semi-automated building detection methods have been developed either using pixel-based or object-based classification algorithms are two strategies (Blaschke 2010; Parker 2013; Pesaresi et al. 2008). In the past decade, artificial intelligence (AI) has been widely applied for object detection and classification with support vector machine (SVM), Random Forest (RF) and deep learning (DL) algorithms such convolutional neural networks (CNNs) (Karpatne et al. 2016; Sur et al. 20202022). A CNN is a deep learning algorithm that stems from artificial neural network research, based on the back-propagation technique enabling feature learning (Zhu et al. 2017; Zhou 2018). Multiple hierarchical stacking and trainable layers enable CNNs to learn characteristic features and abstractions from satellite images (Fu et al. 2019), resulting in the detection of hidden features, based on common characteristics like colour, shape, and size, and deep features such as spatial relationship. Specific capabilities of CNNs are that they maintain spatial configuration of input images, their sparse connections enable the use of lightweight models, and their representation learning procedure helps to automatically learn features from the training data. These capabilities have resulted in high accuracies in image classification (Xie et al. 2020) and object detection (Ghorbanzadeh et al. 2019; Guirado et al. 2017; Sameen and Pradhan 2019), making CNNs suitable tools for building footprint extraction (Alidoost and Arefi 2018; Cohen et al. 2016; Stewart et al. 2020; Xie et al. 2020; Zhou et al. 2019). CNNs like Fully Convolutional Networks (FCN) are used to accomplish pixel-by-pixel semantic segmentation (Wu et al. 2018; Pan et al. 2020). It essentially specifies the mapping of image pixels to specified class labels, such as buildings and non-buildings. For semantic segmentation, FCN is one of the most essential networks in DL (Zhu et al. 2017) as it includes concepts such as end-to-end learning of the upsampling method using an encoder-decoder structure and skip connections to fuse data from different network levels. Some popular FCNs are U-Net (Ronneberger et al. 2015) and SegNet (Badrinarayanan et al. 2017). Pan et al. (2020) demonstrated the success of the U-Net architecture with very high-resolution (VHR) satellite imagery for semantic segmentation of high-density buildings. To test if very deep networks show a better performance, Yi et al. (2019) combined deep residual networks with the U-Net model and reported results showing significant improvements in the accuracy of building segmentation. Because FCNs such as U-Net and ResU-Net retain the contextual information from each layer through end-to-end learning and skip connections, the structural image integrity is preserved and distortion is greatly reduced (Ronneberger et al. 2015). Furthermore, the ResU-Net model is very good at predicting with minimal training data (Alidoost and Arefi 2018; Qi et al. 2020). Therefore, the ResU-Net model was chosen for building detection following.

1.3 Building characterisation

It is mostly not possible to describe important building characteristics based only on the visual interpretation or automatic classification of vertical remote sensing images. Properties such as building occupancy types, construction types or the number of floors can sometimes be obtained from vertical satellite images (Sarabandi and Kiremidjian 2008), from aspects such as the shadow from buildings, the characteristics of rooftops, and the spatial relationships with other buildings. However, there are many building features that cannot be deduced from vertical images alone, and other data sources are required such as census data, official building databases, cadastral databases, field surveys or volunteered geographic information (VGI) (Graff et al. 2019) to provide relevant building information. Therefore, an approach to obtaining such relevant building information that describes the characteristics of the buildings via the use of open source data must be established. The first step in characterising a building is to examine its physical morphology and how it relates to neighbouring buildings and the environment. These morphological measures or metrics can provide useful information about the sort of buildings that may exist in a given location, as well as possible building functions such as occupancy types. The urban morphology Python library Momepy (Fleischmann 2019) is created for the quantitative analysis of urban form and morphometrics. Building diversity, adjacency, area coverage, and other structural factors can all be calculated using the library, which is useful for grouping buildings into physical similarities. As a result, the tool can serve as a link between detecting and characterising buildings as elements at risk. Momepy is used for semantically classifying buildings based on building attributes such as size, form, proximity to other buildings, and building compactness. The second step in characterising would be to combine the obtained morphometric information with added auxiliary knowledge, which can be, for example, building tags from VGI such as OSM and land use data. Fan et al. (2014ab) studied urban morphometrics to characterise buildings in five German cities using a complete OSM dataset with accurate building data and over 2027 buildings with occupancy type labels. The use of OSM for determining building attributes was also investigated by several other authors (Fan et al. 2014ab; Sun et al. 2017). However, the process of using building tags using just the OSM is not feasible in regions where very few buildings are mapped in OSM. In addition to OSM data, additional data sources can prove useful in approximating building characteristics, which could be oblique images such as Google street view or Mapillary, building labels such as from Google Maps, built-up area classification from the Global Urban Footprint (GUF) (Esch et al. 2013; Gei, Wurm, and Taubenböck, 2017; Geis et al. 2019) and land use/land cover maps (ISRO 2020). Gei et al. (2017, 2019) developed a method to combine the GUF data with height information from TanDEM-X data. Cerri et al. (2021) used OSM building data together with various proxies or auxiliary data to enhance building characterisation for flood vulnerability assessment. Stewart et al. (2016) created a technique for estimating the building occupancy type from population density data using Bayesian machine learning. Hasan et al. (2018) used LiDAR data to automatically extract building footprints and heights, and manual interpretation of building occupancy types for landslide exposure. Our study expands on these earlier studies by developing a building characterisation approach using open-source data such as OSM, Google Maps, and other publicly available data, focusing on the generation of homogeneous urban units manifested by the physical morphologies of the buildings and the estimation of their pre-dominant building occupancy type. We do this at an aggregated level in homogeneous units to avoid the lack of information for each individual building.

Therefore, the crux of the research was to detect building footprints using deep learning and to recognise the building occupancy of the detected footprints via building characterisation modules including the use of building morphological metrics and open-source auxiliary data. We employ a semi-automated workflow for the generation of Elements-at-Risk (EaR) databases of buildings.

2 Study area and datasets

2.1 Study area

The approach was tested in two areas in the state of Kerala, India (Fig. 1-A,B,C). Although Kerala is one of the most developed states of India, with a good disaster management framework, it suffered from a lack of organised EaR data, as a basis for risk assessment and disaster preparedness. The state witnessed severe flooding and landsliding in 2018, displacing 85,000 people and destroying numerous buildings (Dwyer 2018). The Kerala State Disaster Management Authority (KSDMA) is the main organisation at the state level, supporting district-level organisations in this state with a high level of self-government. KSDMA supported disaster risk management in Kerala through collaborative mapping projects with organisations such as the International Centre for Free and Open-Source Software (ICFOSS) to establish an elements-at-risk database through the Mapathon Kerala project (Kerala State Spatial Data Infrastructure 2021). However, owing to the time necessary to compile appropriate open data across the entire state of Kerala, the results were not available online at the time of this study. In contrast to this project, which requires a considerable effort in time and human resources, our research aims to address the rapid mapping of buildings EaR (including both detection and characterisation) in data-scarce locations that may be utilised for emergency reasons such as relief efforts.

Fig. 1
figure 1

Study area of Palakkad (b) and Kollam (c) in south western India (a). D and E are examples of the existing building data in OpenStreetMap. b and c also depicts the flood exposure derived from a flood susceptibility map by the KSDMA

Palakkad is town with a population of 130,000 people located in the district of Palakkad (Census of India 1981). It is bordered by tributaries of the Bharathapuzha River and is frequently subjected to high levels of monsoonal rainfall with an average annual of 1216 mm. In the latest disaster of 2018, due to widespread heavy rain-induced floods and landslides in the mountainous hills surrounding Palakkad, many people had to be relocated while landslides claimed the lives of 9 people destroying 3 houses (Bennett 2018). To examine the practicality and transferability of the proposed framework, part of the city of Kollam was chosen as a second test site. The former is a landlocked city, surrounded by two river channels and a dam towards the north whereas the latter is a coastal city which is surround by the Arabian Sea towards the west and the big Ashtamudi Lake in the north. As Kollam is surrounded by these two water bodies, it becomes extremely vulnerable to coastal and lake flooding during the monsoon seasons. Likewise, Kollam was severely flooded causing major property damage and loss in 2018.

2.2 Datasets

Road and building data were downloaded from OSM. As can be seen from Fig. 1d, e, even though there are many buildings (indicated by circles) mapped with their respective footprints, these are mostly limited to specific types of buildings (e.g. public, commercial, educational buildings), while the majority of the city is only marked as built-up area (with instances of tags such as “Yes”). The majority of the footprints lack attribute information.

Building tags were derived from OSM and Google Maps. The point and polygon data from OSM, which included the building tags, were used to extract information on the use of the buildings (e.g., stores, restaurants, offices, houses, residential apartments, commercial, recreational spaces, schools, and hotels). Data on urban land use were collected from the national geospatial data portal Bhuvan (ISRO 2020). The land use map was made at 1:10,000 scale by the Ministry of Urban Development, India (MoUD) as part of the National Urban Information System (NUIS). The land use map was obtained using a Web Map Service, resampled and used as a backdrop image for manually digitising the land use polygons.

The Overpass API, which delivers custom-chosen sections of the OSM data, was used to extract existing building footprints from OSM (1000 buildings approximately). Satellite RGB orthoimages of 80 cm resolution were obtained for both study areas from Google Earth™ dated 20th November 2019.

In order to train DL models correctly and attain greater accuracy, additional 6000 building polygons were manually digitised in order to increase the number of training examples in the Palakkad data set and another 2000 were manually digitized to test the model accuracy.

A total of 15 tiles, each 8000 × 8000 pixels in size, were generated around the city of Palakkad to divide the building polygon data in each tile for training (12 tiles) and testing (3 tiles) purposes.

3 Methods

The overall approach for this research (Fig. 2) and consists of three main components: mapping of building footprints, their characterisation, and their use in flood exposure assessment. The mapping of the building footprints is based on the ground truth data from OSM and the satellite imagery. This was followed by data sampling for training, validation, and testing purposes. The training data is used to train an initial model with buildings, while the validation data is a portion of the training data used to describe the evaluation of the model trained when tuning the hyper-parameters to overcome issues like overfitting. This results in various results with different combinations of hyper-parameters. The testing data is used to evaluate the performance of a final tuned model based on the predictions of the model over the “unseen” data in the test set.

Fig. 2
figure 2

Conceptualisation of the research methodology. The steps include the preparation of data, sampling of the data for model training and evaluation, calculation of characteristic parameters combining these parameters into typological attributes of the buildings at an aggregated scale, and exposure assessment of the aggregated buildings using the derived typological attributes

The characterisation of the detected building footprints is the next step. We hypothesise that the characteristics of buildings are homogeneous in a neighbourhood such that the homogeneity is manifested by the morphology of the buildings. Therefore, following the detection of the building footprints, the buildings are grouped into aggregated homogeneous areas based on parameters obtained from structural data (morphological) and proxy data (open-source data such as OSM, land use data, and Google Maps). Typological (occupancy type) properties of the buildings were assigned to the homogeneous units which were then validated by local experts from the KSDMA and ICFOSS. Additional information such as average number of floors, total floor space area, building density, and percentage of built-up area are also compiled from this approach. Flood exposure assessment was done combining the 2018 flood extend maps and the generated output from the building characteristics at a homogeneous unit level.

3.1 Deep learning model set-up

Building footprint detection in the study areas was carried out using the ResU-Net model (Diakogiannis et al. 2020) that specialises in recognising objects with limited training samples. The ResU-Net model is a semantic segmentation model inspired by the deep residual learning network (He et al. 2016) and U-Net (Ronneberger et al. 2015) which combines the benefits of both residual network and U-Net models in order to achieve higher accuracies. The ResU-Net structure (Fig. 3) contains a very deep encoding network, followed by a bridge and decoding network. The deep encoding network enables more discriminative and hierarchical feature extraction.

Fig. 3
figure 3

Schematic diagram of the ResU-Net model based on Diakogiannis et al. (2020)

which consists of three encoder blocks including a convolutional layer (Conv) (Zhang et al. 1988), a batch normalisation layer (BN) (Ioffe and Szegedy 2015), and a rectified linear unit (ReLU) (Agarap 2018) activation function, which helps learn abstract representations of the input images. The output of the three encoder blocks is connected to the corresponding decoder block through skip connections (He et al. 2016) (blue dotted arrows in Fig. 3), which help skip layers in the network and feeds the outputs to the next layers. The bridge is a residual block that consists of similar BN, Conv, and ReLU layers which connects the encoder network and decoder network.

The decoder network takes information from the bridge and the encoder network through the skip connections to produce segmentation results. The decoder network consists of decoder blocks with upsampling layers that help retain size similar to that of the features in the corresponding encoder blocks, thereby finally resulting in segmentation results of output size similar to that of the input image.

After training, the result is a binary classified image that distinguishes between building and nonbuilding pixels. On Google Colab, the whole process of training the model with the ResU-Net network was conducted on an NVIDIA P100 GPU (16 GiB VRAM) with 25 GB of RAM. Hyper-parameter tuning is a very crucial part of DL training as it controls the overall behaviour of the model. The following hyper-parameters were utilised during training: Number of epochs (the number of complete training passes over a training dataset), Batch Size (the amount of training samples utilized before updating the model), Optimisers (algorithms that update parameters like weights to minimize loss), Learning Rate (a hyper-parameter that regulates how the model changes in response to an estimated inaccuracy). The Adam optimiser was used instead of the traditional Stochastic Gradient Descent optimiser in the tests, as proposed by Bottou (2010) and Pan et al. (2020). Because of its adaptive learning potential, the former is much quicker and converges faster to decrease the loss, thus enhancing overall accuracy. To optimize training speed and avoid overfitting the network model, learning rate and weight decay settings were employed. Heat maps of probability values belonging to the classes "buildings" and "non-buildings" were generated as a result of this stage. Weighted loss functions like the Tversky Loss function can force the model to focus on learning the target building pixels, even when the target pixels constitute a relatively small part of the whole image (Lin et al. 2020). Therefore, this loss function was investigated to improve the Precision and Recall by using the beta weights (alpha and beta) that control the overall False Positives and False Negatives, respectively.

The building detection results are evaluated by measuring the number of pixels assigned as True Positives (TP), False Positives (FP), and False Negatives (FN). The thematic accuracy assessments were computed with Precision, Recall, and F1-score using metrics. The proportion of buildings accurately recognized by the suggested method is shown by Precision (Eq. 1). Recall (Eq. 2) is the fraction of the buildings in the labelled data that were successfully spotted by the technique. The F1-score (Eq. 3) is used to balance the Precision and Recall parameters. The Accuracy (Eq. 4) indicates all of the True Positive and True Negative predictions that the model correctly predicted.

$$\mathrm{Precision}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}}$$
(1)
$$\mathrm{Recall}= \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}$$
(2)
$$F1-\mathrm{score}=2 \times \frac{\mathrm{Precision} \times \mathrm{Recall}}{\mathrm{Precision}+\mathrm{Recall}}$$
(3)
$$\mathrm{Accuracy}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}}$$
(4)

3.2 Building morphological clustering

To comprehend the physical morphology of the building structures, the next step was to use spatial urban morphological metrics through the Momepy Python package, which is a library for morphometrics and quantitative analysis of urban structures (Fleischmann 2019). Twenty-two morphological measures were chosen from the library's numerous metrics (Table 1) to determine the spatial connections and morphologies of the buildings to themselves and their surroundings. The spatial morphological analysis is based on the hypothesis that buildings that are comparable in shape and size and that are close to similar ones are also likely to have the same occupancy type (Fan et al. 2014a).

Table 1 List of urban morphological metrics used in this study (Fleischmann 2019)

After calculating the metrics of each building, the buildings were clustered, with each cluster including information on the physical shape of the buildings. The clustering is based upon the hypothesis made earlier where the attributes/characteristics are manifested by the physical morphology of the respective buildings. Clustering was accomplished via unsupervised K-Means classification, which divides the morphological metrics of the buildings into k clusters, with each observation belonging to the cluster with the closest mean (or cluster centroid). The appropriate number of clusters in the data set was determined based on the Silhouette score in combination with the input from local stakeholders as a guide. The Silhouette score approach is a metric that calculates the goodness of the clustering, where the value ranges between from -1 to 1 (Marutho et al. 2018). It effectively measures how similar an observation/object is to its cluster compared to the other clusters. A high score would indicate appropriate clustering configuration and the vice-versa would indicate too many or few clusters.

3.3 Built-up area homogenisation

After grouping the buildings into homogeneous clusters based on their physical morphologies, they still lacked the data on the likely occupancy types. If, for example, considering that 4 was the optimal number of clusters based on the Silhouette score in Palakkad, the conclusion would be that only four distinct types of buildings exist however, numerous spatially and morphologically distinct characteristics would also be assigned to these four types, which would be incorrect. Therefore, the next step was to improve the clustering with smaller homogeneous units (coupled with local expert validation, see Sect. 3.4). To address this, linear features such as road networks, river lines, and railway lines were used to subdivide the city of Palakkad into 62 homogeneous urban units (Fig. 4). These homogeneous units also follow the earlier hypothesis whereby buildings within the units/neighbourhoods manifest the built-up morphology. Using linear features as blocks to homogenize land parcels has been done earlier (Zeng et al. 2019; Kuffer et al. 2020) and thus, allow generating a sort of administrative units.

Fig. 4
figure 4

Linear features for designing homogeneous built-up areas in Palakkad

A metric (Eq. 5) was used to analyse the homogeneity of the cluster values within each homogeneous urban unit (Fan et al. 2014ab). It is not likely, for instance, that within an urban homogenous unit, there are building clusters with morphologies of agricultural buildings. The homogeneity score indicates the similarity of the building morphologies in each unit. A lower percentage score might aid in determining what additional sorts of buildings may be present in that particular unit, and further subdivide the unit into smaller homogenous ones.

$${\text{Frequency }}\,{\text{of }}\,{\text{first}}\,{\text{ majority}}\,{\text{ cluster}}\,{\text{ buildings}} + {\text{Frequency }}\,{\text{of}}\,{\text{ second}}\,{\text{ majority}}\,{\text{ cluster}}\,{\text{ buildings}}\,{\text{ Number }}\,{\text{of}}\,{\text{ buildings}}\,{\text{ per}}\,{\text{ homogeneous }}\,{\text{unit}}$$
(5)

3.4 Characterisation of occupancy types

The homogeneous urban units were subsequently characterised by the prevalent occupancy type using auxiliary open-source data such as building tags (from OSM and Google Maps) and land use in combination with the morphological information from the building clusters.

We first combined all the auxiliary data at the building footprint level into one shapefile. The next step is to determine the proper classification of the homogeneous urban units in terms of the occupancy types based on the amalgamated data. A majority condition rule was utilised to assert the majority characteristics to classify each individual homogeneous urban unit. These majority calculations were done in the Python environment. The classification system designed for this can be seen in Fig. 5 and the steps to perform it are as follows:

  1. 1.

    Sort the data according to the homogeneity score.

  2. 2.

    Based on the scoring and the majority cluster value, the associated building morphology is interpreted with the building tags.

    1. (i)

      If information from the building tags is not available, then skip step 2 and move to step 3 to use the land use information.

    2. (ii)

      If information from the building tags is available, then use it and then move to step 3.

  3. 3.

    Next, the majority land use information is used for further interpretation.

    1. (i)

      If information from the land use map is not available, then use the building tags from step 2 as the class label instead.

    2. (ii)

      If information from the land use map is available, then use the information and move to step 4.

  4. 4.

    Classify the built-up area units into occupancy types using the inferred/interpreted building type from steps 2 to 3.

  5. 5.

    Sort the classified classes from mixed-built-up and then re-classify based on the distance from the central business district (CBD) or the city centre.

Fig. 5
figure 5

Classification system for building classification based on the typology of the occupancy type

One rationale for reclassifying the mixed-built-up class into residential or commercial classes (in step 5) is to help break down the former into useful information that can be combined with vulnerability curves to estimate flood vulnerability (Huizinga et al. 2017) (refer to Fig. 12b in the appendix section). The next step was to compare the obtained results with the actual situation in the city. Hence, local experts of KSDMA and ICFOSS, Kerala, collaborated to validate the cluster interpretations and overall representation of the occupancy types of the buildings. Furthermore, the experts were asked to comment on the building classification and building occupancy types. Local validation was documented at two stages: a) K-means cluster interpretation, and b) the final (re-) classification of the homogenous urban units later on.

3.5 Flood exposure assessment

Flood susceptibility maps (indicating only presence/absence) were acquired from the KSDMA (KSDMA 2020) (Fig. 1b, c). Although KSDMA has undertaken a crowdsourcing campaign to obtain flood heights during this 2018 flood event, these were not accessible at a sufficient level of detail, and therefore flood vulnerability assessment could not be conducted.

“Flood exposure", which is the quantification of EaR in flood-prone locations (De Moel et al. 2011; Koks et al. 2015), was computed by overlaying the flood susceptibility map with the homogeneous urban units and individual building maps. The proportion of the area exposed by the flood extent is determined as well as the number of buildings exposed. Another method of determining the exposure for individual building footprints was done and the results were aggregated at the homogeneous unit level.

4 Results

4.1 Building detection for the Palakkad study site

The ResU-Net model was trained over 12 tiles and the accuracy was tested on the 3 test tiles using the metrics (Eqs. 1, 2, and 3). First, the Tversky Loss was investigated with varying beta weights and other hyper-parameters like batch size and learning rate. Our experiments demonstrated that beta = 0.7 with a batch size 12 and learning rate of 1e-3 (Table 2) was the best hyper-parameter combination which gave the highest F1 score with the lowest loss. Batch sizes and learning rates affect the convergence of the loss to reach the minimal point (Kinghorn 2018), and here, batch size = 12 and learning rate = 1e-3, gave the lowest loss of 0.231. While for learning rates 1e-4 and 1e-5, we see loss of over 0.5. Table 3 shows the outputs for different batch sizes and learning rates. Therefore, for the final training in Palakkad, batch size = 12 and learning rate = 1e-3 were chosen. The generated weights were then utilised to recognise buildings in the entire region of Palakkad using the TensorFlow API. To minimise the influence of boundary artefacts on the predictions, a sliding window approach with a stride (pixel steps that a filter move by during prediction) of 24 was used to produce overlapping images over each 512 × 512 sized patches and the prediction of these were averaged to get the final segmentation results. Post-processing was performed to remove multi-polygons (instances of building polygons coinciding with another building polygon) and false-positive predictions. Figure 6a, b depicts the detected buildings against the manually mapped buildings.

Table 2 Table of Tversky loss against different batch sizes. Bold numbers are the best values
Table 3 Table of accuracies against different learning rates trained with Tversky beta weight of 0.7. Bold numbers are the best values
Fig. 6
figure 6

a Detected buildings using the ResU-Net model. b: Manually mapped buildings against the detected buildings. c: Existing building tags from the OSM data (inside the yellow box), and d: shows the overall cluster values of each building after performing K-Means clusterisation over the morphological data of each building (see Table 4 for explanation of the classes

4.2 Amalgamation of building morphological with open-source data and local expert validation

After post-processing the detected building footprints, we calculated 22 spatial urban morphology metrics (Table 1) of the buildings using the Momepy Python library. These were used to cluster the buildings based on the K-Means unsupervised algorithm where each cluster value represents a morphological metric associated to the buildings. Figure 6c shows the types of building tags available in the OSM data. Figure 6d and Table 4 show the classification in eight clusters (chosen based on the results and suggestion of the local experts), describing the building morphology.

Table 4 Building cluster interpretation in Palakkad after local expert validation

The cluster data in combination with auxiliary data (see Table 5) such as building tags, and land use information were used to determine the majority of urban land use type per homogeneous urban unit.

Table 5 First five examples of the merged data. Refer to Table 4 for the definition of the cluster values

The cluster interpretation and final classification were modified to the actual setting of Palakkad based on the opinions and suggestions of the local experts who advised to employ a distance-based classification from the CBD while re-classifying the homogeneous urban units to address the Mixed-Built-Up classes. With increasing distance from the CBD, the classification of units would shift from commercial, residential urban, public, industrial, and residential rural (Appendix section, Fig. 12b).

4.3 Final classification of buildings after the classification system

Table 6: First 5 examples of the final classification based on the majority information of the auxiliary data.

Table 6 First 5 examples of the final classification based on the majority information of the auxiliary data

The final product indicates the predominant occupancy type per homogeneous urban unit in Palakkad as seen in Fig. 7. This is done by incorporating the majority rule-based classification system as shown in Fig. 5. Overall, there are 13 occupancy types recognised by this method with additional information (Table 6) such as the number of buildings, building density, percentage of built-up area per homogeneous unit, number of floors (estimated using Google Street View and Mapillary images), and total floor space area (using the estimated number of floors with the building floor space per homogeneous unit). Such information can very well used be in the context of exposure to flooding, for example, and leverage such data to support risk assessment and risk reduction initiatives.

Fig. 7
figure 7

Occupancy types of the homogeneous urban units in Palakkad with the respective unit numbers

5 Transferability of the method in a different test area

Similar to Palakkad, 8 tiles each 8000 × 8000 in size were generated for Kollam for training (5 tiles) and testing (3 tiles) purposes. With the ability of learning feature representations (building features in this case) from previously trained models, transfer learning can become very effective when there is scarce training data by transferring the learnt weights from previous models to different locations with new data (Ravishankar et al. 2016). The weights obtained from a previous model can be applied to the class of buildings in a new area of interest, whereby it can learn on top of the pre-trained model and retrain an output layer through the target building data set. This method can shorten the training time of the model and improve model efficiency in the new area (Bai et al. 2012).

Therefore, to detect buildings in Kollam, transfer learning was used to address fewer training data. The OSM data for Kollam contains about 1100 building polygons in the training tiles but a few more building footprints were manually digitised within the five training tiles to compensate for the missing labels and to also correct some erroneous footprints within the OSM data. Using transfer learning makes more sense than simply training from scratch with label data from Kollam alone as rooftop configurations (such as colour and shape) are similar to that of Palakkad’s, thus allowing us to avoid longer training runtime (Xu et al. 2013). Transfer learning help accomplish faster and seamless detection of buildings in new study areas with just a few training samples, allowing for effective transferability of the model in other relatively similar regions. Such ability to detect buildings over a new and completely un-seen environment makes the use of such deep networks very advantageous. Using transfer learning from the weights learnt in Palakkad, the model trained over Kollam achieved over 74.6% F1-score accuracy. The predictions of buildings over Kollam can be seen in Fig. 8-A (red coloured polygons).

Fig. 8
figure 8

Results of the building detection in Kollam. a: Detected buildings data. b: The existing tags from OSM. c: The overall cluster values after K-means clusterisation

The morphological metrics from Momepy were used to cluster individual buildings in eight clusters (see Table 4 and Fig. 8c). The available building tags from OSM and Google Earth were linked to the individual building footprints, shown in Fig. 8b.

Similar to Palakkad, linear features were used to improve the homogenisation of the existing clusters. These linear features can be seen as edge boundaries in each homogeneous urban unit in Fig. 9. From the DL detection point of view, Kollam buildings were predicted with the learnt weights from Palakkad as well as trained with new building training samples, which has affected the predictions to be far better than that of Palakkad.

Fig. 9
figure 9

Occupancy types of the homogeneous urban units in Kollam with the respective unit numbers

Figure 9 shows the final classification of the occupancy types of Kollam. This result was obtained by employing the majority classification system similar to Palakkad. Also, validation by local experts was performed for Kollam to investigate, improve, and refine the occupancy type classification. The procedure to derive the homogenous units in Kollam was also timed to analyse how fast such an analysis can be done. The steps of downloading the auxiliary data, generating building footprints using DL, application of Momepy metrics and classification of clusters, amalgamating auxiliary data and interpretation through the classification system, took around one day. Despite differences in the availability of auxiliary data, the final results were comparable in the two study areas. Moreover, with additional input from the local experts and stakeholders through online interviews, it was possible to refine the classification further (refer to Fig. 12a).

6 Exposure assessment

The resulting urban classification maps of this study are intended as key input for the exposure, vulnerability and risk assessment for hazardous events such as flooding. The exposure analysis aims to calculate the number of exposed buildings, their spatial distribution, and typological attributes based on the occupancy type. This information can be used in combination with hazard intensity maps (like flood depth) and physical vulnerability curves that are linked to the occupancy types. Exposure was calculated in different ways: percentage of the homogeneous unit, percentage of the buildings in the unit, the number of buildings in the unit, the floorspace in the unit. Figure 10 gives an example of the flood exposure maps for the two locations, indicating the percentage of the homogenous units exposed and the percentage of the buildings exposed 21 homogeneous units are exposed to flooding in Palakkad and 18 in Kollam.

Fig. 10
figure 10

Flood exposure maps. a and b Percentage of homogeneous units in Palakkad (a) and Kollam (c); b and d Percentage of buildings per homogeneous units exposed in Palakkad and (b) and Kollam (d)

The differences between the percentage exposure at homogeneous unit level and the percentage of exposed buildings per unit show some interesting differences (See Fig. 10). The main reason for such differences is the non-uniform spatial distribution of the building footprints within the homogeneous units, indicative of the fact that for some units, the level of homogeneity might be too big and possibly require further subdivision. A good example of this phenomenon is given in Fig. 11. Due to the uneven distribution of buildings within the unit, 19% of the buildings aggregated are exposed, whereas 37% of the entire homogeneous unit is exposed to flooding. As a result, the final exposure assessment results will be the aggregated footprint level values recorded at the homogeneous unit level.

Fig. 11
figure 11

Flood exposure to homogeneous units against the building footprints in Palakkad

7 Discussion

As discussed in 4.1, the best result was given by beta = 0.7 as the accuracy peaked in terms of highest F1 score and lowest loss by targeting the pixels of buildings with FNs. The reason for this is that greater beta weights prevent the model from learning the training data adequately, and as a result, the loss value plateaus during training, thereby decreasing the total accuracy. Even if the Recall increases, adding greater weights does not guarantee that the class imbalance will be handled linearly, as lower Precision decreases the F1 score, resulting in a lower F1 score across all batch sizes. Therefore, a combination of the hyper-parameters that gave the best balance between Precision and Recall was considered for the final training. Furthermore, the poor performance of loss values with lower learning rates of 1e-4 and 1e-5 demonstrates that such lower learning rates are ineffective in updating learned weights and are unable to appropriately optimize training with the existing data. Lower learning rates can deteriorate updating of the weights as training progresses slowly due to tiny updates to the weights in the neural network this procedure reduces the model's overall capacity to train optimally and obstructs its potential performance in achieving better accuracy.

There were quite a few FP predictions due to the similar spectral characteristics of building roofs and roads in Palakkad. This is also explained by the low Precision scores as seen in Table 2. However, with appropriate post-classification removal of these FPs, such non-building artefact issues could be easily eliminated.

As the cluster values of each homogeneous unit depict the morphological characteristics of buildings, some units showed low homogeneity scores (e.g., units 2, 3, and 4 for example in Table 6). A reason for this is that some predictions made by the DL model result in irregular polygonal building features, which affect the building morphological metric calculation. Fan (2014a) and Qi and Li (2008) also state that homogeneity scoring is suitable for representing similar buildings but is dependent on the detail of the polygonal geometry of the buildings.

The use of an urban land use map, which was available for the study areas, is not a requirement for the methodology. Although a very useful input, the use of auxiliary information from OSM and Google Maps tags, building morphological information based on the spatial characteristics, and the homogeneity scores of the morphological clusters, are decisive in determining the classification of the buildings within each homogeneous unit. The approach is also applicable in areas where an urban land use map is not available. Also, the knowledge from local experts, although extremely useful, is not essential in the methodology. Their suggestions helped to improve the classification of the built-up area into building occupancy types. There were challenges that were met such as subjective classification of the buildings. The involvement of local experts in the interpretation of the automated procedure is a useful alternative for the time-consuming ground survey using VGI. As this method is aiming to provide fast results, it therefore seems as an important add-on to the automated procedure.

The detail of characterisation of the buildings is another point for discussion. Whereas it is possible to map buildings at an individual level, their characterisation using the approach outlined in this research does not allow the characterisation of each individual building. Therefore, the approach was carried out at a homogeneous level to counteract this drawback, but this also brought forth another issue where the rather coarse homogenous units that have still quite variation in building density implying that they are not so homogenous after all, and perhaps could be subdivided more.

For each of the homogeneous units, important characteristics for the evaluation of exposure, vulnerability, and risk to natural and anthropogenic disaster was obtained: occupancy types, number of buildings, average number of floors, total floorspace area, and percentage of built-up area. These can be used to estimate population data per homogeneous unit (Lwin and Murayama 2009), thus also forming the basis for population exposure and risk assessment.

In the use of OSM data, some well-known problems were encountered concerning positional accuracy, data quality, and lack of attribute information. The tags of residential buildings are mostly not available, resulting in a strong bias towards non-residential buildings, with an emphasis on commercial buildings. Even when there are building tags available for other uses, for example, schools or religious buildings, the application of the majority rule per homogeneous unit often makes that these are outnumbered by other land uses. Due to the small number of individual tags present within a homogeneous unit, sometimes using the majority tags might not be the best way to represent the actual building occupancy, also in addition to the fact that certain units may be rather large in comparison to the relatively small number of buildings.

Another limitation witnessed is the interpretation of the classification system, which can change in different types of settlements and countries, and hence would require local validation every time. Because of this reason, the methodology cannot be fully automated as there will always be a point where local knowledge validation would be necessary to authenticate the results. A possible solution to overcome this problem is streamlining the rules for different types of settlements and countries through organisational efforts at a meta-level. Nevertheless, this is beyond the scope of this study as it would require a large sample of case study cities. However, the agreement of the local stakeholders' on the various occupancy types in Palakkad ruled out favour of the methodology's overall applicability in data-scarce regions, thus encouraging to test the reproducibility of the approach in Kollam.

One of the crucial questions which this methodology also attempts to answer is to link the building characteristics of occupancy types to the physical vulnerability of buildings. One of the main requirements for calculating the vulnerability is to link these occupancy types to physical vulnerability curves such as the global flood-depth damage curves reported by Huizinga et al. (2017). The method proposed in this study can be applied for rapid initial elements-at-risk characterisation at a regional to city-scale, and the results of the exposure and vulnerability assessment can be subsequently used in loss estimation, risk assessment, and planning of measures and policies to reduce, mitigate, and avoid risk of hazards.

8 Conclusion

The research attempted to provide a fast and preliminary approach for elements-at-risk mapping by developing a semi-automated detection and characterisation method. The research objectives were achieved by first detecting buildings in Palakkad with an F1 score of 76%, followed by homogenising the buildings into units with linear features such as road networks as boundaries. The building morphological characteristics were then assessed using the Momepy approach, and the results were used to develop a number of clusters with similar building characteristics. These were combined with auxiliary information such as building tags from OSM and Google Maps, and a classification system was applied to determine the main occupancy type of the homogeneous units. Moreover, we also tested the reproducibility of the methodology in a different city, where we achieved an F1 score of 74% in building detection and building occupancy type as the characterisation output. The building maps were then used to quantify flood exposure.

This study is one of the first attempts at showing the possibility to obtain EaR information/data as building occupancy type using remote sensing image data in combination with freely available data on geotags and OSM, by means of the state-of-the-art DL models, open-source remote sensing products, and validation with local expert/stakeholder. Such data can be extremely relevant in flooding exposure with information such as building density, the average number of floors, total floor space area, and can be used to support risk assessment and risk reduction measures.

However, certain challenges still remain such as the availability and accessibility to quality open-source data in other countries, the need for local experts to address and refine the building occupancy type classifications, and the inclusion of AI in fully automatising the classification system. Nevertheless, the research does enable the development of databases for buildings as EaR in data-scarce regions, which is the first step for estimating hazard vulnerability, risk assessment, rescue missions, and rehabilitation.

This methodology also has implications for dasymetric mapping in developing nations or regions that lack building typological information. With our approach, it became evident that OSM labels for building tags are critical, but that such information was lacking in some areas of the cities of Palakkad and Kollam, emphasizing the need to update building tag information and make it publicly available for further research. Another significant aspect of the study was to remark on the rapid or timely mapping of buildings using open-source data in real-world crises to swiftly develop an EaR database for effective risk reduction and disaster relief efforts. In the future, we would be experimenting with better-curated data (for example the WSF-3D) and more complex characterisation algorithms to fully automatise this approach. We would also be looking towards classifying more attributes apart from the occupancy types for better use in vulnerability assessment like building materials. Furthermore, efforts will also be spent on scaling this approach for more number of hazards including landslides for multi-hazard exposure and vulnerability assessment.