1 Introduction

The European Floods Directive defines flood risk as “the combination of the probability of a flood event and of the potential adverse impacts on human health, the environment, cultural heritage and economic activity associated with a flood event” (European Parliament 2007). Several other definitions also agree that flood risk is the probability of a certain flood event, combined with its impact (HR Wallingford et al. 2006). Kron (2002) regarded flood risk as being determined by the nature of the hazard and its probability of occurrence, the people and assets that are potentially exposed to the hazard, and their vulnerability. When an exposed population comes into contact with a hazard, their vulnerability will determine the impacts. Sufficient understanding of the risk can help decision makers to adopt adequate measures for flood damage reduction (James and Hall 1986). Gain et al. (2015) have argued for the need to integrate a range of impacts to fully understand flood risk and did so far a part of Dhaka. The effectiveness of alternative measures can be evaluated by considering the reduction in risk that proceeds from their implementation. Therefore, flood risk evaluation requires a proper understanding of the consequences and probability of flood events.

White (1945) considered that flooding causes four types of impacts on communities: (a) damage to physical property; (b) interruption of the production of goods and services; (c) loss or impairment of human life; and (d) reoccupation and rehabilitation of flooded areas. Many studies have adopted these concepts and reclassified them as combinations of tangible, intangible, direct and indirect damage. A tangible damage is one that can be assessed in monetary terms (Smith and Ward 1998). Similarly, Messner et al. (2007) defined tangible damage as one that can be “easily specified in monetary terms.” Intangible damages are those, in contrast, which cannot be so easily specified. Tangible damage includes the cost of physical damage to property, or the loss of business due to interruptions in the economy during and after a flood event. Intangible damage might include the loss of human life, or the increased burden of disease. The distinction between the two is not entirely clear, as some authors have attempted to “monetise” intangible losses, using concepts such as willingness to pay (WTP; Lekuthai and Vongvisessomjai 2001). Parker (2000) made a subtle distinction in describing intangible losses as those where monetary estimates are considered to be undesirable or unacceptable.

A direct damage can be defined as any loss that is caused by the physical contact of flood water with humans, property and the environment. An indirect damage is induced by the direct impacts and may occur—in space or time—outside the flood event. Jonkman et al. (2008) referred to direct losses as occurring within the flooded area, and indirect damage occurring outside of the flooded area. Messner et al. (2007) stated that direct damage is usually measured as damage to stock, whereas an indirect damage relates to interruptions to flows and linkages. There are few applications where researchers have incorporated these indirect damages to estimate the total risk, and where they do, they have often resort to making coarse assumption (Gain et al. 2015; Giupponi et al. 2015).

Vetere Arellano et al. (2003) and Meyer and Messner (2005) described different approaches that are used in European countries to evaluate flood damage. The differences in approach relate to (a) the objective of the evaluation, (b) the damage categories considered, (c) the level of detail, (d) the scale of the analysis and (e) the basic evaluation principles. Some hazard analyses further associate the expected damage with the probability of flooding such that the risk can be monetised (Merz et al. 2004). Messner et al. (2007) produced guidelines for practitioners on ex-ante flood damage evaluation as part of the European Commission-funded FLOODsite project. Another European research project ConHaz (Cost of Natural Hazards; Green et al. 2011) specifically aimed at synthesising current cost assessment methods and strengthening the role of cost assessment in the development of natural hazard management, including droughts, floods, storms, coastal hazards, alpine hazards and adaptation planning. Hammond et al. (2015) reviewed the state-of-the-art in urban flood impact assessment and described the strengths and weaknesses of the different techniques. These insights have been incorporated in the development of the damage assessment tools presented in this paper.

The expected flood damage will be influenced by, among other factors, the land use, and the economic value of the associated human activities. White (1964) associated the land use with functions that relate property damage to flood depth. These depth-damage functions have become the standard approach in flood damage or impact assessment (Smith 1994). Although depth is the most commonly considered variable in damage functions, other flood characteristics such as the flooded duration and flow velocities are considered (McBean et al. 1998). Smith (1994) and Dutta et al. (2003) estimated the flood damage using the hazard map with depth-damage functions. In earlier applications, the lack of high-resolution data often restricted the damage evaluation to broadscale such as land use zoning in a city.

Damage functions can be developed through empirical approaches, which use historical data to develop the relationship between the inundation characteristics and the damage. The empirical approach has been used in multiple studies and relies on the existence of reliable historical flood damage data. Examples of such studies include work in Brazil (Nascimento et al. 2007) which used a reference flood event from 2000, work in Japan (Dutta et al. 2003), or work in Germany (Merz et al. 2004). If such empirical data exist, it should be used, even if in conjunction with a synthetic approach. Such damage data can be collected either from official agencies, or from insurance companies. There is, however, a lack of reliable, consistent and comparable flood damage data (Elmer et al. 2010).

The other approach is described as synthetic, such as the UK’s Multi-coloured Manual (MCM; Penning-Rowsell et al. 2010), where expertise was used to build a database of absolute damage curves for over 100 building types. The synthetic approach does not mean that it is artificial (Messner et al. 2007); the creation of the damage curves often involves the synthesis of all data, including historical data. Some authors have differed slightly on the meaning of this term. Merz et al. (2010) have described this approach as being developed by applying “what-if” scenarios and argued that the synthetic approach and the empirical approach can be combined, whereas Penning-Rowsell et al. (2010) would see this as a synthetic approach, with the empirical and synthetic approaches being mutually exclusive.

Advances in the availability of geographic information system (GIS) and computing power have enabled the accessibility of detailed spatial data and modelling results. Hénonin (2007) developed a toolbox in ArcGIS to identify the buildings which had edges within a buffer distance to flooded areas, obtained from the results of hydraulic modelling with Danish Hydraulic Institute (DHI) MIKE Flood. Hence, the total number of affected buildings can be counted as a measure of the flood damage. Qi and Altinakar (2011) developed a GIS-based decision support system to evaluate the loss of life and flood damage due to a dam failure scenario. Alexander et al. (2011) developed a GIS tool using Visual Basic on the ESRI ArcMap platform that can demonstrate information, including hazard, vulnerability, financial losses and risk to life. Two hazard models were applied to assess the damage: (a) risk to life: classified based on the combination of flood depth and velocity (Priest et al. 2008); (b) flood damage: based on the depth-damage functions from the MCM (Penning-Rowsell et al. 2010). Balica et al. (2013) developed a GIS model to calculate flood risk using a physically based model and compared the results to the flood vulnerability index (FVI) obtained using a parametric approach. Although the FVI model provides a quick and reliable assessment, the information is not detailed enough for engineering design or project-level decisions. Much of the literature discusses the calculation of flood damage; nevertheless, very few studies have proposed a methodology that can be efficiently applied to different case studies with different building data and hydraulic modelling results. Meanwhile, some approaches combine the land use regions and the average flooded depth from hydraulic modelling results to evaluate the damage (Hsu et al. 2011; Smith 1994). We argue that since the damage functions are nonlinear, the use of average value could lead to inaccurate estimations. Innovyze (2014) recently announced InfoWorks ICM RiskMaster that utilises the InfoWorks ICM hydraulic modelling results to calculate the flood damage of individual events and the expected annual damage (EAD), which provides a simple solution to associate the flood hazard with the risk. The approach uses an irregular mesh as the computing unit and has the advantage that it can fit building layouts. However, it may not be directly applicable for hydraulic models using regular grids.

The Collaborative Research on Flood Risk in Urban Areas (CORFU) project was established to develop and investigate new integrative and adaptable flood management plans under different scenarios of relevant drivers: urban development, socioeconomic trends and climate change (Djordjević et al. 2011). It is hoped that the results of this project will enable more scientifically sound policies for the management of the consequences of urban flooding. To achieve this, the project aimed to quantify the cost-effectiveness of different resilience measures.

In the CORFU project, a framework for flood damage assessment has been developed, including direct tangible, indirect tangible and intangible damage, that can be applied to different Asian and European cities. All the case studies have different levels of data availability and quality such that flexible tools that can deal with such variety are required for the applications. Furthermore, although associating flood depth, depth-damage relationships and land uses has been widely applied to flood damage assessment, to the authors’ best knowledge, none of the existing models or tools has been implemented to evaluate the flood damage for building parcels at a mega city scale (e.g. city with more than 10 million population; United Nations 2004). The urban development in developing countries is so rapid that it is difficult to predict the layouts of future building parcels for areas where have not been developed yet. Hence, the assessment is more challenging for the future urban development scenarios because of lacking detailed parcel information.

In this paper, we will describe the challenges for dealing various input data, the limitations of existing commercial software, and the solutions to the problems, which are implemented in the tools we developed for providing efficient assessments. The innovation of the tools is that they are developed to work with different data formats and resolutions for a broad range of applications. The methods applied, the data required, the technical issues encountered and their solutions are presented in the following sections. We have applied the tools to estimate the flood damage to buildings for the current and the future urban development conditions in Dhaka City, which has more than 1 million building parcels. The sensitivity and performance of model applications are also investigated and discussed.

2 Materials and methods

As the CORFU project has adopted an integrated framework, most of the case study cities have adopted DHI MIKE 21 for overland flow modelling, which uses a raster grid. A series of flood impact assessment tools have been developed, which can be integrated with the results of hydraulic models. These tools use Python scripts, Fortran executables and the geoprocessing functions within ESRI ArcGIS software. The tools are designed to minimise the manual input required to calculate the flood damage based on hydraulic modelling results and other supplemental information. Most GIS software packages such as QGIS (QGIS Project 2014a) and GRASS (GRASS Development Team 2014) also provide similar geoprocessing functions and support integration with Python (Glynn and Martin 2014; QGIS Project 2014b). The Python scripts can be easily adapted by changing the geoprocessing functions to the corresponding functions in other GIS software packages, which would allow the future application of the tools on different platforms.Footnote 1

Some algorithms applied herein are difficult or inefficient to implement in a GIS environment (e.g. damage calculation in Sect. 2.1.2 and raster data aggregation in Sect. 2.1.5); therefore, separate executable programs were developed to provide these functions. Figure 1 shows the damage assessment framework we have adopted. The numbers along the arrow lines represent the subsections below, which describe details of particular issues encountered during the analysis and the tool functions for solving the problems. In general, by combining the hazard characteristics with the location of an object (a building parcel or a zoning polygon), the object’s exposure to a hazard can be determined. The vulnerability of an object depends on its attributes. By associating the exposure and the vulnerability information with the hazard-vulnerability relationships (e.g. flood depth-damage of contamination-mortality functions) in a city, the impact (e.g. damage, health impact) of a hazard to individual objects within the city can be estimated. Considering multiple hazard events of different probabilities, the EAD can be determined, which can help the authorities understand the risk associated with a hazard, implement countermeasures, and evaluate the effectiveness of these measures.

Fig. 1
figure 1

The damage assessment framework (numbers along the arrow lines represent the subsections below about relevant issues in the analysis)

Future urban growth scenarios can be projected by using urban growth models (UGMs) and represented as land use and land cover (LULC) maps, where cells represent discrete land use classes in contrast to the polygon representation of real buildings such as schools, hospitals and factories. By comparing the distribution of building objects with the land use classes for a baseline year, a relationship can be estimated and then projected into the future with the urban growth models which will feed into the LULC-based vulnerability and impact assessment. Likewise, the risk of hazard under future scenarios can be assessed such that better urban planning can be achieved to improve the city’s resilience to hazard.

The standard GIS data format is adopted for the inputs and the outputs of the standalone executable programs, to allow seamless integration with other tools and functions. In this paper, we developed a methodology that could be applied to assess different types of damage using flood maps produced by various hydraulic models. In general, the calculation of direct flood damage requires basic information including hazard characteristics, object attributes and damage functions. Once the hazard characteristics that relate to an object are known, the impact can be estimated using the damage function associated with its attribute.

The damage assessment tools can be executed in the command line style environment of Python window. The dialogues use the ArcToolbox interface to collect the required input data and allow a user to change some parameters for modelling. Once a user has completed the input, the toolbox will call the Python script to execute the whole algorithm, and the final result will be added to the GIS windows automatically.

2.1 Functions

2.1.1 Format conversion

Spatial information can be provided in either vector (polygon, polyline or point) or raster formats. The parcel objects and the zoning data are often in polygon format. Both irregular mesh (polygons) and regular grids are commonly used in hydraulic modelling. In this study, the outputs from the MIKE 21 are in raster format, and the outputs from MIKE21 FM can be converted to raster format as well, which allows the data to be processed directly in the ArcGIS environment.

Mismatches between vector and raster data make the direct comparison between different datasets difficult. For example, Fig. 2 shows building polygons that extend beyond a single raster cell. Equally, a raster grid cell can also contain more than one building object. Selecting a representative value to describe the flood depth for a building is therefore difficult. Furthermore, depth-damage functions are typically nonlinear. The damage might be very low at a shallow flood depth and increase exponentially when the depth reaches a certain range and then asymptotically reaches a maximum value. Because of this nonlinearity, calculating the damage using the average flood depth over the domain gives a different result from the one that uses the depth for each cell. Hence, it is necessary to convert the data into the same format with a sufficiently high resolution to avoid losing too much information.

Fig. 2
figure 2

The building objects (coloured polygons) and modelling cells (black regular grid lines) do not have a one-to-one relationship

The raster grid represents data values that can vary rapidly from their neighbours. The conversion from raster to vector would result in a huge number of vectors that would be less efficient for both data storage and processing when a large case study is considered. Therefore, when the input data include both vector and raster formats, the tools will automatically convert vector data into a raster format for data processing. It also performs the reverse conversion from raster to vector when the analysis is finished to present the results for individual vectors such as buildings.

2.1.2 Damage calculation

By overlapping the hazard characteristics and object attributes, the tools will associate the damage function for each spatial location to calculate the damage. ArcGIS has a look-up table functionality, although it can only be used to retrieve a value (the damage) for a corresponding modelled value (flood depth). This built-in function cannot be used to calculate the flood damage by interpolating the values between different flood depths. As the hazard characteristics such as flood depth vary in small increments, interpolation is necessary to determine the proper damage values for the depth between two sampling points defined in the look-up tables. An external program is developed for that purpose.

When a parcel or a zoning polygon includes multiple vulnerability groups (e.g. mixed-use building, different age groups in an area), the components of these various factors will affect the damage calculation. The combination of factors may vary from building to building (e.g. area ratios of different uses) such that it would be difficult to define unique damage functions for every individual building. To simplify the input, the damage functions per unit area are used together with the ratios of different factors at a location to estimate the total damage and the components of that damage.

The tools can calculate the damage to all cells within an object (a building parcel or a zoning polygon) and add the values together to get the total damage for each object. In the CORFU project, the impacts of flood events are analysed, so that strategies to improve the resilience of a city can be proposed and evaluated. A number of strategies can be implemented, and cost–benefit evaluation can support the choice of an optimum solution. Normally, such evaluation uses the reduction in EAD as the main indicator to show the benefit of portfolios of measures. Hence, the damage assessment tool takes multiple events of different return periods as the inputs determine the damage produced by each event and calculate the EAD, weighted by probabilities of events, at the same time for each object. The tools can summarise the damage for different categories of the object attributes to provide the overall results of the analysis.

2.1.3 Flood duration analysis

Flood duration is another factor that may affect the flood damage. Although flood duration might be analysed directly during the hydraulic simulation, it requires extra processing time that will reduce the computational efficiency. A more practical solution is to export snapshots of the simulated flood information at set time-steps, and to analyse the data when the simulation is complete. The damage assessment tools are capable of reading a series of raster grid results from DHI MIKE 21 and calculating the duration of the flood condition at any location within the modelling domain. The tools flag a location as flooded when the flood characteristics (e.g. depth, velocity) increase above the given thresholds in a temporal snapshot of modelling results. The flood duration is accumulated until the flood characteristics return below the thresholds in a later temporal snapshot. The information allows the damage to be calculated using duration-related hazard-vulnerability functions.

2.1.4 Urban growth model land cover classification

In the CORFU project, an UGM has been developed (Veerbeek et al. 2015) to project future changes in land cover, based on historic LULC data and various terrain characteristics. The UGM also considers different drivers such as alternative growth rates, growth containment and zoning plans to produce possible future scenarios. However, the UGM can only provide land cover classifications, which represent the density of built-up areas and does not directly correspond to a specific land use. The UGM cannot therefore provide detailed information on building parcels. It is not possible to link exact building polygons to damage, and a more aggregated approach is required. Flood damage has to be assessed using land cover classes, and damage functions have to be determined for those classes.

The development density of a region is related to the area and the use of buildings within its extent. Therefore, analysis of building components in each land cover class can be used to provide weights for different building uses to determine new damage functions for a land cover class. We considered 2010 as the baseline year for the urban growth scenarios, which means the initial LULC distribution for future scenarios was calculated based on the observed (and classified) 2010 Landsat 5 TM data. For the baseline year, both the existing building information and the LULC are available, and the comparison of these two datasets can establish the relationship between them, i.e. to calibrate the estimation based on LULC. Assuming that the relationship between land cover classes and proportion of certain building types remains constant over time, the newly obtained damage functions can be applied to assess the flood damage in the future based on the UGM projections. If these relationships are not anticipated to be fixed, they can be manually adjusted. In addition, land cover information is also essential for flood simulation because it determines catchment imperviousness which is a key hydrological parameter used to estimate the changes in surface run-off for the future scenario.

2.1.5 Raster masking and aggregation

Using larger cells for the analysis will result in the overestimation of parcel areas, if an object only occupies a small portion of a cell and the whole cell area is used to represent the small parcel. Although the cell size for analysis may be sufficiently fine to differentiate separate objects, the conversion from polygon to raster format may also introduce errors because the parcel boundaries do not align with the cell boundaries. Using smaller grid cells will reduce the error but also increase the computing time and the data storage requirements. An alternative solution is to create a mask that represents the existence of parcels at fine resolution. By overlapping the mask raster with other raster, it can efficiently clip out the non-object areas from the coarse analysing resolution.

However, the direct use of the Geoprocessing function “PolygonToRaster” within ESRI ArcGIS software to convert parcel polygons to raster format at a coarse resolution is inappropriate, because the non-parcel areas often dominate the area of a coarse cell, and parcels which have relatively small areas within a cell will be neglected. Figure 3 shows the conversion of building polygon layouts in Fig. 1 to raster format of different resolution. Figure 3j shows the layout of the building polygons. The building with “Commercial Activity” use (in green) disappears with a 25-m raster, shown in Fig. 3a, because the ‘non-building’ part occupies most of the cell area. The same is observed for most of the residential buildings in the 25-m raster. The use of 5 m resolution in Fig. 3e gives slightly better representation of building layouts; however, large building areas are still translated as non-building cells using the built-in “PolygonToRaster” function.

Fig. 3
figure 3

The representations of building attributes using different raster resolutions, aggregating methods and masking. a 25 m raster converted from building polygons using ArcGIS built-in “Polygon to Raster” function. b 25 m raster aggregated from 1 m raster that is converted from building polygons using ArcGIS built-in “Aggregate” function. c 25 m raster aggregated from 1 m raster that is converted from building polygons using the “Aggregate” function developed in Fortran code. d Masked 1 m raster using the aggregated 25 m raster in (c). e 5 m raster converted from building polygons using ArcGIS built-in “Polygon to Raster” function. f 5 m raster aggregated from 1 m raster that is converted from building polygons using ArcGIS built-in “Aggregate” function. g 5 m raster aggregated from 1 m raster that is converted from building polygons using the “Aggregate” function developed in Fortran code. h Masked 1 m raster using the aggregated 25 m raster in (g). i 1 m raster converted from building polygons using ArcGIS built-in “Polygon to Raster” function. j Original building polygons

Therefore, an object mask raster at a finer resolution is created and the object attributes are also saved in the same resolution. Figure 3i shows the building use raster in 1 m resolution converted by the built-in “PolygonToRaster” function. The information is then aggregated back to a coarser raster scale for damage assessment. However, the built-in “Aggregate” function from the ArcGIS Geoprocessing toolbox only allows the use of maximum, minimum and mean of the fine cell values to represent the new value in the coarse cell. The 25- and 5-m raster, respectively, using the built-in “Aggregate” function, is shown in Figure 3b, f. The attributes are not reflected properly in some cells because the function cannot identify the majority values of subareas within them.

Hence, an external Fortran program was developed to select the majority (mode) of the fine cell values as the new coarse cell value for raster data aggregation. Figure 3c, g shows the 25- and 5-m raster, respectively, using the new aggregate function in the tools that describe the attributes better. Nevertheless, the coarse raster covers large “non-building” areas which will result in the overestimation of damage. Therefore, the object mask is applied again to clip out the “non-building” areas for the damage assessment. Figure 3d, h shows the masked 1-m raster from the aggregated 25- and 5-m raster, respectively. Both masked raster have similar layouts of building attributes to Fig. 3i, and all the raster have close representations to the reality shown in Fig. 3j. The masking procedure can minimise the estimation error introduced due to the differences in spatial resolution for analysis. It also avoids redundant computing that the simple use of finer resolution data will have (e.g. 25 repeated calculations for the 1-m fine cells within a 5-m cell).

2.2 Data requirements

2.2.1 Hazard characteristics

The damage assessment tools can take two-dimensional (2D) hydraulic modelling results from both regular grid models, such as those are used in the MIKE 21, or an unstructured mesh model, such as the output from or MIKE 21 FM and Infoworks 2D. For a given combination of rainfall, boundary, and terrain conditions, a hydraulic model normally produces detailed information on flooding including: (a) maximum flood depths; (b) maximum flood velocities; (c) maximum concentrations of contamination; and (d) snapshots of flood depths, velocities, and contaminations at selected timings; which all form the basis for direct damage assessment.

2.2.2 Object attributes

Locations and layouts of objects (building parcels, vulnerable groups) are essential for determining the damage at the parcel level based on their spatial relationship with hazards. Each object needs a unique index so that the damage to individual objects can be accounted for. Each object may have an attribute such as (a) building use, (b) LULC from UGM, (c) population density, and the flood impact may vary for the same parcel if its attribute is different. Each attribute is associated with a damage function for a particular hazard characteristic so the damage can be calculated.

Where an object has multiple attributes (e.g. mixed building uses), the components of various attributes can be used, and the damage will be calculated according to their area ratios to the parcel area. If the object information does not include any attribute, it is necessary to overlap objects with other data such as land use zoning to determine the attribute for impact assessment.

When the object information is absent, the direct application of the zoning data is also a possible option in the damage assessment tools. The zoning data may include non-object areas, and the applicability will depend on the definition of damage functions. To consider the flood impact in future scenarios, the model can take the raster-based LULC from a general UGM as input as well.

2.2.3 Damage functions

The damage functions that relate hazard and vulnerability provide the link to calculate the damage. For various types of damage assessment, the relationships can be (a) depth and damage, (b) velocity and safety, (c) dose–response, etc. Table 1 shows the hazard-vulnerability functions required for evaluating various types of damage.

Table 1 Flooding damage types and required hazard and vulnerability information for assessment

The most common flood damage assessment is the direct tangible damage that includes residential properties, non-residential properties, technical infrastructure, vehicles and agricultural damage (Messner et al. 2007). Depth-damage curves (DDCs) representing the flood damage of various land use types at different flood depths are often used to determine the damage. In this study, the damage to residential and non-residential properties are considered as building content and structural damage, assuming that the DDCs are available for both types of properties. The damage to vehicles can be reflected in the DDCs or can be assessed separately, depending on the condition of a case study. The technical infrastructure such as the transportation network and flood defences will also require a separate model for assessment. Agricultural products are neither included in the study, but there is no barrier to their inclusion. When flood duration is taken into account, more than one DDC could be applied to the same land use type because longer flooding may exaggerate the damage.

The same tools can also be used to assess the health impact. It may, however, be more appropriate to assess health impacts at the larger block or district scale because of the lack of data at the building level. The health impacts of flooding can be evaluated with the same tools using demographic information and then applying functions that relate the concentration of pathogens in flood water to likelihood of falling ill (using contamination-health impact curves), or functions that relate the depth and velocity of flood waters (using depth mortality curves) for each block or district.

The tools can be applied to the damage assessment for other types of hazards. For example, the key elements for wind damage assessment (1) wind map (hazard); (2) building locations or road network (exposure); (3) building or vehicle types (vulnerability); and (4) the relationships of wind speed and damage to buildings or vehicles safety (hazard-vulnerability) can fit in the framework such that the tools can be adopted directly to estimate the wind damage to buildings or vehicles on the road.

2.3 Referencing coordinates

Different reference coordinates of different data sources may cause problems during the assessment. In the example shown in Fig. 4, the hydraulic model and the UGM have different referencing points that are 6.5 and 4.6 m apart from each other in the x and y direction, respectively. The two models use different grid resolutions, i.e. 25 m for the hydraulic model (grey solid grid lines) and 30 m for the UGM (red dash grid lines). Consequently, a cell in the hydraulic model grid (grey cell) may have up to four different land cover values from the UGM model because of the overlap at the edges (red cells). The same condition remains even if a finer 5-m resolution grid (green dashed grid lines in Fig. 4b) is used for analysis. To overcome this problem, the two models are aligned to the same referencing coordinate (Fig. 4) so that the values of both coarse grids can be assigned to the finer 5-m grid directly (Fig. 4d) for further assessment.

Fig. 4
figure 4

The coarse grids for hydraulic modelling (black solid lines) and UGM (red dash lines) and the fine grid for damage assessment (green dash lines). a The coarse grids have different referencing coordinates that causing multiple land cover values in a flood cell. b The multiple values problem remains for fine grid if the grids are using different referencing coordinates. c The coarse grids use the same referencing coordinate still have multiple land cover values in a flood cell. d Single land cover and flood depth values can be assigned to each fine grid cell

3 Results

3.1 Case study area

In this paper, Dhaka in Bangladesh was adopted as the case study for the following reasons: (1) good availability of detailed parcel information and hydraulic modelling results; (2) large areas and number of buildings to push the boundaries of modelling capacity of the tools; (3) dense building distribution to demonstrate the issues of analysing and masking cell sizes; (4) rapid urban expanding in the past and in the foreseeable future that highlights the needs of urban growth modelling for planning; and (5) potential for other types (e.g. health impact) of impact assessment.

The Greater Dhaka area includes areas beyond the central Dhaka City Corporation. It is bounded by the Balu River in the east, the Tongi Khal in the north, the Turag-Buriganga Rivers in the west and the Dhaka-Demra-Chittagong Road embankment in the south. The average population density in central Dhaka city is 48,000 inhabitants per km2. With rapid urbanisation and the development of city infrastructure, combined with the reduction in water storage and percolation areas, flooding and waterlogging from local rainfall have become a serious problem. Gain and Hoque (2013) recently combined 1D hydraulic modelling results and depth-vulnerability functions to evaluate the flood risk of Eastern Dhaka Area that considered the fluvial flooding scenario only. In the paper, we adopted 2D hydraulic model approach and included the Western Dhaka Area that is also affected by pluvial flooding.

The drainage in Dhaka depends on the operation of a storm-water drainage system (including pumps and regulators) and the water levels in peripheral rivers (IWM 2008). Thus, flooding in Dhaka may occur due to: congestion of storm-water/wastewater drainage systems inside the city area; the high water level in the peripheral rivers under which circumstance drainage is only possible through pumping; and the intrusion of floodwater from the peripheral rivers to city area through the drainage routes.

3.2 Flood modelling

The DHI MIKE Urban model was applied to simulate the urban flooding. The drainage network for Central Dhaka contains a network of underground pipes and some open channels. In this network, there are 9.7 km of box culverts, 40 km of open channels and 134 km of pipes. The city drainage system was schematised from secondary data collection and stakeholder consultation. Accurate records of the drainage infrastructure were difficult to find, so information from different sources was used to cross-verify the model.

Within MIKE Urban, the 2D overland flow is simulated by MIKE 21 using a 25-m resolution regular grid, and terrain models were built for both Eastern and Central Dhaka. MOUSE and MIKE FLOOD models were used to simulate the storm sewer and the river channel flows, respectively. Both models were coupled with MIKE21 to reflect the flow interactions between the ground surface, the drainage networks and the rivers.

3.3 Land use data

The Institute of Water Modelling (IWM), Bangladesh, has collected Detailed Area Plan (DAP) of 2006 that included 1.14 million building parcel information of in Dhaka from the City Development Authority which is (RAJUK). IWM further reviewed and updated for new development or any change in parcel information within the study area. Figure 5 shows the current, i.e. the baseline year 2010, building layouts and uses in a local region. Most buildings are used as residential properties, followed by manufacturing and processing activities, mixed use and commercial activities. The IWM investigated different sectors in Dhaka and established the damage functions for six main building use types in Dhaka: commercial activity, education and research, governmental services, mixed use, manufacturing and processing, and residential.

Fig. 5
figure 5

The building layouts and land use types in a local region in Dhaka for the baseline year

3.4 Depth-damage functions

The damage functions were developed by Haque et al. (2014) from a survey of 430 properties. Survey was conducted using a systematic stratified random sampling method based on the land elevation (flood depth) and structure of the building/premise. Questionnaires were designed for different types of properties to include all types of damage items with their coping mechanisms against waterlogging situations. All flood damage categories have been covered that could be reasonably expected to occur. The curve for commercial entities ranged from 4.56 BDT (currency of Bangladesh) per square foot for 5 in. depth of flooding to 33.28 BDT per square foot for 30 in. depth of flooding. The residential asset damage starts at 4 in. depth without considering flood duration. Up to 4 in. of flood depth, there is no damage. With the increase in flood depth at 40 in., damage becomes 28 BDT per square feet. The values were converted into SI system for damage assessment in the paper. Due to lack of historic data on actual damage after a flood event, it was not possible to carry out any validation. More details regarding the depth-damage functions adopted in the paper can be found in Haque et al. (2014).

3.5 Urban growth

To evaluate the impact of urban growth on flooding, a UGM was developed based on a 2D cellular automata in which cells expressing specific land use characteristics that change state depend on internal growth characteristics and external pressures. LULC classification is based on 30 m cell Landsat 5 TM data, using a maximum-likelihood multi-temporal land cover classification (Bruzzone and Serpico 1997) after which manual corrections were applied (Veerbeek et al. 2015). Similar approach was also adopted by Corner et al. (2014) for the urban sprawl analysis (Dewan and Corner 2014) in Dhaka. The internal growth characteristics are derived using a supervised learning algorithm, which is trained on historic growth data. This means that the generic growth model is adapted to fit the local characteristics of a case study area. Furthermore, spatial constraints and external pressures (e.g. land demand) expressed within scenarios and measures create a probability distribution of future urbanisation which is fulfilled in a probabilistic fashion. The UGM output is primarily a land use distribution for some future point in time. These are translated into possible land cover characteristics, which provide spatial and physical characteristics, required for the flood modelling and impact assessment. It should be noted that the output can include multiple instances in time for a given scenario and response portfolio; the model expresses the urban dynamics over a range of years.

Figure 6 shows the LULC in the area shown in Fig. 4 for the baseline year 2010, predicted by the UGM using historical data. The higher class number (darker colour) represents denser urban development. Compared to the building data shown in Fig. 5, the north-eastern part of the region has a sparser building distribution such that the LULC in Fig. 6 are lower in the area. The detailed relationship between building use and land cover was analysed further. Table 2 shows the building components of each land use cover. The total built-up area of these six main categories for the land cover varies from 3.1 (class 1) to 56.4 % (class 10). These area ratios of building components were used as the weighting factor to combine DDCs for different building uses and to generate new DDCs for each land cover class.

Fig. 6
figure 6

LULC for area shown in Fig. 5 for the baseline year generated by UGM using historical data

Table 2 The building components for different land use classes

3.6 Flood damage assessment

3.6.1 Assessment for current baseline and sensitivity analysis

The main parameters in the tool for raster-based analysis are (1) the cell size used to calculate the damage value of each cell, and (2) the cell size used to clip out the non-building area from a coarse analysing cell. We adopted five combinations, as shown in Table 3, of these two parameters for the sensitivity and performance analysis. Table 3 also lists the damage assessment results of these five cases for a 100-year flood event. The inputs (rainfall and water levels in the rivers) for flood modelling of the event were determined based on statistical analysis, which adopted 50 years historical records in the surrounding catchments of Dhaka City. The current exchange rate for the Bangladeshi Taka (BDT) is €1 ≑ 104 BDT as of 5 August 2014.

Table 3 The total flood damage for 100-year event of Dhaka city with different analysing and masking cell sizes

No real data are available for model verification, and so we assumed Case 1, which had the finest resolution of cell sizes, would provide the best estimation and the results are used as benchmark for other cases.

The dataset includes information on 1.14 million buildings in Dhaka City, although only 250 thousand were within the hydraulic modelling domain. The video in the electronic supplementary material, as well as on Youtube at https://www.youtube.com/watch?v=skAk3giQGrE, illustrates the hydraulic modelling and the damage assessment results for Dhaka City. Table 3 shows that the number of buildings estimated to show flood damage using a 5-m masking cell size was about 11,000 less than when using a 1-m masking cell size. The reason was that only a single building index can be applied to each cell, when the cell contains more than one building (e.g. the three cells in right column of Fig. 3 that each contains two buildings). In the coarser cell representation, the buildings that occupied less area within a cell were filtered out and the total flood damage was smaller than when using fine masking cell size.

The mean and bias show that the coarse analysis and masking cell sizes produced lower damage estimations. It was due to some buildings being filtered out for the damage calculation such that those buildings had shown no damage and resulted in a lower mean damage. For Case 4 and 5, the MAEs are 19 and 21 % and the RMSEs are 39 and 50 % of mean damage for Case 1, respectively. But the bias is only −6 %, which shows that some of the positive and negative errors were cancelled out when calculating the total damage and Mean. The RMSEs are relatively large, compared to the mean, which indicate that large errors exist when using coarse analysis cells.

Table 4 lists the statistical information for the cases, using the key residual criteria including mean, bias, mean absolute error (MAE) and root-mean-square error (RMSE). Although 250 thousand buildings were located inside the modelling domain, only about a quarter of them had flood damage for 100-year event. The statistic only considered those with flood damage. However, the number of buildings differed in the five cases. To make a fair comparison, we took the number of the union of buildings with flood damage, i.e. 67,458, as the common base for the comparisons of all cases. The mean is the average flood damage of the 67,458 buildings for each case. The bias, MAE and RMSE were calculated according to the damage differences of individual buildings between each case and Case 1.

Table 4 The statistics of assessment results for cases with different resolutions

The mean and bias show that the coarse analysis and masking cell sizes produced lower damage estimations. It was due to some buildings being filtered out for the damage calculation such that those buildings had shown no damage and resulted in a lower mean damage. For Cases 4 and 5, the MAEs are 19 and 21 % and the RMSEs are 39 and 50 % of mean damage for Case 1, respectively. But the bias is only −6 %, which shows that some of the positive and negative errors were cancelled out when calculating the total damage and mean. The RMSEs are relatively large, compared to the mean, which indicates that large errors exist when using coarse analysis cells.

Figure 7 shows the distribution of errors for each case in different bands. Orders of magnitude of errors are used to classify the band of errors. O(n) represents the error between 10n−1 and 10n, while −O(n) for the error between −10n and −10n−1, where n is a positive integer or 0. For n = 0, O(0) represents the positive error band less than 1 and −O(0) is the negative error band greater than −1. The majority of Case 2 and Case 3, using 1 m masking cell size, have zero error while as Case 4 and Case 5 have most errors spreading between −O(3) to −O(1) and O(1) to O(3). Figure 8 shows the sum of errors in each error band. The outliers in the bands −O(5) to −O(4) and O(4) to O(5) contribute significant amount to the error for Cases 3, 4 and 5, which resulted in a large RMSE.

Fig. 7
figure 7

The error distributions for cases with different resolutions

Fig. 8
figure 8

The sum of errors of each error band for cases with different resolutions

Table 3 also shows that the 5-m analysing cell size case had slightly more buildings with flood damage than the 1 m case. One of the reasons was that some buildings that are classified as damage free (due to an undefined damage curve for the Community Activity building use) in the 1-m case used information from their neighbours in the 5- and 25-m cases. Such an example is highlighted with a yellow boundary in Fig. 9. For the 5-m case, only the north and south parts of the highlighted building that shared the same 5-m cells with its neighbours had a damage value. The remaining 5-m cells within the building were still considered as damage free. For the 25-m case, the building use of the two cells, where the yellow building was located, was dominated by the larger area ratios of the buildings with residential uses. Hence, the complete building was considered as having damage with manufacturing and processing use type. Further development of the field investigation would be necessary to clarify the missing or unclear information. The light grey lines in Figs. 9, 10, 11 and 12 represent the 1-m grid lines, medium grey ones for 5 m grid lines and the dark thick grey ones for 25 m grid lines.

Fig. 9
figure 9

a The building with yellow boundary has no damage estimated when b 1 m analysing cell size was used because of lack of building use information, but gets flood damage when c 5 m and d 25 m analysing cell sizes were used

Fig. 10
figure 10

a The layouts of overlapped buildings that result in large estimation differences when using b 1 m, c 5 m, and d 25 m analysing cell sizes

Fig. 11
figure 11

a The building with yellow boundary got different damage estimations using b 1 m, c 5 m, and d 25 m analysing cell sizes

Fig. 12
figure 12

a The building with yellow boundary has different damage estimations using b 1 m, c 5 m, and d 25 m analysing cell sizes

Figure 10 shows the building (bounded by yellow line) with the largest estimation errors. Seven other buildings are overlapping with this building (due to error in data) such the type of other buildings (i.e. residential), instead of ‘manufacturing & processing’ of the building itself, is picked up for some cells containing more than one building. Hence, we will not be able to use the same damage function and automatically assign the calculated damage back to the same building when using different analysing cell sizes. Further field investigation would be necessary to clarify the missing or unclear information.

Figure 11 shows another potential reason for the estimation error using different analysing cell sizes. The building highlighted in yellow boundary only has 0.12 m flood depth, shown in Fig. 11a, in the south-west corner. For 1 m analysing cell size, its own building type (mixed use) was used for associating with the damage function, so the calculated damage (Fig. 11b) is only 8 BDT/m2. For 5 and 25 m analysing cell sizes, the building type of its south neighbour (manufacturing and processing activity) was used associating with the damage function because the latter had larger area ratio in the cell shared by both buildings. This resulted in much higher damage of 118 BDT/m2 (Figs. 10d, 11c).

Figure 12 shows another example of such an error. The whole extent of the shown area was completely flooded. The building with a yellow boundary in Fig. 12a is used for manufacturing and processing activity. For the 1-m analysing cell size (Fig. 12b), its own building type was associated with the damage function for all fine cells. Yet, for the 5-m case (Fig. 12c), residential and commercial uses, which have lower damage values, were used for its east and south-west parts, respectively. For the 25-m case (Fig. 12d), the mixed-use type was used for the north-west part of the building that resulted in underestimation. The residential use was applied to the remaining area, which caused further underestimation, due to larger area being occupied by buildings with other types of uses further away within the corresponding coarse cells.

3.6.2 Computing performance

We adopted two machines, with specifications listed in Table 5, to test the computing performance of the damage assessment tools. Each of the five cases in the previous section was executed on the two machines three times. The total of 1.15 million buildings is distributed in an area of 1970 km2, where the hydraulic modelling extent only covers 250 thousand buildings in 184.6 km2, which corresponds to 184.6, 7.4 and 0.3 million cells in 1, 5 and 25 m resolution, respectively. Tables 6 and 7 present the time of total and subtasks processing on Machine1 and Machine2, respectively. The damage assessment work consists of seven main subtasks:

Table 5 Specifications of machines used for performance testing
Table 6 Computing time of subtasks and total for the five cases tested on Machine1
Table 7 Computing time of subtasks and total for the five cases tested on Machine2
Table 8 Total flood damage for 100-year return period event estimated with the current building reality information for the baseline year
  • Polygon to raster: Converting the polygons of building components to raster format. This is done using the built-in “PolygonToRaster” function of ArcGIS.

  • Shapefile copying: Copying the original shape file of building components to a new one and adding new filed(s) for saving damage assessment results. This is done using the built-in “CopyFeatures” and “AddField” functions of ArcGIS.

  • Raster to float: Converting the raster file to binary float format for damage calculation. This is done using the built-in “RasterToFloat” function of ArcGIS.

  • Aggregating: Aggregating the fine cells to the analysing cell size.

  • Damage calculating: Computing the flood damage for each cell.

  • Summing up: Summing up the damage of cells for each building.

  • Results associating: The calculated building damage is saved in a text file, and this subtask associates the results back to the shape file of building components.

  • Other time: other minor tasks.

The results show that the most time-consuming subtasks were “Polygon to raster”, “Results associating” and “Shapefile copying”. The “Polygon to raster” converted the polygons to raster based on the analysing cell size. For 5 m cases, some buildings were filtered out such that the processing spent less time than 1 m cases. The “Results associating” and “Shapefile copying” had to process all building components in the original shapefile for all cases, so the processing time was almost the same. In other words, they are independent from the analysing and masking cell sizes. These three most time-consuming subtasks involved the data processing that requires CPU power such that they performed better on Machine1 which has a faster CPU.

For the remaining subtasks, the results show that the “Raster to float” was faster for 5 m masking cell size cases than 1 m cases; the “Aggregating”, “Damage calculating” and “Summing up” subtasks were affected by both analysing and masking cell sizes. Although the efficiency for these four subtasks varied because of cell size settings, the computing time required within the subtasks was minor compared to the time needed for input reading and output writing for cases with 1 m analysing cell sizes. Therefore, these four subtasks performed better on Machine2 than on Machine1 in Cases 1, 2 and 3 because of the SSD allows much faster data reading/writing. For Cases 4 and 5, the four subtasks ran faster on Machine1 because of the computing time dominated, rather than data input and output.

3.6.3 Assessment for future urban growth scenario

Using the damage functions that IWM developed for the different Dhaka building use classes, and the derived damage functions for land cover classes, the flood damage for a 100-year event is calculated as shown in Figs. 13 and 14, respectively. Figure 13 illustrates the flood damage for each building in the region of a flood event for current reality, while Fig. 14 represents the flood damage per unit area using the land cover classes for the same event and the same baseline year. No parcel information is available in the UGM prediction such that the damage cannot be integrated at the building level.

Fig. 13
figure 13

Flood damage per building estimated using the parcel information reality in the baseline year

Fig. 14
figure 14

Flood damage per unit area using the LULC from the UGM for the baseline year

However, the assessment can be summarised at the city level for the total damage in each building use or each land cover class, as listed in Tables 8 and 9, respectively. For Dhaka city as a whole, the approach using land cover classes overestimated the total damage by 3.95 % compared to the former approach. This could be due to the relationships between the building use and the land cover not being accurate enough to associate the current reality with the conceptual classes. In the UGM, arbitrary classification issues (e.g. overlapping built-up areas, and streets that cross multiple cells or are located within cells) always arise because of the rather coarse resolution (comparing to building sizes). Hence, major roads in sparsely populated areas have been eliminated but in high-density areas often remain as artefacts. Particularly in Dhaka and surroundings, the road acts as a main driver of urbanisation. In the suburbs, this manifests itself as ribbon development while in the urban centres, the adjacent areas along streets are often completely filled up by structures. This indeed might give rise to overestimation of the flood damages, which can be observed in Fig. 14 that higher damage occurred along the main road. This could be improved by introducing more factors when analysing such information. For example, the change in built-up areas and building uses due to urban growth in developed and undeveloped areas could be different due to the spatial limitations; commercial activities tend to grow along the road network due to the accessibility of business; certain areas could favour growth in certain building use types compared to others as a result of economic differentiation. Therefore, more detailed relationships between building use and land cover classes could be developed to improve the flood damage assessment (Table 8).

Table 9 Total flood damage for 100-year return period event estimated with the land cover classes for the baseline year

For the future, Fig. 15 shows the land cover classes in 2050s for the business-as-usual (BAU) high-growth scenario predicted by the UGM. Although the region is already highly developed in the baseline year, the projection shows that the density of development will increase. Consequently, the flood damage for the same 100-year event will be more severe, as shown in Fig. 16. The total damage could increase to 2.78 billion BDT in 2050s, compared to 364 million BDT in the current situation, if the impact of urban growth on the hydraulic condition is not considered. The tools has been applied to take a multiple flood events with different probabilities, for both current and future scenarios, as inputs to determine the damage of each condition and the EADs as the integral risk assessment; the details can be found in Khan et al. (2015). Meanwhile, the LULC can be replaced using different information such as population density, so that the same tool can be applied to assess the flood impacts on human health for the baseline and the future scenarios.

Fig. 15
figure 15

LULC for the BAU high-growth 2050s scenario projected by UGM

Fig. 16
figure 16

Flood damage per unit area using the LULC of the BAU high-growth scenario in 2050s predicted by UGM

4 Discussion

The modelling results in the previous section have demonstrated the capability of the damage assessment tools to evaluate the impact of flooding for the current reality and future urban growth scenarios. The following discussion will focus on the technical issues encountered during the development of the tools.

4.1 Modelling extent

Since the damage is driven by the flood depth and other hydraulic attributes, any location outside the hydraulic modelling extent would show no damage. The calculation outside the hydraulic modelling domain would be unnecessary. Hence, the tools automatically adopt the hydraulic modelling domain as the maximum extent for the raster analysis.

4.2 Analysing resolution

The conversion of building data from a vector to a raster format raises the issue of selecting the appropriate data resolution. If the resolution is too coarse, the spatial variation between vectors or polygons is lost during the conversion, which will cause inaccurate damage estimates; if the resolution is too fine, the required computing resource and processing time would be too onerous.

Therefore, damage functions per unit area were adopted to calculate the damage for each grid cell inside a parcel of a zoning area and sum them up to obtain the total damage. Hence, the tools convert the polygon data into raster in order that the calculation can be easily implemented in the same format. The damage per unit area is calculated using the resolution of the hydraulic model results to avoid averaging of the flood depths over an analysis cell. This is important as the depth-damage functions are not linear, and averaging of the flood depths could lead to inaccuracies in the results.

The converted parcel raster can be at the same resolution as the hydraulic modelling. The default setting of the cell size of analysis is the resolution of the hydraulic modelling results. However, if this resolution is too coarse, the results may not reflect the spatial variations in the parcels, and it will not be possible to distinguish between different land uses of two small adjacent parcels within the same grid cell. Therefore, the analysis cells can have a finer resolution than those of the hydraulic modelling results, to provide more accurate results. If a selected cell size value is larger than that of the hydraulic model results, this value will only be used to generate the building index and building land use raster.

The value of this cell size should be an integer fraction (e.g. 1/5 or 1/4) of the hydraulic modelling results resolution. Otherwise, some of these cells would span adjacent hazard information cells. This in turn leads to the potential for inaccuracies, as a result of averaging, and the inherent loss of information. An alternative solution to this problem would be to generate a finer raster in the “overlapped” sections, but this would lead to increased computer processing times, and so this solution is not favoured.

The sensitivity analysis shows that Cases 2 and 3 using 1 m masking cell sizes, compared to Case 1, can predict consistent results for most building components. For Cases 4 and 5, detailed information on building layouts was lost when the polygons were converted to raster format. However, Dhaka is very densely developed, and many buildings are close to each other; this may cause the overestimation of damage to one building, and an underestimate for its neighbour. These overestimates and underestimates should cancel each other out, and keep the total damage over the whole domain close to the value in Case 1. In general, the use of coarse analysis and masking cell sizes could produce close estimations of total damage (less than 10 % error) due to the cancellation effect, but the damage to individual buildings may be incorrect because of the missing information. Therefore, the analysis with very coarse resolution is not recommended especially when the saving in computation is insignificant.

4.3 Model efficiency

In the Dhaka case, the area of the hydraulic modelling domain is 184.6 km2. Seven hundred and forty megabytes of memory is required to store the damage results using a 1-m cell size at a set of single precision information. The minimum data requirement including flood depth, building use and building index is 2.2 GB of memory. The model was applied to Barcelona, Beijing, Nice and Taipei case studies. Although it is feasible to implement a 64-bit program on the latest computers with 4 GB + memory, some of our other project partners (Beijing and Dhaka) reported difficulties to use the tool for similar large cases due to the limitations of operation system and available hardware. Therefore, instead of reading all data before the calculation, the tool reads and processes the information piece by piece as a trade-off between model efficiency and flexibility. The treatment reduces the memory requirement from O(n 2) to O(n) and enables the program to be executed on a standard desktop computer with 2 GB memory and 32-bit operating system.

We also tried to execute same procedure using the built-in functions of the ArcGIS software on the exact same machines we were using, but encountered the following issues such that the results are not comparable.

(1) The limitation of built-in functions (e.g. look-up tables for damage calculation and data conversion, as discussed in Sect. 2) cannot interpolate the depth-damage functions and therefore do not calculate the correct values; (2) the “Polygon Join” function in ArcGIS is time-consuming and memory hungry. The memory requirement prevented us from running the procedure in a single patch. We had to divide the 250 thousand buildings into 4 parts to associate the damage assessment results for individual buildings. The time cost for the process was at least an order of magnitude higher than using the proposed tools.

Meanwhile, the coarse analysis and masking cell size cases are expected to further improve the performance. The masking cell sizes affected, in terms of absolute processing time, the “Polygon to Raster” subtask most significantly. The 5-m cases were 15 % faster, which were 30–35 s less, than the 1-m cases. For the “Results associating” and the “Shapefile copying” subtasks, the performances were almost identical for all cases.

In terms of relative processing time, the masking cell size influenced “Raster to float”, “Aggregating” and “Summing up” and the analysing cell sizes affected “Damage calculating”. These four subtasks run less efficiently for cases that require reading and writing large files, which can boost their performance by using SSD. The results show that the settings of analysing and masking cell sizes can speed up the damage calculations. Nevertheless, the three most time-consuming subtasks use the built-in functions in the ArcGIS that counts the majority of processing time that make the speed-up in damage calculation insignificant for the overall time. Otherwise, we may recommend Case 2 as it is faster and produces close result to Case 1 for applications requiring large number of simulations such as uncertainty analysis.

Both the ArcGIS functions and the external programs only used single processor. The computing efficiency may be further improved if the functions and programs include parallel processing features.

4.4 Lack of or inappropriate detailed parcel information

If the statistical information about building and land uses is only available at block or district level (e.g. 30 % area of a district is residential, and 20 % is commercial), a simplified approach would be used. Assuming that the buildings and land use distributions within a block or a district are homogenous, the weighted damage function of each block or district can be determined. Then, the tools can be used to assess the total damage at the block or district level.

If land use zoning information is available (which should be a regional zoning polygon), it is recommended to use the resolution of the hydraulic modelling for the assessment. The land use zoning polygons are converted into raster format for the assessment. Assuming that a land zone has a homogeneous distribution of assets, these zones can be reflected in the damage functions, which somehow consider the variation in buildings; hence, the damage assessment tools can also be applied to estimate the flood impact with less detailed information.

4.5 Building heights presented in terrain model

Where building heights are included in the topographic model, the flood depth inside a building is not available from the modelling results. A postprocessing algorithm is required to select the flood depth from the raster output data and assign it to a building. Since a building has a unique flood depth value, the flood damage can be estimated by combining the land use of the building and the appropriate DDC (per building), which has been described above.

For the two modelling approaches, the selection of representative depth of each building would be an important issue, especially for a large building with several entrances at different elevations. The same problem exists when using the vector approach.

4.6 Vector approach

Many hydraulic models adopt irregular meshes for flood simulations to reduce the computing load for wide flat areas. Approaches in this category often consider the outlines of buildings when generating the computing meshes, and exclude buildings from the modelling for following reasons: (a) to concentrate computation load only on flow paths; (b) to reflect the blockage effect of buildings; and (c) buildings are assumed to be well protected by solid walls. The reality is that the flood water may still flow into buildings and cause damage through gaps such as the building entrance or when the stage exceeds the height of temporary protecting measures.

Buildings are represented as individual polygons so that the flood damage can be easily calculated if the flood depth and the land use type of each polygon are known. Unfortunately, buildings are excluded from modelling, and no flood depths inside buildings are directly available from the hydraulic modelling results so that no damage will be calculated. Preprocessing is required to associate a flood depth with each building, by (a) selecting the value from the polygon nearest to the building entrance; (b) taking the maximum value of polygons surrounding the building; or (c) taking the average value of polygons surrounding the building. The local topography will affect the accuracy of such a selection procedure. For example, for a building on a slope, if the entrance is located on an uphill side and flood occurred on a downhill side that has no opening, the selection of the maximum flood depth from the surrounding polygons may overestimate the flood depth if the elevation inside the building is at the same level of entrance. Hence, such procedure needs caution with regard to the local terrain and layout information.

Since a parcel has a unique flood depth value, the flood damage can be estimated by combining the land use of the buildings and the damage functions. If each parcel only has a single land use type, the damage can be estimated using the DDC corresponding to the specific land use, and the unit for the DDC should be per building. If the DDC per unit area is used, then the input information should include the areas of buildings.

4.7 Other applications

The damage assessment tools adopt a general methodology that requires three types of information for modelling: (1) hazard characteristics; (2) object attributes; and (3) damage functions. Once the data are available, the tools can be implemented to evaluate various types of damage as illustrated in Table 1. Meanwhile, the tools can also be applied to assess the damage caused by other hazard types with spatial variation. For example, by combining the gust intensity from a climate or weather model with building information (age, material, height, etc.) and the damage functions to the gust intensity for different building attributes, the tool can be used to calculate the wind storm damage. Similarly, the vulnerability of buildings during earthquakes can be assessed through the application of fragility curves that relate the probability of damage to a particular ground motion parameter (Rossetto and Elnashai 2003). The application of the damage assessment tools can improve the understanding of hazard impact for different scenarios such that better decision can be made based on cost-effectiveness analysis.

5 Conclusions

In this study, a set of GIS-based tools for flood damage assessment was developed and a number of problems related to data resolution have been resolved. The tools are capable of utilising the hydraulic modelling results from DHI MIKE URBAN directly. It is also possible to link the tools to the outputs from other hazard models that are in either raster or polygon format. Combining the hazard characteristics, parcel attributes and their corresponding damage functions, the tools can evaluate the damage efficiently.

We presented the technical issues encountered and the solutions for dealing with data with different formats, referencing coordinates, and cell sizes. The minor compromise of efficiency in data reading and processing has significantly reduced the memory requirements for computing. This makes the tools flexible enough to be applied to a large case study such as Dhaka using high-resolution data.

The sensitivity analysis has shown that the masking cell resolution is the most influential factor to the assessment results. Using high resolution of masking cell, the tool can produce consistent estimations for different analysing cell resolutions. The computing performance tests indicated that data format conversion was the most time-consuming subtask, while the extra cost for analysing with finer cell sizes was marginal compared to those with coarse cell sizes. The cancelling effect occurred when using coarse cell sizes such that the computational saving becomes irrelevant because the loss of accuracy. Hence, the use of finer analysis and masking cell sizes is recommended to obtain correct detailed information.

Apart from the flood depth and the direct tangible damage, the relationships between hazard characteristics and other types of impact can also be used as the damage functions. Hence, the proposed tools can estimate a wide range of hazard impacts, including the monetary loss, the risk to life and the health impact due to flooding, as well as other damage types, e.g. due to storm winds or heat waves. An approach to associate the LULC from UGM with the current reality for the baseline year has also been proposed. Therefore, the damage assessment tools can be further applied to assess future hazard impact using the data from the UGM. The assessments can highlight the consequences of various disaster conditions and the vulnerable hot spots such that better risk management strategies and urban development plans can be adopted for the current hazard mitigation and the future adaptations.

The damage assessment tools were developed in Python to allow integration within the ArcGIS environment, which can be easily transplanted to other GIS software platforms. The Fortran executables, which process the data piece by piece, have significantly reduced the memory requirements and enhanced the computing efficiency. The improvements allow the tools to evaluate flood damage for mega cities with high resolution rapidly on a standard desktop computer without being restricted by the GIS software limitations.