Abstract
Information on the depth of floodwater is crucial for rapid mapping of areas affected by floods. However, previous approaches for estimating floodwater depth, including field surveys, remote sensing, and machine learning techniques, can be time-consuming and resource-intensive. This paper presents an automated and rapid approach for estimating floodwater depth from on-site flood photos. A pre-trained large multimodal model, Generative pre-trained transformers (GPT-4) Vision, was used specifically for estimating floodwater. The input data were flood photos that contained referenced objects, such as street signs, cars, people, and buildings. Using the heights of the common objects as references, the model returned the floodwater depth as the output. Results show that the proposed approach can rapidly provide a consistent and reliable estimation of floodwater depth from flood photos. Such rapid estimation is transformative in flood inundation mapping and assessing the severity of the flood in near-real time, which is essential for effective flood response strategies.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Flooding is one of the most common and devastating natural hazards, leading to significant human and economic losses annually (Bentivoglio et al. 2022). As climate change contributes to more frequent and intense precipitation events, flooding severity is expected to increase (Tabari, 2020). Rapid estimation of flooded areas becomes crucial in the face of such a threat. During flooding events, emergency managers need timely and accurate information about inundated areas to coordinate response operations effectively (Manjusree et al. 2012; Li et al. 2018).
Rapid flood mapping provides an immediate understanding of the extent and severity of flooding. It helps authorities and humanitarian organizations allocate and distribute essential resources like food, water, and medical supplies (Cohen et al. 2018). Identifying flooded areas quickly is also important for protecting critical infrastructure such as power plants and water treatment facilities, thereby minimizing disruption and speeding up recovery efforts (Li et al. 2021). Additionally, flood maps are essential for analyzing flood patterns and informing urban planning and mitigation strategies (Meghanadh et al. 2020).
Floodwater depth is an important factor in flood inundation mapping (Fohringer et al. 2015, Li et al. 2018). Information about the depth of floodwater is also crucial for assessing the severity of floods and evaluating flood risk mitigation measures. This information is essential for deploying rescue efforts, determining road closures, and assessing accessible areas (Cohen et al. 2018). Furthermore, it plays a crucial part in aiding emergency services, evaluating accessibility, devising suitable intervention plans, calculating water volumes, allocating resources for pumping water, and promptly calculating intervention and reconstruction expenses (Cian et al. 2018). Data on floodwater depths is helpful for immediate response and post-disaster analyses, including evaluating property damage and assessing flood risks (Nguyen et al. 2016).
Various techniques have been used to estimate floodwater depth. Conventional approaches such as field surveys have been utilized to determine floodwater depth by directly measuring high-water marks in affected areas (Chaudhary et al. 2020). Although this method is precise, it is time-consuming and labor-intensive. Additionally, it is limited to small-scale applications and can be impacted by weather conditions (Elkhrachy, 2022). Conventional methods also rely on information from stream gauges at specific locations to offer real-time flood data, such as water level. However, these approaches have constraints when the floodwater exceeds the gauge’s height and when scattered gauge placements cannot adequately cover flooded areas (Li et al. 2018). Another approach is to use hydrodynamic models to assess floodwater depth, including the Hydrologic Engineering Center’s River Analysis System (HEC-RAS) (Athira et al. 2023; Brunner, 2016), Delft-3D (Haq et al. 2020), and LISFLOOD-FP (Yin et al. 2022). These models are known for their accuracy in simulating complex flood dynamics (Elkhrachy, 2022). Still, their utilization is hindered by the requirement for extensive input datasets that involve detailed topographic, meteorological, and hydrological data. Additionally, these models require significant computational resources and rely on powerful computing systems.
Remote sensing data has been used extensively for flood management in recent decades. Large-scale knowledge about the extent of floods can be obtained through remote sensing data, such as satellite imagery. Studies have combined optical and synthetic aperture radar (SAR) with a digital elevation model (DEM) to determine floodwater depth. For example, Cian et al. (2018) introduced a semi-automatic approach to calculate floodwater depth by utilizing SAR imagery and statistical estimation of DEM from LIDAR (Light Detection and Ranging). Surampudi and Kumar (2023) utilized SAR data and Shuttle Radar Topography Mission (SRTM) DEM to generate water depth closely following surface undulations in agricultural lands. These methods appear to be efficient in estimating floodwater depth. However, satellite image acquisition is limited by the temporal resolution (Bovenga et al. 2018). Also, cloud cover can affect optical sensors during flood events (Chaudhary et al. 2020), making it impossible to have flood images for areas covered with clouds. Additionally, vertical inaccuracies are frequently produced by DEMs, especially over complicated terrain such as urban areas. As a result, they are unreliable in identifying the important topographical elements which determine how floods behave (Schumann, 2014).
Recent advancements in machine learning have significantly revolutionized floodwater depth estimation. Numerous studies have employed sophisticated computer vision algorithms to assess water levels remotely. For instance, Pan et al. (2018) employed a Convolutional Neural Network (CNN)-based methodology to monitor the length of a ruler in footage captured by a video camera strategically placed adjacent to a river. Similarly, utilizing a mask region-based convolution neural network (Mask R-CNN), Park et al. (2021) achieved floodwater depth estimation by detecting submerged vehicles in flood photos. Furthermore, the wealth of flood images on social media platforms has provided a rich source for researchers to employ computer vision algorithms in estimating floodwater depth. Feng et al. (2020) introduced a workflow focusing on retrieving images containing humans from social media to estimate water levels. In a distinct approach, Quan et al. (2020) matched water levels with human poses, categorizing flood severity into “above the knee” and “below the knee”. In another approach, Meng et al. (2019) integrated deep learning with web images to estimate floodwater depth from the images. Song and Tuo (2021) leveraged CNNs to segment stop signs and extract floodwater depth data from images featuring such signs. Additionally, Li et al. (2023) utilized an object detection model based on CNN to automatically estimate water depth from images from social media platforms. While machine learning models have showcased their effectiveness in floodwater depth estimation, it is crucial to acknowledge their dependency on substantial, annotated training datasets. Creating such datasets can be a resource-intensive and time-consuming endeavor, underscoring a notable challenge in the practical implementation of these models.
The recent advent of large multimodal models, notably the Generative Pre-trained Transformer (GPT), has emerged as an exceptional development. These models exhibit a remarkable capability to comprehend human natural language, enabling proficient task execution across diverse domains. GPT-4 Vision (GPT-4 hereafter), a large-scale multimodal model, has demonstrated several impressive abilities of vision-language understanding and generation (OpenAI, 2023). For example, GPT-4 can generate natural language descriptions of images and even perform image processing tasks from descriptions written as text (Osco et al. 2023). These models can also offer intelligent solutions that resemble human thinking, allowing us to utilize general artificial intelligence to address problems in diverse applications (Wen et al. 2023).
In geographic information science, researchers have explored the potential of GPT-4 with applications to image generation, captioning, and analysis assistance in visuals, to name a few (Osco et al. 2023). A notable effort by Hu et al. (2023) involves the integration of geo-knowledge with GPT models for identifying location descriptions and their respective categories. This fusion results in a geo-knowledge-guided GPT model that accurately extracts location descriptions from disaster-related social media messages. Li and Ning (2023) developed a prototype of autonomous GIS (Geographic Information System) utilizing the GPT-4 API to accept tasks through natural language and autonomously solve spatial problems. Other endeavors include exploring the potential of GPT-4 in map-making (Tao & Xu, 2023) and human mobility modeling (Haydari et al. 2024).
GPT-4 has also demonstrated its potential in the urban science field. For example, Crooks and Chen (2024) examined the capability of GPT-4 in extracting information from street view photographs in urban analytics. Other endeavors include exploring the potential of GPT-4 in urban transportation system management (Zhang et al. 2023), urban planning (Wang et al. 2023), and energy management (Huang et al. 2023).
This research presents an automated, fast, and reliable approach leveraging GPT-4 to estimate floodwater depth from photographs capturing flood events. This study aims to contribute to disaster management, emergency response, and urban planning, potentially enhancing mitigation strategies, ultimately contributing to life-saving efforts, and minimizing economic losses in urban environments.
2 Method
2.1 Overview of the proposed approach
The GPT-4 model, developed by OpenAI, was trained using increasingly large amounts of data and has proven to be highly effective at extracting valuable information from images, even without requiring a separate training dataset. In this study, we propose an automated framework for estimating the floodwater depth by leveraging the advanced potential of GPT-4. This framework, FloodDepth-GPT, uses a GPT-4 model Python API to estimate the floodwater depth. The overall concept of the proposed approach is illustrated in Fig. 1. The approach begins by inputting flooding photos containing objects that can serve as consistent indicators for reference. Such street objects can include vehicles, humans, and street signs. By assessing the known height and relative submersion of these objects, FloodDepth-GPT can estimate water levels according to the visible objects within the photos. For instance, if the water reaches the knee level of a person whose height is known, FloodDepth-GPT can “deduce” the depth of water based on this comparative analysis. Besides the water depth, the FloodDepth-GPT also outlines the rationale behind it, which enhances the transparency, understanding, and explainability of the process.
2.2 Design of FloodDepth-GPT
The FloodDepth-GPT is a customized GPT with a set of prompts structured to guide the tool specifically toward estimating floodwater depth. These prompts include directions related to identifying and measuring reference points in the image, assessing visible waterlines on objects, and identifying known heights of common objects, such as humans, vehicles, and stop signs, present in the image (Appendix 1). The standard heights of the various objects were specified. For example, the average height of a man and the different parts of the body (i.e., knee, waist, shoulder, height, and waist) were included in the prompt (see Tables 1, 2 and 3 for the heights of different reference objects). Figure 2 shows samples of the objects used in this study with their corresponding heights.
Another crucial output of FloodDepth-GPT was detailed explanations of its estimations. This involves clear communication of the visual cues used in the estimation process and presentation of the depth measurements for ease of understanding and global applicability, enhancing the explainability of the AI output. Finally, the model was instructed to avoid speculation and base its analyses on the available objects within the image.
2.3 Performance evaluation
The ability of the FloodDepth-GPT was examined as follows. We collected 150 flood photos from various online sources. Previous studies have utilized flood photos to estimate the depth of floodwater based on different reference objects, including stop signs, vehicles, and humans (Li et al. 2023). This experimental dataset incorporates these three components as main reference objects (Tables 1, 2 and 3). We ensured that each selected photo has at least one of these objects. These photos served as input into FloodDepth-GPT, and the floodwater depth estimation for each photo was obtained using the model.
To evaluate the performance of the GPT model, this study compared the floodwater depth estimated by FloodDepth-GPT (GPT Estimation) and floodwater depth estimated manually by five individuals (manual estimation). The manual estimation processes were conducted independently, and used the average heights detailed in Tables 1, 2 and 3. The primary objective of this study is to explore the potential of GPT-4 in estimating floodwater depth, effectively positioning it as a potential human equivalent in this task. As a result, this evaluation approach was considered appropriate. Furthermore, the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) were calculated to quantitatively measure the accuracy of the FloodDepth-GPT estimations compared to the manual estimations. The MAE and RMSE were computed using Eqs. 1 and 2, respectively, where \({m}_{i}\) is the manual-estimated depth, \({gpt}_{i}\) is the FloodDepth-GPT estimation, and \(n\) is the number of images.
3 Results
Figure 3 presents the correlation between the floodwater depth estimation of GPT and humans. Result shows that there is a strong positive correlation between GPT estimation and the average estimations from human observers (Pearson’s correlation coefficient r = 0.8879). Additionally, there is a strong correlation between GPT and each human estimation (r = 0.8705, 0.8585, 0.8742, 0.8456, and 0.8834 for humans 1, 2, 3, 4, and 5, respectively). Overall, the data points are clustered along the regression line, which suggests that the estimations made by GPT are consistent with human estimates derived from visually examining the images. Additionally, the consistency across the human estimations lends credibility to their use as a benchmark for evaluating the accuracy of GPT.
Additionally, comparing the GPT-based estimations and the manual estimations, a MAE of 25 cm was recorded, which is consistent with the previous studies - ranging from 6 cm to 32 cm – (Chaudhary et al. 2019; Alizadeh Kharazi & Behzadan, 2021; Park et al. 2021; Song & Tuo, 2021; Li et al. 2023), which used deep learning methods to estimate floodwater depth from images obtained from social media platforms (Table 4). As summarized in Table 4, the error obtained using the GPT-based method in our study falls within an acceptable range when compared to previous studies that have utilized deep learning techniques.
This suggests that the GPT-based method is reliable for estimating floodwater depth with acceptable accuracy. A unique aspect of our approach is that it does not require any model training, making it applicable to most flood scenarios, regardless of the reference feature present in the flood image. This has been demonstrated using three different features (humans, vehicles, or street signs), in contrast to previous methods, which are only trained on specific visual reference features. It should be noted that the methods presented in Table 4 were applied to datasets different from those used in our study.
Furthermore, an RMSE of 30 cm was recorded when comparing the GPT and human estimations. This implies that on average, GPT-based estimations differ by approximately 30 cm from the human estimations.
Samples of FloodDepth-GPT estimations and responses are presented in Fig. 4. These results align with manual estimations. Furthermore, Fig. 5 highlights a detailed sample response from the FloodDepth-GPT, showing its ability not only to provide reliable floodwater depth estimations but also to provide reasonable explanations behind the estimations. This process entails identifying reference objects and the water level in flood photos and then utilizing the known height of these objects to estimate the floodwater depth. FloodDepth-GPT, for example, successfully identified the truck in a flood photo and estimated that the water level of the flood was below the bottom of the truck’s door. Utilizing the identified height of the truck’s bottom level, FloodDepth-GPT accurately estimated the floodwater depth.
Figures 6, 7 and 8 present examples where FloodDepth-GPT’s estimations diverged from human assessments. The discrepancies can be attributed to variations in the estimation points, insufficient criteria, and incorrect water level identification. In Fig. 6, where a human served as the reference object, the model assessed the water level to be above the knee (approximately at the mid-thigh), while in reality, the water level on the human appears to be approximately at knee level. Future research could address this issue through enhanced prompt engineering to improve the perspective through which the model observes the water level. Additionally, discrepancies in estimations occurred from the limited criteria used for estimation by both humans and the GPT model. For instance, in Fig. 7, street signs were utilized as reference points without specifying their heights in the estimation criteria, leading to variations in the estimations made by humans and the GPT model. Figure 8 demonstrates significant variation in estimations, likely owing to divergent observation points by human observers and the GPT model, such as the center of the road or roadside. This emphasizes the challenge of precisely estimating floodwater depths due to variations in the terrain and the depth within a flood scenario.
4 Discussion and future research
This study introduces a new approach to estimate floodwater depth by leveraging the ability of GPT-4. This method utilizes structured prompts to analyze flood photos and estimate floodwater depth. In contrast to conventional computer vision and deep learning methods that depend on specific pre-trained objects, FloodDepth-GPT can automatically identify water levels based on reference objects in flood photos. Also, FloodDepth-GPT demonstrates impressive speed and efficiency, enabling floodwater depth estimations in just 10 s per photo, at a cost of approximately $1 to process 150 flood photos. This approach streamlines the estimation process and enhances the rapidity of flood inundation mapping.
Results from this study reveal that the proposed method can utilize a variety of common reference objects in flooding photos. It enhances the method’s versatility and makes this approach applicable in different flood scenarios. To the best of our knowledge, previous research has largely focused on models trained to recognize specific objects in flood photos, such as humans, stop signs, vehicles, etc. For example, Li et al. (2023) developed an object detection model that could only identify humans in flood photos used in their analyses. Analogously, some other studies only utilized photos containing stop signs (Song & Tuo, 2021; Alizadeh & Behzadan, 2023). Although previous methods showed promising results, these techniques can only be applied to photos containing objects on which their models were trained, thereby restricting the utility of such models on a broader scale.
The findings in this study are transformative. The GPT model’s ability to interpret different urban and natural elements in images opens new possibilities for automatic environmental assessments. Such detailed assessments are crucial for urban planning, disaster preparedness, and climate change studies and can be achieved through AI-driven analysis. Moreover, this study highlights FloodDepth-GPT as an example of Explainable AI, which provides the reasoning behind the model’s decision-making. These reasonings illustrate the transparency of the method, which is crucial for fostering trust and understanding in AI-driven environmental assessments.
The results of this study indicate that the proposed method is reliable in its estimations, demonstrating a mean average error within a fair range for estimating floodwater depths. However, further examinations revealed the presence of outliers in the estimations. While some outliers do not necessarily indicate errors by the GPT model but rather reflect differences in the observation points of estimation within the photo (see Figs. 6, 7 and 8), further refinements in future studies could include the introduction of more detailed criteria for estimation. Moreover, a notable challenge with large multimodal models like GPT-4 is their inability to consistently reproduce results, as the model delivers different estimations when run multiple times. Since this model is in its early stages, subsequent versions are expected to see enhancements that address these reproducibility issues, providing a foundation for more robust future studies. Another limitation of the proposed approach is the differences in the heights, dimensions, and designs of reference objects across different regions. This can limit the model’s ability to provide uniform estimations across diverse geographic locations. Future studies can focus on calibrating the model to account for regional variations in reference objects.
One potential future research avenue is photo localization, as the floodwater depth is most useful when accurate geolocation of the photo is known, at least at street level. The rapid floodwater depth estimation from on-site photos combined with geolocation opens the possibility of producing flood inundation maps in real time. The most straightforward way of photo localization is to extract location information from the photo if such information is available in the metadata or from a geotagged post associated with the photo (Huang et al. 2019, 2020; Ning et al. 2020). For example, some social media platforms allow users to geotag photo locations using the phone’s built-in Global Positioning System (GPS) sensor or manually select the street name and address. A big challenge, though, is to locate the photo based on photo content only, where the location can be obtained by retrieving similar photos in a large database with localized photos such as street view images (e.g. Zhang et al. 2020). It is particularly challenging to locate flood photos as such photos often have inundated features, resulting in potential mismatching to the existing photos. Thus, innovative methods are required to match the flood and non-flood photos, such as the semantic scene graph (Yoon et al. 2021).
5 Conclusion
Floodwater depth estimation plays a pivotal role in the effective management of floods, facilitating informed disaster response and strategic planning. Utilizing the advanced capabilities of large pre-trained multimodal models (GPT-4 in this study), this paper introduces a novel method for automatically determining floodwater depths from photographs related to flooding. The GPT-4 model was customized to estimate water depths using recognized reference objects in the photo by providing specific instructions related to this task.
The findings of this study indicate that the proposed method holds promise for estimating floodwater depths from photographs. In comparison to prior research in this domain, our study demonstrates a universal pipeline for floodwater depth estimation rather than training various individual models on different reference objects, which is not only less computationally demanding but also efficient and economical. Such information gives decision-makers and the community at risk a rapid situation awareness of the extent and impact of flooding, which is essential for effective disaster management and decision-making. However, the study also acknowledges minor discrepancies between the model’s estimations and those derived by humans. Future research could enhance this methodology through enhanced prompt engineering and the introduction of additional criteria for more accurate estimations.
As we navigate the future of flood management, AI-driven insights become paramount. This approach, leveraging the power of AI and computer vision, emerges as a means for shaping resilient communities. It equips emergency responders and planners with the tools needed for rapid, data-driven decision-making and fosters innovation in disaster management and urban science.
Availability of data and materials
Data and materials are included in the manuscript.
References
Alizadeh, B., & Behzadan, A. H. (2023). Computers, environment and urban systems flood depth mapping in street photos with image processing and deep neural networks. Computers, Environment and Urban Systems, 88(April 2021), 101628. https://doi.org/10.1016/j.compenvurbsys.2021.101628
Alizadeh Kharazi, B., & Behzadan, A. H. (2021). Flood depth mapping in street photos with image processing and deep neural networks. Computers, Environment and Urban Systems, 88, 101628. https://doi.org/10.1016/j.compenvurbsys.2021.101628
Athira, S., Katpatal, Y. B., & Londhe, D. S. (2023). Flood modelling and inundation mapping of Meenachil river using HEC-RAS and HEC-HMS software (pp. 113–130). https://doi.org/10.1007/978-3-031-26967-7_9
Bentivoglio, R., Isufi, E., Jonkman, S. N., & Taormina, R. (2022). Deep learning methods for flood mapping: a review of existing applications and future research directions. Hydrology and Earth System Sciences, 26(16), 4345–4378. https://doi.org/10.5194/hess-26-4345-2022
Bovenga, F., Belmonte, A., Refice, A., Pasquariello, G., Nutricato, R., Nitti, D., & Chiaradia, M. (2018). Performance analysis of satellite missions for multi-temporal SAR interferometry. Sensors (Basel, Switzerland), 18(5), 1359. https://doi.org/10.3390/s18051359
Brunner, G. (2016). HEC-RAS river analysis system: hydraulic reference manual, version 5.0 (p. 547). US Army Corps of Engineers-Hydrologic Engineering Center
Chaudhary, P., D’Aronco, S., Leitão, J. P., Schindler, K., & Wegner, J. D. (2020). Water level prediction from social media images with a multi-task ranking approach. ISPRS Journal of Photogrammetry and Remote Sensing, 167, 252–262. https://doi.org/10.1016/j.isprsjprs.2020.07.003
Chaudhary, P., D’Aronco, S., Moy de Vitry, M., Leitão, J. P., & Wegner, J. D. (2019). Flood-Water Level Estimation from Social Media Images (pp. 4–12)
Cian, F., Marconcini, M., Ceccato, P., & Giupponi, C. (2018). Flood depth estimation by means of high-resolution SAR images and lidar data. Natural Hazards and Earth System Sciences, 18(11), 3063–3084. https://doi.org/10.5194/nhess-18-3063-2018
Cohen, S., Brakenridge, G. R., Kettner, A., Bates, B., Nelson, J., McDonald, R., Huang, Y., Munasinghe, D., & Zhang, J. (2018). Estimating floodwater depths from flood inundation maps and topography. JAWRA Journal of the American Water Resources Association, 54(4), 847–858. https://doi.org/10.1111/1752-1688.12609
Crooks, A., & Chen, Q. (2024). Exploring the new frontier of information extraction through large language models in urban analytics. Environment and Planning B: Urban Analytics and City Science, 51(3), 565–569. https://doi.org/10.1177/23998083241235495
Elkhrachy, I. (2022). Flash flood water depth estimation using SAR images, digital elevation models, and machine learning algorithms. Remote Sensing, 14(3). https://doi.org/10.3390/rs14030440
Feng, Y., Brenner, C., & Sester, M. (2020). Flood severity mapping from volunteered geographic information by interpreting water level from images containing people: a case study of hurricane harvey. ISPRS Journal of Photogrammetry and Remote Sensing, 169, 301–319. https://doi.org/10.1016/j.isprsjprs.2020.09.011
Fohringer, J., Dransch, D., Kreibich, H., and Schröter, K. (2015). Social media as an information source for rapid flood inundation mapping. Natural Hazards and Earth System Sciences, 15(12), 2725–2738. https://doi.org/10.5194/nhess-15-2725-2015
Fryar, C. D., Carroll, M. D., Gu, Q., Afful, J., & Ogden, C. L. (2021). Anthropometric Reference Data for Children and Adults: United States, 2015–2018. Vital & Health Statistics. Series 3, Analytical and Epidemiological Studies, 0(36), 1–44. http://www.ncbi.nlm.nih.gov/pubmed/33541517. Accessed 16 Jan 2024
Haq, T., Halik, G., & Hidayah, E. (2020). Flood routing model using integration of Delft3D and GIS (case study: Tanggul watershed, Jember). 020052. . https://doi.org/10.1063/5.0014607
Haydari, A., Chen, D., Lai, Z., & Chuah, C. N. (2024). MobilityGPT: Enhanced Human Mobility Modeling with a GPT model (pp. 1–13). https://doi.org/10.48550/arxiv.org/abs/2402.03264
Hu, Y., Mai, G., Cundy, C., Choi, K., Lao, N., Liu, W., Lakhanpal, G., Zhou, R. Z., & Joseph, K. (2023). Geo-knowledge-guided GPT models improve the extraction of location descriptions from disaster-related social media messages. International Journal of Geographical Information Science, 37(11), 2289–2318. https://doi.org/10.1080/13658816.2023.2266495
Huang, C., Li, S., Liu, R., Wang, H., & Chen, Y. (2023). Large foundation models for power systems (pp. 1–10). https://doi.org/10.48550/arXiv.2312.07044
Huang, X., Li, Z., Wang, C., & Ning, H. (2020). Identifying disaster related social media for rapid response: a visual-textual fused CNN architecture. International Journal of Digital Earth, 13(9), 1017–1039. https://doi.org/10.1080/17538947.2019.1633425
Huang, X., Wang, C., Li, Z., & Ning, H. (2019). A visual–textual fused approach to automated tagging of flood-related tweets during a flood event. International Journal of Digital Earth, 12(11), 1248–1264. https://doi.org/10.1080/17538947.2018.1523956
Li, J., Cai, R., Tan, Y., Zhou, H., Sadick, A. M., Shou, W., & Wang, X. (2023). Automatic detection of actual water depth of urban floods from social media images. Measurement, 216,. https://doi.org/10.1016/j.measurement.2023.112891
Li, J., Wang, J., & Ye, H. (2021). Rapid flood mapping based on remote sensing cloud computing and Sentinel-1. Journal of Physics: Conference Series, 1952(2). https://doi.org/10.1088/1742-6596/1952/2/022051
Li, Z., & Ning, H. (2023). Autonomous GIS: the next-generation AI-powered GIS. International Journal of Digital Earth, 16(2), 4668–4686. https://doi.org/10.1080/17538947.2023.2278895
Li, Z., Wang, C., Emrich, C. T., & Guo, D. (2018). A novel approach to leveraging social media for rapid flood mapping: a case study of the 2015 South Carolina floods. Cartography and Geographic Information Science, 45(2), 97–110. https://doi.org/10.1080/15230406.2016.1271356
Meghanadh, D., Jaiswal, A. K., Maurya, V. K., & Dwivedi, R. (2020). Rapid flood mapping using Sentinel-1A images: a case study of flood in Panamaram, Kerala. IGARSS 2020–2020 IEEE International Geoscience and Remote Sensing Symposium (pp. 6883–6885). https://doi.org/10.1109/IGARSS39084.2020.9324674
Meng, Z., Peng, B., & Huang, Q. (2019). Flood depth estimation from web images. Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances on Resilient and Intelligent Cities (pp. 37–40). https://doi.org/10.1145/3356395.3365542
Neighbor Storage. (2023). Average car sizes: length, width, and height. https://www.neighbor.com/storage-blog/average-car-sizes-dimensions/. Accessed 15 Feb 2024
Nguyen, N. Y., Ichikawa, Y., & Ishidaira, H. (2016). Estimation of inundation depth using flood extent information and hydrodynamic simulations. Hydrological Research Letters, 10(1), 39–44. https://doi.org/10.3178/hrl.10.39
Ning, H., Li, Z., Hodgson, M. E., & Wang, C. (2020). Prototyping a social media flooding photo screening system based on deep learning. ISPRS International Journal of Geo-Information, 9(2), 104. https://doi.org/10.3390/ijgi9020104
OpenAI (2023). GPT-4 technical report. 4, 1–100. https://doi.org/10.48550/arXiv.2303.08774
Osco, L. P., de Lemos, E. L., Gonçalves, W. N., Ramos, A. P. M., & Junior, M. (2023). The potential of visual ChatGPT for remote sensing. Remote Sensing, 15(13), 3232. https://doi.org/10.3390/rs15133232
Pan, J., Yin, Y., Xiong, J., Luo, W., Gui, G., & Sari, H. (2018). Deep learning-based unmanned surveillance systems for observing water levels. IEEE Access, 6, 73561–73571. https://doi.org/10.1109/ACCESS.2018.2883702
Panchagnula, Manjusree L., Prasanna Kumar Chandra Mohan, Bhatt Goru Srinivasa, Rao Veerubhotla, Bhanumurthy. (2012). Optimization of threshold ranges for rapid flood inundation mapping by evaluating backscatter profiles of high incidence angle SAR images. International Journal of Disaster Risk Science, 3(2), 113–122. https://doi.org/10.1007/s13753-012-0011-5
Park, S., Baek, F., Sohn, J., & Kim, H. (2021). Computer vision–based estimation of flood depth in flooded-vehicle images. Journal of Computing in Civil Engineering, 35(2). https://doi.org/10.1061/(ASCE)CP.1943-5487.0000956
Quan, K. A. C., Nguyen, V. T., Nguyen, T. C., Nguyen, T. V., & Tran, M. T. (2020). Flood Level Prediction via Human Pose Estimation from Social Media Images. Proceedings of the 2020 International Conference on Multimedia Retrieval (pp. 479–485). https://doi.org/10.1145/3372278.3390704
Schumann, G. J. P. (2014). Fight floods on a global scale. Nature, 507(7491), 169–169. https://doi.org/10.1038/507169e
Song, Z., & Tuo, Y. (2021). Automated flood depth estimates from online traffic sign images: explorations of a convolutional neural network-based method. Sensors (Basel, Switzerland), 21(16), 5614. https://doi.org/10.3390/s21165614
Surampudi, S., & Kumar, V. (2023). Flood depth estimation in agricultural lands from L and C-band synthetic aperture radar images and digital elevation model. IEEE Access, 11, 3241–3256. https://doi.org/10.1109/ACCESS.2023.3234742
Tabari, H. (2020). Climate change impact on flood and extreme precipitation increases with water availability. Scientific Reports, 10(1), 13768. https://doi.org/10.1038/s41598-020-70816-2
Tao, R., & Xu, J. (2023). Mapping with ChatGPT. ISPRS International Journal of Geo-Information, 12(7). https://doi.org/10.3390/ijgi12070284
U.S. Department of Transportation. (2023). Manual on Uniform Traffic Control Devices for Streets and Highways. (Issue December).
Wang, D., Fu, Y., Liu, K., Chen, F., Wang, P., & Lu, C. T. (2023). Towards automated urban planning: when generative and ChatGPT-like AI meets urban planning. ACM Transactions on Spatial Algorithms and Systems, 9(1), 1. https://doi.org/10.1145/3524302
Wen, C., Hu, Y., Li, X., Yuan, Z., & Zhu, X. X. (2023). Vision-Language Models in Remote Sensing: Current Progress and Future Trends. https://doi.org/10.48550/arXiv.2305.05726
Yin, Y., Val, D. V., Zou, Q., & Yurchenko, D. (2022). Resilience of critical infrastructure systems to floods: a coupled probabilistic network flow and LISFLOOD-FP model. Water, 14(5). https://doi.org/10.3390/w14050683
Yoon, S., Kang, W. Y., Jeon, S., Lee, S., Han, C., Park, J., & Kim, E. S. (2021). Image-to-image retrieval by learning similarity between scene graphs. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 10718–10726. https://doi.org/10.1609/aaai.v35i12.17281
Zhang, C., Yankov, D., Wu, C. T., Shapiro, S., Hong, J., & Wu, W. (2020). What is that Building? Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 2425–2433). https://doi.org/10.1145/3394486.3403292
Zhang, S., Fu, D., Zhang, Z., Yu, B., & Cai, P. (2023). TrafficGPT: viewing, processing and interacting with traffic foundation models (pp. 1–9). http://arxiv.org/abs/2309.06719
Acknowledgements
The authors would like to thank Samrin Sauda and Faisal Elias for their valuable assistance in manually estimating the floodwater depth from the flood images.
Funding
None.
Author information
Authors and Affiliations
Contributions
Conceptualization: Akinboyewa, Li. Methodology, analysis, validation: Akinboyewa, Ning, Lessani, Li. Writing – original draft: Akinboyewa. Writing – review & editing: Akinboyewa, Li, Ning, Lessani. Supervision: Li.
Corresponding author
Ethics declarations
Competing interests
None.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1
1.1 Prompt used in designing FloodDepthGPT
- Flood related photos will be an input. Estimate the floodwater depth based on visible reference points in this image. In estimating the flood water depth, consider the following height metrics for common features: |
- For human, consider these height metrics: Men Total height = 1.75m, Knee height = 0.4m, Waist height = 0.9m, Shoulder height = 1.4m; Women: Total height = 1.60m, Knee height = 0.4m, waist height = 0.8m, Shoulder height = 1.4m. |
- For sedan cars, consider these height metrics: the overall height from the ground to the roof is 1.4m, the ground clearance is approximately 0.2m, height from ground to the bottom of the door is 0.6m, height from the ground to the top of the hood is 1.0m, and the height from the ground to the bottom of the window is 0.8m. |
- For a truck car, consider these height metrics: the overall height from the ground to the roof is 1.8 meters, the ground clearance is approximately 0.5 meters, the height from the ground to the bottom of the door is 0.8 meters, the height from the ground to the top of the hood is 1.3 meters, the height from the ground to the bottom of the window is 1.4 meters. |
- For a SUV car, consider these height metrics: the overall height from the ground to the roof is 1.7 meters, the ground clearance is about 0.3 meters, the height from the ground to the bottom of the door is 0.7 meters, the height from the ground to the top of the hood is 1.0 meter. |
- For a bus, consider these height metrics: the overall height from the ground to the roof is 3.2 meters, the ground clearance is approximately 0.7 meters, the height from the ground to the bottom of the door is 1.0 meter, the height from the ground to the bottom of the window is 2.0 meters. |
- For street signage (including stop signs), the dimension of a stop sign (the sign is a red octagon) is 0.9m by 0.9m, while the vertical measurement from ground to the top of the stop sign, indicating the total height of the stop sign plaque including the pole is 2.9m. Avoid the reflection of the stop sign in the water. Also, use any other features as a secondary reference. |
- Based on the water height against the different parts of each features, and the average height metrics, estimate the depth of the water. |
- Provide the estimated floodwater depth in meters. |
- Give estimation as a discrete number and not interval. |
Appendix 2
1.1 More sample responses from FloodDepth-GPT
Photo source: https://vaccineimpact.com/wp-content/uploads/sites/5/2017/08/Hurricane-Harvey-flooded-street.jpg.
Photo source: https://ak.picdn.net/shutterstock/videos/32103613/thumb/1.jpg.
Photo source: https://d.newsweek.com/en/full/657216/8-30-17-hurricane-harvey-flooding.jpg.
Photo source: https://ichef.bbci.co.uk/news/976/cpsprodpb/4F47/production/_87559202_hi030779656.jpg.webp.
Photo source: https://images.hamodia.com/hamod-uploads/2018/08/14170042/30165176018_f517fbfee7_o-1024x727.jpg.
Photo source: https://www.usatoday.com/gcdn/presto/2021/09/03/PNJM/a61d84b8-e6f6-4a6e-a712-de6e9fd3627c-090321_Fairfield_WeatherTZ_1816.JPG.
Photo source: https://www.theguardian.com/us-news/2023/sep/30/fema-government-shutdown-weather-disasters#img-1.
Photo source: https://s.hdnux.com/photos/65/15/12/13947547/6/920x920.jpg.
Photo source: https://pyxis.nymag.com/v1/imgs/40f/dae/449daf31e4753f3f74733fcfc9b082efa5-30-harvey-5.rhorizontal.w700.jpg.
Photo source: https://media.zenfs.com/en/the_independent_577/a55c48cbe6dd3449181b2e412b6707c6.
Photo source: https://cdn.theatlantic.com/thumbor/7VUUiVJMgYrI19Rbl_1ltA4SlME=/0x146:3600x2171/976x549/media/img/mt/2017/08/RTX3DKUO/original.jpg.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Akinboyewa, T., Ning, H., Lessani, M.N. et al. Automated floodwater depth estimation using large multimodal model for rapid flood mapping. Comput.Urban Sci. 4, 12 (2024). https://doi.org/10.1007/s43762-024-00123-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s43762-024-00123-3