Introduction

While environmental and social threats to society changes faster than in recent centuries, there is more of a need for faster, globally scalable and locally relevant risk information from developing Banks and the countries they serve. Big Data can range from gigabytes (call details records), to terabytes (satellite data), to petabytes (web traffic), with each magnitude requiring unique algorithms to extract the signal from the noise. This chapter explores how one type of sensor data—satellite imagery—can be made more useful through the development of an application that leverages Cloud Computing—Google Earth Engine—to turn data into insight for decision-makers on the ground.

The huge availability of and increasing computing power for satellite imagery is essentially meaningless for practical development without demand-driven applications to turn the information into insights and ensure those insights are locally relevant. We need an “application revolution” in step with the data revolution (UN Data Revolution Group 2014). To take one use case, each year millions of people and billions of dollars’ worth of assets around the world are affected by floods, which cause more economic, social, and humanitarian losses worldwide than any other type of hazard (UNISDR 2015). Yet the communities most at risk from environmental and societal change, especially those in the developing world, often lack the information they need to help understand, prepare for, and respond to these threats.

For one example, this chapter describes a satellite imagery based rapid mapping tool that distills massive datasets about flood vulnerability into useful risk information for Senegal. This approach—which combines machine learning, remote sensing, and census data to predict the socio-physical vulnerability to floods and dynamically delivers that information to decision-makers through a web-platform—demonstrates the potential for operationalizing remote sensing data to better understand and address information gaps about flood risk in Senegal and around the world.

More broadly, this case study helps reveal opportunities for integrating Earth Observation with other types of data. Combining data from mobile phones, social media, and more with satellite imagery and official statistics can add a near-real-time layer to remotely sensed land and disaster detection, creating deforestation alerts and disaster response tools for instance. Analysis of Call Detail Records (CDR) data provides a picture of population distribution at a finer temporal scale, showing where people are at the moment a remotely sensed flood hits an area. This can be critical in areas where high rates of seasonal migration or daily commuting. CDR data can also estimate how and where affected populations migrate after a disaster (Wilson et al. 2016; Lu et al. 2016). Pairing news scraping technologies with Earth Observation data via cloud computing platforms like Earth Engine make it possible to provide more accurate spatial detection that respond to input from observers and decision-makers on the ground (see Fig. 3). These new data mining techniques make it possible to update flood maps more quickly or learn of the existence of floods that are not mentioned in international disaster declarations or available in current databases like The Dartmouth Flood Observatory.

Why Senegal? An Example of the Critical Information Gap in Disaster Management

In Senegal, flood risk is constantly shifting in due to changing climate and urban settlement. As in much of the Sahel, Senegal has experienced a history of highly uncertain climatic conditions, varying between cycles of drought to eras of frequent and severe flooding. After several very dry decades (1968–1997), the average rainfall increased 35% between 2000 and 2005 (Nicholson 2005). In addition to changing climate, Senegal has undergone significant land use change, triggered by extreme drought events in the 1970s, 80s, and 90s and forcing rural populations into urban areas (Goldsmith et al. 2004). The peak urbanization rate of Senegal’s capital, Dakar, was estimated around 7–8%, and 44% of Senegalese currently live in urban areas (Mbow et al. 2008).

Despite the high degree of vulnerability and rapidly changing risk in Senegal, there is an inadequate risk data and vulnerability modeling for Senegal. Without this critical flood risk information, communities, governments, donors, and others in development lack clear guidance on how to prioritize resources, which limits their ability to make effective infrastructure investments, devise emergency plans, or provide relief assistance when needs arise (Hellmuth et al. 2011; Mitchell et al. 2010).

Socio-Physical Vulnerability to Flooding to Senegal

Cloud to Street’s approach to vulnerability combines new big data analysis tools with rapid assessment disaster science to fill information gaps about exposure and vulnerability to flooding in Senegal. With the support of Agence Française de Développement, social and physical vulnerability models were developed for Senegal and combined to determine the country’s exposure to flooding and estimate its social vulnerability to future hazard.

Flood detection: The method first estimates flood exposure by creating a historic inventory of major floods in Senegal. A list of past flood events in Senegal was assembled from a number of publicly available information sources (see Fig. 1).

Fig. 1
figure 1

Number of times an area (pixel) flooded from 2003–2015 in Senegal using the DFO algorithm at 250 m per pixel resolution (left: Ziguinchor, Senegal; right: Saint-Louis, Senegal and Senegal River)

Machine learning hydrology: Using the inventory of past floods as training data, a machine learning model was developed in Google Earth Engine to predict which parts of the country and population are at risk from future extreme flood for five priority watersheds. The floodplains cover 34% of the country. The Saint-Louis region, in the Senegal River Valley, was the primary testing ground for customizing the algorithm, where the authors designed and assessed four machine learning approaches on 11 flood conditioning factors. The model’s high accuracy rate for predicting training data demonstrate that machine learning algorithms can successfully predict floods using remote sensing (58–98%, depending on the watershed).

Social vulnerability to flooding: Identifying the social conditions that make one community more likely to experience loss from a disaster—loss of life, loss of livelihood, lack of recovery—is critical to understanding the threat of and resilience to flooding in Senegal. Experts at Cloud to Street conducted a literature review and PCA-based factor analysis to assess social vulnerability for Senegal, using anonymized data from Senegal’s 2013 census data, obtained through a partnership with Data-Pop Alliance and the Agence Nationale de la Statistique et de la Démographie du Sénégal.

Results: In the five priority watersheds, the method predicts a floodplain of 5596 km2. Of this area, 30% is high-risk zone where over 97,000 people live. Additionally, approximately five million people live in the 30 arrondissements that have very high social vulnerability profiles compared to other arrondissements. Five underlying dimensions that drive vulnerability in Senegal: (1) a lack of basic informational resources, (2) old age, (3) disabilities, (4) being disconnected from dense hubs and (5) population increase from internal migration. These five factors explain ∼69% of the variation in the selected census variables.

Combined socio-physical vulnerability of Senegal: Several of the arrondissements identified as having high biophysical risk were also found to have high or very high social vulnerability (see Fig. 2). These preliminary results show promise for cheaper and faster ways to gather flood information critical to disaster management and risk reduction, although a complete nation-wide assessment of the biophysical risk profile would be necessary to yield insights into the combined socio-physical vulnerability, since this assessment only considered five priority watersheds. Because the method relies on Earth Observing satellites, new information can be added in each flood event to retrain machine learning algorithms on the fly and improve prediction accuracies as the model responds to new training data.

Fig. 2
figure 2

This map depicts the combined socio-physical vulnerability map for five test watersheds of Senegal and a mock demonstration of the Senegal Flood Risk Dash-board

Using Big Data Information in Disaster Management

The flood risk information from this assessment can provide valuable insights for decision-makers at all stages of the disaster management cycle—from disaster risk reduction and event prediction to response and recovery (see Fig. 3). A flood vulnerability assessment map is a valuable hotspot analysis, indicating areas of high need to prioritize with preparedness funding and resources. Development Banks who fund infrastructure builds, social programs and other projects to can better target investments and protect assets. National and State governments in flood prone areas who need to protect their citizens, their citizens’ livelihoods, and their state’s economy over the long term, can prepare areas that are most at risk and communities that are most likely to experience loss when hit by hazards. Search and rescue agencies and humanitarian NGOs, aimed at minimize human casualties and reduce immediate and long term suffering from flooding, can use a real-time version for this assessment to mobilize support to locations in need when help is most critical aka in the 24 h immediately after a disaster.

Fig. 3
figure 3

The flood risk information can provide valuable insights for a variety of decision-makers at all stages of the disaster management cycle—from disaster risk reduction and event prediction to response and recovery

The Future of This Approach Globally and Locally as a Practical Tool

The science and technology outlined here is a baseline assessment with more research to be done to fully operationalize it for global use. This test model produces floodplain predictions that range from 30–250 m in spatial scale because it was trained on publicly available NASA satellites, but the same machine learning model could produce results at finer resolution if appropriate imagery were available for floods in the country. Increasing the spatial and temporal resolution of the flood model using private data (5-1 m resolution, with a daily or weekly return period, as is possible with the Planet satellite fleet), expanding the inventory of mapped floods for Senegal (but scraping news and using crowdsourcing), predicting the floodplain for the entire country (by automating optimized machine learning models in the cloud), adding flood depth and economic damage predictions (by developing function to extract depth from optical and radar images), and using predictive analytics and private data in social vulnerability analysis (using cellphone data released by Orange to researchers) are all reachable next steps. A web-enabled version of the map results could be set to stream satellite imagery from public and private sensors, and accepting crowdsourced contributions in near real time, so that the vulnerability analysis for Senegal can be updated each time satellites beam new imagery to earth with the mere refresh of a browser page, which can be essential in developing countries.

Remote sensing based tools for development need to scale globally so that virtually everyone can take advantage of the data revolution, especially as risk changes faster than ever, but this information has to also always be taken in context with other situational information from the ground when used for local or even larger scale decision-making. Applications of remote sensing tools for development need to be designed with the flexibility and humanity needed to protect the dignity of the community being analyzed and engage to them in analysis from the start. Future stages of the vulnerability assessment described in this chapter can engage local communities to verify the location of historic flooding, “ground-truth” the social vulnerability assessment indicators, and add fidelity to the machine learning based physical flooding analysis. For instance, digital crowdsourcing applications and on the ground participatory science could increase the accuracy and richness of the science and help build local resilience through risk awareness and better national disaster management.

Conclusion

Satellite imagery now produces a wealth of data that can be fused with other data and processing techniques such as CDRs and news scraping. Yet the wealth of data captured from above the earth and analyzed in the cloud does not always trickle down into information that can be used on the ground to promote development and increase resilience. There is therefore a need, and great potential, for applications that integrate new data sources and techniques with existing Earth Observation science to turn data into useful information to analyze threats to development such as flood hazards and enable managers on the ground to respond.