FormalPara Overview

The global refugee population has never been as large as it is today, with at least 26 million refugees living in more than 100 countries. Refugees are international migrants who have been forcibly displaced from their home countries due to violence or persecution and who cross an international border and settle elsewhere, most often in a neighboring country. Remote sensing can help refugee leaders, humanitarian agencies, and refugee-hosting countries gain new insights into refugee settlement, population, and land cover change dynamics (Maystadt et al. 2020; Van Den Hoek et al. 2021). In this chapter, we will examine the value of using satellite imagery and satellite-derived data to map a refugee settlement in Uganda, estimate its population, and gauge land cover changes in and around the settlement.

FormalPara Learning Outcomes
  • Using a range of techniques—maps, videos, and charts—to visualize and measure land cover changes before and after the establishment of a refugee settlement.

  • Understanding the considerations and limitations involved in automated detection of refugee settlement boundaries using unsupervised classification.

  • Becoming familiar with satellite-derived human settlement and population datasets and their application in a refugee settlement context.

FormalPara Helps if you know how to
  • Import images and image collections, filter, and visualize (Part 1).

  • Perform basic image analysis: select bands, compute indices, create masks, classify images (Part 2).

  • Create a graph using ui.Chart (Chap. 4).

  • Use normalizedDifference to calculate vegetation indices (Chap. 5).

  • Perform pixel-based supervised or unsupervised classification (Chap. 6).

  • Use ee.Reducer functions to summarize pixels over an area (Chaps. 8 and 9).

  • Perform image morphological operations (Chap. 10).

  • Write a function and map it over an ImageCollection (Chap. 12).

  • Use reduceRegions to summarize an image with zonal statistics in irregular shapes (Chaps. 22 and 24).

  • Convert from a vector to a raster representation with reduceToImage (Chap. 23).

  • Write a function and map it over a FeatureCollection (Chaps. 23 and 24).

1 Introduction to Theory

In a humanitarian context, remote sensing data and analysis have become essential tools for monitoring refugee settlement dynamics both immediately after refugee arrival and over the long term. Nonetheless, there remain important challenges to characterizing refugee settlement conditions. First, dwellings, roadways, and agricultural plots tend to be small in size within refugee settlements and generally difficult to detect without the use of very high resolution satellite imagery. Second, dwellings and other structures within refugee settlements may be diffusely distributed and intermixed with vegetation and bare earth. Third, data on settlement features, boundaries, and refugee populations are often out of date or otherwise inappropriate for detailed geospatial analysis. In this chapter, we will examine these challenges and do our best to document refugee settlement dynamics through analysis of multi-date Landsat imagery.

2 Practicum

The study area for this chapter is Pagirinya Refugee Settlement in northwestern Uganda (Fig. 38.1a). As of 2020, Uganda was home to 1.4 million refugees, the fourth-largest refugee population in the world and the largest in Africa (UNHCR 2020). Refugees living in Uganda primarily fled violence in South Sudan and the Democratic Republic of the Congo, and most live in rural refugee settlements. Pagirinya in particular is home to 36,000 South Sudanese refugees and was established in mid-2016.

Fig. 38.1
A) A map of Uganda highlights all U N H C R and pagirinya refugee settlements. B) A map of Uganda highlights the U N H C R settlement boundary, refugee response office, market center, and road. C) A map of Uganda highlights E S A 2020 world cover with trees, shrubland, grassland, cropland, built-up, and vegetation.

Maps of a UNHCR refugee settlements in Uganda, b OpenStreetMap features, roadways, and the UNHCR settlement boundary for Pagirinya, and c European Space Agency 2020 WorldCover land cover at Pagirinya

In this practicum, we will visualize and document the land cover changes that have taken place in Pagirinya (Fig. 38.1b, c), use satellite data to estimate the settlement’s boundary and compare it to the official boundary laid out by the United Nations High Commissioner for Refugees (UNHCR), and use satellite-derived demographic products to estimate the refugee population within Pagirinya.

2.1 Section 1: Seeing Refugee Settlements from Above

In preparation for the arrival of refugees, humanitarian actors and refugee settlement planners are often interested in analyzing local land cover conditions before a refugee settlement is established. The goal of this first section is to use Landsat satellite imagery to characterize initial land cover conditions and land cover changes at Pagirinya Refugee Settlement in the years before and following the settlement’s establishment in 2016.

Let’s begin by adding the refugee settlement’s boundary to the Map by loading the FeatureCollection of refugee settlement boundaries in Uganda and filtering to Pagirinya Refugee Settlement. We will also initialize the Map to center on Pagirinya and default to showing the satellite basemap for visual reference.

7 lines of pseudo-code represent load U N H C R settlement boundary for pagirinya refugees.

Next, let’s create annual Landsat composites using the Landsat 8 surface reflectance ImageCollection. We will spatially filter the ImageCollection to a buffered settlement boundary and temporally filter to 2015–2020, which includes the full year before the settlement was established and the four years that followed. We will also apply a cloud filter of less than or equal to 40% to help ensure that our annual composites are cloud free.

For better legibility, we will rename the Landsat bands and add three new spectral index bands to each image in the ImageCollection using the addIndices function, which calculates the Normalized Difference Vegetation Index (NDVI), Normalized Difference Building Index (NDBI), and Normalized Burn Ratio (NBR) using normalizedDifference. Each of these metrics offers a different approach to characterizing land cover conditions and change over time. NDVI is commonly used for monitoring vegetation health; NDBI helps to characterize impervious and built-up surfaces; and NBR helps to identify land that has been cleared with fire, a common practice in our study region. Note that other spectral metrics or remote sensing platforms may be better suited for identifying refugee settlements in other regions.

30 lines of pseudo-code to create buffered settlement boundary geometry, the 500-meter buffer size is arbitrary but large enough, capture area outside of the boundary, buffer and convert to geometry for spatial filtering and clipping, create L 8 S R collection 2 band names and new names, and create image collection.

To build annual composites from before and after Pagirinya’s establishment in 2016, let’s create two temporal subsets of the ImageCollection—one from 2015 and one from 2017—and use the median function to composite images for each time frame (Fig. 38.2). We will also clip our image collections to the buffered region around Pagirinya. To visualize the NDVI composites, we will use true-color and false-color visualizations and color palettes, which should help us identify and interpret features within and surrounding the settlement boundary.

Fig. 38.2
2 sets of photographs of composites from before and after the establishment of pagirinya with pagirinya refugee settlement boundary.

Pre-establishment (left) and post-establishment (right) true-color composites with Pagirinya Refugee Settlement boundary overlaid in blue

20 lines of pseudo-code to make annual pre and post-establishment composites, import visualization palettes, set up true color visualization parameters, and set up false color visualization parameters.
20 lines of pseudo-code to make annual pre and post-establishment composites, import visualization palettes, set up true color visualization parameters, and set up false color visualization parameters.
43 lines of pseudo-code display true color composites, display false color composites, display median N D V I composite, create an empty byte image into which paint the settlement boundary, convert settlement boundary geometry to an image for overlay, and display pagirinya boundary in blue,

Now that we have pre- and post-establishment composites to support a visual qualitative assessment, let’s make a complementary quantitative assessment by measuring pre- and post-establishment differences in median NDVI and plotting the distribution of NDVI from both periods.

39 lines of pseudo-code compare pre and post-establishment differences in N D V I, and chart the N D V I distributions for pre and post-establishment.
39 lines of pseudo-code compare pre and post-establishment differences in N D V I, and chart the N D V I distributions for pre and post-establishment.

In addition to the pre- and post-establishment annual composites, let’s create an annotated video time series of the full 2015–2020 Landsat 8 surface reflectance ImageCollection. We will be able to use this video to view changes at our study refugee settlement image by image.

38 lines of pseudo-code import package to support text annotations, define arguments for animation function parameters, set a property called label for each image, create a new image with the label overlaid using genas package, and add timestamp annotation to all images in the video.
38 lines of pseudo-code import package to support text annotations, define arguments for animation function parameters, set a property called label for each image, create a new image with the label overlaid using genas package, and add timestamp annotation to all images in the video.

Code Checkpoint A17a. The book’s repository contains a script that shows what your code should look like at this point.

Question 1. How would you describe the land cover type in the area in 2015, before the establishment of the refugee settlement? Is the land cover consistent within the settlement’s boundary in the pre-establishment period? Does the settlement boundary conform to land cover type or condition in any meaningful way?

Question 2. What features (dwellings, roadways, agricultural plots, etc.) present the greatest visual difference between the pre- and post-establishment periods? Comparing the visual differences in true color, false color, and NDVI with the satellite image basemap may be helpful here.

Question 3. Which of the annual composite visualizations (true color, false color, or NDVI) do you prefer for distinguishing the refugee settlement in the post-establishment period, and why?

Question 4. How do the range and mode of NDVI values change from pre- to post-establishment? How might the changes in NDVI distribution correlate to overall changes in land cover type in the post-establishment period?

Question 5. Beyond the rapid establishment of the settlement’s dwellings and roads, what changes do you observe in the time series video? Do these changes occur within or outside the settlement boundary? What kinds of changes do you see in the imagery from 2019 or 2020, well after the settlement was established in 2016?

2.2 Section 2: Mapping Features Within the Refugee Settlement

In Sect. 38.2.1, we used Landsat data to gauge changes in land cover conditions and types, but we can also draw upon data products derived from satellite imagery. For instance, satellite-derived building footprints, which represent geometries of individual structures and dwellings, are often used to estimate human populations and population density in humanitarian contexts and to support the planning and delivery of food and other kinds of aid. In this section, we will identify different features within Pagirinya, which we will use to create a satellite image-based settlement boundary map in Sect. 38.2.3.

Let’s add to our script from Sect. 38.2.1 by first loading the Open Buildings V1 Polygons dataset from the Earth Engine Data Catalog. This dataset includes satellite-derived building footprints based on very high resolution (0.5 m) satellite imagery, and each footprint has a confidence score. Let’s visualize building footprints with a confidence score above 75% as orange and building footprints with a 75% or lower confidence score as purple.

11 lines of pseudo-code to visualize the open buildings dataset. It visualize building footprints with a high confidence score above 75% and building footprints with 75% or lower confidence.

With a map of building footprints in place, let’s turn to examining other features of interest that we identified in Sect. 38.2.1. Let’s load a FeatureCollection of sample locations of infrastructure, forest, and agriculture visible on the satellite basemap as well as a sample of building footprint locations. Note in the print output that each feature has a value, which represents the feature type. Let’s write a function to use this value property to automatically assign a unique color to each feature as part of a style (Fig. 38.3).

23 lines of pseudo-code load land cover samples, create a function to set feature properties based on value, use the class as an index to lookup the corresponding display color, and apply the function and view the results.
Fig. 38.3
A map of Uganda highlights Feature samples across the pagirinya refugee settlement boundary.

Feature samples across Pagirinya Refugee Settlement (boundary shown in blue)

Since we want to use these sample land cover locations to help delineate the refugee settlement boundary, these different land cover types should be spectrally distinguishable from each other. To see how the spectral values vary among different features, let’s create spectral signature plots for the post-establishment period. We first need to add the land cover class to the post-establishment composites that we made in Sect. 38.2.1 so that the class and spectral value information can be referenced together in our spectral signature plots. To do that, let’s use reduceToImage to convert our lcPts FeatureCollection to an image, lcBand, and then add that image to the post-establishment composite.

7 lines of pseudo-code convert the land cover sample feature collection to an image and add 1cBand to the post-establishment composite.

Now we have a postMedian image that we can sample at specific sample locations and identify not only the spectral values but also the class type. Let’s plot the spectral values by class type. Note that since band names are sorted alphabetically on the x-axis, nir values are plotted in between green and red and are therefore out of order with respect to band wavelengths.

41 lines of pseudo-code define bands that are visualized in the chart and plot median band value for each land cover type.
41 lines of pseudo-code define bands that are visualized in the chart and plot median band value for each land cover type.

Remember that we also calculated NDVI, NDBI, and NBR spectral indices in Sect. 38.2.1. Since these bands range from − 1 to 1, we have to plot their values separately from the Landsat band spectral signature plots above, which use scaled reflectance values.

41 lines of pseudo-code define spectral indices that are visualized in the chart, plot median index value for each land cover type, view window, and create an empty image into which to paint the features cast to a byte.
41 lines of pseudo-code define spectral indices that are visualized in the chart, plot median index value for each land cover type, view window, and create an empty image into which to paint the features cast to a byte.

Code Checkpoint A17b. The book’s repository contains a script that shows what your code should look like at this point.

Question 6. How would you describe the coverage of the footprints within the settlement? Are there sections of the settlement visible in the basemap or the post-establishment composite that are missing footprints?

Question 7. How do NDVI, NDBI, and NBR change from the pre- to post-establishment period at building footprint locations?

Question 8. Are the spectral profiles of the four feature types distinct from each other? Which profiles are the most similar overall?

Question 9. Which bands or indices provide the greatest separation between the four feature types?

2.3 Section 3: Delineating Refugee Settlement Boundaries

Now that we have become familiar with the different land cover types and the changes that can occur once a refugee settlement is established, let’s turn to formally delineating the refugee settlement from its surroundings by mapping a settlement boundary. Having information on refugee settlement boundaries is helpful for the basic accounting of refugee settlement extent and for confidently attributing land cover or land use changes to a specific refugee settlement (Friedrich and Van Den Hoek 2020; Van Den Hoek and Friedrich 2021). In this section, we will use a k-means unsupervised classifier to generate a settlement/non-settlement map that represents land that has been transformed by the refugee settlement’s establishment or subsequent use. Note that the settlement boundary that we used in Sect. 38.2.1 is a settlement planning boundary established by the UNHCR and so represents the land within the formal boundary that potentially could be accessed or used by refugees.

To start making a binary classification that separates settlement from non-settlement, let’s create a random sample of 500 NDVI values from across the post-establishment composite. Remember that the postMedian composite was clipped to the 500-m-buffered extent of the UNHCR settlement boundary geometry, so these sample sites should be dispersed inside and outside of the UNHCR boundary’s geometry. For parameterization, we only need two values output from the classifier (numClusters = 2) and can set the maximum number of iterations to a low value of 5 (maxIter = 5) and the seed value to an arbitrary value of 21. Now let’s apply the classifier to the post-establishment composite, view the coverage of settlement (pixel value of 1) and non-settlement (pixel value of 0), and visually compare the result with the UNHCR settlement boundary.

25 lines of pseudo-code to create samples to input to a K means classifier, set up the parameters for K means, seed the classifier using land cover samples, and apply the K means classifier.
25 lines of pseudo-code to create samples to input to a K means classifier, set up the parameters for K means, seed the classifier using land cover samples, and apply the K means classifier.

The resulting k-means classification looks promising for separating settlement from non-settlement pixels, but it has many gaps in settlement coverage as well as isolated settlement patches and pixels. To produce a single contiguous settlement coverage, let’s apply spatial morphological operations of dilation and erosion on the k-means output. Dilation incrementally expands the boundary of a raster dataset, filling gaps and connecting patches along the way. Conversely, erosion chips away at the outermost pixels, thereby removing the surplus pixels that were added during the dilation step but still maintaining the filled-in gaps.

We will apply these in sequence, first dilation and then erosion, using focal_max and focal_min, respectively; focal_max works as a dilation since it outputs the maximum value detected within the kernel, which will always be a settlement pixel because the settlement pixel value of 1 is always greater than the non-settlement pixel value of 0. Since we just need to do some fine-tuning on the boundary of the settlement coverage, we can use a kernel with a small radius of 3. Finally, let’s convert the output of the dilation and erosion to a polygon FeatureCollection where each contiguous patch of pixels becomes its own polygon (Fig. 38.4). Feel free to map the outline of Pagirinya in blue, as above, for a helpful visual reference.

32 lines of pseudo-code define the kernel used for morphological operations, perform a dilation followed by an erosion, dilation, erosion, convert cleaned K means a settlement and non-settlement coverages to polygons, and a map outline of pagirinya in blue.
Fig. 38.4
2 photographs of the K means the output of the dilation and erosion to a polygon before and after with the outline of Pagirinya in blue.

K-means output before (left) and after (right) dilation and erosion, with Pagirinya Refugee Settlement boundary overlaid in blue

We have created a usable vector map of settlement and non-settlement polygons, but we are aiming for a single polygon that represents the settlement boundary. To filter these polygons to a single polygon that represents the refugee settlement’s boundary, let’s use a simple logic rule and select the polygon that has the largest overlap (i.e., intersected area) with the UNHCR boundary (Fig. 38.5).

14 lines of pseudo-code which intersect K means polygons with U N H C R settlement boundary, return intersection area as a feature property, and sort to select the polygon with the largest overlap with the U N H C R settlement boundary.
Fig. 38.5
A map of Uganda highlights the K means settlement boundary shaded with black inside the U N H C R settlement boundary.

K-means settlement boundary (black) overlaid by UNHCR settlement boundary (blue)

Code Checkpoint A17c. The book’s repository contains a script that shows what your code should look like at this point.

Question 10. In your opinion, does the k-means boundary accurately separate the settlement from its surroundings? Considering differences between the UNHCR boundary and the k-means boundary, comment on potential errors of commission (areas that are inaccurately included in the k-means boundary) and omission (areas that are inaccurately excluded).

Question 11. Rather than collecting samples for input to k-means based only on NDVI in the postMedian image, adjust the script above to sample from all bands in postMedian. How does the resulting settlement polygon differ? Does increasing the amount of spectral information available to the classifier improve the result?

Question 12. Rerun the k-means classifier based on the diffMedian image from Sect. 38.2.1 rather than the postMedian image while keeping the other parameters the same. How does the resulting settlement boundary polygon differ?

2.4 Section 4: Estimating Refugee Population Within the Settlement

Thus far, we have looked at land cover conditions and land cover changes at Pagirinya and used that information to help map the extent of the settlement. Let’s turn toward using satellite-derived data to estimate the size of the refugee population at Pagirinya. Knowing how many refugees are at a settlement is essential for gauging the need for food aid and for guiding sustainable development and disaster risk reduction efforts. Satellite-informed population estimates can be useful for these purposes, especially if no other data are available.

In this final section, we will work with several datasets designed to estimate the geographic distribution of human populations, each of which is based in part on remote sensing detection of buildings. We will analyze population estimates at Pagirinya Refugee Settlement from the Global Human Settlement Layer (GHSL), High Resolution Settlement Layer (HRSL), and WorldPop data products. To gauge the accuracy of these products, we will compare the population estimates with UNHCR-recorded refugee population data from September 2020.

These versions of HRSL and WorldPop are from 2020, and this version of GHSL has data for multiple years, most recently 2015. Let’s filter the GHSL ImageCollection to only the 2015 dataset. We will also rename all relevant bands to ‘population’ for consistency and visualize all population maps using the same approach to support a direct comparison. Use the Inspector tool to identify the different pixel-level values for each population dataset within and around Pagirinya. These values represent the human population estimated to be present at each pixel.

21 lines of pseudo-code set-up visualization to be shared by all population datasets and map population datasets.

You will notice that each dataset has a different spatial resolution (also commonly referred to as the “scale”). We will need to know these different spatial resolutions when we summarize each dataset’s population estimate across Pagirinya using reduceRegion. Once we have a population estimate, we will add it to the Pagirinya feature as a new property.

36 lines of pseudo-code collect the spatial resolution of each dataset, summarize population totals for each population product at each settlement, and assign as new properties to the U N H C R boundary feature.

Now we have three very different population estimates for Pagirinya based on the three population datasets. Let’s see how they compare to the population recorded in 2020 by UNHCR, which is also stored as a property of the Pagirinya feature.

To do so, we will simply subtract the UNHCR population total from each dataset’s estimated population total and store each difference as a new property. A negative difference indicates an underestimation of the UNHCR-recorded population, and a positive difference indicates an overestimation.

16 lines of pseudo-code measure the difference between settlement product and U N H C R recorded population values and update U N H C R boundary feature with population difference properties.

Code Checkpoint A17d. The book’s repository contains a script that shows what your code should look like at this point.

Question 13. Visually interpret the coverage of each population dataset alongside the building footprint data from Sect. 38.2.2. Which population dataset seems to better capture population density at hot spots of building footprints?

Question 14. Many buildings in Pagirinya are not household dwellings but rather administrative offices, shops, food market buildings, etc., and such differences in building use are not necessarily considered in generating the population estimates. How would the inclusion of non-dwellings in population datasets bias settlement-level population estimates?

Question 15. Note that the coverage of the WorldPop population data at Pagirinya is not wholly contained within the UNHCR settlement boundary. Is this “spillover” better captured by the k-means boundary from Sect. 38.2.3?

3 Synthesis

You may have noticed that we showed a 2020 land cover map from the European Space Agency (ESA) based on Sentinel-1 and Sentinel-2 data in Fig. 38.1c but did not make use of those land cover data in the practicum. How would your settlement boundary detection approach and results change if you used Sentinel-2 instead of Landsat data and sampled land cover sites from this ESA dataset? As a homework challenge, please complete the following assignment.

Assignment 1. Use Sentinel-2 surface reflectance data collected in 2020. Collect 20 samples of each land cover class in the ESA land cover product within Pagirinya using ee.Image.stratifiedSample. Assess the spectral separability between land cover classes. Then, run a modified k-means classifier that makes use of Sentinel-2 NDVI values collected across the ESA land cover map.

4 Conclusion

This chapter introduced approaches for characterizing land cover dynamics within and surrounding Pagirinya Refugee Settlement using a range of open-access satellite data and geospatial products. We saw that satellite remote sensing approaches are effective for characterizing land cover changes before and following the establishment of Pagirinya in 2016, and for delineating a refugee settlement boundary that represents land directly affected by the settlement’s establishment and use. We also noted wide disagreement and pronounced inaccuracies in Pagirinya refugee population estimates based on satellite-informed human population datasets. This chapter shows the value of remote sensing for long-term monitoring of refugee settlements as well as the need for deeper integration of humanitarian data and scenarios in remote sensing applications.