FormalPara Overview

After high blood pressure and smoking, air pollution is the third-largest risk factor for death globally (Murray et al. 2020). Air pollution can therefore be described as a global “pandemic” that should arguably be monitored and addressed with the same intensity with which the COVID-19 pandemic has been. Remote sensing and cloud computing technologies allow us to do so.

The purpose of this chapter is to explore and analyze gridded air pollution data from Sentinel-5P in the context of changes brought about by COVID-19 lockdowns. Practical components will include analyzing changes in nitrogen dioxide (NO2) concentrations over time and quantifying population-weighted NO2 concentrations for selected administrative units.

FormalPara Learning Outcomes
  • Understanding Sentinel-5P data.

  • Quantifying changes in air pollutant concentrations over time.

  • Generating a split-panel map to compare two time epochs.

  • Calculating population-weighted air pollutant concentrations.

Helps if you know how to

  • Import images and image collections, filter, and visualize (Part I).

  • Create a graph using ui.Chart (Chap. 4).

  • Perform basic image analysis: select bands, compute indices, create masks (Part II).

  • Use ee.Reducer functions to summarize pixels over an area (Chaps. 8 and 9).

  • Write a function and map it over an ImageCollection (Chap. 12).

  • Mask cloud, cloud shadow, snow/ice, and other undesired pixels (Chap. 15).

  • Design user interfaces for an Earth Engine App (Chap. 30).

1 Introduction to Theory

Air pollution can be generally defined as any chemical, physical, or biological agent that alters the natural composition of the atmosphere. Pollutants that are of primary concern for public health include particulate matter with diameter less than 2.5 μm (PM2.5), carbon monoxide (CO), ozone (O3), NO2, and sulfur dioxide (SO2). Globally, chronic exposure to air pollution results in greater loss of life than HIV/AIDS, malaria, and tuberculosis combined, and more than an order of magnitude more deaths than all forms of violence (Lelieveld et al. 2020). Exposure to PM2.5 and O3 is estimated to result in ~ 4.7 million excess deaths annually across the globe (Murray et al. 2020), although these estimates range between 3 and 10 million excess deaths per year, based on the disease categories considered and the exposure–response function used (Burnett et al. 2018; Chowdhury et al. 2022). Exposure to NO2 may result in 4 million new pediatric asthma cases annually (Achakulwisut et al. 2019).

Knowledge about the global distribution of these air pollutants and their sources has improved over the last decade, with the expansion of networks of ground-based monitors in many countries, the evolution of satellite products, and the advancement of complex atmospheric chemistry models. Studies have found that more than 70% of the global health burden from air pollution is attributable to anthropogenic emissions (Chowdhury et al. 2022; Lelieveld et al. 2019). The main anthropogenic sources of air pollution are industries, motor vehicles, power generation, agricultural activities, and household combustion, while non-anthropogenic sources include desert dust, biogenic emissions, forest fires, and even volcanoes. The reduction in transport and industrial activity during the COVID-19 lockdowns significantly reduced global air pollution levels, thereby highlighting the significance of anthropogenic emissions (Venter et al. 2020). In fact, it is estimated that the decline in air pollution during the first five months of 2020 resulted in 49,900 avoided deaths and 89,000 fewer pediatric asthma emergency room visits (Venter et al. 2021).

Despite the recent growth in monitoring networks, the air in most regions of Earth is insufficiently monitored, limiting air quality management. Given the paucity of ground-based monitoring, alternative monitoring approaches such as satellite remote sensing are gaining popularity and becoming more accurate (e.g., Griffin et al. 2019). Over the past few decades, we have had increasing access to a range of satellite sensors that monitor the contents of Earth’s atmosphere. However, it is important to note that satellites measure pollutant concentrations in the troposphere and stratosphere, which extend for many kilometers above the Earth’s surface. As a result, satellite measurements are not necessarily representative of the concentrations humans are exposed to on the ground, and consequently, relying on satellite data alone for human health applications is not advised. However, more sophisticated methods combine information from satellite remote sensing data, complex atmospheric chemistry models, and ground-based monitors to provide ground-level concentrations of pollutants with high confidence (Dey et al. 2020, Donkelar et al. 2021).

2 Practicum

2.1 Section 1: Data Importing and Cleaning

There is a range of satellite-based datasets on air pollution to choose from in the Earth Engine Data Catalog. The main datasets relevant to air pollution include the Moderate Resolution Imaging Spectroradiometer and Advanced Very-High-Resolution Radiometer for monitoring aerosol optical depth (a proxy for PM2.5); the Total Ozone Mapping Spectrometer Ozone Monitoring Instrument for monitoring O3; and more recently the TROPOspheric Monitoring Instrument (TROPOMI) on board the Sentinel-5 Precursor (Sentinel-5P), which monitors a range of air pollutants. We will use Sentinel-5P in this practicum, but the methods covered here are easily transferable to the datasets mentioned above.

Now, let’s load the satellite data for this practicum. If you search “tropomi” in the Earth Engine Data Catalog, you will see a range of datasets from Sentinel-5P, which can all be of value in quantifying air quality (Fig. 35.1).

Fig. 35.1
A catalog for the search results matching tropomi exhibits 5 results for sentinel 5 P O F F L.

Earth engine data catalog results for the search term “tropomi”

Although Sentinel-5 was launched in October 2017, the data available for analysis in Earth Engine are from July 2018 onward. TROPOMI, the sensor on board the satellite, is a spectrometer sensing ultraviolet, visible, near-infrared, and shortwave infrared wavelengths to monitor NO2, O3, aerosol, methane (CH4), formaldehyde, CO, and SO2 in the atmosphere. The swath width of TROPOMI is approximately 2600 km on the ground, resulting in a global daily coverage with a spatial resolution of 7 × 7 km. All of the Sentinel-5P datasets, except CH4, have two versions: Near Real-Time (NRTI) and Offline (OFFL); CH4 is available as OFFL only. The NRTI assets cover a smaller area than the OFFL assets but appear more quickly after acquisition. The OFFL assets have a delayed availability, but each asset contains data from an entire orbit and is arguably easier to work with for retrospective analyses. We will use the OFFL NO2 product in this practicum.

First we need to define an area of interest. Wuhan is infamous for being the epicenter of the COVID-19 pandemic and witnessed severe lockdowns. In the next section of this practicum, we will test to see if we can detect a reduction in NO2 during the early 2020 lockdowns in the surrounding province, Hubei. To start, in the code below, we import a global dataset of administrative boundaries and filter them for intersection with an ee.Geometry.Point object, which appears under the Imports section at the top of your script. This geometry has to be drawn with the drawing tool and can be moved to a new location to rerun the analysis for that administrative boundary.

After centering the Map on Hubei Province, we will import a population dataset, which is necessary for calculating population-weighted exposures in Sect. 3 of this practicum. We will use the Gridded Population of the World dataset for 2020, which includes a total population count per ~ 1 × 1 km grid (Fig. 35.2).

Fig. 35.2
A google satellite map for the Hubei province indicates the population density. The map has a darker region surrounded by a bright region with higher population counts.

Population density over Hubei Province. Brighter areas have higher population counts

A 29-line of script exhibits steps from importing a global dataset of administrative units level 1 to adding it to the map to see the population distribution.

Question 1. There are two other datasets of gridded population in the Earth Engine Data Catalog, namely WorldPop and Global Human Settlement Layers. Use the search bar to find them and add them to the map to compare them with the Gridded Population of the World dataset. Which one looks more realistic in your opinion, and why?

Now it is time to import the NO2 data. As with most optical satellite data, there can be things in the atmosphere that contaminate the signal from the object or chemical you want to measure. Clouds are a common issue for land surface reflectance products (Chap. 15), and they are also an issue when trying to measure air pollutant concentrations. In the code below, we create a function to mask out pixels with a cloud fraction above 0.3 (i.e., 30% cloud cover). You can test different masking thresholds to see what suits your use case best. After masking out cloudy pixels, we create a median composite from images during March 2021. It is important to note that we are working with the band that gives measurements for the tropospheric vertical column of NO2 and not the stratospheric or total vertical column. The troposphere is the closest we can get to ground-level measurements with Sentinel-5P. The median image for March 2021 should look like the map shown in Fig. 35.3.

Fig. 35.3
A satellite map with a heat map exhibits tropospheric N O 2 concentrations over Hubei province.

Tropospheric NO2 concentrations over Hubei Province. Hotter colors have higher concentrations, while cooler colors have lower concentrations

A 32-line script exhibits steps from importing the sentinel 5 P N O 2 offline product to visualizing the median N O 2.

Code Checkpoint A14a. The book’s repository contains a script that shows what your code should look like at this point.

2.2 Section 2: Quantifying and Visualizing Changes

Next we will test to see if we can visualize a change in NO2 concentrations during the 2020 COVID-19 lockdowns. We will compare the median NO2 concentration during March 2020 (during which Hubei Province was in lockdown) with the median value during March 2019.

Weather can significantly affect air pollutant concentrations (e.g., wind causing long-range transport of smoke), and therefore differences between 2020 and 2019 could be an artifact of differences in weather. By comparing the same month in different years, we partly control for the effects of seasonal weather patterns, but not completely. If you would like to control for weather effects more thoroughly, see Venter et al. (2020) for details. In the code below, we calculate and visualize median composite images for March 2019 and March 2020. The visualization makes use of Earth Engine’s comprehensive library of user-interface widgets (see Chap. 30 for more details). Specifically, we use the ui.SplitPanel widget to compare the two median composites side by side (Fig. 35.4). This widget can be set to have a wiping effect where maps are overlaid on top of one another, or a side-by-side comparison.

Fig. 35.4
A satellite map with a heat map has 2 panels. The left panel is for baseline 2019 and the right panel is for lockdown 2020. The map exhibit data for tropospheric N O 2 concentrations over Hubei province.

Split-panel map showing tropospheric NO2 concentrations over Hubei Province for March 2019 (left) and March 2020 (right). Hotter colors have higher concentrations, while cooler colors have lower concentrations

A 34-line of script exhibits steps from defining a lockdown N O 2 median composite to making a function to add a label with fancy styling.
A 34-line of script exhibits steps from defining a lockdown N O 2 median composite to making a function to add a label with fancy styling.

Question 2. Comparing the two maps in the split-panel map, do you find a reduction in NO2 concentrations during the lockdown? Where is the change in NO2 concentrations most significant?

Question 3. How are changes in NO2 concentrations related to population density? To help answer this question, you can (1) create a difference image by subtracting the no2Lockdown image from the no2Baseline image, (2) create a new ui.Map.Layer for the difference image and the population image created in Sect. 35.1, and (3) add these to the left or right map. Hint: You can change the opacity of the NO2 layers to aid interpretability.

Exploring the differences in NO2 concentrations as ee.Image objects can be visually informative, but quantifying the changes for specific regions requires further work. In the code below, we calculate the mean NO2 concentrations for Hubei Province by applying a reduceRegion function to each image in the March 2019 and March 2020 collections. The resulting time series are visualized in the chart shown in Fig. 35.5.

Fig. 35.5
A graph for the baseline versus lockdown N O 2 for the study region by D O Y calculates the D O Y time series for mean N O 2 during March 2019 baseline and 2020 lockdown. The peak for baseline is around 70.3 for D O Y and in lockdown, the peak is around 80.5.

Time-series graph showing average NO2 concentrations for Hubei Province during March 2019 and March 2020

A 24-line of script exhibits steps from creating the baseline map layer, adding it to the left map, and adding the label to reset the map interface with the split panel widget.
A 24-line of script exhibits steps from creating the baseline map layer, adding it to the left map, and adding the label to reset the map interface with the split panel widget.

Code Checkpoint A14b. The book’s repository contains a script that shows what your code should look like at this point.

2.3 Section 3: Calculating Population-Weighted Concentrations

In Sect. 35.2, we used the ee.Reducer.mean reducer in the reduceRegion function to get the average NO2 concentration over Hubei Province. However, when aggregating pollutant concentrations to define population exposure, we need a different approach. Imagine there was a large concentration of NO2 in a rural area in the east of Hubei Province where very few people live. If we simply calculated the average of all pixels, this rural NO2 anomaly would skew our representation of population exposure. Using the population number dataset imported in Sect. 35.1, we can calculate the population-weighted exposure (\(Exp\)) aggregated across \(n\) pixels in the area of interest (in this case, Hubei Province) using Eq. A1.4.1 below, where \(C_{i}\) is the NO2 concentration and \(P_{i}\) is the subpopulation in pixel \(i\).

$$ {\text{Exp }} = \mathop \sum \limits_{i}^{n} \frac{{P_{i} }}{{\mathop \sum \nolimits_{i}^{n} \left( P \right)}} \cdot C_{i} $$
(35.1)

In the code below, we map a function to calculate population-weighted exposure over all the images in the NO2 ImageCollection. Remember that in Sect. 35.1 we masked out pixels from images that had a cloud cover value greater than 30%. Therefore, an important step in this function is to calculate the percentage of available Sentinel-5P pixels within Hubei Province per image. We need to decide what percentage pixel coverage is enough to calculate a representative average for the province. Here we choose 25% for illustrative purposes, but depending on your research question, you may want to calculate averages only when you have 100% coverage by/from Sentinel-5P that is free of clouds. The contrast between the simple average and population-weighted average is shown in Fig. 35.6. The difference may appear small in this case, but when aggregating over larger areas with greater variation in population density, population-weighted averages can be very different from simple averages.

Fig. 35.6
A graph for the raw versus population-weighted N O 2 for the study region with time series for man N O 2 and the pop-weighted N O 2. The data exhibits peaks around March 20, 2020, for both.

Time-series graph showing average (no2ConcRaw) and population-weighted average (no2ConcPopWeighted) NO2 concentrations for Hubei Province in March 2020

A 20-line of script exhibits steps from creating a function to get the mean N O 2 for the study region to return a feature with N O 2 concentration and day-of-year properties.
A 19-line of script exhibits steps from getting the concentrations for a baseline and lockdown collection to printing it to the console.
A 19-line of script exhibits steps from getting the concentrations for a baseline and lockdown collection to printing it to the console.

Finally, although we can plot this data in Earth Engine, it is often easier to process with other statistical software, such as R or Python. So, to conclude, let us code for exporting time series of population-weighted averages for more than one area of interest (in this case, administrative units). In the code below, we map the function over two regions and then export the resulting table as a CSV file to Google Drive.

A 33-line of script exhibits steps from defining the spatial resolution of the population data to summing the e x p over the region.
A 33-line of script exhibits steps from defining the spatial resolution of the population data to summing the e x p over the region.

Code Checkpoint A14c. The book’s repository contains a script that shows what your code should look like at this point.

3 Synthesis

In this practicum, we focused on a particular pollutant (NO2), region (Hubei), and time period (March 2019 and March 2020). To reinforce your comprehension and understanding, consider the following assignments.

Assignment 1. How would you run this analysis for a different pollutant? Try substituting the NO2 collection with the Sentinel-5P NRTI SO2 collection. Hint: The main emission source for SO2 is electricity generation, for which coal is the most significant fuel. Use this information to inform your selection of a location and time period so that you can detect interesting changes.

Assignment 2. How would you run this analysis for a different geographic area? Try deleting the ee.Geometry.Point at the top of your script and using the Geometry Tools to digitize your own point on which to focus the analysis. If you are running the latter part of the script, you can also change the list of named administrative units. Hint: Add the adminUnits object from Sect. 35.1 of the code to the map. You can use the Inspector tab to click on polygons and get the name of the administrative unit under the ‘ADM1_NAME’ property.

Assignment 3. Finally, try changing the dates in the script so that you are comparing two different time periods. Remember that the Sentinel-5P data are available from July 2018 onward; defining dates before this will cause the script to throw an error.

4 Conclusion

In this chapter, we covered the basics of importing Sentinel-5P air pollution data, comparing changes over time, and calculating population-weighted averages for spatial units. Satellite detection of air pollutants is an important tool for monitoring air quality from local to global scales, but ground-station measurements and atmospheric modeling are often necessary to draw conclusions about human health risk. The fusion of ground-level and satellite data with advanced machine learning models to map and forecast air pollution is a growing research field with important societal applications (e.g., https://www.iqair.com/). Earth Engine is a well-suited and currently underutilized resource to advance this field.