Introduction

Among the key issues exacerbated by urbanization is air pollution1,2, such as the emission of large quantities of non-methane volatile organic compounds (NMVOCs) from human activities3,4. From 1970 to 2012, global anthropogenic NMVOC emissions increased from 119 Tg to 169 Tg5. NMVOCs are crucial in atmospheric chemistry, acting as precursors to ozone6 and secondary organic aerosols7. Additionally, they pose health risks, including respiratory ailments and elevated cancer risks8. The emitting activities of anthropogenic NMVOC emissions range from vehicular traffic, industrial processes, and biomass burning to smaller scales yet ubiquitously present actions such as cooking and the use of volatile chemical products9,10. The complexity of these sources makes anthropogenic NMVOC emissions especially hard to monitor, regulate, and control. Estimating the anthropogenic NMVOC emissions in urban areas globally still poses a challenge.

Recent studies underscore the significant potential of satellites in offering an expansive geographical and temporal perspective for NMVOC emissions monitoring3,11,12,13,14. Formaldehyde (HCHO), identified as an intermediary compound in the atmospheric oxidation of various NMVOCs15, positions satellite HCHO columns as effective indicators for terrestrial NMVOC emissions16,17. Satellites, such as the Global Ozone Monitoring Experiment18 (GOME), the Scanning Imaging Absorption Spectrometer for Atmospheric Cartography19 (SCIAMACHY), the Ozone Monitoring Instrument20 (OMI), and the TROPOspheric Monitoring Instrument21 (TROPOMI), play pivotal roles in retrieving HCHO columns. TROPOMI, a satellite payload of the European Space Agency’s (ESA) S-5P mission, delivers daily global information on trace gases and aerosol concentrations at a resolution of 5–7 km crucial for air quality, climate forcing, and global environmental changes. Specifically, TROPOMI delivers high signal-to-noise ratio data on HCHO columns with superior spatial resolution22,23, which is pivotal for more precise NMVOC emissions monitoring. Integrating Visible Infrared Imaging Radiometer Suite24 (VIIRS) nighttime light (NTL) radiance data25, which are indicative of urbanization, enhances our comprehension of the nexus between urbanization and anthropogenic NMVOC emissions.

Urban areas stand as focal points driving environmental changes from local to global scales26,27,28. The interplay between urbanization and ecological degradation, such as atmospheric pollution, has emerged as a critical topic in urban sustainable development29. Different air pollutants and their precursors exhibit varied responses to urbanization30,31,32. For instance, NO2 emissions displays an inverted U-shaped pattern in relation to urbanization, as represented by per capita income33, while PM2.5 shows a linear trend with respect to time34. Globally, the relationship between anthropogenic NMVOC emissions and urbanization remains ambiguous35, due largely to uncertainties in estimating the anthropogenic NMVOC emissions in urban areas36,37,38. While some studies have expanded our grasp of biogenic NMVOC emissions through satellite HCHO columns15,39,40, current scholarly work has yet to adequately meld global HCHO columns with anthropogenic NMVOC emissions in urban areas. Global NMVOC emissions predominantly originate from plants16,41,42,43, making the precise identification of anthropogenic sources from satellites a challenge.

Here, we utilize data from TROPOMI HCHO columns, VIIRS NTL radiance, the Emission Database for Global Atmospheric Research (EDGAR)44, and the Global Earth Observing System Chemistry (GEOS-Chem)45 to monitor the global levels of anthropogenic NMVOC emissions and the impact of urbanization on these emissions. We develop a satellite-based technical approach to estimate the levels of anthropogenic NMVOC emissions; furthermore, we establish linear models relating anthropogenic NMVOC emissions to urbanization from a national to a global scale and conduct sensitivity experiments. Our aim is to provide references for global air quality assessments and promote sustainable development of the environment.

Results and discussion

Global HCHO columns, HCHO emission rates, and NTL radiance

Figure 1 shows the global HCHO columns from TROPOMI, HCHO emission rates based on the EDGAR inventory, and NTL radiance from VIIRS. Our research primarily focuses on urban areas located in the northern hemisphere. We use TROPOMI HCHO column data from May to October in both 2019 and 2021 to ensure data quality, as satellite HCHO columns exhibit weaker signals during the in other months (winter) in the northern hemisphere46 (Supplementary Figure 1). The 2020 outbreak of coronavirus disease (COVID-19) significantly impacted the global economy, leading to marked variations in the HCHO columns11,47. As a result, we exclude the 2020 data. TROPOMI Level 2 column pixels are filtered based on cloud fraction (< 30%), solar zenith angle (<60°), and quality assurance values (> 0.5). We then resample all eligible level-2 pixels onto a 0.05° × 0.05° (~5 × 5 km2) grid for the periods mentioned, built on our previous oversampling approach36. The TROPOMI HCHO columns (Fig. 1a) provide a top-down proxy of anthropogenic NMVOC emissions.

Fig. 1: Global TROPOMI HCHO columns, estimated HCHO emission rates with EDGAR, and VIIRS NTL radiance.
figure 1

a Shows the TROPOMI HCHO columns for May through October 2019 and 2021, resampled to a grid resolution of 0.05° × 0.05° (~ 5 × 5 km2) grid resolution. Panel (b) shows HCHO emission rates in 2012 at a resolution of 0.1° × 0.1° (~10 × 10 km2) estimated from anthropogenic NMVOC emissions of five species (ethene, propene, isoprene, monoterpenes, and primary formaldehyde) from EDGAR using the methodology proposed by Palmer et al. 42. Panel (c) shows VIIRS annual NTL radiance in 2019 at a resolution of 0.05° × 0.05° (~ 5 × 5 km2), computed based on the median synthesis method provided by the Google Earth Engine.

Using the EDGAR inventory, we estimated the HCHO emission rates (Fig. 1b), both primary and secondary, based on five highly reactive volatile organic compounds (VOCs) with atmospheric lifetimes of less than 3 hours (Supplementary Table 1), employing the methodology proposed by Palmer et al. 17,42. When not constrained by real-time constraints, the HCHO emission rates estimated from the EDGAR inventory reflect the more tangible characteristics of anthropogenic NMVOC emissions from a bottom-up perspective.

Here, we employ VIIRS NTL radiance data as an indicator of urbanization levels (Fig. 1c). NTL radiance data are collected through remote sensing and capture the amount of artificial lighting during nighttime. One common application of nightlight data is to explore urbanization, for instance, by mapping urban areas48,49,50,51 and estimating economic parameters52,53. We synthesize the 2019 VIIRS NTL monthly radiation data based on the Google Earth Engine54,55 platform with a previously used median composite method35,56 that can remove some interference from wildfires and ice.

Globally, evident relationship exists among HCHO columns, HCHO emission rates, and NTL radiance. Developed regions and major urban centers typically exhibit elevated nighttime radiance, reflecting heightened human activities and urban dynamism, often corresponding to increased NMVOC anthropogenic emissions. However, this relationship varies across geographic regions. In Asia, large regions such as eastern China and northern India, where nighttime radiance has significantly increased, also demonstrate high HCHO columns and HCHO emission rates, underscoring their rapid urbanization. In contrast, Africa and South America primarily feature natural sources of NMVOC emissions57, making anthropogenic emissions challenging to discern based on HCHO columns.

Furthermore, Supplementary Figure 2 shows the variations at the city scale in HCHO columns, HCHO emission rates, and NTL radiance for Paris (Europe), New York (North America), and Tokyo (Asia). The selected cities are developed metropolises with the highest per capita GDP58 in their respective continents. The normalized latitudinally averaged results (Supplementary Figure 2j–l) present a comprehensive comparison of these three cities. The HCHO columns, HCHO emission rates, and NTL radiance in Paris show a relatively consistent pattern of change (Supplementary Figure 2j), whereas New York and Tokyo exhibit lower consistency than Paris. Globally, using HCHO columns as indicators of anthropogenic NMVOC emissions presents certain challenges, even in developed urban areas, due to the need to consider the effects of factors such as NOx and biological NMVOC emissions.

Through spatial sampling and filtering, we have identified urban sites globally dominated by anthropogenic NMVOC emissions. Using NTL radiance with intervals of 0.1 log (nanoWatts cm-2 sr-1), we perform a linear correlation analysis. Supplementary Figure 3a shows the relationship between HCHO columns and HCHO emission rates for the selected months (May to October). It is observed that higher levels of urbanization correlate with higher anthropogenic NMVOC emissions, as evidenced by the concurrent increase in HCHO columns and HCHO emission rates. Moreover, HCHO columns and HCHO emission rates exhibit a high positive correlation, with a Pearson correlation coefficient (r) of 0.91. Supplementary Figure 3b presents the relationship between HCHO columns and HCHO emission rates for other months, showing a lower r, which indicates a weaker capability to indicate anthropogenic NMVOC emissions.

Global anthropogenic NMVOC emissions and urbanization

Figure 2 shows the linear models between TROPOMI HCHO columns (indicating anthropogenic NMVOC emissions) and VIIRS NTL radiance (indicating urbanization) for the entire globe and four continents: Asia, Africa, Europe, and North America. These representations are derived from linear models applied to selected urban sites. Considering the high skewness in NTL radiance (skewness = 3.31), the linear models are conducted in logarithmic space. VIIRS NTL radiance values are categorized into bins with a width of 0.1 log (nanoWatts cm2 sr1). Mean values for VIIRS NTL radiance and TROPOMI HCHO columns in each bin are calculated and then subjected to linear fitting. The global-scale linear model (Fig. 2a) exhibits an r of 0.81, with a slope of 0.42 × 1015 molec. cm−2 nanoWatts1 cm2 sr and an intercept of 9.26 × 1015 molec. cm2. The results reveal a significant positive correlation (p value ranging from 0 to 0.03), with r ranging from 0.60 to 0.93, across the four continents. We obtained the uncertainty of the models based on the 1000 Monte Carlo analysis (Supplementary Fig. 4). The results show the distribution of slopes and intercepts, which implies the stability of the method used for the linear model.

Fig. 2: Comparison of TROPOMI HCHO column with VIIRS NTL radiance grouped by global regions.
figure 2

In panel (a), a point represents the mean HCHO columns at a specific NTL radiance bin for all urban grid cells. The blue line shows the simple linear regression line, with a blue area enveloping the 95% confidence interval of the mean response. Pearson correlation coefficients (r) and the linear model equation are also inserted. Kernel density estimates for all global urban sites are plotted in the background. Panel (b) shows the results of the linear models for the four continents, including Asia, Africa, Europe, and North America.

A particularly insightful metric is the intercept, which reflects the background HCHO columns in urban areas, primarily derived from the oxidation of methane and long-lived VOCs. Through spatial sampling and filtering, NMVOC emissions at urban sites globally are mainly from anthropogenic sources, although there is also an influence from biological emissions. Urban sites in Asia record the highest intercept, possibly indicating high emissions of methane and long-lived VOCs.

The high r and slope suggest that urbanization more strongly drives anthropogenic NMVOC emissions, possibly due to regional differences in industrial activities and energy use patterns. Figure 3a illustrates the proportion of sector emission contributions for urban sites on the four continents, as estimated from the EDGAR inventory. Across the four continents, industrial and transportation contributions to anthropogenic NMVOC emissions exceed 75%, with other sectors contributing a smaller proportion of anthropogenic NMVOC emissions. Supplementary Table 2 shows the proportion of species emission contributions for urban sites on the four continents, as estimated from the EDGAR inventory. For Asia, Europe, and North America, ethene and propene dominate urban anthropogenic NMVOC emissions, accounting for more than 60%. In contrast, primary formaldehyde is the predominant component in Africa, constituting 63% of anthropogenic NMVOC emissions. Generally, the structure of emission proportions across continents appears similar, suggesting that industrial and transportation NMVOC emissions likely drive the linear model.

Fig. 3: Evaluating the structure of anthropogenic NMVOC emissions from various sectors.
figure 3

Panel (a) shows the share of emission contributions by sector for urban sites on four continents based on EDGAR inventory statistics. The error bar indicates the standard deviation of 1000 simulations by the Montecarlo method. Panel (b) depicts the variations (standard deviation) in the structure (contribution percentage) of anthropogenic NMVOC emissions from different sectors with increasing VIIRS NTL radiance at urban sites.

On the one hand, as the VIIRS NTL radiance values for urban sites increase, the five anthropogenic NMVOC emissions also rise (Supplementary Figure 5). On the other hand, the structure (or proportionate contributions) of anthropogenic NMVOC emissions across various sectors changes (Fig. 3b). The VIIRS NTL radiance values are still categorized into 0.1 log (nanoWatts cm2 sr1) bins, with the average percentage contribution of each sector’s anthropogenic NMVOC emissions calculated for each bin. The standard deviation of the percentages for each bin is then computed to represent structural changes. From the smallest to the largest bin, the standard deviations for the four continents vary by 10.8%, 3.3%, 14.2%, and 4.2%, respectively.

We also analyze the relationship between anthropogenic NMVOC emissions (indicated by TROPOMI HCHO columns) and urbanization (indicated by VIIRS NTL radiance) in the 12 major countries (defined as those with more than 500 urban sites), as shown in Supplementary Table 4. Overall, we find significant linear relationships in 8 of the 12 major countries, with r ranging from 0.59 to 0.95. We examine the possible driving factors for the differences in response relationships between major countries by the contributions of various sectors and species (Supplementary Table 5). For example, the correlation coefficient between anthropogenic NMVOC emissions and urbanization in Brazil (r = 0.95) is much higher than that in India (r = 0.59) (Supplementary Table 4). We hypothesize that this difference is caused by the contribution of emission sectors (72% vs. 56%), including industry and ground transport. The response of NMVOC to urbanization is more likely related to industry and ground transport emissions rather than residential and agricultural sources. In short, this is consistent with the comparison of some local cities (Supplementary Fig. 2) and the global comparison result (Fig. 2 and Supplementary Table 3), indicating that there are similar positive patterns in the responses of anthropogenic NMVOC emissions to urbanization at the scales of cities, countries, continents, and even globally.

Sensitivity simulations for the linear models

Factors such as NOx and biogenic NMVOC emissions can affect HCHO columns indicative of anthropogenic NMVOC emissions, which can further interfere with the reliability of the linear models. One of the interference factors in the linear models is the biogenic NMVOC emissions. Subsequently, using the Olson terrestrial ecoregions59, we assess the influence of biological NMVOC emissions on those linear models. Each urban site is marked according to its ecoregion, revealing that the five most prevalent biomes are tropical and subtropical moist broadleaf forests, temperate grasslands, savannas and shrublands, deserts and xeric shrublands, temperate broadleaf and mixed forests, and boreal forests or taigas.

By excluding urban sites from each of the five ecoregions and re-establishing the global-scale linear model, we compare the subsequent changes in the r, slope, and intercept, as shown in Fig. 4a. The exclusion of sites from the temperate broadleaf and mixed forest biomes led to a small increase in the r of the linear model, indicating that these sites are influenced to a lesser degree by biological NMVOC emissions. When excluding sites from the other four ecoregions, the r decreases slightly, indicating that there are no strong biological NMVOC emissions. Following the exclusion of sites from the temperate broadleaf and mixed forests biome for each continent, linear models are established and are presented in Supplementary Table 3. Notably, the standard deviations (Fig. 3b) exhibit a Spearman rank correlation of 0.9 with linear models’ r and 0.8 with the slope. The variation in the structure of sectoral anthropogenic NMVOC emissions reflects the differences in the various linear models (both the slope and r).

Fig. 4: Sensitivity simulations for the linear model.
figure 4

Panel (a) illustrates the comparison of the linear model (r, slope, and intercept) before and after excluding urban site data from five terrestrial ecoregions. Panel (b) shows the differences between the linear models established from GEOS-Chem simulated HCHO columns and VIIRS NTL radiance at different NOx levels, relative to when the NOx level is at 100%.

Subsequently, we establish relationships between EDGAR HCHO emission rates and NTL radiance for four continents (Supplementary Table 3). In Europe, the linear relationship between HCHO columns, HCHO emission rates, and NTL radiation displays a high degree of consistency. The model’s r (0.91 vs. 0.90) and slope (0.58 vs. 0.59) are nearly identical. There is also a noticeable consistency in Asia and North America. However, a significant divergence is observed in Africa. We hypothesize that this disparity stems from more potent biological NMVOC emissions and varied emission species contributions within the continent (Supplementary Table 2). We also establish relationships between HCHO emission rates and NTL radiance for the 12 major countries (Supplementary Table 4). The linear model between HCHO columns, HCHO emission rates, and NTL radiation displays some degree of consistency. The model’s r in Indonesia (0.74 vs. 0.71) and Russia (0.68 vs. 0.59) are similar. The model’s slope, in Indonesia (0.80 vs. 1.96) and Russia (0.32 vs. 0.62) have similar patterns. This suggests that our top-down approach for monitoring anthropogenic NMVOC emissions remains broadly reliable.

Another interference factor in the linear model is NOx emissions. As NOx an important participant in the photochemical reaction that oxidizes NMVOCs to HCHO60,61, we base five sensitivity simulations with GEOS-Chem (nested version, 0.5° × 0.625°) and TROPOMI NO2 column data to quantify this primary interference. Figure 4b shows the differences in the linear models established using the aforementioned methodology between GEOS-Chem simulated HCHO columns and VIIRS NTL radiances under varying NOx levels. It is found that when NOx levels are not zero, the influence of NOx on the linear models (r, slope, and intercept) is generally less than 20%, indicating the reliability of the linear model. We also use the previously proposed method35 to quantify this dependency among 4 major countries from four continents by TROPOMI NO2 columns. For 4 countries with a significant positive correlation (p-value = 0.01) between anthropogenic NMVOC emissions and urbanization, we found that this effect exists in most countries but does not alter this positive linear relationship. For example, after considering the spatial variation in NOx emissions, the correlation coefficient of Brazil decreases by 8%, and the slope decreases by 19% (Supplementary Figure 6). Therefore, we believe that the linear pattern between HCHO columns and VIIRS NTL radiance is mainly driven by the NMVOC emission gradient within each country, but the impact of NOx emissions still needs to be noted.

Methods

TROPOMI HCHO and NO2 columns

TROPOMI is a space-based observational hyperspectral imager that measures atmospheric properties and composition21, developed jointly by the Netherlands and European Space Agency (ESA). The instrument uses passive remote sensing technology to retrieve atmospheric trace gas columns by measuring solar radiation reflected and radiated at the top of the atmosphere22,62. TROPOMI provides daily products of HCHO and NO2 columns with global coverage and a high spatial resolution22,63 of up to 5.0 × 3.5 km2. Vigouroux et al. 23 based on Fourier transform infrared (FTIR) observations, validated the accuracy of the TROPOMI HCHO column product and demonstrated its reliability. The TROPOMI HCHO64 and NO263,65 column products have passed an intercomparison check with the OMI product.

Quantitative analysis and spatial sampling

To quantify the response patterns of anthropogenic NMVOC emissions to urbanization, we eliminate the influences of wildfires and biogenic sources on NMVOCs. Air temperature is the main driving factor for the seasonal variations in NMVOCs from biogenic sources40,66,67. To mitigate the impact of biogenic sources and wildfires, we select grid cells based on two criteria. One criterion is that HCHO columns weakly depend on air temperature36. This is done by using the exponential dependence (Supplementary Fig. 7) of biogenic HCHO columns on air temperature from NASA Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2)55. We then exclude grid cells with strong biogenic influence (R2 > 0.5 and slope > 0.05)35. The second criterion is to eliminate grid cells (carbon monoxide emission flux > 1 × 10–6 kg m2 year1) affected by wildfires based on the Global Fire Emissions Database version 4 (GFED 4) in 201968.

We restrict our analysis to urban sites. The main data issue with VIIRS NTL radiance is background noise, such as the high reflection of ice and snow69. To reduce its impact, we survey 119,670 global urban sites (NTL radiance > 0.5 nanoWatts cm2 sr1) by using information about impervious surfaces, population density, and industrial and commercial centers56. The accuracy (95%) of the defined urban sites is verified by visually inspecting 1000 randomly selected sites against corresponding high-resolution remote sensing images in Google Earth. Some impacts of biogenic sources and wildfires have been further eliminated to limit the selection of urban sites.

Sensitivity tests with the GEOS-Chem model

GEOS-Chem (http://www.geos-chem.org), a global three-dimensional (3-D) model of tropospheric chemistry, has been previously used to simulate HCHO under various NOx conditions42,67. To explore the impact of NOx emissions on HCHO columns in major countries, we perform five simulations using the nested version (0.5° × 0.625°) of the GEOS-Chem model (version 12.9.3) globally, driven by the MERRA-270 meteorological fields. In the control run, we use anthropogenic emissions from the EDGAR inventory, biogenic VOC emissions from the MEGAN v2.171, and open fire emissions from the fourth-generation global fire emissions database (GFED4). In the sensitivity runs, we reduce anthropogenic NOx emissions in EDGAR by 25%, 50%, 75%, and 100%, and keep other emissions the same as in the control run. All runs are for July 2019, with a spinning-up time of 1 month.

We use TROPOMI tropospheric NO2 data65 along with GEOS-Chem results to quantify the impact of NOx emissions on the spatial distribution of HCHO columns within a certain country. We filter, regrid, and sample NO2 TROPOMI pixels following the same methods as HCHO. Second, we compute the relative change in the tropospheric NO2 columns at grid cell i, ΔNO2(i), from the background, defined as the minimum NO2 columns in that country. As tropospheric NO2 columns and NOx emissions are linearly related, such a relative change in tropospheric NO2 columns, ΔNO2(i), can be roughly regarded as the relative change in NOx emissions, ΔNOx(i), from a baseline emission of that country. Third, we build a 4-segmented function at grid cell i, δ(ΔNOx(i), i), to describe the relative change in the HCHO columns, ΔHCHO(i), in response to the reduction in NOx emissions, ΔNOx(i), based on localized results from GEOS-Chem sensitivity simulations, which are run with 25%, 50%, 75%, and 100% reductions in NOx emissions (Supplementary Fig. 8). Finally, we substitute TROPOMI-based ΔNO2(i) to quantify the impact of spatial variations of NOx emissions on HCHO columns at each grid cell i by interpolating δ(ΔNOx(i), i) between the two nearest reductions.