Introduction

The management and improvement of air quality are global challenges aimed at protecting human health and environmental resources. This is particularly relevant in cities, where pollutant sources and the population sensitive to harmful substances are highly concentrated (deSouza et al. 2020; Castell et al. 2017).

Globally, this results in about 4.2 million deaths per year due to ambient pollution and 3.8 million deaths per year due to household air pollution (UN 2019). The reference regulations established guidelines and concentration limit values for selected harmful chemical species (e.g. particulate matter—PM10, ozone—O3, nitrogen dioxide—NO2, and sulphur dioxide—SO2), considering their effects on the health of the population (World Health Organisation 2017; Bertero et al. 2020). For this reason, the local administrations also adopt reference regulations that identify the limit values for atmospheric pollutants, as well as the methods for their control and monitoring (Li et al. 2019).

To support the control and improvement of air quality, in addition to legislative and scientific indications, numerous tools are used that can be classified as methods of measuring pollutants and tools for estimating and forecasting. The first includes fixed air quality monitoring networks, mobile, and relocatable instruments that can measure air pollutants in the areas where they are positioned (Hao and Xie 2018; Liu et al. 2020). The number and positioning of these instruments must comply with legislative constraints to guarantee the quality and representativeness of the measures collected (Hao and Xie 2018; Ho et al. 2020). Monitoring techniques with low-cost and mobile instruments are an approach that is spreading recently, but which still needs to be properly developed and improved in order to be as accurate as measurements taken with fixed instruments (Chojer et al. 2020; Idrees and Zheng 2020; Weissert et al. 2020). The second category includes the numerous mathematical models for estimating and/or forecasting air pollution which mathematically simulate the physical and chemical mechanisms related to the emission, transformation, dispersion, and transport of air pollutants. These approaches, starting from input data (measured or estimated), can extend knowledge on air quality with very different spatial and temporal resolutions depending on the study needs (Cao et al. 2020; Tiwari et al. 2019; Wang et al. 2016; Beelen et al. 2010).

In this context, air pollutant emission inventories play a key role in supporting the best application of both categories of tools. They can help to understand where it is most relevant to position the measuring instruments, as well as supporting the assessment of emission scenarios and emission trends over time, in locating air pollutant sources and supporting the assessment of the achievement of regulatory objectives, guide air pollutant emission control and management, favour the development of new environmental strategies, as well as being one of the main input data for air quality and atmospheric dispersion models (Marinello et al. 2021, 2020; Azhari et al. 2021; Hua et al. 2019; Leclerc et al. 2019). As defined by the European Environment Agency (1998), “An atmospheric emission inventory can be defined as a collection of data presenting an emission of a pollutant (to air) and related parameters including: chemical identity (characterises the chemical properties of the pollutant); activity or technology (characterises the cause of the emission and relates it to -human economic- activity); location (describes both the location on the map and the height of the release -stack height-); time dependence (in general, emission inventories store emissions as annual totals). The temporal patterns are, in most cases, modelled in the air quality assessment”.

Since it is not possible to accurately measure the emissions of all the individual pollutants and for all the individual sources present in the territories, with the required time resolutions, the atmospheric emissions are estimated on the basis of measurements taken on selected or representative samples of the (main) sources and of the types of sources or estimated on the basis of representative parameters. This process takes place through two main types of approach: (i) bottom–up (microscale) which aggregates the emissions from individual sources, combining them with respect to the desired temporal and spatial resolution (Azhari et al. 2021; Marinello et al. 2020; Cheewaphongphan et al. 2019); (ii) top–down (macroscale) which disaggregates emission data from large areas to small areas (Righi et al. 2013; Nguyen and Wooster 2020).

The bottom–up approaches use the estimation of emissions, in a general way, the relationship between the activity data for specific source over a given time and the specific emission factor rate (mass per unit of time) (EMEP/EEE 2019). Top–down approaches, to spatial or temporal disaggregate pollutant emissions, use surrogate variables. As reported by Bellasio et al. (2007), “proxy variables are variables that are supposed to be strongly correlated to the emissions of a particular activity, and the correlation is assumed to remain strong passing from national to local scale”. There are several proxy variables that are usually used for the spatial and temporal disaggregation of air emission inventories. The choice of the variables used is linked to different aspects, and they change in relation to the types of sources and the territorial information available, such as: road traffic volumes sorted by vehicle class, simplified and complete road networks, traffic and road density, industrial activities classified by production cycle length, land use and population density maps, commercial and residential heating, electricity and heat production, and agriculture activities (Gioli et al. 2015; Gómez et al. 2018; Saide et al. 2009; Li et al. 2021; Sun et al. 2021; Maes et al. 2009). This makes the top–down approach less time- and resource-consuming and more practical than the bottom–up approach. Several studies compared the performance of the two approaches, comparing their respective results and with interesting evidence (Pallavidino et al. 2014; López-Aparicio et al. 2017; Annadanam and Kota 2019). In addition to air pollutants, these approaches are also used for Greenhouse Gases (GHG) inventories, as in study (Poupkou et al. 2007), which describes EDGAR v4.3.2 developments applied to CO2, CH4, and N2O emissions considering the land use, land-use change, and forestry (LULUCF) sector, apart from savannah burning.

This paper presents a critical review of the available research concerning the application of top–down approaches for the spatial and temporal disaggregation of air pollutant inventories, as well as the identification of the main proxy variables applied by the authors. The added value that this work intends to provide is to support, in a concrete and operational way, the choice of the aspects to be considered to properly analyse emission inventories and support local administrations in properly planning environmental policies and strategies. Knowing the proxy variables described in the literature and used to disaggregate the emissions of specific types of emission sources supports the evaluator in choosing the most appropriate variables, also in relation to the information available.

Compared to other papers already published, this article provides a complete overview of the variables used for each emission macrosector, guiding the choice in future emissions disaggregation work.

Material and methods

Material collection, selection, and analysis

An effective literature review needs an appropriate approach for the collection and analysis of reference bibliographic materials. The bibliographic research on the scientific literature concerning the topic of disaggregation of emission inventories was conducted through the use of the ScienceDirect and Scopus databases with unrestricted search in journals, years of publication, and type of paper. The research was conducted on studies that were published prior to 31 January, 2021. Several keywords have been selected to search for articles dealing with the disaggregation of pollutant inventories in the atmosphere with a top–down approach: “emission inventory”, “proxy variables emissions”, “surrogate variables emissions”, “spatial disaggregation emissions”, “emission inventory land use”, “top–down emission inventories”. The collection phase returned a very large number of articles, particularly with more general keywords (Fig. 1). On the collected papers, a scaled approach for the evaluation of their contents was applied. The screening eliminated the duplicates and analysed the abstracts in order to evaluate their usefulness. Here, 87 scientific articles were considered. These were classified according to the author's name, the year of publication, the title of the paper, and the keywords used (Fig. 2).

Fig. 1
figure 1

Flow chart on the methods and processes conducted in this study

Fig. 2
figure 2

Schematic representation of the number of articles related to most significant keywords (the number of items found refers to a measure approximation of the results obtained from the research on the ScienceDirect database)

Furthermore, a first selection criterion was applied, identifying the papers that expressly describe approaches to disaggregating emissions with top–down or bottom–up approaches.

All the articles have been analysed in terms of content and considered suitable for the research objective through the use of inclusion criteria: they use proxy variables for disaggregation, apply a top–down approach, describe a case study, the proposed approach can be replicated in other territorial contexts. This material collection and selection process resulted in a total of 44 papers. The critical analysis on the selected papers was conducted through a structured approach to extract the descriptive elements of the scientific literature and the parameters on which to conduct the critical analysis.

I—Descriptive analysis

  • Year

  • Journal

  • Geographical distribution

  • Paper type

II—Critical analysis

  • Emission source

  • Proxy variables

  • Pollutant

  • Spatial/temporal disaggregation

  • Case study

Results and discussion

The results for each parameter selected for descriptive and critical analysis are described in this section.

Descriptive analysis

The selected articles describe studies that were published during the period 1993‒2020. In particular, 85% of the bibliography focuses on the time period between 2010 and 2020.

Atmospheric Environment is the journal with the highest number of published articles (40%); 9% of the case studies were disclosed in Science of Total Environment, whilst 6% in the Journal of Cleaner Production. The same percentage value is also found for the articles published in the scientific journals Atmospheric Pollution Research (6%) and Journal of the Air and Waste Management Association (6%). The “Other” category includes a large number of journals (11) that have published one paper each.

The selected studies are concentrated in particular in China (18%), followed by Italy, Spain, and the US, which have the same percentages (12%). As for the types of paper, all are research work.

Critical analysis

Emission sources

In order to describe the types of polluting sources treated by each study, the categories of sources that make up the framework of the CORINAIR methodology, consisting of 11 emission categories, each of which has been assigned a Selected Nomenclature for sources of Air Pollution (SNAP) code, have been considered (EMEP/EEE 2019): energy production (SNAP1), residential central heating (SNAP2), industrial combustion (SNAP3), industrial processes (SNAP4), coal extraction (SNAP5), distribution of fuels (SNAP5), solvents use (SNAP6), road transport (SNAP7), off-road vehicles and machinery (consisting of construction and other industrial vehicles/machinery, agriculture, forestry, household and gardening vehicles/machinery, and railway transport) (SNAP8), aviation (SNAP8), cargo shipping (SNAP8), local ferries (SNAP8), waste treatment and disposal (SNAP9), and agriculture (SNAP10).

Figure 3 shows the distribution of the articles amongst the different emission sources, as well as the type of proxy variables used for the disaggregation of the inventories. The proxy variables are indicated, for each type of source, in descending order with respect to their frequency of use.

Fig. 3
figure 3

Sources of emission and related proxy variables

The emission sources that are most analysed in the literature are road transport (about 64% of the authors analyse this source), non-industrial combustion plants (covered by 57% of the papers), and industrial combustion and process considering three different SNAP codes (52% of authors). The other sources are less analysed: extraction and distribution of fossil fuels and geothermal energy and use of solvents and other products are the sectors with the least number of releases (about 27% each). Often, the analysed authors apply a top–down approach for the disaggregation of numerous emission sources. This is the case for approximately 68% of the authors analysed. In particular, Bai et al. (2020) apply different spatial information to define the spatial allocation profile for different sources in 18 cities in Henan region (China). Li et al. (2021) apply ten spatial surrogates to disaggregate area source emissions inventories in California connected to numerous types of emission sources spread throughout the territory. Zhou et al. (2020), to break down VOC emission in Qingdao city (China), analyse seven major sources divided into 46 subcategories. In particular, industrial process, on-road mobile, and solvent use sources were the most impactful emission sources (as a total annual value). Other studies, on the other hand, focus on the disaggregation of a single emissive source (about 32% of the selected papers). This is the case with the studies of Gómez et al. (2018), González et al. (2020), and Plejdrup et al. (2016). The first focuses in particular on on-road vehicle emission analysing the case study of Andean city (Colombia) with a spatial resolution of 1 × 1 km and a temporal resolution of 1 h. The second study also focuses on traffic emissions through the development of an algorithm for the disaggregation of inventories through the use of different variables: length of road segments; the combination of length of road segments and type of roads; the combination of length of road segments and traffic flows. The last study indicated analyses the residential wood combustion as the major contributor to atmospheric pollution, especially for particulate matter in Denmark.

Proxy variables

Figure 3 also identifies the various proxy variables applied by the analysed authors with respect to each type of emission source. Population density and land use are the variables that find application to support the disaggregation of numerous sources and that are widely used in the literature. The first variable is used for all sources (except in the case of other mobile sources) and is used by 26 papers. Its popularity as a proxy variable is linked to its ability to represent services that are usually aimed at the urban population. For example, domestic heating emissions are closely linked to the presence of population inside homes, whilst in the case of waste treatment and disposal, population density expresses the number of populations served, as is the case for the extraction and distribution of fossil fuels. Land use, on the other hand, identifies the different types of land destination. In this way, it is easy to identify the areas for industrial use where to allocate industrial emissions, as well as the agricultural soils on which to distribute the emissions due to this type of activity.

Finally, it is central to emphasise the importance of having different proxy variables to be appropriately combined with each other to improve the results obtained. This is, for example, the case described by Righi et al. (2013) which, by combining the population density and the building volume, has obtained an interesting spatial representation of emissions for domestic heating.

Pollutant and spatial/temporal disaggregation

Table 1 reports the representative data of the case studies described by the authors, in particular, the territorial scope addressed by the case studies (and the relative extension), the pollutants that have been analysed, and the spatial and temporal resolution obtained by each described approach. The analysis of the case studies highlights a general balance between studies that analyse small territorial contexts that refer to urban areas with works that study broader contexts that affect an entire country or a group of countries. About 84% of the papers analyse these contexts (43% and 41% for city and country, respectively). Intermediate contexts, called in Table 1 as “regions” which constitute areas of intermediate extension between cities and countries, constitute 16% of the collected papers.

Table 1 Representative data from each study analysed (articles are organised alphabetically with respect to the surname of the first author)

Gioli et al. (2015) apply a top–down approach in order to spatially and temporally disaggregate emission inventories at yearly, monthly, weekly, and hourly time scales. Through the eddy covariance technique, the authors estimated CO2 emissions and compared them with the official regional emission inventory. The study was carried out in the city of Florence, Italy. Another example is described in the study by Karvosenoja et al. (2020) who analysed the spatial distribution of PM2.5 emissions from the machinery sector in Helsinki area, demonstrating the applicability of specific proxy variables to better detail the knowledge of the territory and of the emission sources. In addition to these cases, specific to a single city, other studies have applied the same approach to the disaggregation of emissions in different urban contexts. Ossés de Eicker et al. (2008) studied the breakdown of vehicular traffic emissions of seven mid-sized Chilean cities. Mishalani et al. (2014) use a model to study CO2 emissions on 146 urbanised areas in the US through the application of different proxy variables, such as: population density, transit share, freeway lane-miles per capita, private vehicle occupancy, and average travel time. Andreão et al. (2020) disaggregate PM emissions from vehicular sources within 1 km2 cells within four metropolitan areas of Brazilian Southeast: Belo Horizonte (MABH), Great Vitória (MAGV), Rio de Janeiro (MARJ), and São Paulo (MASP).

The studies presented by Dios et al. (2012) and Righi et al. (2013) operate in wider territorial contexts. The first study paper applies a mixed top–down and bottom–up methodology for spatial segregation of the emissions over Galicia region (Spain), whilst the second study disaggregates the domestic emissions of the territory of the province of Ravenna (Italy).

Finally, working within the boundaries of one or more countries, Lopez-Aparicio et al. (2018) present a study dedicated to residential wood combustion in Norway through two particular methods for data collection: (i) webcrawler that extracts openly online available real estate data and (ii) image recognition and classification based on machine learning techniques. The positive results obtained have shown how this approach for the high resolution and level of emission inventories can enhance the value of the available open data. Markakis et al. (2010) compiled a PM10 emission inventory from road transport for Greece in 10-km spatial resolution. The results obtained from this study are interesting because, in addition to the emissions on land, it also analyses and disaggregates the total PM10 anthropogenic emissions in the sea area. Trombetti et al. (2018) present an inter-comparison of the main top–down emission inventories currently available at the European level, working on NOx, SO2, VOC, and PM2.5 from the road transport, residential combustion, and industry sectors. Also Ferreira et al. (2013) conduct a comparison between different spatial disaggregation methodologies of atmospheric emission inventories that work with different spatial resolutions.

Case study

The distribution of the case studies (Fig. 4) is mainly concentrated in Europe and Asia (21 and 15 applications each). The presence in the US is equally distributed between north and south (eight in the north and seven in the south). Africa closes the list with two case studies, whilst in Oceania, there are no specific studies.

Fig. 4
figure 4

Case study map

PM (PM10 and/or PM2.5) and NOx are the pollutants that have been analysed by the greatest number of authors (55% and 48%, respectively). This confirms how these pollutants are the ones that determine the greatest criticalities from the point of view of exceeding the concentration limit values for the exposure of the population. Over 30% of the authors experimentally applied top–down disaggregation approaches for the study of CO (39%), NMVOC (34%), SO2 (32%), and CO2 (30%). In the study presented by Kanabkaew and Oanh (2011), an approach to define the atmospheric emissions from the combustion of crop residues in Thailand is described. In particular, the study analyses the following pollutants: PM10, SO2, CO2, CO, NOx, NH3, CH4, NMVOC, elemental carbon, and organic carbon. The provincial emissions were also disaggregated on a 0.1° × 0.1° grid net and to hourly profiles that can be directly used for dispersion modelling. Palacios et al. (2001) estimated the disaggregation of anthropogenic emissions in a coastal Mediterranean region of Spain. The pollutants considered were: SO2, NOx, NMVOCs, CH4, CO, CO2, N2O, NH3, and NMVOCs.

Some authors have concentrated their studies, in particular, on the study of Greenhouse Gases (GHGs), intended as a set of atmospheric substances that have effects on the climate. In particular, in Zhang et al. (2018), a disaggregation approach is applied in the Xiamen area, a typical rapidly urbanising city in China which was selected as a case study. Several GHGs have been studied, in particular: CO2, CH4, N2O, HFCs, PFCs, and SF6. The disaggregation was conducted according to land use: facturing, mining, warehouse land, service industry land, transportation land, residential land, and agriculture land. The study of CO2 was also addressed in Alam et al. (2018) and Shu and Lam Nina (2011). The first disaggregated the emissions from road transport of CO2 and PM starting from a national road transport estimation in the Greater Dublin Area. The second study describes a disaggregation approach based on a multiple linear regression model to disaggregate traffic-related CO2 emission estimates from the parish-level scale to a 1–1-km grid scale. The proxy variables used in the proposed methodological approach are the following: population density, urban area, income, and road density. A further example of studies aimed at analysing GHG emissions was presented by Nguyen and Wooster (2020) which, specifically, analyse CH4 as an air pollutant.

Also, from the point of view of the resolutions (spatial and temporal), a heterogeneity in the approaches can be observed: the spatial resolution varies from square cells with dimensions 100 × 100 m2 up to much wider resolutions reaching up to 36 km2. In particular, higher spatial resolutions are applied above all with less extensive study areas, also in order not to “burden” the computational resources necessary to process and manage the data. In this case (Righi et al. 2013), cells with dimensions of 100 m2 are used (Righi et al. 2013) to few km2 (e.g. Dios et al. (2012) use 3-km2 cells). The only exception is the study of Liu et al. (2019) which uses a resolution of 36 km2. Analysing larger study areas (such as an entire country), the authors' choice focuses more on cells with dimensions between 250 m2 (as in the study of Lopez-Aparicio et al. (2018) conducted in Finland) and 10 km2 (as in the case study of Trombetti et al. (2017) which analyses the countries of the EU). Regardless of the case study analysed, the spatial resolution most used by the collected works is 1 km2, chosen by 43% of the authors analysed (Aleksandropoulou et al. 2011; Ma et al. 2018).

Finally, the time resolution most frequently used for the disaggregation of emissions into the atmosphere is the hourly or monthly one, chosen by more than half of the authors analysed.

Operational benefits from the breakdown of emissions

From the analysis of the scientific bibliography that has been analysed and is the subject of this study, it emerged that the results obtained from the process of disaggregating atmospheric emissions inventories using a “top–down” approach to be able to allocate emissions at space and time are mainly aimed at ensuring that effective mitigating policies for environmental and atmospheric remediation are in place.

In particular, it was highlighted, amongst the purposes most cited in the collected documentation, the need to use the data collected from the experiments in order to support studies on air quality improvement models; this, of course, refers to the territory on which the experimentation was conducted.

Furthermore, the definition of a reliable emission inventory, spatially and temporally disaggregated, provides a set of data necessary to support the implementation of deterministic models of chemistry and transport, thus allowing the evaluation of whether compliance with the quality limits of the air was reached; in this case, effective abatement measures are provided to meet environmental objectives.

Another goal of the researchers was to support the development of strategies aimed at reducing Greenhouse Gas (GHG) emissions at the urban level, encouraging policies aimed at the use of more efficient modes of transport. Other research groups, on the other hand, preferred to compare the emission data obtained by referring to different inventories and estimating what could be the trends relating to atmospheric quality, considering a specific geographical area over the years.

Conclusion

The study reports a critical analysis of the main results concerning the disaggregation of atmospheric pollutant emission inventories through top–down approaches that use appropriate proxy variables. The extensive bibliography available on this topic has been analysed in relation to specific aspects: (a) descriptive aspects of each paper, collection methods, and type of work; (b) type of proxy variable; (c) emission sources; (d) spatial and temporal resolution.

The conclusions of the study, including some limitations, are as follows:

  • The studies available in the literature are concentrated in particular during the period 2010 and 2020, underlining the growing need to improve knowledge on the sources of environmental pressure, supported by the greater availability of spatial data.

  • The literature offers approaches and reflections on the disaggregation of emissions attributed to all the different possible emission sources, whether stationary or mobile, point, areal, or linear in form. Vehicular traffic, industrial activities, and domestic consumption are the sources most analysed, often the most responsible for environmental impacts in terms of pollutants emitted into the atmosphere through the emissions of some of the currently most critical pollutants (particulate and NOx).

  • There are numerous proxy variables applied in the literature for the disaggregation of inventories. Some are very specific for the type of emissive source and are not always easily found in any territorial context, whilst others are very common variables that can be easily applicable (e.g. population density and land use). The use of proxy variables often requires the ability of the methodological approach to combine different variables to better represent the territorial context of reference.

  • The choice of a methodological approach and related proxy variables makes it possible to disaggregate, in the same way, the emissions of different air pollutants without changing the approach developed.

  • The choice of the spatial resolution used is heavily dependent on the extension of the study area. The smaller the case study (e.g. urban area), the higher the usable spatial resolution. The studies analysed have demonstrated the ability of top–down approaches to disaggregate emissions with a resolution of up to 100 m2.