State-of-art in modelling particulate matter (PM) concentration: a scoping review of aims and methods

Gianquintieri, Lorenzo; Oxoli, Daniele; Caiani, Enrico Gianluca; Brovelli, Maria Antonia

doi:10.1007/s10668-024-04781-5

State-of-art in modelling particulate matter (PM) concentration: a scoping review of aims and methods

Review
Open access
Published: 02 April 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Environment, Development and Sustainability Aims and scope Submit manuscript

State-of-art in modelling particulate matter (PM) concentration: a scoping review of aims and methods

Download PDF

417 Accesses
1 Citation
Explore all metrics

Abstract

Air pollution is the one of the most significant environmental risks to health worldwide. An accurate assessment of population exposure would require a continuous distribution of measuring ground-stations, which is not feasible. Therefore, significant efforts are spent in implementing air-quality models. However, a complex scenario emerges, with the spread of many different solutions, and a consequent struggle in comparison, evaluation and replication, hindering the definition of the state-of-art. Accordingly, aim of this scoping review was to analyze the latest scientific research on air-quality modelling, focusing on particulate matter, identifying the most widespread solutions and trying to compare them. The review was mainly focused, but not limited to, machine learning applications. An initial set of 940 results published in 2022 were returned by search engines, 142 of which resulted significant and were analyzed. Three main modelling scopes were identified: correlation analysis, interpolation and forecast. Most of the studies were relevant to east and south-east Asia. The majority of models were multivariate, including (besides ground stations) meteorological information, satellite data, land use and/or topography, and more. 232 different algorithms were tested across studies (either as single-blocks or within ensemble architectures), of which only 60 were tested more than once. A performance comparison showed stronger evidence towards the use of Random Forest modelling, in particular when included in ensemble architectures. However, it must be noticed that results varied significantly according to the experimental set-up, indicating that no overall best solution can be identified, and a case-specific assessment is necessary.

A Comparative and Systematic Study of Machine Learning (ML) Approaches for Particulate Matter (PM) Prediction

Article 22 September 2023

evalPM: a framework for evaluating machine learning models for particulate matter prediction

Article Open access 18 November 2023

Integrating machine learning techniques for Air Quality Index forecasting and insights from pollutant-meteorological dynamics in sustainable urban environments

Article 21 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Air pollution is considered by United Nations as the one of the most significant environmental risks to health worldwide, and consequently addressed in the United Nations Sustainable Development Goals [1]. Air quality can vary significantly across territories, even at high geographic and temporal granularity [2]: an accurate assessment of the exposure of population to pollutants would require an almost continuous distribution of measuring ground-stations, an approach far from being feasible. Hence, scientific research has spent significant efforts in implementing air quality models, in order to increase the usability of limited measurements in time and space by inferring more detailed information through data processing. Many different models were implemented over the years, applying diverse approaches [3], the most common being Kriging interpolation and land-use regression (LUR), along with more complex processing frameworks such as chemical transport models (CTM). However, recent reviews on the topic [3,4,5,6] highlighted some critical issues with these widespread approaches, namely the limited performance of the more basic statistical models (Kriging, LUR) and the high requirements in terms of data and computational capabilities of the more complex models (CTM). For such reasons, an exponential increase in implementation of models based on machine learning (ML) algorithms emerged in the last years and is now the most diffused in scientific research in this field, setting a new state-of-art in particular with relation to health impact assessment, where advanced data processing and geographical modelling are taking over more traditional approaches [7]. This kind of models offers a performance comparable (or even superior) to CTM, while relying on less data and requiring less computational capabilities. However, ‘machine learning’ is a macro-category, which includes many algorithms with significant differences in terms of mathematical background, applicability, complexity and interpretability; the number of different solutions is virtually infinite, as these algorithms can also be modified and adapted to specific frameworks. Furthermore, multiple ML algorithms can be used as separate ‘functional blocks’ in the different phases of a unique modelling process, to build ‘ensemble’ architectures. What emerges is a complex scenario, with the spread of many different solutions, and a consequent struggle in comparison, evaluation and replication, thus hindering the definition of the state-of-art. As a consequence, it may be difficult to identify the best solutions to be tested when designing a new project focused on air quality.

In this context, object of this scoping review was to analyze the latest scientific research on the topic of ML applied to air quality modelling, focusing in particular on particulate matter (PM), known to be a serious hazard for human health [8, 9]. The intent was to identify the most widespread solutions and to try to compare them, according to level of evidence, thus identifying requirements and possibly supporting the design of future projects in the field. Therefore, with this review, the research goal is to verify if machine learning has become the state-of-art methodology in air quality modelling (either globally or in limited areas), if there are specific architectures and algorithms that outperform other solutions, and what is the performance that can be expected from such models.

The manuscript is structured in the following sections: (II) Review methodology: describes the procedure of collection and analysis of relevant scientific literature. (III) Objective of selected studies: classification of the identified studies according to the different aim, distinguishing explorative correlation analysis, interpolation and forecast. (IV) Geographic distribution: analysis of the origins’ distribution of the studies. (V) Input data sources: assessment of the input data on which models were based. (VI) Used algorithms and estimated performance: analysis and comparative evaluation of the different solutions implemented. (VII) Critical discussion of the results and consequent conclusions.

2 Review methodology

The explored database was Google Scholar; for the query, the applied keywords were ‘pollution’, ‘PM’ and ‘particulate matter’, ‘interpolation’, ‘prediction’ and ‘forecast’, ‘machine learning’ and ‘ML’. Showing a strong attention towards this topic, the number of potentially relevant results (as returned by the search engine) was very high, with over 7000 results returned (in English language). In order to catch the trend of the state-of-art evolution, the search was therefore limited to the last year (2022) only, thus reducing the number of potential results to 940. Based on the title and abstract, articles were selected as relevant if the study included the development (or at least the use) of a specific model to estimate PM concentration. The number was thus further reduced to 169. Finally, after full-text reading, 142 relevant studies (with the same criterion) were identified, including 4 literature reviews [3,4,5,6] and 138 observational studies [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147]. These 138 studies were analyzed, collecting structured information relevant to (1) the study objective (primary and eventually secondary), (2) the target pollutant(s), (3) the data sources, (4) the method of attributes selection, (5) the target territory, (6) the spatial and time resolution, (7) the models tested, (8) the performance evaluation method and results, and (9) the final model selected. Such information was manually recorded and structured in a relational database, with a pre-defined codified language that allowed comparison and processing. After the information was structured within this framework, it was possible to automatize the subsequent analyses, implementing them through Python programming language (v 3.7). With this approach, it was possible to quickly obtain statistics and graphics, as well as the list of references corresponding to each identified group of studies. A first analysis round was relevant to the studies objective, resulting in a classification that allowed to identify research sub-groups. Once this classification was performed, all following analyses were repeated separately on the whole database and on the single groups. For qualitative information (such as 1–5, 7 and 9), basic statistics were extracted, discussing absolute and relative frequencies of the different labels, generally aggregating single-spot elements (i.e. found in one study only across the database) in the ‘other’ category. For quantitative information (6 and especially 8), a more advanced analysis was applied, assessing the cumulated results as average and confidence interval, and identifying the robustness of the results in terms of number of studies in relation to category-based sub-groups. Anyway, it is worth noticing that the different aspects of the analysis were addressed one-by-one, eventually adjusting the methodology according to the specific needs.

3 Objective of selected studies

The 138 identified observational studies were classified according to their primary goal. In particular, 3 categories were identified:

A.
Correlation analysis: these studies aimed at analyzing the impact on PM concentration of the different considered data sources, eventually evaluating their weight within the implemented ML models. This category included 14 (10.14%) studies [10,11,12,13,14,15,16,17,18,19,20,21,22,23].
B.
Spatial and/or temporal interpolation models: these studies aimed at inferring missing data in space (missing records and/or computation of continuous mapping from discrete points) and/or in time, thus including 65 (47.1%) studies [26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90].
C.
Forecast models: these studies aimed at developing and validating predictive models to forecast the future concentration of PM. This category included 56 (40.58%) studies [92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147].

On top of this classification, some additional studies should be considered, whose main scope was not PM modelling, yet in which a model for PM concentration (eventually externally developed) was used. Specifically, it is a total of 3 studies, two of them [24, 25] had explorative correlation analysis as secondary purpose (category A), while the third [91] had a spatial-temporal interpolation as secondary scope (category B). Therefore, the final number of studies for the three categories resulted as:

A.
Correlation analysis: 14 + 2 = 16 (11.59%) [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25].
B.
Interpolation: 65 + 1 = 66 (47.83%) studies [26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91].
C.
Forecast: 56 (40.58%) studies [92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147].

However, it is worth noticing that this classification should not be interpreted as rigid, as many studies could be assigned to more than one category when secondary aims were taken into account. A graphical representation of sub-categories is reported in Fig. 1.

4 Geographic distribution

Considering the territory under analysis, the vast majority of the studies (112, 81.16%) was relevant to Asia, in particular east and south-east. Among them, the largest contribution was provided by China [11,12,13, 19,20,21, 25,26,27,28,29,30, 34, 35, 37, 38, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60, 62, 63, 71,72,73,74,75,76,77,78,79,80, 93, 98, 100,101,102,103,104,105, 115,116,117, 123,124,125,126,127,128,129,130,131,132,133,134,135], which represented, with 73 studies (52.9%), more than half of the total. A second block included India [15, 24, 35, 61, 83,84,85, 94, 106, 120, 121, 136, 137] (13, 9.42%), South Korea [22, 26, 60, 62, 63, 87, 111, 118, 140,141,142,143] (12, 8.7%) and USA [10, 17, 18, 23, 33, 66,67,68, 89, 97, 114] (11, 7.97%), while the other countries addressed in more than one publication were Japan [60, 63, 95, 122] and Thailand [16, 64, 112, 144] (4, 2.9%), Taiwan [69, 96, 119] and UK [35, 113, 147] (3, 2.17%), Spain [88, 145], Germany [82, 91], Iran [107, 138], Malaysia [108, 109], Canada [39, 70] (2, 1.45%). The total counting (subdivided by study category according to the classification defined in Sect. 3) is reported in Table 1.

Table 1 Number of studies per country relevant to machine-learning based modelling of particulate matter concentration; for references of countries with more than one occurrence, please refer to the main text

Full size table

When normalizing the number of studies on the population (N studies / 10 million) of the different target countries, the most studied country resulted by far South Korea (2.34), while (considering only countries with more than one single publication) the second was Taiwan (1.26). Countries with the largest absolute numbers had lower values: 0.51 for China, 0.09 for India, 0.33 for USA. A graphical representation of the normalized number of studies is provided in Fig. 2, while complete results (also subdivided by study category according to the classification defined in Sect. 3) are reported in Table 2.

Table 2 Number of studies per country, relevant to machine-learning based modelling of particulate matter concentration, normalized on the country’s population

Full size table

5 Input data sources

Concerning the data sources used in the modelling, a first distinction should be made between univariate and multivariate models. Univariate models were based on a single data source, represented by the time-series of PM concentration as recorded by ground stations. Such studies were 15 [31, 35, 36, 83, 85, 87,88,89, 95, 113, 116, 130, 132, 135, 147] in total (10.87%), 8 of which were interpolation models (12.12% of category B) and 7 were predictive models (12.15% of category C). No study of this kind belonged to category A, for which the scope is indeed to evaluate the impact of other data sources on the target (concentration of PM).

The majority of studies were multivariate: 123 in total (89.13%), 16 for correlation analysis (100%), 58 for interpolation models (87.88%) and 49 for prediction models (87.85%). In such models, the most used data source (besides ground stations) was meteorological information, with 92 [13,14,15,16, 19,20,21, 24,25,26,27, 30, 34, 38, 40, 42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63, 65,66,67,68, 70, 71, 73, 74, 76,77,78,79,80,81,82, 84, 90, 91, 93, 94, 96,97,98,99,100,101,102,103,104,105, 107, 109,110,111,112, 114, 115, 117, 119, 122, 124,125,126,127, 129, 133, 134, 136,137,138,139,140, 142, 144, 145] studies (74.8% of all multivariate models), followed by satellite data, where 45 [15, 17, 26, 29, 30, 32, 40, 42, 43, 45,46,47,48, 50,51,52,53, 55, 57, 60,61,62,63, 65, 67, 70, 73,74,75,76,77,78,79,80,81,82, 93, 107, 112, 115, 120, 133, 134, 141, 144] studies (36.59%) used Aerosol Optical Depth, and 37 [10,11,12,13, 15, 17,18,19,20, 22, 26, 34, 37, 40, 41, 43, 45, 47, 48, 51, 52, 54, 56, 58, 60, 62,63,64,65, 68, 70,71,72,73, 79,80,81,82, 86, 93, 97, 110, 112, 115, 117, 129, 131, 139, 144] (30.08%) used other satellite imagery. Other largely included variables were land use and/or topography, present in 51 [11,12,13, 16, 18, 22, 23, 25, 26, 30, 33, 34, 38, 39, 42, 43, 45,46,47,48,49,50,51, 53,54,55,56, 58, 59, 61, 63, 64, 66, 68, 71, 73,74,75,76,77, 80, 81, 92, 93, 101, 112, 115, 122, 133, 139, 144] studies (41.46%). Other less frequently included data sources were measurement of other pollutants, other models previously implemented, demography, ad-hoc micro-sensors networks, road traffic information, wildfires localization. Furthermore, 12 [10, 11, 13, 18,19,20, 22, 34, 51, 80, 110, 129] studies (9.76%) used other specific categories of variables not previously used in any other study, and therefore not classified. A complete description of data sources included in multivariate models, also subdivided by study category, are reported in Table 3.

Table 3 Data sources used in studies relevant to machine-learning based modelling of particulate matter concentration, subdivided according to study aim

Full size table

6 Used algorithms and estimated performance

With regards to the type of the chosen final algorithm, a first distinction can be made between single-block models and ‘ensemble’ architectures. Single-block models are algorithms trained for the specific intended task, while ‘ensemble’ architectures are systems composed by multiple functional blocks, in series and/or in parallel, each of whom is basically a single-block model performing a specific sub-task within the overall framework. Such a more complex approach was tested in 43 [22, 24, 25, 29,30,31, 36, 42, 44, 46, 47, 64, 67, 84, 89, 97,98,99,100,101,102, 104,105,106,107,108, 110, 117, 118, 121, 123, 126, 127, 129,130,131, 133, 135, 138, 141, 142, 145, 147] studies (31.16%), 3 of which [22, 24, 25] were correlation analysis (18.75% of category A), 12 [29,30,31, 36, 42, 44, 46, 47, 64, 67, 84, 89] were interpolation models (18.18% of category B), and 28 [97,98,99,100,101,102, 104,105,106,107,108, 110, 117, 118, 121, 123, 126, 127, 129,130,131, 133, 135, 138, 141, 142, 145, 147] prediction models, accounting for 50% of category C. As a result, this last category represents the application field where the use of ensemble architectures was mostly diffused.

In sake of comparison, in the following analysis all algorithms were considered singularly, even when they were inserted into a more complex structure. The reason is that ensemble architectures are basically unique and implemented ad-hoc on each specific project, meaning that the same architecture was never used more than once across all considered studies, thus impeding any kind of comparison.

The performed analysis regarded both all the algorithms that were tested and evaluated, as well as those that were selected for the final implementation resulting the most performant. A first relevant result is the use (in test phase) of 232 different algorithms across all the studies, of which only 60 (25.86%) were tested in more than one. Similarly, considering the final model chosen as the most performant, only 20 solutions, out of a total list of 108 (thus corresponding to the 18.52%), were selected in more than one study.

Considering only the repeated solutions, the most frequently tested algorithm was the Random Forest (RF: 43 [10, 12, 13, 19, 20, 37, 39, 42, 45, 46, 48, 52, 54, 55, 58,59,60,61, 63, 64, 68, 70, 71, 73, 75, 89, 90, 95, 99, 101, 103, 107, 109, 110, 113, 114, 117, 120, 122, 134, 136, 140, 141], 31.16%), followed by the Long-Short Term Memory (LSTM: 34 [36, 38, 46, 84, 94, 96, 98, 100, 102,103,104,105, 111, 113, 114, 116, 119, 125,126,127, 129, 130, 137, 138, 142, 143, 145], 24.64%) and by Convolutional Neural Networks (CNN: 19 [84, 86, 94, 96, 100, 108, 113, 114, 116, 126, 127, 130, 142, 145], 13.77%). With regards to the choice of the most performant algorithm, the most frequently selected two were again RF (27 [9, 12, 13, 15, 19, 20, 37, 39, 42, 45, 48, 52, 60, 63, 64, 68, 70, 73, 89, 95, 99, 101, 109, 110, 117, 122, 141], 19.57%) and LSTM (12 [24, 36, 84, 96, 98, 103, 105, 108, 126, 129, 136, 145], 8.7%), while the third was the eXtreme Gradient Boosting (XGBoost: 7 [26, 42, 51, 59, 61, 107, 110], 5.07%, tested in 12 [26, 42, 51, 55, 61, 90, 103, 107, 110, 113, 140], 8.7%).

Considering the choice of the most performant computational algorithm, according to the different application field and studies objectives as categorized in Sect. 3, for correlation analysis (category A) the most used algorithm was RF, with 6 [10, 12, 13, 15, 19, 20] studies out of 16 (37.5%), followed by Geographically Weighted Regression (GWR: 2 [11, 18], 12.5%); in all the other cases, ad-hoc solutions were implemented and not repeated anywhere else, while in 4 further cases [21,22,23, 25] (25%) the analysis was based on classic statistical methods, meaning that there was not an actual model development. Also in interpolation models (category B), RF was the most widely adopted solution (13 [37, 39, 42, 45, 48, 52, 60, 63, 64, 68, 70, 73, 89], 19.7%), followed by XGBoost (5 [26, 42, 51, 59, 61], 7.58%) and the Deep Forest (DF: 4 [50, 53, 54, 58], 6.06%). Within predictive models (category C), the most used was instead LSTM (9 [96, 98, 103, 105, 108, 126, 129, 136, 145], 16.07%), followed by RF (8 [95, 99, 101, 109, 110, 117, 122, 141], 14.29%) while three different algorithms were equally applied with the third highest frequency (3, 5.36%), namely CNN [100, 126, 138], AutoRegressive Moving Average (ARMA) [110, 130, 132] and Chemical Transportation Models (CTM [101, 131, 147], always included in ensemble architectures in the analyzed studies). The frequency of application of the most diffused algorithms, subdivided according to study category, are reported in Fig. 3.

With regards to performance, the most common parameter (118 studies, 85.51%) considered for comparison was root mean squared error (RMSE), expressed as µg/m³ and reported as median [1st quartile – 3rd quartile] or as 95% confidence interval (Table 4).

Table 4 Statistics about the performance evaluation (through root mean squared error, RMSE) of algorithms used in studies relevant to machine-learning based modelling of particulate matter concentration, subdivided according to study aim

Full size table

The following analysis considered the performance evaluation of the single algorithms applied, although the need of a common metric imposed to only include studies reporting RMSE (118 studies, 85.51%, as per Table 4); given the nature of the metric (error measurement), lower values correspond to higher performance. Moreover, for statistical robustness, only algorithms that were used more than once (i.e. at least in two different studies) were included, whether they are applied as a single-block framework or as one of the multiple blocks in an ensemble architecture. In the first case (single-blocks), the best performance was that of LSTM (5.75 ± 5.18 µg/m³), although the evidence is quite low, being used in two studies only [96, 103]. XGBoost follows with 7.78 ± 4.68 µg/m³ and a higher level of evidence, being used in 5 studies [26, 51, 59, 61, 107]. The highest level of evidence was reached for RF, applied in 19 studies [10, 12, 13, 15, 19, 20, 37, 45, 48, 52, 60, 63, 68, 70, 73, 89, 95, 109, 122], with a lower but comparable performance of 12.79 ± 9.1 µg/m³, while the least performant solution (mainly due to a large confidence interval among the 4 cases of application [50, 53, 54, 58]) was the Deep Forest (DF) with 27 ± 14.61 µg/m³.

When considering the scond group, ensemble architectures, the best performances were reached when a RF was included (10.44 ± 7.64 µg/m³, with 6 cases of application [42, 64, 99, 101, 110, 117]), followed by LSTM (13.15 ± 9.1 µg/m³, with 7 cases of application [24, 36, 84, 98, 105, 108, 126]). A comparative graphical representation of such results is reported in Fig. 4.

As previously stated, the specific aim of each study has a primary impact on the implemented algorithms and their performance. To take this into account, this performance analysis was also conducted separately for the three categories identified in Sect. 3.

For Category A, relevant to correlation analysis, it must be noticed that the full implementation of a model is not a requirement to fulfill the goal and, as a result, only two algorithms could be evaluated: RF, 5.05 ± 4.04 µg/m³ on 3 applications [10, 19, 20], and GWR, 30.57 ± 20.8 µg/m³ on 2 applications [11, 18]. In category B, interpolation models, the best results were obtained with LSTM into an ensemble architecture (9.9 ± 2 µg/m³), although with low evidence (2 only cases of application [36, 84]). The most frequently applied approach was the implementation of a single-block framework based on an empowered decision-tree-like algorithm, such as an Extremely Randomized Tree (11.22 ± 1.28 µg/m³ with 2 cases [81, 90]), Deep Forest (DF: 16.66 ± 4.5 µg/m³ with 4 applications [50, 53, 54, 58]) and XGBoost (13.95 ± 5.15 µg/m³ with 3 applications [51, 59, 61]). Such use of empowered decision trees showed, although with lower evidence, a higher performance when compared with a basic RF (14.9 ± 8.45 µg/m³ with 10 applications [37, 45, 48, 52, 60, 63, 68, 70, 73, 89]). A graphical representation of these performance results (category B) is provided in Fig. 5. Concerning category C, predictive modelling, the best results were obtained with a single-block LSTM (4.32 ± 3.23 µg/m³ with 3 applications [96, 103, 136]), closely followed by a single RF (6.49 ± 4.91 µg/m³ with 3 applications [95, 109, 122]), while considering ensemble architectures the most performant were those including a CNN (5.81 ± 5.46 µg/m³ with 3 applications [100, 126, 145]). The most frequently applied approaches were the inclusion, in the ensemble architecture, of either a RF [99, 101, 110, 117, 141] (12 ± 7.93 µg/m³) or a LSTM [98, 105, 126, 129, 145] (13.7 ± 9.1 µg/m³), both with 5 cases of application. Lower performance, again mainly due to a high range of values in the 3 cases of application [101, 131, 147], was provided by ensemble models including a CTM (18.79 ± 11.99 µg/m³). A graphical representation of such results (category C) is provided in Fig. 6.

7 Conclusions

A scientific literature review was performed on the topic of advanced data computational techniques (mainly machine learning ML) applied to air quality models, with a specific focus on particulate matter (PM). This topic resulted to be of very high interest for the international scientific community, with a production of scientific literature of impressive dimension. As a matter of fact, by considering a single year (2022) it was already possible to identify a total of 138 relevant studies to be included and fully analyzed. While, on one side, this represents a limitation (resulting in a very small time period for the review), it is also to be considered as a relevant result in itself, showing a unique level of interest and attention from the scientific community towards this field, and its characteristic of exceptional dynamicity and speed of advancement.

According to the analysis, ML is the edging technology in air quality modelling, and recent scientific research confirms a widely spread application of this approach, thus positively answering (on a global scale) to the primary research question addressed with this work. In particular, three main fields of application emerged, with the largest share of studies focused on the spatial and/or temporal interpolation of data (either filling gaps in recordings or inferring a continuous measurement from sparse samples), followed closely by prediction models for concentration forecast, and a smaller amount of studies focused on a correlation analysis between explicative factors and concentration levels.

Despite a wide enough geographical distribution of the countries under examination, a larger part of the production was focused on southern-eastern Asia, in particular in China (in absolute numbers) and South Korea (in proportion to the population). The most frequent approach in PM modelling was to implement multi-variate models (almost 9 cases out of 10), including additional measurements on top of ground stations, mainly meteorological data and satellite-derived information (such as AOD), but many more additional data were frequently considered (land-use, demography, previous models etc.), thus confirming established knowledge in the field [4].

With regards to the implemented methodological solutions, a strong sparsity was found, with the vast majority of studies developing ad-hoc unique frameworks. While literature [3] enlightens that there is not a single best solution suiting all needs, which can to some extent explain the recorded sparsity, this variety hinders replicability and therefore comparisons, thus being a potential barrier to identify best-practices for new future studies on this topic. As a result, the second research question addressed is left unanswered, having to notice the impossibility to identify a specific architecture/model that consistently outperforms other solutions.

Despite this obstacle, it was possible to draw some significant conclusions, and to address the last research question about the expected performance. In particular, the most interesting assessment regards the relationship between the estimated performance and the level of complexity of the models. In this sense, a primary distinction can be made between classic ML (e.g. RF) and Deep Learning DL (e.g. CNN), with this last approach resulting, in line with literature [4,5,6], more diffused for prediction tasks. However, the overall evidence does not point clearly to a superiority of DL over the simpler basic ML. As a matter of fact, when considering single-block frameworks, an increase in the performance is present, despite a different robustness of the evidence: for instance, it is possible to consider DL approaches such as LSTM with RMSE = 5.75 ± 5.18 µg/m³, against basic ML such as XGBoost with RMSE = 7.78 ± 4.68 µg/m³ or RF with RMSE = 12.79 ± 9.1 µg/m³. When instead considering ensemble architectures, an inverse result emerges, such as was recorded for RF, with RMSE = 10.44 ± 7.64 µg/m³, and LSTM, with RMSE = 13.15 ± 9.1 µg/m³. It must be noticed that, considering the large overlapping in confidence intervals, there is no evidence about the higher suitability of one approach over the other. However, it is anyway possible to partly confirm previous literature results [4, 5] in the field of DL. For instance, a primary role of LSTM and CNN, considered by literature to solve many issues affecting older approaches (e.g. vanishing gradient issue), was verified. On the contrary, other parts of established knowledge were not confirmed, such as the preferability of Gated Recurrent Unit (GRU) over other approaches, which did not emerge clearly in this analysis, where one case only [24] was recorded in which it was selected as the best solution (moreover, in this case, GRU was put in series with an LSTM building an ensemble architecture, reaching a higher performance compared to each of the two used singularly). Anyway, it is recommendable to always base the choice on the final goal of the modelling task. As a matter of fact, ML offers the possibility to easily implement explainable-AI models [148], which is vital to generate evidence about the impact of different factors on the levels of pollution, thus generating insights for policy makers about a proper management of the territory in terms of land-use [149] and human activities.

Secondarily, in addition to the distinction between classic ML and DL, another important distinction is between single-block frameworks and the more complex ensemble architectures. Previous reviews [3,4,5,6] generally agreed in identifying ensemble architectures as the most performant approaches. At first sight, in this review, a different result seems to have emerged, with an average lower performance of ensemble architectures across the different studies. However, considering the 37 studies that implemented an ensemble architecture (and were included in the comparison, being evaluated through RMSE), the vast majority of them (26, 70.3%) made this choice after a comparison between the ensemble architecture and other single-block models, thus recording a performance increase when the more complex solution was tested. Therefore, it is possible to hypothesize that the overall higher performance of single-block models is actually due to the different experimental set-ups, rather than to the characteristics of the models themselves. This hypothesis is also corroborated by the fact that the inversed scenario, thus a single-block model preferred over an ensemble architecture when both were tested, was a very rare occurrence, with only 3 cases out of 96 (3.13%). Therefore, while it is possible to state that an ensemble architecture can help reaching higher performances, it must be also specified that the opportunity of this approach depends strongly on the experimental set-up. An increase in complexity does not automatically result in a higher performance, thus partly denying previous literature. The trade-off between complexity and expected performance should therefore be accurately analyzed case-by-case, according to needs and specifics in terms of context, application scenario, aim, available data sources, and characteristics of target and explicative data, resulting in different choices being suitable according to the different situations. While some solutions resulted generally more robust and have stronger evidence compared to others, an extensive effort emerges recommendable in terms of comparative analysis of different models when implementing a new solution. Choosing the model that is reported to have the best numerical performance can be misleading, as the quantitative evaluations resulted more dependent from the initial set-up rather than the developed model.

In conclusion, this study shows that the target field is one of the most fast-evolving and manifold applications of machine learning technologies. In this scenario, a relevant application of the performed analysis is to provide a reference framework for researchers in this field to address the topic, having identified the most relevant features in cases-studies to be taken into account when defining the experimental set-up. In light of all the above, while this literature review can be considered a reference for general benchmarking, an even higher relevance should be attributed to the methodological guidelines proposed.

Data availability

All original data, including a formatted table of references and the processing code, are made available on request.

References

Rafaj, P., Kiesewetter, G., Gül, T., Schöpp, W., Cofala, J., Klimont, Z., Purohit, P., Heyes, C., Amann, M., Borken-Kleefeld, J., & Cozzi, L. (2018). Outlook for clean air in the context of sustainable development goals. Global Environmental Change, 53, 1–11. https://doi.org/10.1016/j.gloenvcha.2018.08.008.
Article Google Scholar
Tanzer, R., Malings, C., Hauryliuk, A., Subramanian, R., & Presto, A. A. (2019). Demonstration of a low-cost Multi-pollutant Network to quantify Intra-urban spatial variations in Air Pollutant Source impacts and to Evaluate Environmental Justice. International Journal of Environmental Research and Public Health, 16(14), 2523. https://doi.org/10.3390/ijerph16142523.
Article CAS Google Scholar
Gardner-Frolick, R., Boyd, D., & Giang, A. (2022). Selecting Data Analytic and Modeling Methods to Support Air Pollution and Environmental Justice Investigations: A Critical Review and Guidance Framework. Environmental Science and Technology 56(5): 2843–2860. https://doi.org/10.1021/acs.est.1c01739.
Zaini, N., Ean, L. W., Ahmed, A. N., & Malek, M. A. (2022). A systematic literature review of deep learning neural network for time series air quality forecasting. Environmental Science and Pollution Research, 29(4), 4958–4990. https://doi.org/10.1007/s11356-021-17442-1.
Article Google Scholar
Mehmood, K., Bao, Y., Saifullah Cheng, W., Khan, M. A., Siddique, N., Abrar, M. M., Soban, A., Fahad, S., & Naidu, R. (2022). Predicting the quality of air with machine learning approaches: Current research priorities and future perspectives. Journal of Cleaner Production, 379. https://doi.org/10.1016/j.jclepro.2022.134656.
Gugnani, V., & Singh, R. K. (2022). Analysis of deep learning approaches for air pollution prediction. Multimedia Tools and Applications, 81(4), 6031–6049. https://doi.org/10.1007/s11042-021-11734-x.
Article Google Scholar
Mahakalkar, A., Gianquintieri, L., Lorenzo Amici, L., Brovelli, M. A., & Caiani, E. G. (2024). Geospatial analysis of short-term exposure to air pollution and risk of cardiovascular diseases and mortality–A systematic review. Chemosphere, 353, 141495. https://doi.org/10.1016/j.chemosphere.2024.141495.
Article CAS Google Scholar
Xing, Y. F., Xu, Y. H., Shi, M. H., & Lian, Y. X. (2016). The impact of PM2.5 on the human respiratory system. Journal of Thoracic Disease, 8(1), 69–74. https://doi.org/10.3978/j.issn.2072-1439.2016.01.19.
Article Google Scholar
Gianquintieri, L., Brovelli, M. A., Pagliosa, A., Bonora, R., Sechi, G. M., & Caiani, E. G. (2021). Geospatial correlation analysis between Air Pollution indicators and estimated speed of COVID-19 diffusion in the Lombardy Region (Italy). International Journal of Environmental Research and Public Health, 18(22), 12154. https://doi.org/10.3390/ijerph182212154.
Article CAS Google Scholar
Lucas, E., Cummings, J. D., Stewart, P. K., & Kabindra, M. S. (2022). Predicting citywide distribution of air pollution using mobile monitoring and three-dimensional urban structure. Sustainable Cities and Society, 76, 103510. https://doi.org/10.1016/j.scs.2021.103510.
Article Google Scholar
Ren, W., Zhao, J., & Ma, X. (2022). Analysis of the spatial characteristics of inhalable particulate matter concentrations under the influence of a three-dimensional landscape pattern in Xi’an, China. Sustainable Cities and Society, 81, 103841. https://doi.org/10.1016/j.scs.2022.103841.
Article Google Scholar
Su, Z., Lin, L., Chen, Y., et al. (2022). Understanding the distribution and drivers of PM2.5 concentrations in the Yangtze River Delta from 2015 to 2020 using Random Forest Regression. Environmental Monitoring and Assessment, 194, 284. https://doi.org/10.1007/s10661-022-09934-5.
Article Google Scholar
Zeng, L., Hang, J., Wang, X., & Shao, M. (2022). Influence of urban spatial and socioeconomic parameters on PM2.5 at subdistrict level: A land use regression study in Shenzhen, China. Journal of Environmental Sciences, 114, 485–502. https://doi.org/10.1016/j.jes.2021.12.002.
Article Google Scholar
Aldegunde, J. A. Á., Sánchez, A. F., Saba, M., Bolaños, E. Q., & Palenque, J. Ú. (2022). Analysis of PM2.5 and Meteorological variables using enhanced geospatial techniques in developing countries: A case study of Cartagena De Indias City (Colombia). Atmosphere, 13, 506. https://doi.org/10.3390/atmos13040506.
Article CAS Google Scholar
Basu, E., & Salui, C. L. (2021). Estimating particulate matter concentrations from MODIS AOD considering Meteorological parameters using Random Forest Algorithm. In P. K. Shit, P. P. Adhikary, & D. Sengupta (Eds.), Spatial modeling and Assessment of Environmental Contaminants. Environmental challenges and solutions. Springer. https://doi.org/10.1007/978-3-030-63422-3_29.
Cheewinsiriwat, P., Duangyiwa, C., Sukitpaneenit, M., & Stettler, M. E. J. (2022). Influence of Land Use and Meteorological factors on PM2.5 and PM10 concentrations in Bangkok, Thailand. Sustainability, 14, 5367. https://doi.org/10.3390/su14095367.
Article Google Scholar
Yu, X., Lary, D. J., Simmons, C. S., & Wijeratne, L. O. H. (2022). High spatial-temporal PM2.5 modeling utilizing Next Generation Weather Radar (NEXRAD) as a supplementary Weather source. Remote Sens, 14, 495. https://doi.org/10.3390/rs14030495.
Article Google Scholar
Chun, B., Choi, K., & Pan, Q. (2022). Key determinants of particulate matter 2.5 concentrations in urban environments with scenario analysis. Environment and Planning B: Urban Analytics and City Science. https://doi.org/10.1177/23998083221078306.
Article Google Scholar
Liu, H., Yue, F., & Xie, Z. (2022). Quantify the role of anthropogenic emission and meteorology on air pollution using machine learning approach: A case study of PM2.5 during the COVID-19 outbreak in Hubei Province, China. Environmental Pollution, 300, 118932. https://doi.org/10.1016/j.envpol.2022.118932.
Article CAS Google Scholar
Zhang, Z., Xu, B., Xu, W., Wang, W., Gao, J., Li, Y., Li, M., Feng, Y., & Shi, G. (2022). Machine learning combined with the PMF model reveal the synergistic effects of sources and meteorological factors on PM2.5 pollution. Environmental Research, 2022(212), B–113322. https://doi.org/10.1016/j.envres.2022.113322.
Article CAS Google Scholar
Deng, C., Qin, C., Li, Z., & Li, K. (2022). Spatiotemporal variations of PM2.5 pollution and its dynamic relationships with meteorological conditions in Beijing-Tianjin-Hebei region. Chemosphere, 301, 124640. https://doi.org/10.1016/j.chemosphere.2022.134640.
Article CAS Google Scholar
Ahn, H., Lee, J., & Hong, A. (2022). Urban form and air pollution: Clustering patterns of urban form factors related to particulate matter in Seoul. Korea Sustainable Cities and Society, 81, 103859. https://doi.org/10.1016/j.scs.2022.103859.
Article Google Scholar
Singh, S., Johnson, G., & Kavouras, I. (2022). The Effect of Transportation and wildfires on the Spatiotemporal heterogeneity of PM2.5 Mass in the New York-New Jersey Metropolitan Statistical Area. Environmental Health Insights, 16. https://doi.org/10.1177/11786302221104016.
Sarkar, N., Gupta, R., Keserwani, P. K., & Govil, M. C. (2022). Air Quality Index prediction using an effective hybrid deep learning model. Environmental Pollution, 315, 120404. https://doi.org/10.1016/j.envpol.2022.120404.
Article CAS Google Scholar
Chen, J., Song, X., Zang, L., et al. (2022). Spatio-temporal association mining of intercity PM2.5 pollution: Hubei Province in China as an example. Environmental Science and Pollution Research. https://doi.org/10.1007/s11356-022-22574-z.
Article Google Scholar
Pu, Q., & Yoo, E. H. (2022). A gap-filling hybrid approach for hourly PM2.5 prediction at high spatial resolution from multi-sourced AOD data. Environmental Pollution, 315, 120419. https://doi.org/10.1016/j.envpol.2022.120419.
Article CAS Google Scholar
Ma, J., Zhang, R., Xu, J., & Yu, Z. (2022). MERRA-2 PM2.5 mass concentration reconstruction in China mainland based on LightGBM machine learning. Science of the Total Environment, 827, 154363. https://doi.org/10.1016/j.scitotenv.2022.154363.
Article CAS Google Scholar
Xu, C., Wang, J., Hu, M., & Wei Wang, W. (2022). A new method for interpolation of missing air quality data at monitor stations. Environment International, 169, 107538. https://doi.org/10.1016/j.envint.2022.107538.
Article CAS Google Scholar
Yin, S., Li, T., Cheng, X., & Wu, J. (2022). Remote sensing estimation of surface PM2.5 concentrations using a deep learning model improved by data augmentation and a particle size constraint. Atmospheric Environment, 287, 119282. https://doi.org/10.1016/j.atmosenv.2022.119282.
Article CAS Google Scholar
Yang, X., Xiao, D., Bai, H., Tang, J., Wang, W., & Wei (2022). Spatiotemporal distributions of PM2.5 concentrations in the Beijing–Tianjin–Hebei Region from 2013 to 2020. Frontiers in Environmental Science, 10, https://doi.org/10.3389/fenvs.2022.842237.
Real, E., Couvidat, F., Ung, A., Malherbe, L., Raux, B., Gressent, A., & Colette, A. (2022). Historical reconstruction of background air pollution over France for 2000–2015. Earth System Science Data, 14(5), 2419–2443. https://doi.org/10.5194/essd-14-2419-2022.
Article Google Scholar
Yarivan, H. M., Salih, N. M., & Peshawa, M. N. (2022). Ambient particulate matter concentrations for difference size from MODIS Satellite images and ground measurements in Sulaimani, IRAQ. Applied Ecology and Environmental Sciences, 10(10), 622–639. https://doi.org/10.12691/aees-10-10-4.
Article CAS Google Scholar
Hofman, J., Do, T. H., Qin, X., Bonet, E. R., Philips, W., Deligiannis, N., & Panzica La Manna, V. (2022). Spatiotemporal air quality inference of low-cost sensor data: Evidence from multiple sensor testbeds. Environmental Modelling & Software, 149, 105306. https://doi.org/10.1016/j.envsoft.2022.105306.
Article Google Scholar
Zhong, J., Zhang, X., Gui, K., Liao, J., Fei, Y., Jiang, L., Guo, L., Liu, L., Che, H., Wang, Y., Wang, D., & Zhou, Z. (2022). Reconstructing 6-hourly PM2.5 datasets from 1960 to 2020 in China. Earth System Science Data, 14(7), 3197–3211. https://doi.org/10.5194/essd-14-3197-2022.
Article Google Scholar
Wardana, I. N. K., Gardner, J. W., & Fahmy, S. A. (2022). Estimation of missing air pollutant data using a spatiotemporal convolutional autoencoder. Neural Comput & Applic. https://doi.org/10.1007/s00521-022-07224-2.
Article Google Scholar
Tan, S., Wang, Y., Yuan, Q., Zheng, L., Li, T., Shen, H., & Zhang, L. P. (2022). Reconstructing global PM2.5 monitoring dataset from OpenA using a two-step spatio-temporal model based on SES-IDW and LSTM. Environmental Research Letters, 17(3), 034014. https://doi.org/10.1088/1748-9326/ac52c9.
Article CAS Google Scholar
Ma, P., Tao, F., Gao, L., Leng, S., Yang, K., & Zhou, T. (2022). Retrieval of Fine-Grained PM2.5 Spatiotemporal Resolution based on multiple machine learning models. Remote Sens, 14, 599. https://doi.org/10.3390/rs14030599.
Article Google Scholar
Hsieh, H. P., Wu, S., Ko, C. C., Shei, C., Yao, Z. T., & Chen, Y. W. (2022). Forecasting fine-Grained Air Quality for locations without Monitoring stations based on a hybrid predictor with spatial-temporal attention based Network. Appl Sci, 12, 4268. https://doi.org/10.3390/app12094268.
Article CAS Google Scholar
Joyce, J. Y., Zhang, Sun, L., Rainham, D., Dummer, T. J. B., Wheeler, A. J., Anastasopolos, A., Gibson, M., & Johnson, M. (2022). Predicting intraurban airborne PM1.0-trace elements in a port city: Land use regression by ordinary least squares and a machine learning algorithm. Science of the Total Environment, 806(1), 150149. https://doi.org/10.1016/j.scitotenv.2021.150149.
Article CAS Google Scholar
Chen, B., Song, Z., Pan, F., & Huang, Y. Obtaining vertical distribution of PM2.5 from CALIOP data and machine learning algorithms. Science of the Total Environment, 805: 150338. https://doi.org/10.1016/j.scitotenv.2021.150338.
Qu, Y., Zhao, M., Wang, T., Li, S., Li, M., Xie, M., & Zhuangn, B. (2022). Lidar- and UAV-Based Vertical Observation of Spring ozone and particulate matter in Nanjing, China. Remote Sens, 14, 3051. https://doi.org/10.3390/rs14133051.
Article Google Scholar
Lin, L., Liang, Y., Liu, L., Zhang, Y., Xie, D., Yin, F., & Ashraf, T. (2022). Estimating PM2.5 concentrations using the machine learning RF-XGBoost model in Guanzhong Urban Agglomeration, China. Remote Sens, 14, 5239. https://doi.org/10.3390/rs14205239.
Article Google Scholar
Wang, Z., Li, R., Chen, Z., Yao, Q., Gao, B., Xu, M., Yang, L., Li, M., & Zhou, C. (2022). The estimation of hourly PM2.5 concentrations across China based on a spatial and temporal weighted continuous deep neural network (STWC-DNN). ISPRS Journal of Photogrammetry and Remote Sensing, 190, 38–55. https://doi.org/10.1016/j.isprsjprs.2022.05.011.
Article Google Scholar
Lyu, B., Huang, R., Wang, X., Wang, W., & Hu, Y. (2022). Deep-learning spatial principles from deterministic chemical transport models for chemical reanalysis: An application in China for PM 2.5. Geoscientific Model Development, 15(4), 1583–1594. https://doi.org/10.5194/gmd-15-1583-2022.
Article Google Scholar
Liu, Y., Li, C., Liu, D., Tang, Y., Seyler, B. C., Zhou, Z., Hu, X., Yang, F., & Zhan, Y. (2022). Deriving hourly full-coverage PM2.5 concentrations across China’s Sichuan Basin by fusing multisource satellite retrievals: A machine-learning approach. Atmospheric Environment, 271, 118930. https://doi.org/10.1016/j.atmosenv.2021.118930.
Article CAS Google Scholar
Wang, Z., Hu, B., Huang, B., Ma, Z., Biswas, A., Jiang, Y., & Shi, Z. (2022). Predicting annual PM2.5 in mainland China from 2014 to 2020 using multi temporal satellite product: An improved deep learning approach with spatial generalization ability. ISPRS Journal of Photogrammetry and Remote Sensing, 187, 141–158. https://doi.org/10.1016/j.isprsjprs.2022.03.002.
Article Google Scholar
Yang, N., Shi, H., Tang, H., & Yang, X. (2022). Geographical and temporal encoding for improving the estimation of PM2.5 concentrations in China using end-to-end gradient boosting. Remote Sensing of Environment, 269, 112828. https://doi.org/10.1016/j.rse.2021.112828.
Article Google Scholar
Ma, R., Ban, J., Wang, Q., Zhang, Y., Yang, Y., Li, S., Shi, W., Zhou, Z., Zang, J., & Li, T. (2022). Full-coverage 1km daily ambient PM2.5 and O3 concentrations of China in 2005–2017 based on a multi-variable random forest model. Earth System Science Data, 14(2), 943–954. https://doi.org/10.5194/essd-14-943-2022.
Article Google Scholar
Song, J., & Stettler, M. E. J. (2022). A novel multi-pollutant space-time learning network for air pollution inference. Science of the Total Environment, 811, 152254. https://doi.org/10.1016/j.scitotenv.2021.152254.
Article CAS Google Scholar
Chen, B., Song, Z., Shi, B., & Li, M. An interpretable deep forest model for estimating hourly PM10 concentration in China using Himawari-8 data. Atmospheric Environment 268: 118827. https://doi.org/10.1016/j.atmosenv.2021.118827.
Wang, J., He, L., Lu, X., Zhou, L., Tang, H., Yan, Y., & Ma, W. (202) A full-coverage estimation of PM2.5 concentrations using a hybrid XGBoost-WD model and WRF-simulated meteorological fields in the Yangtze River Delta Urban Agglomeration, China. Environmental Research 203: 111799. https://doi.org/10.1016/j.envres.2021.111799.
Bai, K., Li, K., Guo, J., & Chang, N. B. (2022). Multiscale and multisource data fusion for full-coverage PM2.5 concentration mapping: Can spatial pattern recognition come with modeling accuracy? ISPRS Journal of Photogrammetry and Remote Sensing, 184, 31–44. https://doi.org/10.1016/j.isprsjprs.2021.12.002.
Article Google Scholar
Song, Z., Chen, B., & Huang, J. (2022). Combining Himawari-8 AOD and deep forest model to obtain city-level distribution of PM2.5 in China. Environmental Pollution, 297, 118826. https://doi.org/10.1016/j.envpol.2022.11882.
Article CAS Google Scholar
Song, Z., Chen, B., Zhang, P., Guan, X., Wang, X., Ge, J., Hu, X., Zhang, X., & Wang, Y. (2022). High temporal and spatial resolution PM2.5 dataset acquisition and pollution assessment based on FY-4A TOAR data and deep forest model in China. Atmospheric Research, 274, 106199. https://doi.org/10.1016/j.atmosres.2022.106199.
Article CAS Google Scholar
Dai, H., Huang, G., Wang, J., Zeng, H., & Zhou, F. (2022). Spatio-temporal characteristics of PM2.5 concentrations in China based on multiple sources of data and LUR-GBM during 2016–2021. International Journal of Environmental Research and Public Health, 19, 6292. https://doi.org/10.3390/ijerph19106292.
Article Google Scholar
Wang, M., Wang, Y., Teng, F., Li, S., Lin, Y., & Cai, H. (2022). Estimation and analysis of PM2.5 concentrations with NPP-VIIRS Nighttime Light images: A Case Study in the Chang-Zhu-Tan Urban Agglomeration of China. International Journal of Environmental Research and Public Health, 19, 4306. https://doi.org/10.3390/ijerph19074306.
Article Google Scholar
Gu, J., Wang, Y., Ma, J., Lu, Y., Wang, S., & Li, X. (2022). An estimation method for PM2.5 based on Aerosol Optical depth obtained from remote sensing image Processing and Meteorological factors. Remote Sens, 14, 1617. https://doi.org/10.3390/rs14071617.
Article Google Scholar
Bin, C., Song, Z., Huang, J., Zhang, P., Hu, X., Zhang, X., et al. (2022). Estimation of atmospheric PM10 concentration in China using an interpretable deep learning model and top-of-the-atmosphere reflectance data from China’s new generation geostationary meteorological satellite, FY-4A. Journal of Geophysical Research: Atmospheres, 127, https://doi.org/10.1029/2021JD036393. e2021JD036393.
Li, J., An, X., Li, Q., Wang, C., Yu, H., Zhou, X., & Geng, Y. (2022). Application of XGBoost algorithm in the optimization of pollutant concentration. Atmospheric Research, 276, 106238. https://doi.org/10.1016/j.atmosres.2022.106238.
Article CAS Google Scholar
Pendergrass, D. C., Zhai, S., Kim, J., Koo, J. H., Lee, S., Bae, M., Kim, S., Liao, H., & Jacob, D. J. (2022). Continuous mapping of fine particulate matter PM 2.5 air quality in East Asia at daily 6x6 km2 resolution by application of a random forest algorithm to 2011–2019 GOCI geostationary satellite data. Atmospheric Measurement Techniques, 15(4), 1075–1091. https://doi.org/10.5194/amt-15-1075-2022.
Article Google Scholar
Kulkarni, P., Sreekanth, V., Upadhya, A. R., & Gautam, H. C. (2022). Which model to choose? Performance comparison of statistical and machine learning models in predicting PM2.5 from high-resolution satellite aerosol optical depth. Atmospheric Environment, 282, 119164. https://doi.org/10.1016/j.atmosenv.2022.119164.
Article CAS Google Scholar
Pouyaei, A., Choi, Y., Jung, J., Mousavinezhad, S., Momeni, M., & Song, C. H. (2022). Investigating the long-range transport of particulate matter in East Asia: Introducing a new Lagrangian diagnostic tool. Atmospheric Environment, 278, 119096. https://doi.org/10.1016/j.atmosenv.2022.119096.
Article CAS Google Scholar
Park, S., Im, J., Kim, J., & Kim, S. M. (2022). Geostationary satellite-derived ground-level particulate matter concentrations using real-time machine learning in Northeast Asia. Environmental Pollution, 306, 119425. https://doi.org/10.1016/j.envpol.2022.119425.
Article CAS Google Scholar
Han, S., Kundhikanjana, W., Towashiraporn, P., & Stratoulias, D. (2022). Interpolation-based Fusion of Sentinel-5P, SRTM, and Regulatory-Grade Ground stations Data for Producing spatially continuous maps of PM2.5 concentrations nationwide over Thailand. Atmosphere, 13, 161. https://doi.org/10.3390/atmos13020161.
Article CAS Google Scholar
Atuhaire, C., Gidudu, A., Bainomugisha, E., & Mazimwe, A. (2022). Determination of Satellite-Derived PM2.5 for Kampala District. Uganda Geomatics, 2, 125–143. https://doi.org/10.3390/geomatics2010008.
Article Google Scholar
Ghahremanloo, M., Lops, Y., Choi, Y., Jung, J., Mousavinezhad, S., & Hammond, D. (2022). A comprehensive study of the COVID-19 impact on PM2.5 levels over the contiguous United States: A deep learning approach. Atmospheric Environment, 272, 118944. https://doi.org/10.1016/j.atmosenv.2022.118944.
Article CAS Google Scholar
Cui, Q., Zhang, F., Fu, S., Wei, X., Ma, Y., & Wu, K. (2022). High Spatiotemporal Resolution PM2.5 concentration estimation with machine learning algorithm: A Case Study for Wildfire in California. Remote Sens, 14, 1635. https://doi.org/10.3390/rs14071635.
Article Google Scholar
Vu, B. N., Bi, J., Wang, W., Huff, A., Kondragunta, S., & Liu, Y. (2022). Application of geostationary satellite and high-resolution meteorology data in estimating hourly PM2.5 levels during the Camp Fire episode in California. Remote Sensing of Environment, 271, 112890. https://doi.org/10.1016/j.rse.2022.112890.
Article Google Scholar
Chen, P. C., & Lin, Y. T. (2022). Exposure assessment of PM2.5 using smart spatial interpolation on regulatory air quality stations with clustering of densely-deployed microsensors. Environmental Pollution, 292(B), 118401. https://doi.org/10.1016/j.envpol.2021.118401.
Article CAS Google Scholar
Paul, N., Yao, J., McLean, K. E., Stieb, D. M., & Henderson, S. B. (2022). The Canadian optimized statistical smoke exposure model (CanOSSEM): A machine learning approach to estimate national daily fine particulate matter (PM2.5) exposure. Science of the Total Environment, 850, 157956. https://doi.org/10.1016/j.scitotenv.2022.157956.
Article CAS Google Scholar
Zhang, Y., Zhai, S., Huang, J., Li, X., Wang, W., Zhang, T., Yin, F., & Ma, Y. (2022). Estimating high-resolution PM2.5 concentration in the Sichuan Basin using a random forest model with data-driven spatial autocorrelation terms. Journal of Cleaner Production, 380(1), 134890. https://doi.org/10.1016/j.jclepro.2022.134890.
Article CAS Google Scholar
Li, T., Yang, Q., Wang, Y., & Wu, J. (2022). Joint estimation of PM2.5 and O3 over China using a knowledge-informed neural network. Geoscience Frontiers, 101499, https://doi.org/10.1016/j.gsf.2022.101499.
Jin, X., Ding, J., Ge, X., Liu, J., Xie, B., Zhao, S., & Zhao, Q. (2022). Machine learning driven by environmental covariates to estimate high-resolution PM2.5 in data-poor regions. PeerJ, 10, e13203. https://doi.org/10.7717/peerj.13203.
Article Google Scholar
Han, M., Jia, S., & Zhang, C. (2022). Estimation of high-resolution PM2.5 concentrations based on gap-filling aerosol optical depth using gradient boosting model. Air Quality, Atmosphere and Health, 15, 619–631. https://doi.org/10.1007/s11869-021-01149-w.
Article CAS Google Scholar
Dong, L., Li, S., Xing, J., Lin, H., Wang, S., Zeng, X., & Qin, Y. (2022). Joint features random forest (JFRF) model for mapping hourly surface PM2.5 over China. Atmospheric Environment, 273, 118969. https://doi.org/10.1016/j.atmosenv.2022.118969.
Article CAS Google Scholar
Zeng, Q., Xie, T., Zhu, S., Fan, M., Chen, L., & Tian, Y. (2022). Estimating the Near-Ground PM2.5 concentration over China based on the CapsNet Model during 2018–2020. Remote Sens, 14, 623. https://doi.org/10.3390/rs14030623.
Article Google Scholar
Bai, K., Li, K., Ma, M., Li, K., Li, Z., Guo, J., Chang, N. B., Tan, Z., & Han, D. (2022). LGHAP: The long-term gap-free high-resolution air pollutant concentration dataset, derived via tensor-flow-based multimodal data fusion. Earth System Science Data, 14(2), 907–927. https://doi.org/10.5194/essd-14-907-2022.
Article Google Scholar
Yuan, S., Li, Y., Gao, J., & Bao, F. (2022). A New Coupling Method for PM2.5 concentration estimation by the Satellite-based Semiempirical Model and Numerical Model. Remote Sens, 14, 2360. https://doi.org/10.3390/rs14102360.
Article Google Scholar
Hu, Y., Zeng, C., Li, T., & Shen, H. (2022). Performance comparison of Fengyun-4A and Himawari-8 in PM2.5 estimation in China. Atmospheric Environment, 271, 118898. https://doi.org/10.1016/j.atmosenv.2021.118898.
Article CAS Google Scholar
Wang, F., Yao, S., Luo, H., & Huang, B. (2022). Estimating high-resolution PM2.5 concentrations by Fusing Satellite AOD and Smartphone photographs using a Convolutional Neural Network and ensemble learning. Remote Sens, 14, 1515. https://doi.org/10.3390/rs14061515.
Article Google Scholar
Ibrahim, S., Landa, M., Pešek, O., Brodský, L., & Halounová, L. (2022). Machine learning-based Approach using Open Data to Estimate PM2.5 over Europe. Remote Sens, 14, 3392. https://doi.org/10.3390/rs14143392.
Article Google Scholar
Handschuh, J., Erbertseder, T., Schaap, M., & Baier, F. (2022). Estimating PM2.5 surface concentrations from AOD: A combination of SLSTR and MODIS. Remote Sensing Applications: Society and Environment, 26, 100716. https://doi.org/10.1016/j.rsase.2022.100716.
Article Google Scholar
Kumar, A., Dhakhwa, S., & Dikshit, A. K. (2022). Comparative evaluation of fitness of interpolation techniques of ArcGIS using leave-one-out Scheme for Air Quality Mapping. J Geovis spat anal, 6(9). https://doi.org/10.1007/s41651-022-00102-4.
Mittal, V., Sasetty, S., Choudhary, R., & Agarwal, A. (2022). Deep-Learning Spatiotemporal Prediction Framework for Particulate Matter under dynamic monitoring. Transportation Research Record. https://doi.org/10.1177/03611981221082589.
Article Google Scholar
Singh, P., Vaishya, R. C., Soni, P., & Medhi, H. (2022). A methodological comparison on Spatiotemporal Prediction of Criteria Air pollutants. Asian Journal of Atmospheric Environment, 16(1), 2021087. https://doi.org/10.5572/ajae.2021.087.
Article CAS Google Scholar
Ahmed, M., Xiao, Z., & Shen, Y. (2022). Estimation of Ground PM2.5 concentrations in Pakistan using convolutional neural network and Multi-pollutant Satellite images. Remote Sens, 14, 1735. https://doi.org/10.3390/rs14071735.
Article Google Scholar
Choi, K., & Chong, K. (2022). Modified Inverse Distance Weighting Interpolation for Particulate Matter Estimation and Mapping. Atmosphere, 13, 846. https://doi.org/10.3390/atmos13050846.
Article Google Scholar
Morillo, M., Martínez-Cuevas, C., García-Aranda, S., Molina, C., Querol, I., Javier, J., & Estibaliz, M. (2022). Spatial analysis of the particulate matter (PM10) an assessment of air pollution in the region of Madrid (Spain): Spatial interpolation comparisons and results. International Journal of Environmental Studies, 1, 11. https://doi.org/10.1080/00207233.2022.2072585.
Article Google Scholar
Dharmalingam, S., Senthilkumar, N., D’Souza, R. R., Hu, Y., Chang, H. H., Ebelt, S., Yu, H., Kim, C. S., & Rohr, A. (2022). Developing air pollution concentration fields for health studies using multiple methods: Cross-comparison and evaluation. Environmental Research, 207, 112207. https://doi.org/10.1016/j.envres.2021.112207.
Article CAS Google Scholar
Jin, C., Wang, Y., Li, T., & Yuan, Q. (2022). Global validation and hybrid calibration of CAMS and MERRA-2 PM2.5 reanalysis products based on OpenAQ platform. Atmospheric Environment, 274, 118972. https://doi.org/10.1016/j.atmosenv.2022.118972.
Article CAS Google Scholar
Gitahi, J., & Hahn, M. (2022). Evaluation of crowd-sourced PM2.5 measurements from low-cost sensors for Air Quality Mapping in Stuttgart City. In V. Coors, D. Pietruschka, & B. Zeitler (Eds.), iCity. Transformative Research for the Livable, Intelligent, and Sustainable City. Springer. https://doi.org/10.1007/978-3-030-92096-8_14.
Wu, P., & Song, Y. (2022). Land Use Quantile Regression modeling of fine particulate matter in Australia. Remote Sens, 1370. https://doi.org/10.3390/rs14061370. 14.
Wu, H., Zhang, Y., Li, Z., Wei, Y., Peng, Z., Luo, J., & Ou, Y. (2022). Prediction of fine particulate matter concentration near the ground in North China from Multivariable Remote Sensing Data based on MIV-BP neural network. Atmosphere, 13, 825. https://doi.org/10.3390/atmos13050825.
Article Google Scholar
Abirami, S., & Chitra, P. (2022). Regional spatio-temporal forecasting of particulate matter using autoencoder based generative adversarial network. Stochastic Environmental Research and Risk Assessment : Research Journal, 36, 1255–1276. https://doi.org/10.1007/s00477-021-02153-3.
Article Google Scholar
Araki, S., Shimadera, H., Hasunuma, H., Yoda, Y., & Shima, M. (2022). Predicting Daily PM2.5 exposure with spatially invariant accuracy using co-existing pollutant concentrations as predictors. Atmosphere, 13, 782. https://doi.org/10.3390/atmos13050782.
Article CAS Google Scholar
Kristiani, E., Lin, H., Lin, J. R., Chuang, Y. H., Huang, C. Y., & Yang, C. T. (2022). Short-term prediction of PM2.5 using LSTM Deep Learning methods. Sustainability, 14, 2068. https://doi.org/10.3390/su14042068.
Article CAS Google Scholar
Muthukumar, P., Nagrecha, K., Comer, D., Calvert, C. F., Amini, N., Holm, J., & Pourhomayoun, M. (2022). PM2.5 Air Pollution Prediction through Deep Learning using Multisource Meteorological, Wildfire, and Heat Data. Atmosphere, 13, 822. https://doi.org/10.3390/atmos13050822.
Article CAS Google Scholar
Tsokov, S., Lazarova, M., & Aleksieva-Petrova, A. (2022). A hybrid Spatiotemporal Deep Model based on CNN and LSTM for Air Pollution Prediction. Sustainability, 14, 5104. https://doi.org/10.3390/su14095104.
Article CAS Google Scholar
Gocheva-Ilieva, S., Ivanov, A., & Stoimenova-Minova, M. (2022). Prediction of Daily Mean PM10 concentrations using Random Forest, CART Ensemble and Bagging stacked by MARS. Sustainability, 14, 798. https://doi.org/10.3390/su14020798.
Article Google Scholar
Li, J., Xu, G., & Cheng, X. (2022). Combining spatial pyramid pooling and long short-term memory network to predict PM2.5 concentration. Atmospheric Pollution Research, 13(3), 101309. https://doi.org/10.1016/j.apr.2021.101309.
Article CAS Google Scholar
Bi, J., Knowland, K. E., Keller, C. A., & Liu, Y. (2022). Combining Machine Learning and Numerical Simulation for High-Resolution PM2.5 Concentration Forecast. Environmental Science and Technology, 56(3), 1544–1556. https://doi.org/10.1021/acs.est.1c05578.
Article CAS Google Scholar
Gu, Y., Li, B., & Meng, Q. Hybrid interpretable predictive machine learning model for air pollution prediction. Neurocomputing 468: 123–136. https://doi.org/10.1016/j.neucom.2021.09.051.
Wu, Y., Lin, S., Shi, K., et al. (2022). Seasonal prediction of daily PM2.5 concentrations with interpretable machine learning: A case study of Beijing, China. Environmental Science and Pollution Research. https://doi.org/10.1007/s11356-022-18913-9.
Article Google Scholar
Zhou, H., Zhang, F., Du, Z., & Liu, R. (2022). A theory-guided graph networks based PM2.5 forecasting method. Environmental Pollution, 293, 118569. https://doi.org/10.1016/j.envpol.2021.118569.
Article CAS Google Scholar
Yu, T., Wang, Y., Huang, J., Liu, X., Li, J., & Wei Zhan, W. (2022). Study on the regional prediction model of PM2.5 concentrations based on multi-source observations. Atmospheric Pollution Research, 13(4), 101363. https://doi.org/10.1016/j.apr.2022.101363.
Article CAS Google Scholar
Saravanan, D., & Santhosh Kumar, K. (2022). IoT based improved air quality index prediction using hybrid FA-ANN-ARMA model. Materials Today: Proceedings 56(4): 1809–1819. https://doi.org/10.1016/j.matpr.2021.10.474.
Bagheri, H. A machine learning-based framework for high resolution mapping of PM2.5 in Tehran, Iran, using MAIAC AOD data. Advances in Space Research 69(9): 3333–3349. https://doi.org/10.1016/j.asr.2022.02.032.
Zaini, N., Ean, L. W., Ahmed, A. N., et al. (2022). PM2.5 forecasting for an urban area based on deep learning and decomposition method. Scientific Reports, 12, 17565. https://doi.org/10.1038/s41598-022-21769-1.
Article CAS Google Scholar
Shaziayani, W. N., Ul-Saufie, A. Z., Mutalib, S., Mohamad Noor, N., & Zainordin, N. S. (2022). Classification prediction of PM10 concentration using a tree-based machine Learning Approach. Atmosphere, 13, 538. https://doi.org/10.3390/atmos13040538.
Article CAS Google Scholar
Ejohwomu, O. A., Shamsideen Oshodi, O., Oladokun, M., Bukoye, O. T., Emekwuru, N., Sotunbo, A., & Adenuga, O. (2022). Modelling and forecasting temporal PM2.5 concentration using ensemble machine learning methods. Buildings, 12, 46. https://doi.org/10.3390/buildings12010046.
Article Google Scholar
Mengara, A. G., Park, E., Jang, J., & Yoo, Y. (2022). Attention-based distributed Deep Learning Model for Air Quality forecasting. Sustainability, 14, 3269. https://doi.org/10.3390/su14063269.
Article CAS Google Scholar
Tongprasert, P., & Ongsomwang, S. (2022). A suitable model for Spatiotemporal Particulate Matter Concentration Prediction in Rural and Urban landscapes. Thailand Atmosphere, 13, 904. https://doi.org/10.3390/atmos13060904.
Article Google Scholar
Wood, D. A. (2022). Trend decomposition aids forecasts of air particulate matter (PM2.5) assisted by machine and deep learning without recourse to exogenous data. Atmospheric Pollution Research, 13(3), 101352. https://doi.org/10.1016/j.apr.2022.101352.
Article CAS Google Scholar
Wood, D. A. Local integrated air quality predictions from meteorology (2015 to 2020) with machine and deep learning assisted by data mining. Sustainability Analytics and Modeling 2: 100002. https://doi.org/10.1016/j.samod.2021.100002.
Miao, L., Tang, S., Ren, Y., Kwan, M. P., & Zhang, K. (2022). Estimation of daily ground-level PM2.5 concentrations over the Pearl River Delta using 1 km resolution MODIS AOD based on multi-feature BiLSTM. Atmospheric Environment, 290, 119362. https://doi.org/10.1016/j.atmosenv.2022.119362.
Article CAS Google Scholar
Shi, L., Zhang, H., Xu, X., Han, M., & Zuo, P. (2022). A balanced social LSTM for PM2.5 concentration prediction based on local spatiotemporal correlation. Chemosphere, 291(3), 133124. https://doi.org/10.1016/j.chemosphere.2021.133124.
Article CAS Google Scholar
Hong, J., Mao, F., Gong, W., Gan, Y., Zang, L., Quan, J., & Chen, J. (2022). Assimilating Fengyun-4A observations to improve WRF-Chem PM2.5 predictions in China. Atmospheric Research, 265, 105878. https://doi.org/10.1016/j.atmosres.2021.105878.
Article Google Scholar
Jang, E., Kim, M., Do, W., Park, G., & Yoo, E. Real-time estimation of PM2.5 concentrations at high spatial resolution in Busan by fusing observational data with chemical transport model outputs. Atmospheric Pollution Research 13(1): 101277. https://doi.org/10.1016/j.apr.2021.101277.
Lin, G. Y., Chen, H. W., Chen, B. J., & Yang, Y. C. (2022). Characterization of temporal PM2.5, nitrate, and sulfate using deep learning techniques. Atmospheric Pollution Research, 13(1), 101260. https://doi.org/10.1016/j.apr.2021.101260.
Article CAS Google Scholar
Nath, P., Roy, B., Saha, P., et al. (2022). Hybrid learning model for spatio-temporal forecasting of PM2.5 using aerosol optical depth. Neural Comput & Applic, 34, 21367–21386. https://doi.org/10.1007/s00521-022-07616-4.
Article Google Scholar
Iyer, S. R., Balashankar, A., Aeberhard, W. H., et al. (2022). Modeling fine-grained spatio-temporal pollution maps with low-cost sensors. npj Clim Atmos Sci, 5, 76. https://doi.org/10.1038/s41612-022-00293-z.
Article CAS Google Scholar
Araki, S., Shimadera, H., & Shima, M. (2022). Continuous estimations of daily PM2.5 chemical components from temporally sparse monitoring data using a machine learning approach. Atmospheric Pollution Research, 13(11), 101580. https://doi.org/10.1016/j.apr.2022.101580.
Article CAS Google Scholar
Pei, Y., Huang, C. J., Shen, Y., & Ma, Y. (2022). An ensemble model with adaptive Variational Mode decomposition and multivariate temporal graph neural network for PM2.5 concentration forecasting. Sustainability, 14, 13191. https://doi.org/10.3390/su142013191.
Article Google Scholar
Li, J., Dai, Y., Zhu, Y., Tang, X., Wang, S., Xing, J., Zhao, B., Fan, S., Long, S., & Fang, T. (2022). Improvements of response surface modeling with self-adaptive machine learning method for PM2.5 and O3 predictions. Journal of Environmental Management, 303, 114210. https://doi.org/10.1016/j.jenvman.2021.114210.
Article CAS Google Scholar
Wang, D., Wang, H. W., Lu, K. F., Peng, Z. R., & Zhao, J. (2022). Regional Prediction of ozone and fine particulate matter using Diffusion Convolutional recurrent neural network. International Journal of Environmental Research and Public Health, 19, 3988. https://doi.org/10.3390/ijerph19073988.
Article CAS Google Scholar
Teng, M., Li, S., Song, G., Yang, J., Dong, L., Lin, H., & Hu, S. (2022). Including the feature of appropriate adjacent sites improves the PM2.5 concentration prediction with long short-term memory neural network model. Sustainable Cities and Society, 76, 103427. https://doi.org/10.1016/j.scs.2021.103427.
Article Google Scholar
Teng, M., Li, S., Xing, J., Song, G., Yang, J., Dong, J., Zeng, X., & Qin, Y. (2022). 24-Hour prediction of PM2.5 concentrations by combining empirical mode decomposition and bidirectional long short-term memory neural network. Science of the Total Environment, 821, 153276. https://doi.org/10.1016/j.scitotenv.2022.153276.
Article CAS Google Scholar
Wang, W., An, X., Li, Q., Geng, Y., Yu, H., & Zhou, X. (2022). Optimization research on air quality numerical model forecasting effects based on deep learning methods. Atmospheric Research, 271, 106082. https://doi.org/10.1016/j.atmosres.2022.106082.
Article CAS Google Scholar
Guo, X., Wang, Y., Mei, S., Shi, C., Liu, Y., Pan, L., Li, K., Zhang, B., Wang, J., Zhong, Z., & Dong, M. (2022). Monitoring and modelling of PM2.5 concentration at subway station construction based on IoT and LSTM algorithm optimization. Journal of Cleaner Production, 360, 132179. https://doi.org/10.1016/j.jclepro.2022.132179.
Article CAS Google Scholar
Wang, Z., Chen, H., Zhu, J., & Ding, Z. (2022). Daily PM2.5 and PM10 forecasting using linear and nonlinear modeling framework based on robust local mean decomposition and moving window ensemble strategy. Applied Soft Computing, 114, 108110. https://doi.org/10.1016/j.asoc.2021.108110.
Article Google Scholar
Chen, L., Mao, F., Hong, J., Zang, L., Chen, J., Zhang, Y., Gan, Y., Gong, W., & Xu, H. (2022). Improving PM2.5 predictions during COVID-19 lockdown by assimilating multi-source observations and adjusting emissions. Environmental Pollution, 297, 118783. https://doi.org/10.1016/j.envpol.2021.118783.
Article CAS Google Scholar
Bai, B., Li, L., Zeng, Z., & Huang, H. (2022). Design of a combined system based on multi-objective optimization for fine particulate matter (PM2.5) prediction. Frontiers in Environmental Science, 10, https://doi.org/10.3389/fenvs.2022.833374.
Yang, X., Xiao, D., Fan, L., Li, F., Wang, W., Bai, H., & Tang, J. (2022). Spatiotemporal estimates of daily PM2.5 concentrations based on 1-km resolution MAIAC AOD in the Beijing–Tianjin–Hebei, China. Environmental Challenges, 8, 100548. https://doi.org/10.1016/j.envc.2022.100548.
Article CAS Google Scholar
Wang, X., Liu, W., Sun, W., Peng, Y., Zhang, Y., Zhai, X., Li, R. (2022). One Day Ahead Prediction Of Pm2.5 Spatial Distribution Using Modis 3 Km Aod And Spatiotemporal Model Over Beijing-Tianjin-Hebei, China. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 3: 303–310. https://doi.org/10.5194/isprs-annals-V-3-2022-303-2022.
Yang, H., Zhao, J., & Li, G. (2022). A new hybrid prediction model of PM2.5 concentration based on secondary decomposition and optimized extreme learning machine. Environmental Science and Pollution Research. https://doi.org/10.1007/s11356-022-20375-y.
Article Google Scholar
Masood, A., & Ahmad, K. (2022). Data-driven predictive modeling of PM2.5 concentrations using machine learning and deep learning techniques: A case study of Delhi, India. Environmental Monitoring and Assessment, 195, 60. https://doi.org/10.1007/s10661-022-10603-w.
Article Google Scholar
Barot, V., & Kapadia, V. (2022). Long short term memory neural network-based Model Construction and Fine-tuning for Air Quality parameters Prediction. Cybernetics and Information Technologies, 22(1), 171–189. https://doi.org/10.2478/cait-2022-0011.
Article Google Scholar
Faraji, M., Nadi, S., Ghaffarpasand, O., Homayoni, S., & Downey, K. (2022). An integrated 3D CNN-GRU deep learning method for short-term prediction of PM2.5 concentration in urban environment. Science of the Total Environment, 834, 155324. https://doi.org/10.1016/j.scitotenv.2022.155324.
Article CAS Google Scholar
Yu, W., Li, S., Ye, T., Xu, R., Song, J., & Guo, Y. (2022). Deep Ensemble Machine Learning Framework for the estimation of PM2.5 concentrations. Environmental Health Perspectives, 130, 3. https://doi.org/10.1289/EHP9752.
Article Google Scholar
Kim, B. Y., Lim, Y. K., & Wan Cha, J. (2022). Short-term prediction of particulate matter (PM10 and PM2.5) in Seoul, South Korea using tree-based machine learning algorithms. Atmospheric Pollution Research, 13(10), 101547. https://doi.org/10.1016/j.apr.2022.101547.
Article CAS Google Scholar
Lee, S., Park, S., Lee, M. I., Kim, G., Im, J., & Song, C. K. (2022). Air quality forecasts improved by combining data assimilation and machine learning with satellite AOD. Geophysical Research Letters, 49, https://doi.org/10.1029/2021GL096066. e2021GL096066.
Prihatno, A. T., Utama, I. B. K. Y., & Jang, Y. M. (2022). oneM2M-Enabled prediction of high particulate Matter Data based on Multi-dense Layer BiLSTM Model. Appl Sci, 12, 2260. https://doi.org/10.3390/app12042260.
Article CAS Google Scholar
Nurcahyanto, H., Prihatno, A. T., Alam, M., Rahman, H., Jahan, I., & Shahjalal, Min Jang, Y. (2022). Multilevel RNN-Based PM10 Air Quality Prediction for Industrial Internet of things Applications in Cleanroom Environmen. Wireless Communications and Mobile Computing, 1874237, https://doi.org/10.1155/2022/1874237.
Kumharn, W., Sudhibrabha, S., Hanprasert, K., Janjai, S., Masiri, I., Buntoung, S., Pattarapanitchai, S., Wattan, R., Pilahome, O., Nissawan, W., & Jankondee, Y. (2022). Improved hourly and long-term PM2.5 prediction modeling based on MODIS in Bangkok. Remote Sensing Applications: Society and Environment, 28, 100864. https://doi.org/10.1016/j.rsase.2022.100864.
Article Google Scholar
Gilik, A., Ogrenci, A. S., & Ozmen, A. (2022). Air quality prediction using CNN + LSTM-based hybrid deep learning architecture. Environmental Science and Pollution Research, 29, 11920–11938. https://doi.org/10.1007/s11356-021-16227-w.
Article CAS Google Scholar
Takruri, M., Abubakar, A., Jallad, A. H., Altawil, B., Marpu, P. R., & Bermak, A. (2022). Machine learning-based estimation of PM2.5 concentration using Ground Surface DoFP Polarimeters. Ieee Access : Practical Innovations, Open Solutions, 10, 23489–23496. https://doi.org/10.1109/ACCESS.2022.3151632.
Article Google Scholar
Dimakopoulou, K., Samoli, E., Analitis, A., Schwartz, J., Beevers, S., Kitwiroon, N., Beddows, A., Barratt, B., Rodopoulou, S., Zafeiratou, S., Gulliver, J., & Katsouyanni, K. (2022). Development and evaluation of spatio-temporal Air Pollution exposure models and their combinations in the Greater London Area, UK. International Journal of Environmental Research and Public Health, 19, 5401. https://doi.org/10.3390/ijerph19095401.
Article CAS Google Scholar
Gianquintieri, L., Oxoli, D., Caiani, E. G., & Brovelli, M. A. (2024). Implementation of a GEOAI model to assess the impact of agricultural land on the spatial distribution of PM2.5 concentration. Chemosphere, 352, 141438. https://doi.org/10.1016/j.chemosphere.2024.141438.
Article CAS Google Scholar
Gianquintieri, L., Oxoli, D., Caiani, E. G., & Brovelli, M. A. (2023). Land use influence on ambient PM2.5 and ammonia concentrations: Correlation analyses in the Lombardy region, Italy, AGILE GIScience Ser., 4, 26, https://doi.org/10.5194/agile-giss-4-26-2023, 2023.

Download references

Funding

This study was performed within the framework of the D-DUST project (Data-driven moDelling of particUlate with Satellite Technology aid), which received support from Fondazione Cariplo (project ID: 2020–4022).

Open access funding provided by Politecnico di Milano within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
Lorenzo Gianquintieri & Enrico Gianluca Caiani
Department of Civil and Environmental Engineering, Politecnico di Milano, Milan, Italy
Daniele Oxoli & Maria Antonia Brovelli
IRCCS, Istituto Auxologico Italiano, Milan, Italy
Enrico Gianluca Caiani

Authors

Lorenzo Gianquintieri
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Oxoli
View author publications
You can also search for this author in PubMed Google Scholar
Enrico Gianluca Caiani
View author publications
You can also search for this author in PubMed Google Scholar
Maria Antonia Brovelli
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The original idea for the article was developed by MAB. The literature search and data analysis were performed by LG. All authors drafted and/or critically revised the work.

Corresponding author

Correspondence to Lorenzo Gianquintieri.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gianquintieri, L., Oxoli, D., Caiani, E.G. et al. State-of-art in modelling particulate matter (PM) concentration: a scoping review of aims and methods. Environ Dev Sustain (2024). https://doi.org/10.1007/s10668-024-04781-5

Download citation

Received: 21 June 2023
Accepted: 12 March 2024
Published: 02 April 2024
DOI: https://doi.org/10.1007/s10668-024-04781-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

State-of-art in modelling particulate matter (PM) concentration: a scoping review of aims and methods

Abstract

Similar content being viewed by others

A Comparative and Systematic Study of Machine Learning (ML) Approaches for Particulate Matter (PM) Prediction

evalPM: a framework for evaluating machine learning models for particulate matter prediction

Integrating machine learning techniques for Air Quality Index forecasting and insights from pollutant-meteorological dynamics in sustainable urban environments

1 Introduction

2 Review methodology

3 Objective of selected studies

4 Geographic distribution

5 Input data sources

6 Used algorithms and estimated performance

7 Conclusions

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

State-of-art in modelling particulate matter (PM) concentration: a scoping review of aims and methods

Abstract

Similar content being viewed by others

A Comparative and Systematic Study of Machine Learning (ML) Approaches for Particulate Matter (PM) Prediction

evalPM: a framework for evaluating machine learning models for particulate matter prediction

Integrating machine learning techniques for Air Quality Index forecasting and insights from pollutant-meteorological dynamics in sustainable urban environments

1 Introduction

2 Review methodology

3 Objective of selected studies

4 Geographic distribution

5 Input data sources

6 Used algorithms and estimated performance

7 Conclusions

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation