Skip to main content
Log in

Forecasting of Air Quality Index in Delhi Using Neural Network Based on Principal Component Analysis

  • Published:
Pure and Applied Geophysics Aims and scope Submit manuscript

Abstract

Forecasting of the air quality index (AQI) is one of the topics of air quality research today as it is useful to assess the effects of air pollutants on human health in urban areas. It has been learned in the last decade that airborne pollution has been a serious and will be a major problem in Delhi in the next few years. The air quality index is a number, based on the comprehensive effect of concentrations of major air pollutants, used by Government agencies to characterize the quality of the air at different locations, which is also used for local and regional air quality management in many metro cities of the world. Thus, the main objective of the present study is to forecast the daily AQI through a neural network based on principal component analysis (PCA). The AQI of criteria air pollutants has been forecasted using the previous day’s AQI and meteorological variables, which have been found to be nearly same for weekends and weekdays. The principal components of a neural network based on PCA (PCA-neural network) have been computed using a correlation matrix of input data. The evaluation of the PCA-neural network model has been made by comparing its results with the results of the neural network and observed values during 2000–2006 in four different seasons through statistical parameters, which reveal that the PCA-neural network is performing better than the neural network in all of the four seasons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Anfossi D., Brisasca G., Tinarelli G. (1990), Simulation of atmospheric diffusion in low wind speed meandering conditions by a Monte Carlo dispersion model, IL Nuovo Climento 13C, 995–1006.

  • Aron R. (1984), Models for estimating current and future sulphur dioxide concentrations in Taipei, Bulletin of Geophysics 25, 47–52.

  • Aron R., Aron I. M. (1978), Statistical forecasting models: I. Carbon monoxide concentrations in the Los Angeles basin, Journal of Air Pollution Control Association 28, 681–684.

  • Boutahar J., Lacour S., Mallet V., Que′lo D., Roustan Y., Sportisse B. (2004), Development and validation of a fully modular platform for numerical modelling of air pollution: POLAIR, International Journal of Environmental Pollution 22, 17–28.

  • Boznar M., Lesjak M., Mlakar P. (1993), A neural network-based method for short-term predictions of ambient SO 2 concentrations in highly polluted industrial areas of complex terrain, Atmospheric Environment B 27, 221–230.

  • Byun D. W., Ching J. K. S. (Eds.) (1999), Science algorithms of the EPA Models-3 Community Multiscale Air Quality (CMAQ) modeling system, EPA/600/R-99/030, Office of Research and Development, U.S. Environmental Protection Agency, Washington, D. C.

  • Chang J.C., Hanna S.R. (2004), Air quality model performance evaluation, Meteorology and Atmospheric Physics 87, 167–196.

  • Chelani A.B., Rao C.V.C., Phadke K.M., Hasan, M.Z. (2002), Prediction of sulphur dioxide concentration using artificial neural networks, Environment Modelling and Software 17, 161–168.

  • Christensen J. H. (1997), The Danish Eulerian hemispheric model: A three dimensional air pollution model used for the arctic, Atmospheric Environment 31, 4169–4191.

  • Cogliani E. (2001), Air pollution forecast in cities by an air pollution index highly correlated with metrological variables, Atmospheric Environment 35, 2871–2877.

  • Comrie A. C. (1997), Comparing neural networks and regression models for ozone forecasting, Journal of air and waste management association 47, 653–663.

  • Economic Survey of Delhi (2008–2009), Planning Department, Government of NCT Delhi, June 2009.

  • EPA (1999), Air quality index Reporting Final Rule 1999-Federal Register, Part III, CFR Part 58.

  • Finzi G., Tebaldi G. (1982), A mathematical model for air pollution forecast and alarm in an urban area, Atmospheric Environment 16, 2055–2059.

  • Gardner M. W., Dorling S. R. (1998), Artificial neural networks (the multilayer perceptron) – A review of applications in the atmospheric sciences, Atmospheric Environment 32, 2627–2636.

  • Hajek P., Olej V. (2009), Air quality indices and their modelling by hierarchical fuzzy inference systems, WSEAS Transactions on Environment and Development 10, 661–672.

  • Katsoulis B. D. (1988), Some meteorological aspects of air pollution in Athens, Greece, Meteorology and Atmospheric Physics 39, 203–212.

  • Kumar A., Goyal P. (2011), Forecasting of air quality index in Delhi using principal component regression technique, Atmospheric Pollution Research 2, 436–444.

  • Kunzli N., Kaiser R., Medina S., Studnicka M., Chanel O., Filliger P., Herry M., Horak F. Jr., Puybonnieux-Texier V., Quénel P., Schneider J., Seethaler R., Vergnaud J. C., Sommer H. (2000), Public-health impact of outdoor and traffic-related air pollution: A European assessment, The Lancet 356, 795–801.

  • Kurt A., Oktay A.B. (2010), Forecasting air pollutant indicator levels with geographic models 3 days in advance using neural networks, Expert systems with Application 37, 7986–7992.

  • Lin G. Y. (1982), Oxidant prediction by discriminant analysis in the South coast air basin of California, Atmospheric Environment 16, 135–143.

  • Mantis H. T., Repapis C. C., Zerefos C. S., Ziomas J. C. (1992), Assessment of the potential for photochemical air pollution in Athens: a comparison of emissions and air pollutant levels in Athens with those in Los Angeles, Journal of Applied Meteorology 31, 1467–1476.

  • McCollister G. M., Wilson K. R. (1975), Linear stochastic models for forecasting daily maxima and hourly concentrations of air pollutants, Atmospheric Environment 9, 417–423.

  • Milionis A. E., Davies T. D. (1994), Regression and stochastic models for air pollution I. review, comments and suggestions, Atmospheric Environment 28, 2801–2810.

  • Nagendra S. M. S., Venugopal K., Jones S. L. (2007), Assessment of air quality near traffic intersections in Bangalore city using air quality index, Transportation Research Part D 12, 167–176.

  • Polydoras G. N., Anagnostopoulos J. S., Bergeles G (1998), Air quality predictions: dispersion model vs Box-Jenkins stochastic models. An implementation and comparison for Athens, Greece, Applied Thermal Engineering 18, 1037–1048.

  • Robeson S. M., Steyn D. G. (1990), Evaluation and comparison of statistical forecast models for daily maximum ozone concentrations, Atmospheric Environment 24B, 303–312.

  • Sanchez M. L., Pascual D., Ramos C., Perez I. (1990), Forecasting particulate pollutant concentrations in a city from meteorological variables and regional weather patterns, Atmospheric Environment 24A, 1509–1519.

  • Schmidt H., Derognat C., Vautard R., Beekmann M. (2001), A comparison of simulated and observed ozone mixing ratios for the summer of 1998 in western Europe, Atmospheric Environment 35, 6277–6297.

  • Shi J. P., Harrison R. M. (1997), Regression Modelling of Hourly NOx and NO 2 concentrations in urban air in London, Atmospheric Environment 31, 4081–4097.

  • Sousa S. I. V., Martins, F. G. Alvim-Ferraz M. C. M., Pereira M. C. (2007), Multiple linear regression and artificial neural networks based on principal components to predict ozone concentrations, Environmental Modelling & Software 22, 97–103.

  • Van den Elshout S., Leger K., Fabio N. (2008), Comparing urban air quality in Europe in real time: A review of existing air quality indices and the proposal of a common alternative, Environment International 34, 720–726.

  • Wilks D. S. (1995), Statistical Methods in the Atmospheric Sciences, Academic Press, San Diego.

  • Willmott C. J. (1982), Some comments on the evaluation of model performance, Bulletin of American Meteorological Society 63, 1309–1313.

  • Ziomass I. C., Dimitrios M., Christos S. Z., Alkiviadis F. B. (1995), Forecasting peak pollutant levels from meteorological variables, Atmospheric Environment 29, 3703–3711.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Goyal.

Appendices

Appendix

The statistical measures, which have been used for statistical evaluation of the performance of models has been given by Chang and Hanna (2004) as follows:

Coefficient of Correlation (R)

Coefficient of correlation (R) is relative measure of the association between the observed and predicted values. It can vary from 0 (which indicates no correlation to ±1 (which indicates perfect correlation). A value of R close to 1.0 implies good agreement between the observed and predicted values, i.e., good model performance.

$$ R = \frac{{\overline{{(C_{\text{o}} - \bar{C}_{\text{o}} )(C_{\text{p}} - \bar{C}_{\text{p}} )}} }}{{\sigma_{{C_{\text{p}} }} \sigma_{{C_{\text{o}} }} }} $$
(7)

Root Mean Square Error (RMSE)

RMSE, is a measure of the differences between values predicted by a model and the observed values and is expressed as follows:

$$ {\text{RMSE}} = \sqrt {\overline{{(C_{\text{o}} - C_{\text{p}} )^{2} }} } .$$
(8)

Normalized Mean Square Error (NMSE)

NMSE, as a measure of performance, emphasizes the scatter in the entire data set and is defined as follows:

$$ {\text{NMSE}} = \frac{{\overline{{(C_{\text{o}} - C_{\text{p}} )^{2} }} }}{{\bar{C}_{\text{o}} \cdot \bar{C}_{\text{p}} }} .$$
(9)

The normalization by \( \bar{C}_{\text{o}} \cdot \bar{C}_{\text{p}} \) ensures that NMSE will not be biased towards models that over predict or under predict. Ideal value for NMSE is zero. Smaller values of NMSE denote better model performance.

Fractional Bias (FB)

It is a performance measure known as the normalized or fractional bias of the mean concentrations:

$$ {\text{FB}} = \frac{{(\bar{C}_{\text{o}} - \bar{C}_{\text{p}} )}}{{0.5(\bar{C}_{\text{o}} + \bar{C}_{\text{p}} )}} .$$
(10)

Willmott’s Index (WI)

Willmott’s index of agreement given by the formula (Willmott, 1982)

$$ {\text{WI}} = 1 - \left[ {\frac{{(\overline{{C_{\text{o}} - C_{\text{p}} }} )^{2} }}{{\overline{{(|C_{\text{o}} - \overline{{C_{\text{o}} }} | + |C_{\text{p}} - \overline{{C_{\text{o}} }} |}} )}}} \right], $$
(11)

where C p is the model predictions, C o observations, overbar \( (\bar{C}) \) average over the dataset, and \( {{\upsigma}}_{\text{C}} \) is the standard deviation over the data set.

Probability of Detection (POD), False Alarm Rate (FAR) and Critical Success Index (CSI):

POD, FAR and CSI (Wilks, 1995) are commonly used for ‘Yes/No’ forecast.

$$ {\text{POD}} = \frac{b}{b + d} $$
$$ {\text{FAR}} = \frac{b}{a + b} $$
$$ {\text{CSI}} = \frac{a}{a + b + c} $$

where a is the forecast-observation pairs usually are called hits, b occasions, called false alarm, the event was forecast to occur but did not, c instances of the event of interest not occurring despite not being forecast, called misses, d is the instances of the event not occurring after a forecast that it would not occur, sometimes called a correct rejection or correct negative.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, A., Goyal, P. Forecasting of Air Quality Index in Delhi Using Neural Network Based on Principal Component Analysis. Pure Appl. Geophys. 170, 711–722 (2013). https://doi.org/10.1007/s00024-012-0583-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00024-012-0583-4

Keywords

Navigation