Current Environmental Health Reports

, Volume 4, Issue 4, pp 481–490 | Cite as

Statistical Approaches to Address Multi-Pollutant Mixtures and Multiple Exposures: the State of the Science

  • Massimo Stafoggia
  • Susanne Breitner
  • Regina Hampel
  • Xavier Basagaña
Air Pollution and Health (S Adar and B Hoffmann, Section Editors)
Part of the following topical collections:
  1. Topical Collection on Air Pollution and Health


Purpose of Review

The purpose of this review is to describe the most recent statistical approaches to estimate the effect of multi-pollutant mixtures or multiple correlated exposures on human health.

Recent Findings

The health effects of environmental chemicals or air pollutants have been widely described. Often, there exists a complex mixture of different substances, potentially highly correlated with each other and with other (environmental) stressors. Single-exposure approaches do not allow disentangling effects of individual factors and fail to detect potential interactions between exposures. In the last years, sophisticated methods have been developed to investigate the joint or independent health effects of multi-pollutant mixtures or multiple environmental exposures.


A classification of the most recent methods is proposed. A non-technical description of each method is provided, together with epidemiological applications and operational details for implementation with standard software.


Correlated variables Environmental exposures Epidemiology Health Multi-pollutant 



ISGlobal is a member of the CERCA Programme, Generalitat de Catalunya.

Compliance with Ethical Standards

Conflict of Interest

The authors declare that they have no conflict of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.


Papers of particular interest, published recently, have been highlighted as: • Of importance

  1. 1.
    International Programme on Chemical Safety (IPCS)-World Health Organization (WHO). Public health impact of chemicals: knowns and unknowns. Geneva: World Health Organization; 2016.Google Scholar
  2. 2.
    International Agency for Research on Cancer (IARC). IARC monographs on the evaluation of carcinogenic risks to humans. Lyon: World Health Organization; 2015.Google Scholar
  3. 3.
    Lelieveld J, Evans JS, Fnais M, Giannadaki D, Pozzer A. The contribution of outdoor air pollution sources to premature mortality on a global scale. Nature. 2015;525:367–71.CrossRefPubMedGoogle Scholar
  4. 4.
    GBD 2013 Risk Factors Collaborators. Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks in 188 countries, 1990–2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet. 2015;386:2287–323.CrossRefPubMedCentralGoogle Scholar
  5. 5.
    Wild CP. Complementing the genome with an “exposome”: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol Biomark Prev. 2005;14:1847–50.CrossRefGoogle Scholar
  6. 6.
    • Taylor KW, Joubert BR, Braun JM, Dilworth C, Gennings C, Hauser R, et al. Statistical approaches for assessing health effects of environmental chemical mixtures in epidemiology: lessons from an innovative workshop. Environ Health Perspect. 2016;124:A227–9. This paper provides an important summary of a workshop organized by NIEHS on statistical methods for the analysis of environmental chemical mixtures. CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Dominici F, Peng RD, Barr CD, Bell ML. Protecting human health from air pollution: shifting from a single-pollutant to a multi-pollutant approach. Epidemiology. 2010;21:187–94.CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Health Effects Institute (HEI). Strategic plan for understanding the health effects of air pollution 2015–2020. Boston: Health Effects Institute; 2014.Google Scholar
  9. 9.
    Johns DO, Stanek LW, Walker K, Benromdhane S, Hubbell B, Ross M, et al. Practical advancement of multipollutant scientific and risk assessment approaches for ambient air pollution. Environ Health Perspect. 2012;120:1238–42.CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Mauderly JL, Burnett RT, Castillejos M, Ozkaynak H, Samet JM, Stieb DM, et al. Is the air pollution health research community prepared to support a multipollutant air quality management framework? Inhal Toxicol. 2010;22S1:1–19.CrossRefGoogle Scholar
  11. 11.
    U.S. Environmental Protection Agency (EPA). The multi-pollutant report: technical concepts and examples. Washington, DC: US Environmental Protection Agency; 2008.Google Scholar
  12. 12.
    Billionnet C, Sherrill D, Annesi-Maesano I. Estimating the health effects of exposure to multi-pollutant mixture. Ann Epidemiol. 2012;22:126–41.CrossRefPubMedGoogle Scholar
  13. 13.
    Anderson TW. An introduction to multivariate statistical analysis. 2nd ed. New York: John Wiley & Sons; 1984.Google Scholar
  14. 14.
    Yang Y, Li R, Li W, Wang M, Cao Y, Wu Z, et al. The association between ambient air pollution and daily mortality in Beijing after the 2008 Olympics: a time series study. PLoS One. 2013;e76759:8.Google Scholar
  15. 15.
    Paatero P, Tapper U. Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics. 1994;5:111–26.CrossRefGoogle Scholar
  16. 16.
    Paatero P. The multilinear engine—a table-driven least squares program for solving multilinear problems, including the n-way parallel factor analysis model. J Comput Graph Stat. 1999;8:1–35.Google Scholar
  17. 17.
    • Krall JR, Strickland MJ. Recent approaches to estimate associations between source-specific air pollution and health. Curr Environ Health Rep. 2017;4:68–78. Krall et al. provide a thorough review of recent methodological developments in the study of the association between source-specific air pollution and health. CrossRefPubMedGoogle Scholar
  18. 18.
    Krall JR, Mulholland JA, Russell AG, Balachandran S, Winquist A, Tolbert PE, et al. Associations between source-specific fine particulate matter and emergency department visits for respiratory disease in four US cities. Environ Health Perspect. 2017;125:97–103.PubMedGoogle Scholar
  19. 19.
    Dai L, Bind M-A, Koutrakis P, Coull BA, Sparrow D, Vokonas PS, et al. Fine particles, genetic pathways, and markers of inflammation and endothelial dysfunction: analysis on particulate species and sources. J Expo Sci Environ Epidemiol. 2016;26:415–21.CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Siponen T, Yli-Tuomi T, Aurela M, Dufva H, Hillamo R, Hirvonen M-R, et al. Source-specific fine particulate air pollution and systemic inflammation in ischaemic heart disease patients. Occup Environ Med. 2015;72:277–83.CrossRefPubMedGoogle Scholar
  21. 21.
    Gass K, Balachandran S, Chang HH, Russell AG, Strickland MJ. Ensemble-based source apportionment of fine particulate matter and emergency department visits for pediatric asthma. Am J Epidemiol. 2015;181:504–12.CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Park ES, Symanski E, Han D, Spiegelman C. Part 2. Development of enhanced statistical methods for assessing health effects associated with an unknown number of major sources of multiple air pollutants. In: Development of statistical methods for multipollutant research. Res Rep Health Eff Inst. 2015; 183:51–113.Google Scholar
  23. 23.
    Basagaña X, Esnaola M, Rivas I, Amato F, Alvarez-Pedrerol M, Forns J, et al. Neurodevelopmental deceleration by urban fine particles from different emission sources: longitudinal observational study. Environ Health Perspect. 2016;124:1630–6.CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Bair E, Hastie T, Paul D, Tibshirani R. Prediction by supervised principal components. J Am Stat Assoc. 2006;101:119–37.CrossRefGoogle Scholar
  25. 25.
    Roberts S, Martin MA. Using supervised principal components analysis to assess multiple pollutant effects. Environ Health Perspect. 2006;114:1877–82.PubMedPubMedCentralGoogle Scholar
  26. 26.
    Wold H. Estimation of principal components and related models by iterative least squares. In: Krishnaiah PR, editor. Multivariate analysis. New York: Academic Press; 1966. p. 391–420.Google Scholar
  27. 27.
    Mevik BH, Wehrens R. The pls package: principal component and partial least squares regression in R. J Stat Softw. 2007;18:1–23.CrossRefGoogle Scholar
  28. 28.
    Sun Z, Tao Y, Li S, Ferguson KK, Meeker JD, Park SK, et al. Statistical strategies for constructing health risk models with multiple pollutants and their interactions: possible choices and comparisons. Environ Health. 2013;12:85.CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Chun H, Keles S. Sparse partial least squares regression for simultaneous dimension reduction and variable selection. J R Stat Soc B. 2010;72:3–25.CrossRefGoogle Scholar
  30. 30.
    • Agier A, Portengen L, Chadeau-Hyam M, Basagaña X, Giorgis-Allemand L, Siroux V, et al. A systematic comparison of linear regression-based statistical methods to assess exposome-health associations. Environ Health Perspect. 2016;124:1848–56. This study conducted a comparison of the performance of several variable selection methods in an exposome setting. CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Carrico C, Gennings C, Wheeler DC, Factor-Litvak P. Characterization of a weighted quantile sum regression for highly correlated data in a risk analysis setting. J Agric Biol Environ Stat. 2015;20:100. Scholar
  32. 32.
    Czarnota J, Gennings C, Colt JS, De Roos AJ, Cerhan JR, Severson RK, et al. Analysis of environmental chemical mixtures and non-Hodgkin lymphoma risk in the NCI-SEER NHL study. Environ Health Perspect. 2015;123:965–70.CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Reid S, Tibshirani R. Sparse regression and marginal testing using cluster prototypes. Biostatistics. 2016;17:364–76.PubMedGoogle Scholar
  34. 34.
    Bien J, Tibshirani R. Hierarchical clustering with prototypes via minimax linkage. J Am Stat Assoc. 2011;106:1075–84.CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Tibshirani RJ, Taylor J, Lockhart R, Tibshirani R. Exact post-selection inference for sequential regression procedures. arXiv 2014:1401.3889v5 [stat.ME].Google Scholar
  36. 36.
    Sinisi S, van der Laan M. Deletion/substitution/addition algorithm in learning with applications in genomics. Stat Appl Genet Mol Biol. 2004;3:Article18.CrossRefPubMedGoogle Scholar
  37. 37.
    Beckerman BS, Jerrett M, Martin RV, van Donkelaar A, Ross Z, Burnett RT. Application of the deletion/substitution/addition algorithm to selecting land use regression models for interpolating air pollution measurements in California. Atmos Environ. 2013;77:172–7.CrossRefGoogle Scholar
  38. 38.
    Amini SM, Parmeter CF. Bayesian model averaging in R. J Econ Soc Meas. 2011;36:253–87.Google Scholar
  39. 39.
    Fragoso TM, Louzada Neto F. Bayesian model averaging: a systematic review and conceptual classification. arXiv 2015:1509.08864.Google Scholar
  40. 40.
    Bobb JF, Dominici F, Peng RDA. Bayesian model averaging approach for estimating the relative risk of mortality associated with heat waves in 105 US cities. Biometrics. 2011;67:1605–16.CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc B. 1996;58:267–88.Google Scholar
  42. 42.
    Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc B. 2005;67:301–20.CrossRefGoogle Scholar
  43. 43.
    Dai L, Koutrakis P, Coull BA, Sparrow D, Vokonas PS, Schwartz JD. Use of the adaptive LASSO method to identify PM2.5 components associated with blood pressure in elderly men: the Veterans Affairs Normative Aging Study. Environ Health Perspect. 2016;124:120–5.CrossRefPubMedGoogle Scholar
  44. 44.
    Lenters V, Portengen L, Rignell-Hydbom A, Jönsson BAG, Lindh CH, Piersma AH, et al. Prenatal phthalate, perfluoroalkyl acid, and organochlorine exposures and term birth weight in three birth cohorts: multi-pollutant models based on elastic net regression. Environ Health Perspect. 2016;124:365–72.PubMedGoogle Scholar
  45. 45.
    Tibshirani R. Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc B. 2011;73 Part 3:273–82.CrossRefGoogle Scholar
  46. 46.
    Lim M, Hastie T. Learning interactions via hierarchical group-lasso regularization. J Comput Graph Stat. 2015;24:627–54.CrossRefPubMedPubMedCentralGoogle Scholar
  47. 47.
    Huang H. Controlling the false discoveries in LASSO. Biometrics. 2017;
  48. 48.
    Liquet B, Bottolo L, Campanella G, Richardson S, Chadeau-Hyam M. R2GUESS: a graphics processing unit-based R Package for Bayesian variable selection regression of multivariate responses. J Stat Softw. 2016;69:2.CrossRefGoogle Scholar
  49. 49.
    MacLehose RF, Dunson DB, Herring AH, Hoppin JA. Bayesian methods for highly correlated exposure data. Epidemiology. 2007;18:199–207.CrossRefPubMedGoogle Scholar
  50. 50.
    Hill SM, Neve RM, Bayani N, Kuo WL, Ziyad S, Spellman PT, et al. Integrating biological knowledge into variable selection: an empirical Bayes approach with an application in cancer biology. BMC Bioinformatics. 2012;13:94.CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    • Bobb JF, Valeri L, Claus Henn B, Christiani DC, Wright RO, Mazumdar M. Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures. Biostatistics. 2015;16:493–508. This study provides a thorough description of BKMR method. CrossRefPubMedGoogle Scholar
  52. 52.
    Wolpert DH. The lack of a priori distinctions between learning algorithms. Neural Comput. 1996;8:1341–90.CrossRefGoogle Scholar
  53. 53.
    Steinley D. K-means clustering: a half-century synthesis. Br J Math Stat Psychol. 2006;59:1–34.CrossRefPubMedGoogle Scholar
  54. 54.
    Fahad A, Alshatri N, Tari Z, Alamri A, Khalil I, Zomaya AY, et al. A survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans Emerg Top Comput. 2014;2:267–79.CrossRefGoogle Scholar
  55. 55.
    Ljungman PL, Wilker EH, Rice MB, Austin E, Schwartz J, Gold DR, et al. The impact of multi-pollutant clusters on the association between fine particulate air pollution and microvascular function. Epidemiology. 2016;27:194–201.PubMedPubMedCentralGoogle Scholar
  56. 56.
    Lee DH, Steffes MW, Sjödin A, Jones RS, Needham LL, Jacobs DR Jr. Low dose of some persistent organic pollutants predicts type 2 diabetes: a nested case-control study. Environ Health Perspect. 2010;118:1235–42.CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Pumarega J, Gasull M, Lee DH, López T, Porta M. Number of persistent organic pollutants detected at high concentrations in blood samples of the United States population. PLoS One. 2016;11:e0160432.CrossRefPubMedPubMedCentralGoogle Scholar
  58. 58.
    Lee DH, Lee IK, Song K, Steffes M, Toscano W, Baker BA, et al. A strong dose-response relation between serum concentrations of persistent organic pollutants and diabetes: results from the National Health and Examination Survey 1999-2002. Diabetes Care. 2006;29:1638–44.CrossRefPubMedGoogle Scholar
  59. 59.
    Molitor J, Papathomas M, Jerrett M, Richardson S. Bayesian profile regression with an application to the National Survey of Children’s Health. Biostatistics. 2010;11:484–98.CrossRefPubMedGoogle Scholar
  60. 60.
    Wang Y, Miller DJ, Clarke R. Approaches to working in high-dimensional data spaces: gene expression microarrays. Br J Cancer. 2008;98:1023.CrossRefPubMedPubMedCentralGoogle Scholar
  61. 61.
    Papathomas M, Molitor J, Richardson S, Riboli E, Vineis P. Examining the joint effect of multiple risk factors using exposure risk profiles: lung cancer in nonsmokers. Environ Health Perspect. 2011;119:84–91.CrossRefPubMedGoogle Scholar
  62. 62.
    • Pirani M, Best N, Blangiardo M, Liverani S, Atkinson RW, Fuller GW. Analysing the health effects of simultaneous exposure to physical and chemical properties of airborne particles. Environ Int. 2015;79:56–64. Pirani and colleagues propose a Bayesian approach to analyze the impact of multiple particle metrics on daily mortality. The method enables a better understanding of hidden structures in multi-pollutant health effects and provides a tool to assess the changes in health effects from various policies to control the ambient particle matter mixtures. CrossRefPubMedPubMedCentralGoogle Scholar
  63. 63.
    Bauer DJ, Shanahan MJ. Modeling complex interactions: person-centered and variable-centered approaches. In: Little TD, Bovaird JA, Card NA, editors. Modeling contextual effects in longitudinal studies. Mahwah: Lawrence Erlbaum Associates; 2007. p. 255–83.Google Scholar
  64. 64.
    Strobl C, Malley J, Tutz G. An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods. 2009;14:323–48.CrossRefPubMedPubMedCentralGoogle Scholar
  65. 65.
    Gass K, Klein M, Chang HH, Flanders WD, Strickland MJ. Classification and regression trees for epidemiologic research: an air pollution example. Environ Health. 2014;13:17.CrossRefPubMedPubMedCentralGoogle Scholar
  66. 66.
    Molinaro AM, Lostritto K, van der Laan M. partDSA: deletion/substitution/addition algorithm for partitioning the covariate space in prediction. Bioinformatics. 2010;26:1357–63.CrossRefPubMedPubMedCentralGoogle Scholar
  67. 67.
    Lampa E, Lind L, Lind PM, Bornefalk-Hermansson A. The identification of complex interactions in epidemiology and toxicology: a simulation study of boosted regression trees. Environ Health. 2014;13:57.CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Massimo Stafoggia
    • 1
    • 2
  • Susanne Breitner
    • 3
  • Regina Hampel
    • 3
  • Xavier Basagaña
    • 4
    • 5
    • 6
  1. 1.Department of EpidemiologyLazio Region Health Service/ASL Roma 1RomeItaly
  2. 2.Institute of Environmental Medicine, Karolinska InstitutetStockholmSweden
  3. 3.Institute of Epidemiology II, Helmholtz Zentrum München-German Research Center for Environmental Health (GmbH)NeurherbergGermany
  4. 4.ISGlobal, Centre for Research in Environmental Epidemiology (CREAL)BarcelonaSpain
  5. 5.Pompeu Fabra UniversityBarcelonaSpain
  6. 6.Ciber on Epidemiology and Public Health (CIBERESP)MadridSpain

Personalised recommendations