Skip to main content

Analysing radon accumulation in the home by flexible M-quantile mixed effect regression

Abstract

Radon is a noble gas that occurs in nature as a decay product of uranium. Radon is the principal contributor to natural background radiation and is considered to be one of the major leading causes of lung cancer. The main concern revolves around indoor environments where radon accumulates and reaches high concentrations. In this paper, a semiparametric random-effect M-quantile model is introduced to model radon concentration inside a building, and a way to estimate the model within the framework of robust maximum likelihood is presented. Using data collected in a monitoring survey carried out in the Lombardy Region (Italy) in 2003–2004, we investigate the impact of a number of factors, such as geological typologies of the soil and building characteristics, on indoor concentration. The proposed methodology permits the identification of building typologies prone to a high concentration of the pollutant. It is shown how these effects are largely not constant across the entire distribution of indoor radon concentration, making the suggested approach preferable to ordinary regression techniques since high concentrations are usually of concern. Furthermore, we demonstrate how our model provides a natural way of identifying those areas more prone to high concentration, displaying them by thematic maps. Understanding how buildings’ characteristics affect indoor concentration is fundamental both for preventing the gas from accumulating in new buildings and for mitigating those situations where the amount of radon detected inside a building is too high and has to be reduced.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  • Alfó M, Ranalli M, Salvati N (2017) Finite mixtures of quantiles and m-quantile models. Stat Comput 27:547–570

    Article  Google Scholar 

  • Apte M, Price P, Nero A, Revzan K (1999) Predicting new hampshire indoor radon concentrations from geologic information and other covariates. Environ Geol 37:181–194

    Article  CAS  Google Scholar 

  • Bianchi A, Fabrizi E, Salvati N, Tzavidis N (2018) Estimation and testing in M-quantile regression with applications to small area estimation. Int Stat Rev 86(3):1–30

    Article  Google Scholar 

  • Borgoni R (2011) A quantile regression approach to evaluate factors influencing residential indoor radon concentration. Environ Model Assess 16:239–250

    Article  Google Scholar 

  • Borgoni R, Bianco PD, Salvati N, Schmid T, Tzavidis N (2018) Modelling the distribution of health-related quality of life of advanced melanoma patients in a longitudinal multi-centre clinical trial using m-quantile random effects regression. Stat Methods Med Res 27:549–563

    Article  Google Scholar 

  • Borgoni R, Quatto P, Soma G, de Bartolo D (2010) A geostatistical approach to define guidelines for radon prone area identification. Stat Methods Appl 19:255–276

    Article  Google Scholar 

  • Borgoni R, Tritto V, Bigliotto C, de Bartolo D (2011) A geostatistical approach to assess the spatial association between indoor radon concentration, geological features and building characteristics: the Lombardy case, Northern Italy. Int J Environ Res Public Health 8:1420–1440

    Article  Google Scholar 

  • Bosch RJ, Ye Y, Woodworth GG (1995) A convergent algorithm for quantile regression with smoothing splines. Comput Stat Data Anal 19(6):613–630

    Article  Google Scholar 

  • Breckling J, Chambers R (1988) M-quantiles. Biometrika 75(4):761–771

    Article  Google Scholar 

  • Cade B, Noon BR, Flather CH (2005) Quantile regression reveals hidden bias and uncertainty in habitat models. Ecology 86:786–800

    Article  Google Scholar 

  • Chaudhuri P (1991) Global nonparametric estimation of conditional quantile functions and their derivatives. J Multivar Anal 39(2):246–269

    Article  Google Scholar 

  • Cinelli G, Tondeur F, Dehandschutter B (2011) Development of an indoor radon risk map of the Walloon region of Belgium, integrating geological information. Environ Earth Sci 62:809–819

    Article  CAS  Google Scholar 

  • Darby S, Hill D, Auvinen A, Barros-Dios J, Baysson J, Bochicchio F, Deo H, Falk R, Forastiere F, Hakama M, Heid I, Kreienbrock L, Kreuzer M, Lagarde F, MSkelSinen I, Muirhead C, Oberaigner W, Pershagen G, Ruano-Ravina A, Ruosteenoja E, Rosario AS, Tirmarche T, Tomsek L, Whitley E, Wichmann H, Doll R (2005) Radon in homes and risk of lung cancer: collaborative analysis of individual data from 13 European case–control studies. Br Med J 330(6485):223–226

    Article  CAS  Google Scholar 

  • Fellner WH (1986) Robust estimation of variance components. Technometrics 28(1):51–60

    Article  Google Scholar 

  • Fontanella L, Ippoliti L, Sarra A, Valentini P, Palermi S (2015) Hierarchical generalised latent spatial quantile regression models with applications to indoor radon concentration. Stoch Environ Res Risk Assess 29:357–367

    Article  Google Scholar 

  • Foxall R, Baddeley A (2002) Nonparametric measures of association between a spatial point process and a random set, with geological applications. J R Stat Soc Ser C 51(2):165–182

    Article  Google Scholar 

  • Gates A, Gundersen L (1992) Geologic controls on radon. Geological Society of America, Washington, DC (Special Paper 271)

    Google Scholar 

  • Geraci M (2018) Additive quantile regression for clustered data with an application to children’s physical activity. arXiv:1803.05403

  • Geraci M, Bottai M (2014) Linear quantile mixed models. Stat Comput 24(3):461–479

    Article  Google Scholar 

  • Green B, Miles J, Bradley E, Rees D (2002) Radon atlas of England and Wales. Report nrpb-w26, Chilton NRPB

  • Gunby J, Darby S, Miles J, Green B, Cox D (1993) Indoor radon concentrations in the United Kingdom. Health Phys 64:2–12

    Article  CAS  Google Scholar 

  • Huber P (1981) Robust statistics. Wiley, New York

    Book  Google Scholar 

  • Huggins RM (1993) A robust approach to the analysis of repeated measures. Biometrics 49(3):715–720

    Article  Google Scholar 

  • Huggins RM, Loesch DZ (1998) On the analysis of mixed longitudinal growth data. Biometrics 54(2):583–595

    Article  CAS  Google Scholar 

  • Hunter N, Muirhead C, Miles J, Appleton JD (2009) Uncertainties in radon related to house-specific factors and proximity to geological boundaries in England. Radiat Prot Dosim 136:17–22

    Article  CAS  Google Scholar 

  • Jacobi W (1993) The history of the radon problem in mines and homes. Ann ICRP 23(2):39–45

    Article  Google Scholar 

  • Jones M (1994) Expectiles and m-quantiles are quantiles. Stat Probab Lett 20:149–153

    Article  Google Scholar 

  • Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York

    Book  Google Scholar 

  • Kemski J, Klingel R, Siehl A, Valdivia-Manchego M (2009) From radon hazard to risk prediction-based on geological maps, soil gas and indoor measurements in Germany. Environ Geol 56:1269–1279

    Article  CAS  Google Scholar 

  • Koenker R (2005) Quantile regression. Cambridge University Press, New York

    Book  Google Scholar 

  • Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46:33–50

    Article  Google Scholar 

  • Koenker R, Mizera I (2004) Penalized triograms: total variation regularization for bivariate smoothing. J R Stat Soc Ser B 66(1):145–163

    Article  Google Scholar 

  • Koenker R, Ng P, Portnoy S (1994) Quantile smoothing splines. Biometrika 81(4):673–680

    Article  Google Scholar 

  • Kreienbrock L, Kreuzer M, Gerken M, Dingerkus M, Wellmann J, Keller G, Wichmann H (2001) Case-control study on lungcancer and residential radon in western Germany. Am J Epidemiol 89(4):339–348

    Google Scholar 

  • Krewski D, Lubin MAJH, Zielinski JM, Catalan V, Field R, Klotz J, Letourneau E, Lynch C, Lyon J, Sandler D, Schoenberg D, Steck J, Stolwijk C, Weinberg C, Wilcox H (2005) Residential radon and risk of lung cancer: a combined analysis of seven North American case-control studies. Epidemiology 16(4):137–145

    Article  Google Scholar 

  • Levesque B, Gauvin D, McGregor R, Martel R, Gingras S, Dontigny A, Walker W, Lajoie P, Levesque E (1997) Radon in residences: influences of geological and housing characteristics. Health Phys 72:907–914

    Article  CAS  Google Scholar 

  • Lubin J, Boice J (1997) Lung cancer risk from residential radon: a meta-analysis of eight epidemiological studies. J Natl Cancer Inst 89(1):49–57

    Article  CAS  Google Scholar 

  • Nero A, Schwehr M, Nazaroff W, Revzan K (1986) Distribution of airborne radon-222 concentrations in US homes. Science 234:992–997

    Article  CAS  Google Scholar 

  • Newey WK, Powell JL (1987) Asymmetric least squares estimation and testing. Econometrica 55(4):819–847

    Article  Google Scholar 

  • Opsomer J, Claeskens G, Ranalli M, Kauermann G, Breidt F (2008) Nonparametric small area estimation using penalized spline regression. J R Stat Soc Ser B 70(1):265–283

    Article  Google Scholar 

  • Organization WH (2009) WHO handbook on indoor radon: a public health perspective. WHO Library Cataloguing-in-Publication Data

  • Pratesi M, Ranalli M, Salvati N (2009) Nonparametric m-quantile regression using penalized splines. J Nonparametr Stat 21:287–304

    Article  Google Scholar 

  • Price P, Nero A, Gelman A (1996) Bayesian prediction of mean indoor radon concentrations for Minnesota counties. Health Phys 71:922–936

    Article  CAS  Google Scholar 

  • R Core Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

  • Rowe J, Kelly M, Price L (2002) Weather system scale variation in radon-222 concentration of indoor air. Sci Total Environ 284:157–166

    Article  CAS  Google Scholar 

  • Ruppert D, Wand M, Carroll R (2003) Semiparametric regression. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Sarra A, Fontanella L, Ippoliti L, Valentini P, Palermi S (2016) Quantile regression and Bayesian cluster detection to identify radon prone areas. J Environ Radioact 164:354–364

    Article  CAS  Google Scholar 

  • Shi X, Hoftiezer D, Duell E, Onega T (2006) Spatial association between residential radon concentration and bedrock types in New Hampshire. Environ Geol 51:65–71

    Article  CAS  Google Scholar 

  • Smith B, Field R (2007) Effect of housing factor and surficial uranium on the spatial prediction of residential radon in Iowa. Environmetrics 18:481–497

    Article  CAS  Google Scholar 

  • Smith B, Zhang L, Field R (2007) Iowa radon leukemia study: a hierarchical population risk model. Stat Med 10:4619–4642

    Article  Google Scholar 

  • Sundal A, Henriksen H, Soldal O, Strand T (2004) The influence of geological factors on indoor radon concentrations in Norway. Sci Total Environ 328:41–53

    Article  CAS  Google Scholar 

  • Tiefelsdorf M (2007) Controlling for migration effects in ecological disease mapping of prostate cancer. Stoch Environ Res Risk Assess 21:615–624

    Article  Google Scholar 

  • Tzavidis N, Salvati N, Schmid T, Flouri E, Midouhas E (2016) Longitudinal analysis of the strengths and difficulties questionnaire scores of the millennium cohort study children in England using m-quantile random effects regression. J R Stat Soc Ser A 179(2):427–452

    Article  Google Scholar 

  • USEPA (1992) National residential radon survey: summary report. Technical Report EPA/402/R-92/011, United States Environmental Protection Agency, Washington, DC

  • Wang Y, Lin X, Zhu M, Bai Z (2007) Robust estimation using the Huber funtion with a data dependent tuning constant. J Comput Graph Stat 16(2):468–481

    Article  Google Scholar 

  • Yu K, Lu Z, Stander J (2003) Quantile regression: applications and current research areas. Statistician 52(3):331–350

    Google Scholar 

Download references

Acknowledgements

The work of Nicola Salvati has been carried out with the support of the project InGRID 2Grant Agreement No 730998, EU) and of project PRA_2018_9 (‘From survey-based to register-based statistics: a paradigm shift using latent variable models’). The authors were further supported by the MIUR-DAAD Joint Mobility Program (57265468).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Borgoni.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Preliminary data analysis

Hereafter some preliminary data analyses is reported that motivates the need for a robust approach when modelling IRC data. To this aim an ordinary random effect model for the mean IRC that reflects the hierarchical structure of the data with buildings nested in the geological classes has been fitted using the function lmer of the R package lme4. Figure 9a shows the normal qq-plot of the individual residuals (i.e. residuals pertinent to the building level) whereas Fig. 9b displays the normal qq-plot of the residuals estimated from the model at the geological class level. These plots show that the normality assumptions of the ordinary mixed model are violated, which is also confirmed by the Shapiro-Wilk test (p values=0.0000078 for the geological class residuals and p value= 2.2e−16 for the building residuals). Figure 10a shows the histogram of the standardised building residuals obtained by the random effect regression model, whereas Fig. 10b displays the distribution of the standardised residuals by geological classes. The histogram appears very skewed and some classes have many large positive residuals (larger than 2). Thus, influential observations seem to be present in the data. This is also confirmed by Fig. 11 that displays the Cook’s Distance for the two sets of residuals.

Fig. 9
figure 9

QQ-plot of building residuals (a) and of geological class residuals (b) estimated by the two-level random effect regression model for the mean IRC

Fig. 10
figure 10

Histogram of standardised building residuals of the two-level random effect regression model for the mean IRC (a); boxplots of standardised building residuals by geological classes (b)

Fig. 11
figure 11

Cook Distance of building residuals (a) and of geological class residuals (b) estimated by the two-level random effect regression model for the mean IRC

It is clear that the data may contain outliers and influential points that invalidate the Gaussian assumptions. In these circumstances, estimates of the model parameters are biased and inefficient and the robust approach suggested in this paper sounds more appropriate.

Appendix B: Additional results for modelling geocoded radon data

Appendix 1 provides a short comparison of the estimated parameters obtained from quantile and M-quantile regression models. The two approaches cannot be directly compared since they target different location parameters. However, both approaches try to model location parameters that are related to the same part of the conditional distribution of IRC. Table 5 reports the estimated regression coefficients for q = 0.5 for two approaches: (1) the proposed semiparametric M-quantile random effect regression model (semiMQRE), and (2) a semiparametric quantile regression model (semiQR). semiQR is based on an additive quantile regression model (Koenker et al. 1994) where the spatial structure is captured by bivariate splines but without accounting for the hierarchical structure in the data by a random component. The results indicate that the coefficients based on M-quantile regression models are in the same direction as the ones based on quantile regression. However, with quantile regression convergence problems of the algorithm sometimes occurred. On the other hand, estimation with the M-quantile approach was smoother but the interpretation of the estimated parameters is more difficult.

Table 5 Results—Semiparametric M-quantile and quantile regression models for q = 0.5: Point estimates with standard errors in parentheses

Finally, Fig. 12 presents the estimated effects obtained from M-quantile and quantile-mixed regression models by quantile for each explanatory variable that is considered in the model. In particular, the solid line represents the proposed semiparametric M-quantile random effect regression model and the dashed line stands for an additive quantile regression model (Geraci 2018) which includes a bivariate spline to capture the spatial structure as well as random effects to account for the hierarchy of the data (fitted by the R package aqmm). Note that we only plot the point estimates (without the point-wise 95% confidence intervals) in order to avoid an overload of Fig. 12. The results confirm that the results based on both models are in the same direction.

Fig. 12
figure 12

Estimated coefficients of quantile regressions (dashed line) and M-quantile regressions (solid line): a intercept, b fault distance, c floor material, d wall material, e year from construction/last renovation, f single building, g not in contact with the ground, h air conditioning system

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Borgoni, R., Carcagní, A., Salvati, N. et al. Analysing radon accumulation in the home by flexible M-quantile mixed effect regression. Stoch Environ Res Risk Assess 33, 375–394 (2019). https://doi.org/10.1007/s00477-018-01643-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-018-01643-1

Keywords

  • Environmental radioactivity
  • Building factors
  • Radon-prone areas
  • Hierarchical mixed models
  • Penalised splines
  • Lombardy region