Skip to main content
Log in

Applications for Quantile Regression in Epidemiology

  • Epidemiologic Methods (P Howards Section Editor)
  • Published:
Current Epidemiology Reports Aims and scope Submit manuscript

Abstract

Purpose of Review

To illustrate the utility of quantile regression in epidemiology for outcomes that are continuous and when exposure effects may differ across the distribution of the outcome. Linear regression methods estimate only the effects at the mean level which may be an incomplete and biased summary of the effect of exposures for some continuous health outcomes.

Recent Findings

There are several variations of the quantile regression method including classical linear quantile regression, nonparametric quantile regression for growth trajectories, and the modified quantile regression for case–control designs. Such methods offer several applications including (1) the use of quantile regression to test whether the effects of exposure are similar across quantiles, (2) the use of quantile regression for risk prediction, and (3) the use of quantile regression to examine the effects of growth trajectories over time.

Summary

Quantile regression is an important tool for understanding continuous health outcomes, especially outcomes that are not normally distributed, as it offers insight into the relation of exposures with respect to the distribution of the outcome. Quantile regression methods have the potential to deepen and expand the existing quantitative evidence from more common mean-based analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

Papers of particular interest, published recently, have been highlighted as: • Of importance

  1. Yang J, Loos RJ, Powell JE, Medland SE, Speliotes EK, Chasman DI, et al. FTO genotype is associated with phenotypic variability of body mass index. Nature. 2012;490(7419):267–72. https://doi.org/10.1038/nature11401.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, Lindgren CM, et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science. 2007;316(5826):889–94. https://doi.org/10.1126/science.1141634.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Koenker R, Bassett G Jr. Regression quantiles. Econometrica. 1978;46:33–50.

    Article  Google Scholar 

  4. Zhao Z, Xiao Z. Efficient regressions via optimally combining quantile information. Economic Theory. 2014;30(6):1272–314. https://doi.org/10.1017/s0266466614000176.

    Article  Google Scholar 

  5. Gutenbrunner C, Jurečková J, Koenker R, Portnoy S. Tests of linear hypotheses based on regression rank scores. J Nonparametr Stat. 1993;2(4):307–31.

    Article  Google Scholar 

  6. He X, Hu F. Markov chain marginal bootstrap. J Am Stat Assoc. 2002;97(459):783–95.

    Article  Google Scholar 

  7. Feng X, He X, Hu J. Wild bootstrap for quantile regression. Biometrika. 2011;98(4):995–9.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Kocherginsky M, He X, Mu Y. Practical confidence intervals for regression quantiles. J Comput Graph Stat. 2005;14(1):41–55.

    Article  Google Scholar 

  9. Hjartåker A, Langseth H, Weiderpass E. Obesity and diabetes epidemics. In: Innovative Endocrinology of Cancer: Springer; 2008. p. 72–93.

  10. Terry MB, Wei Y, Esserman D. Maternal, birth, and early-life influences on adult body size in women. Am J Epidemiol. 2007;166(1):5–13. https://doi.org/10.1093/aje/kwm094.

    Article  PubMed  Google Scholar 

  11. Koenker RW, D’Orey V, Algorithm AS. 229: computing regression quantiles. J R Stat Soc: Ser C: Appl Stat. 1987;36(3):383–93. https://doi.org/10.2307/2347802.

    Article  Google Scholar 

  12. Koenker R, d’Orey V, Remark AS. R92: a remark on algorithm AS 229: computing dual regression quantiles and regression rank scores. J R Stat Soc: Ser C: Appl Stat. 1994;43(2):410–4.

    Google Scholar 

  13. Wei Y, Pere A, Koenker R, He X. Quantile regression methods for reference growth charts. Stat Med. 2006;25(8):1369–82.

    Article  PubMed  Google Scholar 

  14. Wei Y, He X. Conditional growth charts. Ann Stat. 2006;34(5):2069–97.

    Article  Google Scholar 

  15. • Wei Y, Ma X, Liu X, Terry MB. Using time-varying quantile regression approaches to model the influence of prenatal and infant exposures on childhood growth. Biostat Epidemiol. 2017;1(1):133–47. https://doi.org/10.1080/24709360.2017.1358137. This is a paper that shows how to do repeated measures analysis with quantile regression .

    Article  Google Scholar 

  16. Terry MB, Wei Y, Esserman D, McKeague IW, Susser E. Pre- and postnatal determinants of childhood body size: cohort and sibling analyses. J Dev Orig Health Dis. 2011;2(2):99–111. https://doi.org/10.1017/s2040174411000067.

    Article  CAS  PubMed  Google Scholar 

  17. • Ester WA, Houghton LC, Lumey LH, Michels KB, Hoek HW, Wei Y, et al. Maternal and early childhood determinants of women’s body size in midlife: overall cohort and sibling analyses. Am J Epidemiol. 2017;185(5):385–94. https://doi.org/10.1093/aje/kww222. This analysis updated quantile-specific results from 2007 showing the association between maternal BMI and gestational weight gain and offspring BMI persists through midlife.

    Article  PubMed  PubMed Central  Google Scholar 

  18. • Briollais L, Durrieu G. Quantile regression for genetic and genomic applications. In: Handbook of quantile regression: Chapman and Hall/CRC; 2017. p. 409–27. This paper is an example of applying quantile regression to genetic data.

  19. Lin D, Zeng D. Proper analysis of secondary phenotype data in case-control association studies. Genet Epidemiol. 2009;33(3):256–65.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Wei Y, Song X, Liu M, Ionita-Laza I, Reibman J. Quantile regression in the secondary analysis of case–control data. J Am Stat Asoc. 2016;111(513):344–54.

    Article  CAS  Google Scholar 

  21. Liu M, Rogers L, Cheng Q, Shao Y, Fernandez-Beros ME, Hirschhorn JN, et al. Genetic variants of TSLP and asthma in an admixed urban population. PLoS One. 2011;6(9):e25099.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Song X, Ionita-Laza I, Liu M, Reibman J, We Y. A general and robust framework for secondary traits analysis. Genetics. 2016;202:1329–43.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mary Beth Terry.

Ethics declarations

Conflict of Interest

The authors declare that they have no potential conflicts of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Epidemiologic Methods

Appendix

Appendix

Table 2. Computational packages and basic syntax of quantile regression

Basic Quantile Regression Syntax in R

> install.packages(“quantreg”).

> library(quantreg).

> fit = rq(y~×1 + ×2, tau = .5, data = data).

Note: tau is the quantile level(s) of interest. It could a single value for a fixed quantile level, or a vector of quantile levels, tau = c(0.25, 0.5, 0.75). The function rq() will return regression quantiles from multiple quantiles. If tau is smaller than 0 or larger than 1, the function will return the entire quantile process.

Basic Quantile Regression Syntax in SAS

PROC QUANTREG.

DATA = sas-data-set;

CLASS X1;

MODEL Y = X1 X2 / QUANTILE = 0.25 0.5 0.75;

RUN;

Note: if the option QUANTILE = ALL, it returns the entire quantile process. Same as in R, the default value is 0.5, corresponding to the median.

Statistical Inference of Quantile Regression in R and SAS

To obtain statistical inference of quantile regression in R, we need to use the function summary.rq(object, se = “nid”, ...), where object is the returned object from the function rq(), and the parameter se specify the inference methods. In SAS, the inference options are specified at the PROC QUANTREG Statement following the syntax “PROC QUANTREG CI= <NONE|RANK|...> ALPHA = value ;” where ALPHA is the significance level, and CI specifies the choice of inference. The table below lists the available methods in R and SAS.

  

Options

Inference method

Subcategories

R

SAS

Direct

i.i.d. model

n.i.d. model

se =” iid”

se = “nid”

CI = SPARCITY/IID

CI = SPARCITY

Rank Score

 

se =” rank”

CI = RANK

resampling

Pairwise

se =” boot”, bsmethod = “xy”

Not available

 

Parzen, Wei and Ying

se =” boot”, bsmethod = “pxy”

Not available

 

MCMB

se =” boot”, bsmethod = “mcmb”

CI = RESAMPLING

 

Wild

se =” boot”, bsmethod = “wild”

Not available

R script for the nonparametric quantile regression for growth trajectories with B-spline approximation

figure a

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, Y., Kehm, R.D., Goldberg, M. et al. Applications for Quantile Regression in Epidemiology. Curr Epidemiol Rep 6, 191–199 (2019). https://doi.org/10.1007/s40471-019-00204-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40471-019-00204-6

Keywords

Navigation