Skip to main content

Quantile Regression: Analyzing Changes in Distributions Instead of Means

  • Chapter
  • First Online:
Higher Education: Handbook of Theory and Research

Part of the book series: Higher Education: Handbook of Theory and Research ((HATR,volume 30))

Abstract

While multiple regression has been a popular statistical choice for postsecondary researchers, it can only tell us the effect of an independent variable on the mean of y. In many applications, however, we would like to know the effect across the entire distribution of y, not just the mean of y. Quantile regression provides one way of telling us this effect, although the interpretation can vary depending upon whether conditional or unconditional quantile regression is used.This chapter reviews conditional and unconditional quantile regression, with an emphasis on the latter as estimated via the recentered influence function, assuming exogeneity of the independent variables, and the instrumental variables quantile treatment effect estimator, assuming endogeneity of treatment. Issues around interpretation, estimation, sensitivity analyses, and presentation of results are discussed, using the 2004 National Survey of Postsecondary Faculty to estimate male-female salary differentials and the effect of faculty unions on faculty salaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This example is based on the discussion at http://www.ats.ucla.edu/stat/stata/faq/quantreg.htm

  2. 2.

    To simplify the analysis, no survey weights or adjustments of the standard errors for the complex sampling design of the NSOPF are used, and the dependent variable is not logged.

  3. 3.

    Indeed, one of the co-authors of the Firpo et al. (2009) paper has done this in their discussion papers, but omitted the sensitivity analyses from their published papers (Fortin, June 2 2014, Personal communication).

  4. 4.

    While their estimator is easily programmed by hand, the ado files for this command can be found at http://faculty.arts.ubc.ca/nfortin/datahead.html

  5. 5.

    Please note that for expository purposes I am assuming selection on observables, but this clearly does not hold here. There are many differences between male and female faculty that are not taken into account by the simple model estimated here, so the results should not be interpreted as the “true” male-female salary differential.

  6. 6.

    Continuous instruments can be dichotomized to satisfy this requirement.

References

  • Abadie, A. (2003). Semiparametric instrumental variable estimation of treatment response variables. Journal of Econometrics, 113, 231–263.

    Article  Google Scholar 

  • Abadie, A., Angrist, J., & Imbens, G. (2002). Instrumental variables estimates of the effect of subsidized training on the quantiles of trainee earnings. Econometrica, 70(1), 91–117.

    Article  Google Scholar 

  • Bielby, R. M., House, E., Flaster, A., & DesJardins, S. L. (2013). Instrumental variables: Conceptual issues and an application considering high school coursetaking. In M. B. Paulsen (Ed.), Higher education: Handbook of theory and Research. New York: Springer.

    Google Scholar 

  • Buchinsky, M. (1998). Recent advances in quantile regression models: A practical guideline for empirical research. Journal of Human Resources, 33(1), 88–126.

    Article  Google Scholar 

  • Budig, M. J., & Hodges, M. J. (2010). Differences in disadvantage: Variation in the motherhood penalty across white women’s earnings distribution. American Sociological Review, 75, 705–728.

    Article  Google Scholar 

  • Cameron, A. C., & Trivedi, P. K. (2005). Microeconometrics: Methods and applications. New York: Cambridge University Press.

    Book  Google Scholar 

  • Castellano, K. E., & Ho, A. D. (2013a). Contrasting OLS and quantile regression approaches to student “growth” percentiles. Journal of Educational and Behavioral Statistics, 38(2), 190–215.

    Article  Google Scholar 

  • Castellano, K. E., & Ho, A. D. (2013b). A practioners guide to growth models. Technical report, Council of Chief State School Officers.

    Google Scholar 

  • Chen, C. L. (2005). An introduction to quantile regression and the quantreg procedure. In SUGI 30 proceedings, Philadelphia.

    Google Scholar 

  • Cox, N. J. (2007). Kernel estimation as a basic tool for geomorphological data analysis. Earth Surface Processes and Landforms, 32, 1902–1912.

    Article  Google Scholar 

  • Davino, C., Furno, M., & Vistocco, D. (2014). Quantile regression: Theory and applications. Chichester, UK: Wiley.

    Book  Google Scholar 

  • Fiorio, C. V. (2004). Confidence intervals for kernel density estimation. The Stata Journal, 4(2), 168–179.

    Google Scholar 

  • Firpo, S. (2007). Efficient semiparametric estimation of quantile treatment effects. Econometrica, 75(1), 259–276.

    Article  Google Scholar 

  • Firpo, S., Fortin, N. M., & Lemieux, T. (2009). Unconditional quantile regressions. Econometrica, 77(3), 953–973.

    Article  Google Scholar 

  • Frandsen, B. R., Fröhlich, M., & Melley, B. (2012). Quantile treatment effects in the regression discontinuity design. Journal of Econometrics, 168, 382–395.

    Article  Google Scholar 

  • Fröhlich, M., & Melly, B. (2010). Estimation of quantile treatment effects with Stata. The Stata Journal, 10(3), 423–457.

    Google Scholar 

  • Fröhlich, M., & Melly, B. (2013). Unconditional quantile treatment effects under endogeneity. Journal of Business & Economic Statistics, 31(3), 346–357.

    Article  Google Scholar 

  • Härdle, W. (1991). Smoothing techniques. New York: Springer.

    Book  Google Scholar 

  • Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis, 15(3), 199–236.

    Article  Google Scholar 

  • Killewald, A., & Bearak, J. (2014). Is the motherhood penalty larger for low-wage women? A comment on quantile regression. American Sociological Review, 79(2), 350–357.

    Article  Google Scholar 

  • Koenker, R., & Hallock, K. F. (2001). Quantile regression. Journal of Economic Perspectives, 15(4), 143–156.

    Article  Google Scholar 

  • Maclean, J. C., Webber, D. A., & Marti, J. (2014). An application of unconditional quantile regression to cigarette taxes. Journal of Policy Analysis and Management, 33(1), 188–210.

    Article  Google Scholar 

  • Mueller, S. (2013). Teacher experience and the class size effect – Experimental evidence. Journal of Public Economics, 98, 44–52.

    Article  Google Scholar 

  • Murnane, R. J., & Willett, J. B. (2011). Methods matter: Improving causal inference in educational and social science. New York: Oxford University Press.

    Google Scholar 

  • Porter, S. R. (2013). The causal effect of faculty union on institutional decision-making. Industrial & Labor Relations Review, 66(5), 1192–1211.

    Google Scholar 

  • Porter, S. R. (2014). Understanding the modern approach to instrumental variables. North Carolina State University. Raleigh: NC.

    Google Scholar 

  • Salgado-Ugarte, I. H., Shimizu, M., & Taniuchi, T. (1993). Exploring the shape of univariate data using kernel density estimators. Stata Technical Bulletin, 16, 8–19.

    Google Scholar 

  • Salgado-Ugarte, I. H., Shimizu, M., & Taniuchi, T. (1995). Practical rules for bandwidth selection in univariate density estimation. Stata Technical Bulletin, 27, 5–19.

    Google Scholar 

  • Scott, D. (1992). Multivariate density estimation. New York: Wiley.

    Book  Google Scholar 

  • Silverman, B. W. (1992). Density estimation for statistics and data analysis. London: Chapman & Hall.

    Google Scholar 

  • StataCorp LP (2013). Stata base reference manual release 13. College Station, TX: Stata Press.

    Google Scholar 

  • Webber, D. A., & Ehrenberg, R. G. (2010). Do expenditures other than instructional expenditures affect graduation and persistence rates in American higher education? Economics of Education Review, 29, 947–958.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen R. Porter .

Editor information

Editors and Affiliations

Appendix

Appendix

Below is the Stata syntax used to generate the results in this chapter.

global figures directory

*** Example in Table 8.1 ***

use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear

sum write, detail

sum write if female==0, detail

sum write if female==1, detail

reg write female

qreg write female

qreg write female, quantile(.25)

replace write=1000 if id==192

reg write female

qreg write female

*** Graphing densities for Fig. 8.6, panel (a) ***

kdensity write, bwidth(1) kernel(gau) legend(off)

graphregion(color(white) lwidth(large)) xtitle

("Writing score") title("")

graph export $figures∖kernel1gau.eps, replace

!epstopdf $figures∖kernel1gau.eps

*** Input faculty salary data ***

use nsopfdata.dta, clear

* Define faculty group for analysis

keep if q1==1 & q2==1 & q3==1 & q5==1 // only instr. duties, faculty status, full-time

keep if q4==1 | q4==2 // principal activity is teaching or research

keep if q10==1 | q10==2 | q10==3 // rank of prof, assoc or asst

* Code independent variables

recode q17a1 (1=1) (0 2/7=0), gen(phd)

recode q71 (2=1) (1=0), gen(female)

gen age=2003-q72

recode q10 (1=1) (2 3=0) (0 4 5 6=.), gen(full)

recode q10 (2=1) (1 3=0) (0 4 5 6=.), gen(assoc)

rename q74b asian

rename q74c black

gen native=0

replace native=1 if q74a==1 | q74d==1

rename q73 latino

rename q52ba articles

rename q52bd books

rename q16cd2 disc

xi i.disc // discipline dummy vars

* Dependent variable

rename q66a basesalary

drop if basesalary<20000 // seems odd to be FT prof and making less than 20K

* Create analytic sample

reg basesalary female asian black latino native full assoc articles books _Idisc_2-_Idisc_32

keep if e(sample)

*** OLS-RIF results for Table 8.5 ***

reg basesalary female asian black latino native full assoc articles books _Idisc_2-_Idisc_32

estimate store ols

foreach i in 10 25 50 75 90

rifreg basesalary female asian black latino native full assoc articles books _Idisc_2-_Idisc_32, quantile(.‘i’)

estimates store q‘i’

estimates table ols q10 q25 q50 q75 q90,

drop(_Idisc_2-_Idisc_32) b(%9.0f) se se(%9.0f)

*** bootstrapping SEs for Table 8.6 ***

foreach i in 10 25 50 75 90

bootstrap, reps(100) seed(642014): rifreg basesalary female asian black latino native full assoc articles books _Idisc_2-_Idisc_32, quantile(.‘i’)

estimates store q‘i’

estimates table q10 q25 q50 q75 q90,

drop(_Idisc_2-_Idisc_32) b(%9.0f) se se(%9.0f)

*** Testing sensitivity of results in Table 8.7 ***

* Gaussian

foreach i in 10 25 50 75 90

rifreg basesalary female asian black latino native full assoc articles books _Idisc_2-_Idisc_32, quantile(.‘i’)

estimates store silverq‘i’

foreach i in 10 25 50 75 90

rifreg basesalary female asian black latino native full assoc articles books _Idisc_2-_Idisc_32, quantile(.‘i’) width(5192)

estimates store hardleq‘i’

foreach i in 10 25 50 75 90

rifreg basesalary female asian black latino native full assoc articles books _Idisc_2-_Idisc_32, quantile(.‘i’) width(3802)

estimates store scottq‘i’

estimates table silverq10 silverq25 silverq50 silverq75 silverq90, drop(_Idisc_2-_Idisc_32) b(%9.0f)

se se(%9.0f)

estimates table hardleq10 hardleq25 hardleq50 hardleq75 hardleq90, drop(_Idisc_2-_Idisc_32) b(%9.0f)

se se(%9.0f)

estimates table scottq10 scottq25 scottq50

scottq75 scottq90, drop(_Idisc_2-_Idisc_32) b(%9.0f)

se se(%9.0f)

* to see results with Epanechnikov and uniform

distributions, just add kernop(ep) or kernop(rec) as

options

*** Conditional QR results for Table 8.8 ***

foreach i in 10 25 50 75 90

qreg basesalary female asian black latino native full assoc articles books _Idisc_2-_Idisc_32, quantile(.‘i’)

estimates store q‘i’

estimates table q10 q25 q50 q75 q90,

drop(_Idisc_2-_Idisc_32) b(%9.0f) se se(%9.0f)

*** Graph unconditional QR results for gender (Fig. 8.7) ***

* This set of code can be used to create the other figures in the chapter

matrix quantiles = J(1,3,.) // create blank matrix to add model results to

matrix colnames quantiles = B SE Q

matrix identity=J(1,1,1) // to add to counter matrix per loop

matrix counter=J(1,1,0) // will save quatiles for

qraphing

forvalues i=.01(.01)1

matrix counter=counter+identity

qui:rifreg basesalary female asian black latino native full assoc articles books _Idisc_2-_Idisc_32, quantile(‘i’)

matrix table=r(table) // create a matrix of results for each rd (have to rename matrix)

matrix b_se=table[1..2,1..1]’ // grab B and SE and

transpose so they are in column format rather than row

matrix temp=b_se,counter // add quantile as a column

matrix quantiles=quantiles∖temp //add most recent set of model results to matrix

matrix quantiles2=quantiles[2..100,1..3] // drop missing first row

clear svmat quantiles2, names(col) // converts matrix of results to dataset for graphing

gen ciplus=B+1.96*SE

gen cineg=B-1.96*SE

graph twoway connected B Q, msymbol(none) legend(off) graphregion(color(white)) yline(-5540, lpattern (longdash)) lwidth(medthick) xtitle("Quantiles of salary") ytitle(Male-female differential ($)) || connected ciplus Q, msymbol(none) lpattern(dash) || connected cineg Q, msymbol(none) lpattern(dash)

graph export $figures∖gender.eps, replace

!epstopdf "$figures∖gender.eps

*** Finding optimal bandwidths for ivqte command (Table 8.11) ***

locreg facultyunion, logit bandwidth(.2 1.) lambda(.2.5.8) continuous(citi6008) dummy(gov_cons)

locreg facultyunion, logit bandwidth(.05.1.15.2.25) lambda(.05.1.15.2.25) continuous(citi6008) dummy(gov_cons)

locreg facultyunion, logit bandwidth(.06.08.1.12) lambda(.01.02.03.04.05) continuous(citi6008) dummy(gov_cons)

locreg facultyunion, logit bandwidth(.1) lambda(0.0025.005.0075.01) continuous(citi6008) dummy(gov_cons)

*** IV QR estimates for Table 8.12 **

foreach i in 10 25 50 75 90

ivqte basesalary (facultyunion = statelaws), variance quantiles(.‘i’) continuous(citi6008) dummy(gov_cons)

foreach i in 10 25 50 75 90

ivqte basesalary (facultyunion = statelaws), variance quantiles(.‘i’) continuous(citi6008) dummy(gov_cons) bandwidth(.1) lambda(0)

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Porter, S.R. (2015). Quantile Regression: Analyzing Changes in Distributions Instead of Means. In: Paulsen, M. (eds) Higher Education: Handbook of Theory and Research. Higher Education: Handbook of Theory and Research, vol 30. Springer, Cham. https://doi.org/10.1007/978-3-319-12835-1_8

Download citation

Publish with us

Policies and ethics