Skip to main content

Introduction to Intermediate Statistical Techniques

  • Chapter
  • First Online:
Higher Education Policy Analysis Using Quantitative Techniques
  • 543 Accesses

Abstract

This chapter introduces intermediate statistical techniques, which include pooled ordinary least squares (OLS) , fixed-effects, and random-effects regression . This chapter demonstrates how we can use these statistical techniques to analyze panel data. It shows how various tests can be conducted to determine the appropriate method that should be employed in correlational studies. The chapter also introduces how multivariate regression can be modified to infer causal effects by including difference-in-differences estimators.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For OLS regression formulas with more than one independent variable, see introductory mathematical statistics texts.

  2. 2.

    It is assumed the reader is familiar with ANOVA.

  3. 3.

    The t statistic is equal to the estimated beta coefficient divided by the standard error. So the smaller the standard error, the larger the absolute value of the t statistic.

  4. 4.

    For a full description of the Levene test, see Levene, H. (1960). Robust tests for equality of variances. In I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow, & H. B. Mann (Eds.), Contributions to probability and statistics: Essay in honor of Harold Hotelling (pp. 278–292). Stanford University Press.

  5. 5.

    For an excellent comprehensive description, discussion, and example of regression-based DiD techniques, see Furquium et al. (2020).

  6. 6.

    For a complete discussion of this test, see Breusch and Pagan (1980).

  7. 7.

    For a technical discussion of the Hausman test , see Hausman (1978).

  8. 8.

    For more on bootstrapping in Stata, see Guan (2003).

References

  • Breusch, T. S., & Pagan, A. R. (1980). The Lagrange Multiplier Test and its Applications to Model Specification in Econometrics. The Review of Economic Studies, 47(1), 239–253. JSTOR. https://doi.org/10.2307/2297111

    Article  MathSciNet  MATH  Google Scholar 

  • Furquim, F., Corral, D., & Hillman, N. (2020). A Primer for Interpreting and Designing Difference-in-Differences Studies in Higher Education Research. In L. W. Perna (Ed.), Higher Education: Handbook of Theory and Research: Volume 35 (pp. 667–723). Springer International Publishing. https://doi.org/10.1007/978-3-030-31365-4_5

  • Guan, W. (2003). From the help desk: Bootstrapped standard errors. The Stata Journal, 3(1), 71–80.

    Article  Google Scholar 

  • Hausman, J. A. (1978). Specification Tests in Econometrics. Econometrica, 46(6), 1251–1271. JSTOR. https://doi.org/10.2307/1913827

    Article  MathSciNet  MATH  Google Scholar 

  • Hoechle, D. (2007). Robust Standard Errors for Panel Regressions with Cross-Sectional Dependence. The Stata Journal: Promoting Communications on Statistics and Stata, 7(3), 281–312. https://doi.org/10.1177/1536867X0700700301

    Article  Google Scholar 

  • Hutchinson, S. R., & Lovell, C. D. (2004). A review of methodological characteristics of research published in key journals in higher education: Implications for graduate research training. Research in Higher Education, 45(4), 383–403.

    Article  Google Scholar 

  • Judge, G. G., Hill, R. C., Griffiths, W. E., Lutkepohl, H., & Lee, T.-C. (1988). Introduction to the Theory and Practice of Econometrics, 2nd Edition (2 edition). Wiley.

    Google Scholar 

  • Kaiser, B. (2015). RHAUSMAN: Stata module to perform Robust Hausman Specification Test. In Statistical Software Components. Boston College Department of Economics. https://ideas.repec.org/c/boc/bocode/s457909.html

  • Wells, R. S., Kolek, E. A., Williams, E. A., & Saunders, D. B. (2015). “How We Know What We Know”: A Systematic Comparison of Research Methods Employed in Higher Education Journals, 1996—2000 v. 2006—2010. The Journal of Higher Education, 86(2), 171–198.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

7.7 Appendix

7.7 Appendix

*Chapter 7 Stata syntax *Bivariate OLS Regression *use dataset from the previous chapter use "C:\Users\Marvin\Dropbox\Manuscripts\Book\Chapter 6\Stata files\Example 6.3.dta", clear *generate bivariate (one independent variable) OLS regression output regress netuit_fte stapr_fte if year ==2016 *create a new variable reflecting, say, the squared term /// (or quadratic) of another variable gen stapr_fte2 = stapr_fte*stapr_fte *include new variable in the regression model regress netuit_fte stapr_fte stapr_fte2 pc_income if year ==2016 *Multivariate Pooled OLS Regression reg netuit_fte stapr_fte stapr_fte2 pc_income *include the categorical variable (region_compact) in regression model reg netuit_fte stapr_fte stapr_fte2 pc_income i.region_compact *Multivariate Pooled OLS Regression with Interaction Terms reg netuit_fte stapr_fte i.region_compact##i.ugradmerit, allbaselevels *test to see if there is an interaction effects by quietly (qui) running the /// models and storing (est sto) the model results without (model1)and with and the *interaction terms (model2) qui reg netuit_fte stapr_fte i.region_compact est sto model1 qui reg netuit_fte stapr_fte i.region_compact##i.ugradmerit est sto model2 lrtest model1 model2 *Using the testparm command, the statistical significance of the interaction /// terms can also be checked. testparm i.region_compact#i.ugradmerit *if the interaction term is composed of one continuous (c) variable and one /// categorical (i) variable reg netuit_fte i.ugradmerit i.region_compact c.stapr_fte##i.tuitset testparm c.stapr_fte#i.tuitset *change working directory cd "C:\Users\Marvin\Dropbox\Manuscripts\Book\Chapter 7\Stata files" *use new dataset Use "Example 7.1.dta", clear *if two continuous variables are included reg netuit_fte i.region_compact c.stapr_fte##c.state_needFTE *using margins (with the vsquish option) margins, dydx(stapr_fte) at(state_needFTE=(0(3000)10000)) vsquish qui margins, at(stapr_fte=(0 10000) state_needFTE=(0(3000)10000)) vsquish *and marginsplot with different patterns marginsplot, noci x(stapr_fte) recast(line) xlabel(0(3000)10000) /// plot1opts(lpattern("...")) plot2opts(lpattern("-..-") color(black)) /// plot3opts(lpattern("---") color(black)) plot4opts(color(black)) *residual-versus-fitted plot rvfplot, mcolor(black) *comprehensive post-estimation estat imtest *POLS regression model using the robust option reg netuit_fte stapr_fte stapr_fte2 pc_income i.region_compact, robust *Levene test of homogeneity quietly: reg netuit_fte stapr_fte stapr_fte2 pc_income i.region_compact predict double eps, residual robvar eps, by(state) *To address this particular violation of the assumption of homoscedasticity , we /// use the cluster option, with state as the cluster variable, in our POLS *regression model. reg netuit_fte stapr_fte stapr_fte2 pc_income i.region_compact, cluster(state) *Fixed-Effects Regression * Unobserved Heterogeneity and Fixed-Effects Dummy Variable (FEDV) /// Regression Estimating FEDV Multivariate POLS Regression Models reg netuit_fte stapr_fte stapr_fte2 pc_income i.stateid, cluster(state) *we determine if the state fixed-effects as a whole are statistically /// significant, immediately after we run the above regression testparm i.stateid * alternative approach to producing the same results areg netuit_fte stapr_fte stapr_fte2 pc_income, cluster(stateid) absorb(stateid) *open another dataset use "Example 7.1.dta" areg eg statea tuition totfteiarep ftfac ptfac D, cluster(opeid5_new) *Unobserved Heterogeneity and Within-Group Estimator Fixed-Effects Regression /// using the xtreg command with the fe option xtreg eg statea tuition totfteiarep ftfac ptfac, fe cluster(opeid5_new) *Fixed-effects regression and difference-in-differences (DiD) *The DiD Estimator *Fixed-effects Regression-based DiD: An Example use "Example 7.1.dta", clear *We create the treatment variable (T). gen T=0 replace T=1 if state=="CO" *The post-treatment (P) is then created. gen P=0 replace P=1 if year>=2004 *Based on every state other than the treatment state (Colorado), we create the /// first control group. gen C1 = 0 replace C1=1 if state !="CO" *we create a second control group. gen C2 = 0 replace C2=1 if state !="CO" & region_compact==2 *we use the global command to create temporary variables reflecting the /// dependent variable net tuition revenue per FTE enrollment (y) global y "netuit_fte" *and the set of control variables state appropriations to higher education per /// FTE enrollment (stapr_fte) and state per capita income (pc_income). global controls "stapr_fte pc_income" *To take into account unobserved heterogeneity, we include the robust (rob) as /// an option in the syntax. reg $y i.T i.P T#P $controls i.year i.fips if year>=2000 & (C1==1 | T==1), rob *The within-group fixed-effects DiD regression model can also be employed. xtreg $y T##P $controls i.year if year>=2000 & (C1==1 | T==1) , fe rob *For comparison, we run the within-group fixed-effects model with the second /// control group (states in WICHE). xtreg $y T##P $controls i.year if year>=2000 & (C2==1 | T==1) , fe rob *DiD Placebo Tests gen placebo_2000 = 1 if year>=2000 recode placebo_2000 (.=0) xtreg $y T##placebo_2000 $controls if (year>1995 | year<2005) & (C2==1 | /// T==1), fe rob *Random-Effects Regression xtreg netuit_fte stapr_fte stapr_fte2 pc_income i.region_compact, /// re cluster(stateid) *Breusch and Pagan Lagrangian multiplier test for random effects xttest0 * Hausman test quietly: xtreg eg statea tuition totfteiarep ftfac ptfac, fe est sto fixed quietly: xtreg eg statea tuition totfteiarep ftfac ptfac, re est sto random hausman fixed random *log transform the variables and rerun the test xtreg lneg lnstatea lntuition lntotfteiarep lnftfac ptfac, fe est sto fixed xtreg lneg lnstatea lntuition lntotfteiarep lnftfac lnptfac, re est sto random hausman fixed random *install a Stata user-written Hausman routine by Kaiser (2015) ssc install rHausman *run rHausman qui: xtreg lneg lnstatea lntuition lntotfteiarep lnftfac ptfac, cluster(opeid5_new) fe est sto fixed qui: xtreg lneg lnstatea lntuition lntotfteiarep lnftfac ptfac, cluster(opeid5_new) re est sto random rhausman fixed random, reps(400) cluster *end

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Titus, M. (2021). Introduction to Intermediate Statistical Techniques. In: Higher Education Policy Analysis Using Quantitative Techniques . Quantitative Methods in the Humanities and Social Sciences. Springer, Cham. https://doi.org/10.1007/978-3-030-60831-6_7

Download citation

Publish with us

Policies and ethics