Skip to main content
Log in

A Simple Adaptation of Variable Selection Software for Regression Models to Select Variables in Nested Error Regression Models

  • Published:
Sankhya B Aims and scope Submit manuscript

Abstract

Data users often apply standard regression model selection criteria to select variables in nested error regression models, which are widely used in small area estimation. We demonstrate through a Monte Carlo simulation study that this practice may lead to selection of a non-optimal or incorrect model. To assist data users who wish to use standard regression software, we propose a transformation of the data so that transformed data follow a standard regression model. Thus, variable selection software available for the standard regression model can be directly applied to the transformed data. We illustrate our methodology using survey and satellite data for corn and soybeans in 12 Iowa counties.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Battese, G.E., Harter, R.M. and Fuller, W.A. (1988). An error components model for prediction of county crop areas using survey and satellite data. J. Am. Stat. Assoc.83, 28–36.

    Article  Google Scholar 

  • Claeskens, G. and Hjort, N.L. (2008). Model Selection and Model Averaging. University Press, Cambridge.

    MATH  Google Scholar 

  • Fuller, W.A. and Battese, G.E. (1973). Transformations for estimation of linear models with nested error structures. J. Am. Stat. Assoc.68, 626–632.

    Article  MathSciNet  Google Scholar 

  • Gunst, G.F. and Mason, R.L. (1980). Regression Analysis and Its Application. Marcel Dekker, New York.

    MATH  Google Scholar 

  • Henderson, C.R. (1953). Estimation of variance and variance components. Biometrics9, 226–252.

    Article  MathSciNet  Google Scholar 

  • Jiang, J. and Lahiri, P. (2006). Mixed model prediction and small area estimation. Test15, 111–999.

    Article  MathSciNet  Google Scholar 

  • Jiang, J., Rao, J.S., Gu, Z. and Nguyen, T. (2008). Fence methods for mixed model selection. Ann. Stat.36, 1669–1692.

    Article  MathSciNet  Google Scholar 

  • Kutner, M.H., Nachtsheim, C.J. and Neter, J. (2004). Applied Linear Regression Models. McGraw-Hill/Irwin Series Operations and Decision Sciences, New York City.

    Google Scholar 

  • Lahiri, P. (2001). Model Selection, vol. 38. Institute of Mathematical Statistics. OH. IMS Lecture Notes/Monograph, Beachwood.

    Google Scholar 

  • Lahiri, P. and Li, Y. (2009). A new alternative to the standard F test for clustered data. J. Stat. Plan. Inference139, 3430–41.

    Article  MathSciNet  Google Scholar 

  • Lahiri, P. and Suntornchost, J. (2015). Variable Selection for Linear Mixed Models with Applications in Small Area Estimation, Sankhya B. https://doi.org/10.1007/s13571-015-0096-0.

    Article  MathSciNet  Google Scholar 

  • Meza, J.L. and Lahiri, P. (2005). A note on the p c statistic under the nested error regression model. Survey Method.31, 105–109.

    Google Scholar 

  • Muller, S., Scealy, J.L. and Welsh, A.H. (2013). Model selection in linear mixed models. Stat. Sci.28, 135–167.

    Article  MathSciNet  Google Scholar 

  • Prasad, N.G.N. (1990). The estimation of mean squared errors of small area estimators. J. Amer. Statist. Assoc.85, 163–171.

    Article  MathSciNet  Google Scholar 

  • Rao, J.N.K. (2003). Small Area Estimation. Wiley, New York.

    Book  Google Scholar 

  • Rao, C.R. and Wu, Y. (2001). On model Selection, Lahiri, P. (ed.),. Institute of Mathematical Statistics Lecture Notes-Monograph Series, 38.

  • Rao, J.N.K., Sutradhar, B.C. and Yue, K. (1993). Generalized least squares F test in regression analysis with two-stage cluster samples. J. Am. Stat. Assoc.88, 1388–1391.

    MathSciNet  MATH  Google Scholar 

  • Shao, J. (1993). Linear model selection by cross-validation. J. Am. Stat. Assoc.88, 486–494.

    Article  MathSciNet  Google Scholar 

  • Vaida, F. and Blanchard, S. (2005). Conditional Akaike information for mixed-effects models. Biometrika92, 351–370.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors thank editors and the anonymous referee for a few constructive suggestions that led to improvement of an earlier version of the article. The research of the second author was supported in part by the National Science Foundation Grant Number SES-1534413.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Li.

Appendix

Appendix

figure a
figure b
figure c
figure d

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Lahiri, P. A Simple Adaptation of Variable Selection Software for Regression Models to Select Variables in Nested Error Regression Models. Sankhya B 81, 302–317 (2019). https://doi.org/10.1007/s13571-018-0161-6

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13571-018-0161-6

Keywords and phrases

AMS (2000) subject classification

Navigation