Skip to main content

Gaussian Semiparametric Analysis Using Hierarchical Predictive Models

  • Chapter
Modeling Demographic Processes In Marked Populations

Part of the book series: Environmental and Ecological Statistics ((ENES,volume 3))

Abstract

The Hierarchical Predictive Model (HPM) is a semiparametric mixed model where the fixed effects are fit with a user-specified non-parametric component. This approach extends current spline-based semiparametric mixed model formulations, allowing for more flexible nonparametric estimation. Greater adaptability simplifies model specification making it easier to analyze data sets with large numbers of predictors. Greater automation also extends the scope of exploratory analyses that may be performed with mixed models. Using a HPM, the analyst may select the predictive model to best suit their needs, exploiting the strengths of currently available predictive methods. A simulation study is used to demonstrate the advantages of accounting for known hierarchical structure in predictive models and to illustrate the adaptability of current decision-tree based predictive models. A HPM of the relative abundance of the North American House Finch (Carpodacus mexicanus) is used to demonstrate exploratory analysis with a real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Amstrup S, MacDonald L, Manly B (2006) Handbook of Rapture-Recapture Analysis. Princeton University Press, Englewood Cliffs, NJ 296 pp.

    Google Scholar 

  • Banerjee S, Carlin BP, Gelfand AE (2004) Hierarchical Modeling and Analysis for Spatial Data. Chapman & Hall/CRC, London BocaRadon, FL 472 pp.

    Google Scholar 

  • Berry S, Carroll R, Ruppert D (2002) Bayesian smoothing and regression splines for measurement error problems. Journal of the American Statistical Association 97:160–169.

    Article  MATH  MathSciNet  Google Scholar 

  • Breiman L (1996) Bagging predictors. Machine Learning 24:123–140.

    MATH  MathSciNet  Google Scholar 

  • Breiman L (2001) Random forests. Machine Learning 45:5–32.

    Article  MATH  Google Scholar 

  • Breiman L, Friedman JH, Olshen RA, Stone JC (1984) Classification and Regression Trees. Chapman & Hall, New York.

    Google Scholar 

  • Buja A, Hastie T, Tibshirani R (1989) Linear smoothers and the additive model. The Annals of Statistics 17:453–555.

    Article  MATH  MathSciNet  Google Scholar 

  • Carlin BP, Louis TA (2000) Bayes and Empirical Bayes Methods for Data Analysis. Chapman & Hall/CRC, Boca Raton, FL.

    Book  Google Scholar 

  • Chatfield C (1995) Model uncertainty, data mining, and statistical inference. Journal of the Royal Statistical Society, Series A, 158:419–466.

    Article  Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge, UK.

    Book  Google Scholar 

  • De'ath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81:3178–3192.

    Article  Google Scholar 

  • Dhondt AA, Altizer S, Cooch EG, Davis AK, Dobson A, Driscoll MJL, Hartup BK, Hawley DM, Hochachka WM, Hosseini PR, Jennelle CS, Kollias GV, Ley DH, Swarthout ECH, Sydenstricker KV (2005) Dynamics of a novel pathogen in an avian host: mycoplasmal conjunctivitis in House Finches. Acta Tropica 94(1):77–93.

    Article  Google Scholar 

  • Elith J, Graham CH, Anderson RP, Dudik M, Ferrier S, Guisan A, Hijmans RJ, Huettmann F, Leathwick JR, Lehmann A, Li J, Lohmann LG, Loiselle BA, Manion G, Moritz G, Nakamura M, Nakazawa Y, Overton JMcC, Peterson AT, Phillips SJ, Richardson K, Scachetti-Pereira R, Schapire RE, Soberón J, Williams S, Wisz MS, Zimmermann NE (2006) Novel methods improve prediction of species' distributions from occurrence data. Ecography 29:129–151.

    Article  Google Scholar 

  • Friedman JH, Popescu BE (2005) Predictive learning via rule ensembles. Technical Report, Stanford University.

    Google Scholar 

  • Geissler PH, Sauer JR (1990) Topics in route-regression analysis. In: Sauer JR, Droege S (eds) Survey Designs and Statistical Methods for the Estimation of Avian Population Trends. U.S. Fish and Wildlife Service, Biological Report 90(1):54–57.

    Google Scholar 

  • Gelfand A, Schmidt AM, Wu S, Silander JA, Latimer A, Rebelo AG (2005) Modelling species diversity through species level hierarchical modeling. Applied Statistics, 54:1–20.

    MATH  MathSciNet  Google Scholar 

  • Goldstein H (1995) Multilevel Statistical Models. Halstead Press, New York..

    Google Scholar 

  • Gu C (2002) Smoothing Spline ANOVA Models. Springer, New York..

    Book  Google Scholar 

  • Hand DJ, Mannila H, Smyth P (2001) Principles of Data Mining. MIT Press, Cambridge.

    Google Scholar 

  • Hastie T, Tibshirani R (1990) Generalized Additive Models. Chapman and Hall, London.

    Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2001) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Verlag, New York, 552 pp.

    Book  Google Scholar 

  • Hochachka WM, Dhondt AA (2000) Density-dependent decline of host abundance resulting from a new infectious disease. Proceedings of the National Academy of Sciences USA 97:5303–5306.

    Article  Google Scholar 

  • Hochachka WM, Dhondt AA (2006) House finch (Carpodacus mexicanus) population- and group-level responses to a bacterial disease. Ornithological Monographs 60:30–43.

    Article  Google Scholar 

  • Hochachka WM, Caruana R, Fink D, Kelling S, Munson A, Riedewald M, Sorokina D (2007) Data mining for discovery of pattern and process in ecological systems. Journal of Wildlife Management 71(7):2427–2437.

    Google Scholar 

  • Hooker G (2007) Generalized functional ANOVA diagnostics for high dimensional functions of dependent variables. Journal of Computational and Graphical Statistics 16(3).

    Google Scholar 

  • Jolly GM (1965) Explicit estimates from capture-recapture data with both death and immigration-stochastic models. Biometrika 52:225–247.

    MATH  MathSciNet  Google Scholar 

  • Lepage D, Francis CM (2002) Do feeder counts reliably indicate bird population changes? 21 years of winter bird counts in Ontario, Canada. Condor 104:255–270.

    Google Scholar 

  • Lehmann E, Casella G (1999) Theory of Point Estimation, 2nd Edition. Springer-Verlag, New York.

    Google Scholar 

  • Lindley DV, Smith AFM (1972) Bayes estimates for the linear model. Journal of the Royal Statistical Society, Series B, 1–41.

    Google Scholar 

  • Mackenzie DI, Nichols JD, Lachman GB, Droege S, Royle JA, Langtimm CA (2002) Estimating site occupancy rates when detection probabilities are less than one. Ecology 83(8):2248–2255.

    Article  Google Scholar 

  • McCulloch CE, Searle SR (2001) Generalized, Linear, and Mixed Models. John Wiley and Sons, New York.

    Google Scholar 

  • Mitchell T (1997) Machine Learning. McGraw-Hill, New York.

    Google Scholar 

  • Raftery AE, Lewis SM (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science 7:493–497.

    Article  Google Scholar 

  • R Development Core Team (2006). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.

  • Ridgeway G (2006). gbm: Generalized Boosted Regression Models. R package version 1.5-7. http://www.i-pensieri.com/gregr/gbm.shtml.

  • Robert CP, Casella G (2004) Monte Carlo Statistical Methods, 2nd Edition. Springer, New York.

    Book  Google Scholar 

  • Robinson GK (1991) That BLUP is a good thing: the estimation of random effects. Statistical Science 8(1):15–51.

    Article  Google Scholar 

  • Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric Regression. Cambridge University Press, Cambridge, 402 pp.

    Book  Google Scholar 

  • Seber GAF (1965) A note on the multiple-recapture census. Biometrika 52:249–259

    MATH  MathSciNet  Google Scholar 

  • Therneau TM, Atkinson B, R port by Ripley B (2007) rpart: Recursive Partitioning. R package version 3.1-35. http://mayoresearch.mayo.edu/mayo/research/biostat/splusfunctions.cfm

  • Thogmartin WE, Sauer JR, Knutson MG (2004) A hierarchical spatial model of avian abundance with application to Cerulean Warblers. Ecological Applications 14(6):1766–1779.

    Article  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society B 58(1):267–288.

    MATH  MathSciNet  Google Scholar 

  • Wahba G (1990) Spline models for observational data SIAM [Society for Industrial and Applied Mathematics] (Philadelphia).

    Google Scholar 

  • Wells JV, Rosenberg KV, Dunn EH, Tessaglia-Hymes DL, Dhondt AA (1998) Feeder counts as indicators of spatial and temporal variation in winter abundance of resident birds. Journal of Field Ornithology 69:577–586.

    Google Scholar 

  • West M, Harrison J (1997) Bayesian Forecasting and Dynamic Models. Springer-Verlag, New York.

    Google Scholar 

  • Wikle CK (2003) Hierarchical Bayesian models for predicting the spread of ecological processes. Ecology 84:1382–1394.

    Article  Google Scholar 

  • Wikle CK, Berliner ML (2005) Combining information across spatial scales. Technometrics 47:80–91.

    Article  MathSciNet  Google Scholar 

  • Wikle CK, Hooten MB (2006) Hierarchical bayesian spatio-temporal models for population spread. In: Clark JS and Gelfand A (eds) Applications of Computational Statistics in the Environmental Sciences: Hierarchical Bayes and MCMC Methods. Oxford University Press, Oxford.

    Google Scholar 

  • Wood SN (2006) Generalized Additive Models: An Introduction with R. Chapman & Hall/CRC, London/Boca Rarm, FL, 416 pp.

    Google Scholar 

  • Zhao X, Wells MT (2005) Reference priors for linear models with general covariance structures, Cornell Department of Statistical Sciences Technical Report.

    Google Scholar 

  • Zhao Y, Staudenmayer J, Coull BA, Wand MP (2006) General design Bayesian generalized linear mixed models. Statistical Science 21:35–51.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Fink .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Fink, D., Hochachka, W. (2009). Gaussian Semiparametric Analysis Using Hierarchical Predictive Models. In: Thomson, D.L., Cooch, E.G., Conroy, M.J. (eds) Modeling Demographic Processes In Marked Populations. Environmental and Ecological Statistics, vol 3. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78151-8_46

Download citation

Publish with us

Policies and ethics