Abstract
The Hierarchical Predictive Model (HPM) is a semiparametric mixed model where the fixed effects are fit with a user-specified non-parametric component. This approach extends current spline-based semiparametric mixed model formulations, allowing for more flexible nonparametric estimation. Greater adaptability simplifies model specification making it easier to analyze data sets with large numbers of predictors. Greater automation also extends the scope of exploratory analyses that may be performed with mixed models. Using a HPM, the analyst may select the predictive model to best suit their needs, exploiting the strengths of currently available predictive methods. A simulation study is used to demonstrate the advantages of accounting for known hierarchical structure in predictive models and to illustrate the adaptability of current decision-tree based predictive models. A HPM of the relative abundance of the North American House Finch (Carpodacus mexicanus) is used to demonstrate exploratory analysis with a real data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amstrup S, MacDonald L, Manly B (2006) Handbook of Rapture-Recapture Analysis. Princeton University Press, Englewood Cliffs, NJ 296 pp.
Banerjee S, Carlin BP, Gelfand AE (2004) Hierarchical Modeling and Analysis for Spatial Data. Chapman & Hall/CRC, London BocaRadon, FL 472 pp.
Berry S, Carroll R, Ruppert D (2002) Bayesian smoothing and regression splines for measurement error problems. Journal of the American Statistical Association 97:160–169.
Breiman L (1996) Bagging predictors. Machine Learning 24:123–140.
Breiman L (2001) Random forests. Machine Learning 45:5–32.
Breiman L, Friedman JH, Olshen RA, Stone JC (1984) Classification and Regression Trees. Chapman & Hall, New York.
Buja A, Hastie T, Tibshirani R (1989) Linear smoothers and the additive model. The Annals of Statistics 17:453–555.
Carlin BP, Louis TA (2000) Bayes and Empirical Bayes Methods for Data Analysis. Chapman & Hall/CRC, Boca Raton, FL.
Chatfield C (1995) Model uncertainty, data mining, and statistical inference. Journal of the Royal Statistical Society, Series A, 158:419–466.
Cristianini N, Shawe-Taylor J (2000) An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge, UK.
De'ath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81:3178–3192.
Dhondt AA, Altizer S, Cooch EG, Davis AK, Dobson A, Driscoll MJL, Hartup BK, Hawley DM, Hochachka WM, Hosseini PR, Jennelle CS, Kollias GV, Ley DH, Swarthout ECH, Sydenstricker KV (2005) Dynamics of a novel pathogen in an avian host: mycoplasmal conjunctivitis in House Finches. Acta Tropica 94(1):77–93.
Elith J, Graham CH, Anderson RP, Dudik M, Ferrier S, Guisan A, Hijmans RJ, Huettmann F, Leathwick JR, Lehmann A, Li J, Lohmann LG, Loiselle BA, Manion G, Moritz G, Nakamura M, Nakazawa Y, Overton JMcC, Peterson AT, Phillips SJ, Richardson K, Scachetti-Pereira R, Schapire RE, Soberón J, Williams S, Wisz MS, Zimmermann NE (2006) Novel methods improve prediction of species' distributions from occurrence data. Ecography 29:129–151.
Friedman JH, Popescu BE (2005) Predictive learning via rule ensembles. Technical Report, Stanford University.
Geissler PH, Sauer JR (1990) Topics in route-regression analysis. In: Sauer JR, Droege S (eds) Survey Designs and Statistical Methods for the Estimation of Avian Population Trends. U.S. Fish and Wildlife Service, Biological Report 90(1):54–57.
Gelfand A, Schmidt AM, Wu S, Silander JA, Latimer A, Rebelo AG (2005) Modelling species diversity through species level hierarchical modeling. Applied Statistics, 54:1–20.
Goldstein H (1995) Multilevel Statistical Models. Halstead Press, New York..
Gu C (2002) Smoothing Spline ANOVA Models. Springer, New York..
Hand DJ, Mannila H, Smyth P (2001) Principles of Data Mining. MIT Press, Cambridge.
Hastie T, Tibshirani R (1990) Generalized Additive Models. Chapman and Hall, London.
Hastie T, Tibshirani R, Friedman J (2001) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Verlag, New York, 552 pp.
Hochachka WM, Dhondt AA (2000) Density-dependent decline of host abundance resulting from a new infectious disease. Proceedings of the National Academy of Sciences USA 97:5303–5306.
Hochachka WM, Dhondt AA (2006) House finch (Carpodacus mexicanus) population- and group-level responses to a bacterial disease. Ornithological Monographs 60:30–43.
Hochachka WM, Caruana R, Fink D, Kelling S, Munson A, Riedewald M, Sorokina D (2007) Data mining for discovery of pattern and process in ecological systems. Journal of Wildlife Management 71(7):2427–2437.
Hooker G (2007) Generalized functional ANOVA diagnostics for high dimensional functions of dependent variables. Journal of Computational and Graphical Statistics 16(3).
Jolly GM (1965) Explicit estimates from capture-recapture data with both death and immigration-stochastic models. Biometrika 52:225–247.
Lepage D, Francis CM (2002) Do feeder counts reliably indicate bird population changes? 21 years of winter bird counts in Ontario, Canada. Condor 104:255–270.
Lehmann E, Casella G (1999) Theory of Point Estimation, 2nd Edition. Springer-Verlag, New York.
Lindley DV, Smith AFM (1972) Bayes estimates for the linear model. Journal of the Royal Statistical Society, Series B, 1–41.
Mackenzie DI, Nichols JD, Lachman GB, Droege S, Royle JA, Langtimm CA (2002) Estimating site occupancy rates when detection probabilities are less than one. Ecology 83(8):2248–2255.
McCulloch CE, Searle SR (2001) Generalized, Linear, and Mixed Models. John Wiley and Sons, New York.
Mitchell T (1997) Machine Learning. McGraw-Hill, New York.
Raftery AE, Lewis SM (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science 7:493–497.
R Development Core Team (2006). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
Ridgeway G (2006). gbm: Generalized Boosted Regression Models. R package version 1.5-7. http://www.i-pensieri.com/gregr/gbm.shtml.
Robert CP, Casella G (2004) Monte Carlo Statistical Methods, 2nd Edition. Springer, New York.
Robinson GK (1991) That BLUP is a good thing: the estimation of random effects. Statistical Science 8(1):15–51.
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric Regression. Cambridge University Press, Cambridge, 402 pp.
Seber GAF (1965) A note on the multiple-recapture census. Biometrika 52:249–259
Therneau TM, Atkinson B, R port by Ripley B (2007) rpart: Recursive Partitioning. R package version 3.1-35. http://mayoresearch.mayo.edu/mayo/research/biostat/splusfunctions.cfm
Thogmartin WE, Sauer JR, Knutson MG (2004) A hierarchical spatial model of avian abundance with application to Cerulean Warblers. Ecological Applications 14(6):1766–1779.
Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society B 58(1):267–288.
Wahba G (1990) Spline models for observational data SIAM [Society for Industrial and Applied Mathematics] (Philadelphia).
Wells JV, Rosenberg KV, Dunn EH, Tessaglia-Hymes DL, Dhondt AA (1998) Feeder counts as indicators of spatial and temporal variation in winter abundance of resident birds. Journal of Field Ornithology 69:577–586.
West M, Harrison J (1997) Bayesian Forecasting and Dynamic Models. Springer-Verlag, New York.
Wikle CK (2003) Hierarchical Bayesian models for predicting the spread of ecological processes. Ecology 84:1382–1394.
Wikle CK, Berliner ML (2005) Combining information across spatial scales. Technometrics 47:80–91.
Wikle CK, Hooten MB (2006) Hierarchical bayesian spatio-temporal models for population spread. In: Clark JS and Gelfand A (eds) Applications of Computational Statistics in the Environmental Sciences: Hierarchical Bayes and MCMC Methods. Oxford University Press, Oxford.
Wood SN (2006) Generalized Additive Models: An Introduction with R. Chapman & Hall/CRC, London/Boca Rarm, FL, 416 pp.
Zhao X, Wells MT (2005) Reference priors for linear models with general covariance structures, Cornell Department of Statistical Sciences Technical Report.
Zhao Y, Staudenmayer J, Coull BA, Wand MP (2006) General design Bayesian generalized linear mixed models. Statistical Science 21:35–51.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Fink, D., Hochachka, W. (2009). Gaussian Semiparametric Analysis Using Hierarchical Predictive Models. In: Thomson, D.L., Cooch, E.G., Conroy, M.J. (eds) Modeling Demographic Processes In Marked Populations. Environmental and Ecological Statistics, vol 3. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78151-8_46
Download citation
DOI: https://doi.org/10.1007/978-0-387-78151-8_46
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-78150-1
Online ISBN: 978-0-387-78151-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)