Skip to main content
Log in

Extending Ordinal Regression with a Latent Zero-Augmented Beta Distribution

  • Published:
Journal of Agricultural, Biological and Environmental Statistics Aims and scope Submit manuscript

Abstract

Ecological abundance data are often recorded on an ordinal scale in which the lowest category represents species absence. One common example is when plant species cover is visually assessed within bounded quadrats and then assigned to pre-defined cover class categories. We present an ordinal beta hurdle model that directly models ordinal category probabilities with a biologically realistic beta-distributed latent variable. A hurdle-at-zero model allows ecologists to explore distribution (absence) and abundance processes in an integrated framework. This provides an alternative to cumulative link models when data are inconsistent with the assumption that the odds of moving into a higher category are the same for all categories (proportional odds). Graphical tools and a deviance information criterion were developed to assess whether a hurdle-at-zero model should be used for inferences rather than standard ordinal methods. Hurdle-at-zero and non-hurdle ordinal models fit to vegetation cover class data produced substantially different conclusions. The ordinal beta hurdle model yielded more precise parameter estimates than cumulative logit models, although out-of-sample predictions were similar. The ordinal beta hurdle model provides inferences directly on the latent biological variable of interest, percent cover, and supports exploration of more realistic ecological patterns and processes through the hurdle-at-zero or two-part specification. We provide JAGS code as an on-line supplement. Supplementary materials accompanying this paper appear on-line.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Agresti, A. (2010), Analysis of ordinal categorical data John Wiley and Sons, Hoboken, NJ, USA.

    Book  MATH  Google Scholar 

  • Agresti, A., and Kateri, M. (2014), Some Remarks on Latent Variable Models in Categorical Data Analysis. Communications in Statistics - Theory and Methods, 43, 801–814.

    Article  MathSciNet  MATH  Google Scholar 

  • Albert, J. H., and Chib, S. (1993), Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88, 669–679.

    Article  MathSciNet  MATH  Google Scholar 

  • Ananth, C. V., and Kleinbaum, D. G. (1997), Regression models for ordinal responses: a review of methods and applications. International Journal of Epidemiology, 26, 1323–1333.

    Article  Google Scholar 

  • Andrewartha, H. G., and Birch, L. C. (1954), The distribution and abundance of animals. University of Chicago Press, Chicago, Illinois, USA.

    Google Scholar 

  • Bonham, C. D. (1989), Measurements for terrestrial vegetation John Wiley and Sons, New York, NY.

    Google Scholar 

  • Braun-Blanquet, J. (1932), Plant sociology. The study of plant communities. McGraw-Hill, New York, NY, US.

  • Branscum, A. J., Johnson, W. O., and Thurmond, M. C. (2007), Bayesian beta regression: applications to household expenditure data and genetic distance between foot-and-mouth disease viruses. Australian New Zealand Journal of Statistics, 49, 287–301.

    Article  MathSciNet  MATH  Google Scholar 

  • Chambers, J. C., Roundy, B. A., Blank, R. R., Meyer, S. E., and Whittaker, A. (2007), What makes Great Basin sagebrush ecosystems invasible by Bromus tectorum? Ecological Monographs, 77, 117–145.

    Article  Google Scholar 

  • Chen, J., Shiyomi, M, Hori, Y., and Yamamura, Y. (2008a), Frequency distribution models for spatial patterns of vegetation abundance. Ecological Modelling, 211, 403–410.

    Article  Google Scholar 

  • Chen, J., Shiyomi, M., Bonham, C. D., Yasuda, T., Hori, Y., and Yamamura, Y. (2008b), Plant cover estimation based on the beta distribution in grassland vegetation. Ecological Research, 23, 813–819.

    Article  Google Scholar 

  • Christensen, R. H. B. (2014), Ordinal:Regression models for ordinal data. R package version 2014.12-22. Available at http://cran.r-project.org/web/packages/ordinal/index.html.

  • Congdon, P. (2005), Bayesian models for categorical data. John Wiley and Sons, Hoboken, NJ, USA.

    Book  MATH  Google Scholar 

  • Coudun, C. and Gegout, J.-C. (2007). Quantitative prediction of the distribution and abundance of Vaccinium myrtillus with climatic and edaphic factors. Journal of Vegetation Science, 18, 517–524.

    Article  Google Scholar 

  • Damgaard, C. (2009), On the distribution of plant abundance data. Ecological Informatics, 4, 76–82.

    Article  Google Scholar 

  • Damgaard, C. (2012), Trend analyses of hierarchical pin-point cover data. Ecology, 93, 1269–1274.

    Article  Google Scholar 

  • Daubenmire, R. F. (1959), A canopy-coverage method. Northwest Science, 33, 43–64.

    Google Scholar 

  • Davies, K. W., Boyd, C. S., Beck, J. L., Bates, J. D., Svejcar, T. J., and Gregg, M. A. (2011), Saving the sagebrush sea: an ecosystem conservation plan for big sagebrush plant communities. Biological Conservation, 144, 2573–2584.

    Article  Google Scholar 

  • Davies, K. W., Nafus, A. M., and Madsen, M. D. (2013), Medusahead invasion along unimproved roads, animal trails, and random transects. Western North American Naturalist, 73, 54–59.

    Article  Google Scholar 

  • Duff, T. J., Bell, T. L., and York, A. (2011), Patterns of plant abundances in natural systems: is there value in modelling both species abundance and distribution?. Australian Journal of Botany, 59, 719–733.

    Google Scholar 

  • Eskelson, N. I., Madsen, L., Hagar, J. C., and Temesgen, H. (2011), Estimating riparian understory vegetation cover with beta regression and copula models. Forest Science, 57, 212–221.

    Google Scholar 

  • Esposito, D. M., Shanahan, E., and Rodhouse, T. J. (2016), UCBN and GRYN Sagebrush Steppe Vegetation Monitoring: Double Observer Study 2015. John Day Fossil Beds National Monument-Clarno Unit and City of Rocks National Reserve. Natural Resource Reporting Series NPS/UCBN/NRR-2016/1052. National Park Service, Fort Collins, Colorado.

  • Fahrmeier, L, and Tutz, G. (2001), Multivariate statistical modelling based on generalized linear models. Springer, Berlin.

  • Feng, X., Zhu, J., and Steen-Adams, M. M. (2015), On regression analysis of spatial proportional data with zero/one values. Spatial Statistics, 14, 452–471.

    Article  MathSciNet  Google Scholar 

  • Ferrari, S. L. P. and Cribari-Neto, F. (2004), Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31, 799–815.

    Article  MathSciNet  MATH  Google Scholar 

  • Gelbard, J. L. and Belnap, J. (2003), Roads as conduits for exotic plant invasions in a semiarid landscape. Conservation Biology, 17, 420-432.

    Article  Google Scholar 

  • Gelfand, A. E. and Smith, A. F. M. (1990), Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398–409.

    Article  MathSciNet  MATH  Google Scholar 

  • Gruen, B., Kosmidis, I., and Zeileis, A. (2012), Extended Beta Regression in R: Shaken, Stirred, Mixed, and Partitioned. Journal of Statistical Software, 48, 1–25.

    Google Scholar 

  • Guisan, A. and Harrell, F. E. (2000), Ordinal response regression models in Ecology. Journal of Vegetation Science, 11, 617–626.

    Article  Google Scholar 

  • Hall, D. B. (2000), Zero-inflated Poisson and Binomial Regression with Random Effects: A Case Study. Biometrics, 56, 1030–1039.

    Article  MathSciNet  MATH  Google Scholar 

  • Heilbron, D. C. (1994), Zero-altered and other Regression Models for Count Data with Added Zeros. Biometrical Journal, 36, 531–547.

    Article  MATH  Google Scholar 

  • Higgs, M. D. and Hoeting, J. A. (2010), A clipped latent variable model for spatially correlated ordered categorical data. Computational Statistics and Data Analysis, 54, 1999–2011.

    Article  MathSciNet  MATH  Google Scholar 

  • Higgs, M. D. and Ver Hoef, J. M. (2012), Discretized and Aggregated: modeling dive depth of Harbor Seals from Ordered Categorical data with temporal autocorrelation. Biometrics, 68, 965–974.

    Article  MathSciNet  MATH  Google Scholar 

  • Holland, M. D. and Gray, B. R. (2011), Multinomial mixture model with heterogeneous classification probabilities. Ecological and Environmental Statistics, 18, 257–270.

    Article  MathSciNet  Google Scholar 

  • Irvine, K. M. and Rodhouse, T. J. (2010), Power analysis for trend in ordinal cover classes: implications for long-term vegetation monitoring. Journal of Vegetation Science, 21, 1152–1161.

    Article  Google Scholar 

  • Ishwaran, H. (2000), Univariate and multirater ordinal cumulative link regression with covariate specific cutpoints. The Canadian Journal of Statistics, 28, 715–730.

  • Kelley, M. E. and Anderson, S. J. (2008), Zero inflation in ordinal data: incorporating susceptibility to response through the use of a mixture model. Statistics in Medicine, 27, 3674–3688.

    Article  MathSciNet  Google Scholar 

  • Kim, J-H. (2003), Assessing practical significance of the proportional odds assumption. Statistics and Probability Letters, 65, 233–239.

    Article  MathSciNet  MATH  Google Scholar 

  • Kosmidis, I. and Firth, D. (2010), A generic algorithm for reducing bias in parametric estimation.Electronic Journal of Statistics, 4, 1097–1112.

    Article  MathSciNet  MATH  Google Scholar 

  • Lachenbruch, P. A. (2002), Analysis of Data with Excess Zeros. Statistical Methods in Medical Research, 11, 297–302.

    Article  MATH  Google Scholar 

  • Lambert, D. (1992), Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics, 34, 1–14.

    Article  MATH  Google Scholar 

  • Larrabee, B., Scott, H. M., and Bello, N. M. (2014), Ordinary least squares regression of ordered categorical data: inferential implications in practice. Journal of Agricultural, Biological, and Environmental Statistics, 19, 373–386.

    Article  MathSciNet  MATH  Google Scholar 

  • Link, W. A. and Eaton, M. J. (2012) On thinning of chains in MCMC. Methods in Ecology and Evolution , 3, 112–115.

    Article  Google Scholar 

  • Mackenzie, D. I., Nichols, J. D., Royle, J. A., Pollock, K. H., Bailey, L. L., and Hines, J. E. (2006), Occupancy estimation and modeling:inferring patterns and dynamics of species occurrence. Elsevier Academic Press, Burlington, MA, USA.

    Google Scholar 

  • Martin, T. G., Wintle, B. A., Rhodes, J. R., Kuhnert, P. M., Field, S. A., Low-Choy, S. J., Tyre, A. J., and Possingham, H. P. (2005), Zero tolerance ecology: improving ecological inference by modeling the source of zero observations. Ecology Letters, 8, 1235–1246.

    Article  Google Scholar 

  • Milberg, P., Bergstedt, J., Fridman, J., Gunnar, O., Westerberg, L. (2008), Observer bias and random variation in vegetation monitoring data. Journal of Vegetation Science, 19, 633–644.

    Article  Google Scholar 

  • Millar, R. B. (2009), Comparison of hierachical Bayesian models for overdispersed count data using DIC and Bayes’ factors. Biometrics, 65, 962–969.

    Article  MathSciNet  MATH  Google Scholar 

  • Miller, R. F., Chambers, J. C., Pyke, D. A., Pierson, F. B., and Williams, C. J. (2013), A review of fire effects on vegetation and soils in the Great Basin region: response and ecological site characteristics. RMRS GTR-308. USDA Forest Service, Rocky Mountain Research Station, Fort Collins, Colorado, USA.

  • Moulton, L. H. and Halsey, N. A. (1995), A Mixture Model with Detection Limits for Regression Analyses of Antibody Response to Vaccine. Biometrics, 51, 1570–1578.

    Article  MATH  Google Scholar 

  • Neelon, B. H., O’Malley, A. J., and Normand, S-L T. (2010), A Bayesian model for repeated measures zero-inflated count data with application to outpatient psychiatric service use. Statistical Modelling, 10, 421–439.

    Article  MathSciNet  Google Scholar 

  • Ospina, R. and Ferrari, S. L. (2010), Inflated beta distributions. Statistical Papers, 51, 111–126.

    Article  MathSciNet  MATH  Google Scholar 

  • Plummer, M. (2003), JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), K. Hornik, F. Leisch, and A. Zeileis (eds.) Vienna, Austria. Available at: http://www.ci.tuwien.ac.at/Conferences/DSC-2003/

  • Plummer, M. (2008), Penalized loss functions for Bayesian model comparison. Biostatistics, 9, 523–539.

    Article  MATH  Google Scholar 

  • Plummer, M. (2015), JAGS Version 4.0.0 user manual. Available at https://sourceforge.net/projects/mcmc-jags/files/Manuals/4.x/.

  • Reisner, M. D., Grace, J. B., Pyke, D. A., and Doescher, P. S. (2013), Conditions favouring Bromus tectorum dominance of endangered sagebrush steppe ecosystems. Journal of Applied Ecology, 50, 1039–1049.

    Article  Google Scholar 

  • Rodhouse, T. J., Irvine, K. M., Sheley, R. L., Smith, B. S., Hoh, S., Esposito, D. M., and Mata-Gonzalez, R. (2014), Predicting foundation bunchgrass species abundances: model-assisted decision-making in protected-area sagebrush-steppe. Ecosphere, 5, art208.

  • Royle, J. A. and Link, W. A. (2005), A general class of multinomial mixture models for anuran calling survey data. Ecology, 86, 2505–2512.

    Article  Google Scholar 

  • Schabenberger, O. (1995), The use of ordinal response methodology in Forestry. Forest Science, 41, 321–336.

    Google Scholar 

  • Schliep, E. M. and Hoeting, J. A. (2013), Multilevel latent Gaussian process model for mixed disrete and continuous multivariate response data. Journal of Agricultural, Biological, and Environmental Statistics, 18, 492–513.

    Article  MathSciNet  MATH  Google Scholar 

  • Spiegelhalter, D., Best, N., Carlin, B., and van der Linde, A. (2002), Bayesian measures of model complexity and fit (with discussion). Journal of the Royal Statistical Society B, 64, 583–639.

  • Stevens, D. L., and Olsen, A. R. (2004), Spatially balanced sampling of natural resources. Journal of the American Statistical Association, 99, 262–278.

    Article  MathSciNet  MATH  Google Scholar 

  • Stroup, W. W. (2014), Rethinking the analysis of non-normal data in plant and soil science. Agronomy Journal, 106, 1–17.

    Article  Google Scholar 

  • Tamhane, A. C., Ankenman, B. E., and Yang, Y. (2002), The beta distribution as a latent response model for ordinal data (I): estimation of location and dispersion parameters. Journal of Statistical Computation and Simulation, 72, 473–494.

    Article  MathSciNet  MATH  Google Scholar 

  • Venables, W. N., and Ripley, B. D. (2002), Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0

    Book  MATH  Google Scholar 

  • Wenger, S. J. and Freeman, M. C. (2008), Estimating species occurrence, abundance, and detection probability using zero-inflated distributions. Ecology, 89, 2953–2959.

    Article  Google Scholar 

  • Yeo, J. J., Rodhouse, T. J., Dicus, G. H., Irvine, K. M., and Garrett, L. K. (2009), Sagebrush steppe vegetation monitoring protocol. Upper Columbia Basin Network. Version 1.0. Natural Resource Report NPS/UCBN/NRR–2009/142. National Park Service, Fort Collins, CO, USA.

Download references

Acknowledgments

We thank Dr. Megan D. Higgs for early discussions on this work and her assistance with WinBUGS code for clipping latent distributions. Dr. Brian Gray provided encouragement and interest in this work and we are appreciative. We also thank Dr. Andrew Hoegh, two anonymous reviewers’, and our associate editor’s comments and suggestion for revising our paper. The work by K. M. Irvine was funded through an Interagency Agreement P12PG70586 with the National Park Service. T. J. Rodhouse was funded by Upper Columbia Basin Network Inventory and Monitoring Program of the National Park Service. I. N. Keren’s participation was secured by an interagency agreement with Montana State’s Institute on Ecosystems with funding by North Central Climate Science Center. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kathryn M. Irvine.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 237 KB)

Supplementary material 2 (csv 99 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Irvine, K.M., Rodhouse, T.J. & Keren, I.N. Extending Ordinal Regression with a Latent Zero-Augmented Beta Distribution. JABES 21, 619–640 (2016). https://doi.org/10.1007/s13253-016-0265-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-016-0265-2

Keywords

Navigation