Environmental and Ecological Statistics

, Volume 24, Issue 1, pp 39–68 | Cite as

Composite likelihood approach to the regression analysis of spatial multivariate ordinal data and spatial compositional data with exact zero values

  • Xiaoping Feng
  • Jun Zhu
  • Pei-Sheng Lin
  • Michelle M. Steen-Adams
Article

Abstract

In many environmental and ecological studies, it is of interest to model compositional data. One approach is to consider positive random vectors that are subject to a unit-sum constraint. In landscape ecological studies, it is common that compositional data are also sampled in space with some elements of the composition absent at certain sampling sites. In this paper, we first propose a practical spatial multivariate ordered probit model for multivariate ordinal data, where the response variables can be viewed as the discretized non-negative compositions without the unit-sum constraint. We then propose a novel two-stage spatial mixture Dirichlet regression model. The first stage models the spatial dependence and the presence of exact zero values, and the second stage models all the non-zero compositional data. A maximum composite likelihood approach is developed for parameter estimation and inference in both the spatial multivariate ordered probit model and the two-stage spatial mixture Dirichlet regression model. The standard errors of the parameter estimates are computed by an estimate of the Godambe information matrix. A simulation study is conducted to evaluate the performance of the proposed models and methods. A land cover data example in landscape ecology further illustrates that accounting for spatial dependence can improve the accuracy in the prediction of presence/absence of different land covers as well as the magnitude of land cover compositions.

Keywords

Dirichlet regression model Gaussian latent variable Godambe information Mixture model Multivariate ordered probit model Spatial prediction 

References

  1. Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Second international symposium on information theory. Akademia Kiado, Budapest, pp 267–281Google Scholar
  2. Aitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, LondonCrossRefGoogle Scholar
  3. Aitchison J, Kay JW (2003) Possible solutions in some essential zero problems in compositional data analysis. Working paper, presented at CoDaWorks03Google Scholar
  4. Bai Y, Kang J, Song PX-K (2014) Efficient pairwise composite likelihood estimation for spatial-clustered data. Biometrics 70:661–670CrossRefPubMedPubMedCentralGoogle Scholar
  5. Bhat CR, Varin C, Ferdous N (2010) A comparison of the maximum simulated likelihood and composite marginal likelihood estimation approaches in the context of the multivariate ordered-response model. In: Greene W, Hill RC (eds) Advances in econometrics: maximum simulated likelihood methods and applications. Emerald Group Publishing Limited, Bingley, pp 65–106CrossRefGoogle Scholar
  6. Byrd RH, Lu P, Nocedal J, Zhu C (1995) A limited memory algorithm for bound constrained optimization. SIAM J Sci Comput 16:1190–1208CrossRefGoogle Scholar
  7. Crow TR, Host GE, Mladenoff DJ (1999) Ownership and ecosystem as sources of spatial heterogeneity in a forested landscape, Wisconsin, USA. Landscape Ecol 14:449–463CrossRefGoogle Scholar
  8. Dai B, Ding S, Wahba G (2013) Multivariate bernoulli distribution. Bernoulli 19:1465–1483CrossRefGoogle Scholar
  9. De Oliveira V (2000) Bayesian prediction of clipped Gaussian random fields. Comput Stat Data Anal 34:299–314CrossRefGoogle Scholar
  10. Eskelson BN, Madsen L, Hagar JC, Temesgen H (2011) Estimating riparian understory vegetation cover with beta regression and copula models. Forest Sci 57:212–221Google Scholar
  11. Feng X, Zhu J, Steen-Adams MM, Lin PS (2014) Composite likelihood estimation for models of spatial ordinal data and spatial proportional data with zero/one values. Environmetrics 25:571–583CrossRefGoogle Scholar
  12. Feng X (2015) Composite likelihood estimation and inference for spatial data models. Ph.D. thesis, University of Wisconsin, MadisonGoogle Scholar
  13. Feng X, Zhu J, Steen-Adams MM (2015) On regression analysis of spatial proportional data with zero/one values. Spatial Stat 14:452–471CrossRefGoogle Scholar
  14. Gelfand AE, Banerjee S (2010) Multivariate spatial process models. In: Gelfand AE, Diggle PJ, Fuentes M, Guttorp P (eds) Handbook of spatial statistics. Chapman and Hall/CRC, Boca Raton, pp 495–515CrossRefGoogle Scholar
  15. Godambe VP (1960) An optimum property of regular maximum likelihood estimation. Annal Math Stat 31:1208–1211CrossRefGoogle Scholar
  16. Heagerty PJ, Lele SR (1998) A composite likelihood approach to binary spatial data. J Am Stat Assoc 93:1099–1111CrossRefGoogle Scholar
  17. Higgs MD, Hoeting JA (2010) A clipped latent variable model for spatially correlated ordered categorical data. Comput Stat Data Anal 54:1999–2011CrossRefGoogle Scholar
  18. Hijazi RH, Jernigan RW (2009) Modeling compositional data using Dirichlet regression models. J Appl Prob Stat 4:77–91Google Scholar
  19. Irvine KM, Rodhouse TJ, Keren IN (2016) Extending ordinal regression with a latent zero-augmented beta distribution. J Agric Biol Envir Stat. doi:10.1007/s13253-016-0265-2 Google Scholar
  20. LaMondia J, Bhat CR (2009) A conceptual and methodological framework of leisure activity loyalty accommodating the travel context: application of a copula-based bivariate ordered-response choice model. Technical Paper, Department of Civil, Architectural and Environmental Engineering, The University of Texas at AustinGoogle Scholar
  21. Leininger T, Gelfand A, Allen J, Silander J (2013) Spatial regression modeling for compositional data with many zeros. J Agric Biol Environ Stat 18:314–334CrossRefGoogle Scholar
  22. Lindsay B (1988) Composite likelihood methods. Contemp Math 80:221–239Google Scholar
  23. Martín-Fernández JA, Hron K, Templ M, Filzmoser P, Palarea-Albaladejo J (2012) Model-based replacement of rounded zeros in compositional data: classical and robust approaches. Comput Stat Data Anal 56:2688–2704CrossRefGoogle Scholar
  24. Neelon B, Anthopolos R, Miranda ML (2014) A spatial bivariate probit model for correlated binary data with application to adverse birth outcomes. Stat Methods Med Res 23:119–133CrossRefPubMedGoogle Scholar
  25. Palarea-Albaladejo J, Martín-Fernández JA (2008) A modified EM alr-algorithm for replacing rounded zeros in compositional data sets. Comput Geosci 34:902–917CrossRefGoogle Scholar
  26. Qian PZG, Wu H, Wu CFJ (2008) Gaussian process models for computer experiments with qualitative and quantitative factors. Technometrics 50:383–396CrossRefGoogle Scholar
  27. R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org/
  28. Rhemtulla JM, Mladenoff DJ, Clayton MK (2007) Regional land-cover conversion in the US upper Midwest: magnitude of change and limited recovery (1850–1935-1993). Landscape Ecol 22:57–75CrossRefGoogle Scholar
  29. Robinson DT (2012) Land-cover fragmentation and configuration of ownership parcels in an exurban landscape. Urban Ecosyst 15:53–69CrossRefGoogle Scholar
  30. Schliep EM, Hoeting JA (2013) Multilevel latent Gaussian process model for mixed discrete and continuous multivariate response data. J Agric Biol Environ Stat 18:492–513CrossRefGoogle Scholar
  31. Spies TA, Johnson KN, Burnett KM et al (2007) Cumulative ecological and socioeconomic effects of forest policies in coastal Oregon. Ecol Appl 17:5–17CrossRefPubMedGoogle Scholar
  32. Stanfield BJ, Bliss JC, Spies TA (2002) Land ownership and landscape structure: a spatial analysis of sixty-six Oregon (USA) coast range watersheds. Landscape Ecol 17:685–697CrossRefGoogle Scholar
  33. Steen-Adams MM, Mladenoff DJ, Langston NE, Liu F, Zhu J (2011) Influence of biophysical factors and differences in ojibwe reservation versus Euro- American social histories on forest landscape change in northern Wisconsin, USA. Landscape Ecol 26:1165–1178CrossRefGoogle Scholar
  34. Stewart C, Field C (2010) Managing the essential zeros in quantitative fatty acid signature analysis. J Agric Biol Environ Stat 16:45–69CrossRefGoogle Scholar
  35. Tjelmeland H, Lund KV (2003) Bayesian modelling of spatial compositional data. J Appl Stat 30:87–100CrossRefGoogle Scholar
  36. Tsagris M (2014) Zero adjusted Dirichlet regression for compositional data with zero values present. arXiv:1410.5011
  37. Varin C, Vidoni P (2005) A note on composite likelihood inference and model selection. Biometrika 92:519–528CrossRefGoogle Scholar
  38. Varin C, Czado C (2010) A mixed autoregressive probit model for ordinal longitudinal data. Biostatistics 11:127–138CrossRefPubMedGoogle Scholar
  39. Varin C, Reid N, Firth D (2011) An overview of composite likelihood methods. Stat Sinica 21:5–42Google Scholar
  40. White MA, Mladenoff DJ (1994) Old-growth forest landscape transitions from pre-European settlement to present. Landscape Ecol 9:191–205CrossRefGoogle Scholar
  41. Zhao Y, Joe H (2005) Composite likelihood estimation in multivariate data analysis. Can J Stat 33:335–356CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Xiaoping Feng
    • 1
  • Jun Zhu
    • 1
    • 2
  • Pei-Sheng Lin
    • 3
  • Michelle M. Steen-Adams
    • 4
  1. 1.Department of StatisticsUniversity of WisconsinMadisonUSA
  2. 2.Department of EntomologyUniversity of WisconsinMadisonUSA
  3. 3.Division of Biostatistics and BioinformaticsNational Health Research InstitutesZhunanTaiwan
  4. 4.Department of Environmental StudiesUniversity of New EnglandBiddefordUSA

Personalised recommendations