Abstract
Motivated by the analysis of the Australian Grain Insect Resistance Database (AGIRD), we develop a Bayesian hurdle modelling approach to assess trends in strong resistance of stored grain insects to phosphine over time. The binary response variable from AGIRD indicating presence or absence of strong resistance is characterized by a majority of absence observations and the hurdle model is a two step approach that is useful when analyzing such a binary response dataset. The proposed hurdle model utilizes Bayesian classification trees to firstly identify covariates and covariate levels pertaining to possible presence or absence of strong resistance. Secondly, generalized additive models (GAMs) with spike and slab priors for variable selection are fitted to the subset of the dataset identified from the Bayesian classification tree indicating possibility of presence of strong resistance. From the GAM we assess trends, biosecurity issues and site specific variables influencing the presence of strong resistance using a variable selection approach. The proposed Bayesian hurdle model is compared to its frequentist counterpart, and also to a naive Bayesian approach which fits a GAM to the entire dataset. The Bayesian hurdle model has the benefit of providing a set of good trees for use in the first step and appears to provide enough flexibility to represent the influence of variables on strong resistance compared to the frequentist model, but also captures the subtle changes in the trend that are missed by the frequentist and naive Bayesian models.
Similar content being viewed by others
References
Austin MP, Meyers JA (1996) Current approaches to modelling the environmental niche of eucalypts: implication for management of forest biodiversity. For Ecol Manag 85(1–3):95–106
Austin MP, Nicholls AO, Doherty MD, Meyers JA (1994) Determining species response functions to an environmental gradient by means of a \(\beta \) function. J Veg Sci 5(2):215–228
Berk R, Brown L, Zhao L (2010) Statistical inference after model selection. J Quant Criminol 26(2):217–236
Bonn A, Schröder B (2001) Habitat models and their transfer for single and multi species groups: a case study of carabids in an alluvial forest. Ecography 24(4):483–496
Chipman HA, George EI, McCulloch RE (1998) Bayesian CART model search. J Am Stat Assoc 93(443):935–948
Collins PJ (2006) Resistance to chemical treatments in insect pests of stored grain and its management. In: Lorini I, Bacaltchuk B, Beckel H, Deckers D, Sundfeld E, dos Santos JP, Biagi JD, Celaro JC, Faroni LRDA, Bortolini L.de OF, Sartori MR, Elia MC, Guedes RNC, da Fonseca RG, Scussel VM (eds) Proceedings of the 9th international working conference on stored product protection, Campinas, Brazil (2006)
Collins PJ, Emery RN, Wallbank BE (2003) Resistance to chemical treatments in insect pests of stored grain and its management. In: Bell CH, Cogan PM, Highley E, Credland PF, Armitage DM (eds) Proceedings of the 8th international working conference on stored product protection. York, UK
Dalrymple ML, Hudson IL, Ford RPK (2003) Finite mixture, zero-inflated poisson and hurdle models with application to sids. Comput Stat Data Anal 41(3–4):491–504
De’ath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81(11):3178–3192
Denison DGT, Mallick BK, Smith AFM (1998) A Bayesian CART algorithm. Biometrika 85(2):363–377
Emery RN, Nayak MK, Holloway JC (2011) Lessons learned from phosphine resistance monitoring in Australia. Postharvest Rev 3(6):1–8
Emery RN, Tassone RA (1998) The Australian Grain Insect Resistance Database (AGIRD)—a national approach to resistance data management. In: Banks HJ, Wright EJ, Damcevski KA (eds) Proceedings of Australian postharvest technical conference. Canberra, Australia
Fletcher D, MacKenzie D, Villouta E (2005) Modelling skewed data with many zeros: a simple approach combining ordinary and logistic regression. Environ Ecol Stat 12:45–54
Frühwirth-Schnatter S, Wagner H (2010) Bayesian variable selection for random intercept modelling of Gaussian and non-Gaussian data. In: Bernardo M, Bayarri MJ, Berger JO, Dawid AP, Heckerman D, Smith AFM, West M (eds) Bayesian statistics 9. Canberra, Australia
George EI, McCulloch RE (1993) Variable selection via gibbs sampling. J Am Stat Assoc 88(423):881–889
Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711–732
Hastie T, Tibshirani R (1986) Generalized additive models. Stat Sci 1(3):297–310
Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. Springer, New York
Hu W, O’Leary R, Mengersen K, Low Choy S (2011) Bayesian classification and regression trees for predicting incidence of cryptosporidiosis. PLoS One 6(8):e23903
Ishwaran H, Rao JS (2005) Spike and slab variable selection: frequentist and Bayesian strategies. Ann Stat 33(2):730–773
Martin TG, Wintle BA, Rhodes JR, Kuhnert PM, Field SA, Low-Choy SJ, Tyre AJ, Possingham HP (2005) Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecol Lett 8(11):1235–1246
Miller J, Franklin J (2002) Modeling the distribution of four vegetation alliances using generalized linear models and classification trees with spatial dependence. Ecol Model 157(2–3):227–247
Mullahy J (1986) Specification and testing of some modified count data models. J Econ 33(3):341–365
Nayak MK, Collins PJ, Holloway JC, Emery RN, Pavic H, Bartlett J (2013) Strong resistance to phosphine in rusty grain beetle, Cryptolestes ferrugineus (stephens) (coleoptera: Laemophloeidae): its characterisation and a rapid assay for diagnosis. Pest Manag Sci 69:48–53
O’Leary, R (2008) Informed statistical modelling of habitat suitability for rare and threatened species. PhD thesis. Queensland University of Technology, Brisbane
O’Leary R, Low Choy S, Mengersen K (2012) Improving the performance and interpretation of habitat models: a two-scale modelling approach to model the envelope and identify excess zeros. Under Rev 1:1
O’Leary R, Mengersen K, Murray J (2009) Comparison of four expert elicitation methods: for Bayesian logistic regression and classification trees. In: 18th World IMACS/MODSIM congress
Scheipl F (2011) spikeSlabGAM: Bayesian variable selection, model choice and regularization for generalized additive mixed models in R. J Stat Softw 43(14):1–24
Therneau TM, Atkinson EJ (1997) An introduction to recursive partitioning using the RPART routine. Technical report. Mayo Clinic
Therneau TM, Atkinson, EJ (2011) rpart: recursive partitioning, R package version 3.1-50
Welsh AH, Cunningham RB, Donnelly CF, Lindenmayer, DB (1996) Modelling the abundance of rare species: statistical models for counts with extra zeros. Ecol Model 88(1–3):297–308. ISSN 0304–3800
Wood SN (2006) Generalized additive models: an introduction with R. Chapman and Hall/CRC, London
Zhang P (1992) Inference after variable selection in linear regression models. Biometrika 79(4):741–746
Acknowledgments
The authors would like to thank Dr Clair Alston for the very helpful comments on the manuscript. Drs Falk, Nayak, Low Choy and Collins would also like to acknowledge the support of the Australian Governments Cooperative Research Centres Program.
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling Editor: Pierre Dutilleul.
Appendices
Appendix 1: Data dictionary
Commodity | Description |
---|---|
2 | Barley |
8 | Chickpeas |
15 | Feed |
23 | Maize |
25 | Mixed grain |
26 | Mung beans |
27 | Oats |
35 | Sorghum |
37 | Waste |
38 | Unknown |
39 | Wheat |
57 | Bran |
127 | Millet |
999 | Other |
Region | Description |
---|---|
C | Central Queensland |
SEBEN | South East Queensland—East and North |
SEC | South East Queensland—Central |
SES | South East Queensland—South |
SEW | South East Queensland—West |
Site type | Description |
---|---|
CS | Bulk handler |
F | Farm |
M | Merchant |
Storage type | Description |
---|---|
B | Bunker |
D | Shed |
I | Silo |
N | Unknown |
S | Sealed storage |
U | Unsealed storage |
Grain storage treatments | |||
---|---|---|---|
Actellic | Aeration | Bioresmethrin | Carbaryl |
Delta | Dichlorvous | Dryacide | Fenitrothion |
Insect growth regulator | None | Other | Phosphine |
Phosphine (siroflow) | Pyrethrins/minor pythreoid | Reldan | Unknown |
Appendix 2: Non optimal BCARTs
Rights and permissions
About this article
Cite this article
Falk, M.G., O’Leary, R., Nayak, M. et al. A Bayesian hurdle model for analysis of an insect resistance monitoring database. Environ Ecol Stat 22, 207–226 (2015). https://doi.org/10.1007/s10651-014-0294-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10651-014-0294-3