Skip to main content
Log in

A Bayesian hurdle model for analysis of an insect resistance monitoring database

  • Published:
Environmental and Ecological Statistics Aims and scope Submit manuscript

Abstract

Motivated by the analysis of the Australian Grain Insect Resistance Database (AGIRD), we develop a Bayesian hurdle modelling approach to assess trends in strong resistance of stored grain insects to phosphine over time. The binary response variable from AGIRD indicating presence or absence of strong resistance is characterized by a majority of absence observations and the hurdle model is a two step approach that is useful when analyzing such a binary response dataset. The proposed hurdle model utilizes Bayesian classification trees to firstly identify covariates and covariate levels pertaining to possible presence or absence of strong resistance. Secondly, generalized additive models (GAMs) with spike and slab priors for variable selection are fitted to the subset of the dataset identified from the Bayesian classification tree indicating possibility of presence of strong resistance. From the GAM we assess trends, biosecurity issues and site specific variables influencing the presence of strong resistance using a variable selection approach. The proposed Bayesian hurdle model is compared to its frequentist counterpart, and also to a naive Bayesian approach which fits a GAM to the entire dataset. The Bayesian hurdle model has the benefit of providing a set of good trees for use in the first step and appears to provide enough flexibility to represent the influence of variables on strong resistance compared to the frequentist model, but also captures the subtle changes in the trend that are missed by the frequentist and naive Bayesian models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Austin MP, Meyers JA (1996) Current approaches to modelling the environmental niche of eucalypts: implication for management of forest biodiversity. For Ecol Manag 85(1–3):95–106

    Article  Google Scholar 

  • Austin MP, Nicholls AO, Doherty MD, Meyers JA (1994) Determining species response functions to an environmental gradient by means of a \(\beta \) function. J Veg Sci 5(2):215–228

    Article  Google Scholar 

  • Berk R, Brown L, Zhao L (2010) Statistical inference after model selection. J Quant Criminol 26(2):217–236

    Article  Google Scholar 

  • Bonn A, Schröder B (2001) Habitat models and their transfer for single and multi species groups: a case study of carabids in an alluvial forest. Ecography 24(4):483–496

    Article  Google Scholar 

  • Chipman HA, George EI, McCulloch RE (1998) Bayesian CART model search. J Am Stat Assoc 93(443):935–948

    Article  Google Scholar 

  • Collins PJ (2006) Resistance to chemical treatments in insect pests of stored grain and its management. In: Lorini I, Bacaltchuk B, Beckel H, Deckers D, Sundfeld E, dos Santos JP, Biagi JD, Celaro JC, Faroni LRDA, Bortolini L.de OF, Sartori MR, Elia MC, Guedes RNC, da Fonseca RG, Scussel VM (eds) Proceedings of the 9th international working conference on stored product protection, Campinas, Brazil (2006)

  • Collins PJ, Emery RN, Wallbank BE (2003) Resistance to chemical treatments in insect pests of stored grain and its management. In: Bell CH, Cogan PM, Highley E, Credland PF, Armitage DM (eds) Proceedings of the 8th international working conference on stored product protection. York, UK

  • Dalrymple ML, Hudson IL, Ford RPK (2003) Finite mixture, zero-inflated poisson and hurdle models with application to sids. Comput Stat Data Anal 41(3–4):491–504

    Article  Google Scholar 

  • De’ath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81(11):3178–3192

    Article  Google Scholar 

  • Denison DGT, Mallick BK, Smith AFM (1998) A Bayesian CART algorithm. Biometrika 85(2):363–377

    Article  Google Scholar 

  • Emery RN, Nayak MK, Holloway JC (2011) Lessons learned from phosphine resistance monitoring in Australia. Postharvest Rev 3(6):1–8

    Article  Google Scholar 

  • Emery RN, Tassone RA (1998) The Australian Grain Insect Resistance Database (AGIRD)—a national approach to resistance data management. In: Banks HJ, Wright EJ, Damcevski KA (eds) Proceedings of Australian postharvest technical conference. Canberra, Australia

  • Fletcher D, MacKenzie D, Villouta E (2005) Modelling skewed data with many zeros: a simple approach combining ordinary and logistic regression. Environ Ecol Stat 12:45–54

    Article  Google Scholar 

  • Frühwirth-Schnatter S, Wagner H (2010) Bayesian variable selection for random intercept modelling of Gaussian and non-Gaussian data. In: Bernardo M, Bayarri MJ, Berger JO, Dawid AP, Heckerman D, Smith AFM, West M (eds) Bayesian statistics 9. Canberra, Australia

  • George EI, McCulloch RE (1993) Variable selection via gibbs sampling. J Am Stat Assoc 88(423):881–889

    Article  Google Scholar 

  • Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711–732

    Article  Google Scholar 

  • Hastie T, Tibshirani R (1986) Generalized additive models. Stat Sci 1(3):297–310

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. Springer, New York

    Google Scholar 

  • Hu W, O’Leary R, Mengersen K, Low Choy S (2011) Bayesian classification and regression trees for predicting incidence of cryptosporidiosis. PLoS One 6(8):e23903

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Ishwaran H, Rao JS (2005) Spike and slab variable selection: frequentist and Bayesian strategies. Ann Stat 33(2):730–773

    Article  Google Scholar 

  • Martin TG, Wintle BA, Rhodes JR, Kuhnert PM, Field SA, Low-Choy SJ, Tyre AJ, Possingham HP (2005) Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecol Lett 8(11):1235–1246

    Article  PubMed  Google Scholar 

  • Miller J, Franklin J (2002) Modeling the distribution of four vegetation alliances using generalized linear models and classification trees with spatial dependence. Ecol Model 157(2–3):227–247

    Article  Google Scholar 

  • Mullahy J (1986) Specification and testing of some modified count data models. J Econ 33(3):341–365

    Article  Google Scholar 

  • Nayak MK, Collins PJ, Holloway JC, Emery RN, Pavic H, Bartlett J (2013) Strong resistance to phosphine in rusty grain beetle, Cryptolestes ferrugineus (stephens) (coleoptera: Laemophloeidae): its characterisation and a rapid assay for diagnosis. Pest Manag Sci 69:48–53

    Article  CAS  PubMed  Google Scholar 

  • O’Leary, R (2008) Informed statistical modelling of habitat suitability for rare and threatened species. PhD thesis. Queensland University of Technology, Brisbane

  • O’Leary R, Low Choy S, Mengersen K (2012) Improving the performance and interpretation of habitat models: a two-scale modelling approach to model the envelope and identify excess zeros. Under Rev 1:1

    Google Scholar 

  • O’Leary R, Mengersen K, Murray J (2009) Comparison of four expert elicitation methods: for Bayesian logistic regression and classification trees. In: 18th World IMACS/MODSIM congress

  • Scheipl F (2011) spikeSlabGAM: Bayesian variable selection, model choice and regularization for generalized additive mixed models in R. J Stat Softw 43(14):1–24

    Google Scholar 

  • Therneau TM, Atkinson EJ (1997) An introduction to recursive partitioning using the RPART routine. Technical report. Mayo Clinic

  • Therneau TM, Atkinson, EJ (2011) rpart: recursive partitioning, R package version 3.1-50

  • Welsh AH, Cunningham RB, Donnelly CF, Lindenmayer, DB (1996) Modelling the abundance of rare species: statistical models for counts with extra zeros. Ecol Model 88(1–3):297–308. ISSN 0304–3800

  • Wood SN (2006) Generalized additive models: an introduction with R. Chapman and Hall/CRC, London

    Google Scholar 

  • Zhang P (1992) Inference after variable selection in linear regression models. Biometrika 79(4):741–746

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Dr Clair Alston for the very helpful comments on the manuscript. Drs Falk, Nayak, Low Choy and Collins would also like to acknowledge the support of the Australian Governments Cooperative Research Centres Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthew G. Falk.

Additional information

Handling Editor: Pierre Dutilleul.

Appendices

Appendix 1: Data dictionary

Commodity

Description

2

Barley

8

Chickpeas

15

Feed

23

Maize

25

Mixed grain

26

Mung beans

27

Oats

35

Sorghum

37

Waste

38

Unknown

39

Wheat

57

Bran

127

Millet

999

Other

Region

Description

C

Central Queensland

SEBEN

South East Queensland—East and North

SEC

South East Queensland—Central

SES

South East Queensland—South

SEW

South East Queensland—West

Site type

Description

CS

Bulk handler

F

Farm

M

Merchant

Storage type

Description

B

Bunker

D

Shed

I

Silo

N

Unknown

S

Sealed storage

U

Unsealed storage

Grain storage treatments

Actellic

Aeration

Bioresmethrin

Carbaryl

Delta

Dichlorvous

Dryacide

Fenitrothion

Insect growth regulator

None

Other

Phosphine

Phosphine (siroflow)

Pyrethrins/minor pythreoid

Reldan

Unknown

Appendix 2: Non optimal BCARTs

See Figs. 5 and 6.

Fig. 5
figure 5

BCART with second largest number of observations classified into purely absent nodes. The same variables and splits are included as with the best tree. The first split is the same as the best tree however the order of subsequent splits has changed

Fig. 6
figure 6

BCART with third largest number of observations classified into purely absent nodes. The same variables and splits are included as with the best tree. The first split is the same as the best tree however the order of subsequent splits has changed

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Falk, M.G., O’Leary, R., Nayak, M. et al. A Bayesian hurdle model for analysis of an insect resistance monitoring database. Environ Ecol Stat 22, 207–226 (2015). https://doi.org/10.1007/s10651-014-0294-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10651-014-0294-3

Keywords

Navigation