Skip to main content
Log in

Methods of analysis for georeferenced sample counts of tarnished plant bugs in cotton

  • Published:
Precision Agriculture Aims and scope Submit manuscript

Abstract

The problem of analyzing georeferenced cotton pest insect samples when a large percentage of the counts are zero is examined. The use of appropriate statistical methods for their analysis is required. To demonstrate this, georeferenced samples (n = 63) of tarnished plant bugs (TPBs; Lygus lineolaris [Palisot de Beauvois] (Heteroptera: Miridae)) were analyzed by three statistical methods and the results were compared. Correlation analysis of the sample counts with 25 classes of cotton growth derived from an unsupervized classification of multispectral imagery was followed by a complete enumeration analysis comprising three scenarios. The first scenario assumed the insect samples were unstratified. A distribution of sample averages was created by complete enumeration of all combinations of samples taken four at a time. The second scenario used imagery of the cotton fields to allocate the samples among three cotton growth categories (marginal, good or best) derived by a supervized classification of the 25 unsupervized classes. The insect samples associated with these categorical habitats were completely enumerated using allocations of 4, 6, 8 or 10 samples at a time from various sample sizes to determine how different allocations affected the results. The mean was not affected, but the standard deviation decreased with increased allocation sizes in all habitats. The third scenario used the two observers and three habitat categories to create six additional complete enumeration distributions by allocating four samples at a time from groups of varying sample sizes. These enumeration distributions are non-parametric estimators of the sampling distribution of: (1) the sample averages of a given sample size when samples are taken from the entire field, (2) the sample averages of a given size when samples are taken from each cotton habitat or (3) the sample averages of a given sample size from samples taken from each habitat by each observer. To support the enumeration analyses, these insect samples were analyzed further by Poisson regression models. These models showed significant differences between TPB counts by the two observers and among the habitats, whereas the observer by habitat interaction was not significant. For every combination of observer and cotton growth category, a Poisson regression model estimated the mean rate of TPB numbers. These means were similar to the corresponding modes of the complete enumeration distributions. The two non-standard methods showed that TPB numbers differed by habitat categories even though there were samples with a zero count, whereas a correlation analysis failed to identify a relationship between TPB sample counts and unsupervized habitat classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Agresti, A. (1996). An introduction to categorical data analysis. New York: John Wiley & Sons.

    Google Scholar 

  • Allison, P. D. (1999). Logistic regression using the SAS system: Theory and application. Cary: SAS® Institute Inc.

    Google Scholar 

  • Beyer, W. H. (1968). Handbook of tables for probability and statistics (2nd ed.). Cleveland: CRC Press.

    Google Scholar 

  • Blom, P. E., Fleischer, S. J., & Smilowitz, Z. (2002). Spatial and temporal dynamics of Colorado potato beetle (Coleoptera: Chrysomelidae) in fields with perimeter and spatially targeted insecticides. Environmental Entomology, 31, 149–159.

    Article  Google Scholar 

  • Boydell, B., & McBratney, A. B. (2002). Identifying potential within-field management zones from cotton-yield estimates. Precision Agriculture, 3, 9–23. doi:10.1023/A:1013318002609.

    Article  Google Scholar 

  • Bugayevskiy, L. M., & Snyder, J. P. (1995). Map projections. A reference manual. Philadelphia: Taylor and Francis.

    Google Scholar 

  • Campenella, R. (2000). Testing components toward a remote-sensing-based decision support system for cotton production. Photogrammetric Engineering and Remote Sensing, 66, 1219–1227.

    Google Scholar 

  • Candy, S. G. (2000). The application of generalized linear mixed models to multi-level sampling for insect population monitoring. Environmental and Ecological Statistics, 7, 217–238. doi:10.1023/A:1009662915320.

    Article  Google Scholar 

  • Confalonieri, R., Acutis, M., Bellocchi, G., & Genovese, G. (2007). Resampling-based software for estimating optimal sample size. Environment Model Software, 22, 1796–1800. doi:10.1016/j.envsoft.2007.02.006.

    Article  Google Scholar 

  • Desouhant, E., Debouzie, D., & Menu, F. (1998). Oviposition pattern of phytophagous insects: On the importance of host population heterogeneity. Oecologia, 114, 382–388. doi:10.1007/s004420050461.

    Article  Google Scholar 

  • Dupont, J. K., Campanella, R., Seal, M. R., Willers, J. L., & Hood, K. B. (2000). Spatially variable insecticide applications through remote sensing. In P. Duggar & D. Richter (Eds.), 2000 Proceedings of the Beltwide Cotton Conferences, vol. 2 (pp. 426–429). Memphis, TN: National Cotton Council.

  • Efron, B. (1981). Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods. Biometrika, 68, 589–599. doi:10.1093/biomet/68.3.589.

    Article  Google Scholar 

  • Elvidge, C. D., & Chen, Z. (1995). Comparison of broad-band and narrow-band red and near-infrared vegetation indices. Remote Sensing Environment, 54, 38–48. doi:10.1016/0034-4257(95)00132-K.

    Article  Google Scholar 

  • Fleischer, S. J., Blom, P. E., & Weisz, R. (1999). Sampling in precision IPM: When the objective is a map. Phytopathology, 89, 1112–1118. doi:10.1094/PHYTO.1999.89.11.1112.

    Article  PubMed  CAS  Google Scholar 

  • Hall, D. B. (2000). Zero-inflated Poisson and binomial regression with random effects: A case study. Biometrics, 56, 1030–1039. doi:10.1111/j.0006-341X.2000.01030.x.

    Article  PubMed  CAS  Google Scholar 

  • Hinde, J., & Demétrio, C. G. B. (1988). Overdispersion: Models and estimation. Computational Statistics and Data Analysis, 27, 151–170. doi:10.1016/S0167-9473(98)00007-3.

    Article  Google Scholar 

  • Iwi, G., Millard, R. K., Palmer, A. M., Preece, A. W., & Saunders, M. (1999). Bootstrap resampling: A powerful method of assessing confidence intervals for doses from experimental data. Physics in Medicine and Biology, 44, N55–N62.

    Article  PubMed  CAS  Google Scholar 

  • Jensen, J. R. (2000). Remote sensing of the environment: An earth resource perspective. Upper Saddle River: Prentice-Hall.

    Google Scholar 

  • Johnston, G. (1993). SAS® software to fit the General Linear Model. In Proceedings of the Thirteenth Annual SAS Users Group International Conference. Cary, NC: SAS® Institute Inc. http://support.sas.com/rnd/app/papers/papers_da.html. Accessed 2 September 2008.

  • Karandinos, M. G. (1976). Optimum sample size and comments on some published formulae. Bulletin of the Entomological Society of America, 22, 417–421.

    Google Scholar 

  • Kennedy, M. (1996). The global positioning system: An introduction. Chelsea: Ann Arbor Press, Inc.

    Google Scholar 

  • Long, J. S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks: Sage Publications.

    Google Scholar 

  • Manly, B. F. J. (1997). Randomization, bootstrap, and Monte Carlo methods in biology (2nd ed.). London: Chapman and Hall.

    Google Scholar 

  • Midgarden, D., Fleischer, S. J., Weisz, R., & Smilowitz, Z. (1997). Site-specific integrated pest management impact on development of esfenvalerate resistance in Colorado Potato Beetle (Coleoptera: Chrysomelidae) and on densities of natural enemies. Journal of Economic Entomology, 90, 855–867.

    Google Scholar 

  • Musser, F., Stewart, S., Bagwell, R., Lorenz, G., Catchot, A., Burris, E., et al. (2007). Comparison of direct and indirect sampling methods for tarnished plant bug (Hemiptera: Miridae) in flowering cotton. Journal of Economic Entomology, 100, 1916–1923. doi:10.1603/0022-0493(2007)100[1916:CODAIS]2.0.CO;2.

    Article  PubMed  CAS  Google Scholar 

  • Myers, R. H., & Montgomery, D. C. (1997). A tutorial on generalized linear models. Journal of Quality Technology, 29, 274–291.

    Google Scholar 

  • Nelder, J. A., & Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society. Series A, 135, 370–384. doi:10.2307/2344614.

    Article  Google Scholar 

  • Nyrop, J. P., Foster, R. E., & Onstad, D. W. (1986). Value of sample information in pest control decision making. Journal of Economic Entomology, 79, 1421–1429.

    Google Scholar 

  • Park, H. M. (2005). Regression models for event count data using SAS, STATA, and LIMDEP. http://www.indiana.edu/~statmath/stat/all/count/count.html. Accessed 2 September 2008.

  • Pedigo, L. P., & Buntin, G. D. (Eds.). (1994). Handbook of sampling methods for arthropods in agriculture. Boca Raton: CRC Press.

    Google Scholar 

  • Piegorsch, W. W., & Bailer, A. J. (2005). Analyzing environmental data. Chichester: Wiley and Sons.

    Book  Google Scholar 

  • Pouncey, R., Swanson, K., & Hart, K. (Eds.). (1999). ERDAS field guide (5th ed.). Atlanta: ERDAS.

    Google Scholar 

  • Richards, J. A., & Jia, X. (1999). Remote sensing digital image analysis. An introduction (3rd ed.). Berlin: Springer-Verlag.

    Google Scholar 

  • Ridout, M., Demétrio, C. G. B., & Hinde, J. P. (1998). Models for count data with many zeros. In Proceedings of the XIXth International Biometric Conference (pp. 179–192). Cape Town: International Biometrics Society.

  • Rouse Jr. J. W., Haas, R. H., Deering, D. W., Schell, J. A., & Harlan, J. C. (1974). Monitoring the vernal advancement and retrogradation (greenwave effect) of natural vegetation. NASA/GSFC Type III Final Rpt. Greenbelt, MD: Goddard Space Flight Center.

  • Schabenberger, O., & Pierce, F. J. (2002). Contemporary statistical models for the plant and soil sciences. Boca Raton: CRC Press.

    Google Scholar 

  • Seavy, N. E., Quader, S., Alexander, J. D., & Ralph, C. J. (2005). Generalized linear models and point count data: Statistical considerations for the design and analysis of monitoring studies. In C. J. Ralph & T. D. Rich (Eds.), Bird conservation implementation and integration in the Americas (pp. 744–753). Albany: Pacific Southwest Research Station, Forest Service, U.S. Department of Agriculture. USDA Forest Service Gen. Tech. Rep. PSW-GTR-191.

  • Shaw, D. R., & Willers, J. L. (2006). Improving pest management with remote sensing. Outlooks Pest Management, 17, 197–201. doi:10.1564/17oct03.

    Article  Google Scholar 

  • Sloane, N. J. A. (2007). The on-line encyclopedia of integer sequences. http://www.research.att.com/~njas/sequences/. Accessed 2 September 2008.

  • Slymen, D. J, Ayala, G. X., Arredondo, E. M., & Elder J. P. (2006). A demonstration of modeling count data with an application to physical activity. Epidemiologic Perspectives and Innovations, 3:3. http://www.epi-perspectives.com/content/3/1/3. Accessed September 2, 2008. doi:10.1186/1742-5573-3-3.

  • Stern, V. M., Smith, R. F., van den Bosch, R., & Hagen, K. S. (1959). The integrated control concept. Hilgardia, 29, 81–101.

    CAS  Google Scholar 

  • Stokes, M. E., Davis, C. S., & Kock, G. G. (2000). Categorical data analysis using the SAS ® system (2nd ed.). Cary: SAS® Institute Inc.

    Google Scholar 

  • Theobald, D. M. (2003). GIS concepts and ARCGIS methods. Fort Collins: Conservation Planning Technologies.

    Google Scholar 

  • Thompson, S. K. (1992). Sampling. New York: Wiley-Interscience.

    Google Scholar 

  • Trumble, J. T. (1985). Implications of changes in arthropod distribution following chemical applications. Researches on Population Ecology, 27, 277–285. doi:10.1007/BF02515466.

    Article  Google Scholar 

  • Willers, J. L., & Akins, D. C. (2000). Sampling for tarnished plant bugs in cotton. Southwest Entomologist, Suppl no. 23, 39–57.

    Google Scholar 

  • Willers, J. L., Ladner, W. L., McKinion, J. M., & Cooke, W. H. (2000). Application of computer intensive methods to evaluate the performance of a sampling design for use in cotton insect pest management. In G. A. Milliken (Ed.), Proceedings of the 2000 Kansas State University Conference on Applied Statistics in Agriculture (pp. 119–133). Manhattan, KS: Kansas State University.

  • Willers, J. L., Jenkins, J. N., Ladner, W. L., Gerard, P. D., Boykin, D. L., Hood, K. B., et al. (2005). Site-specific approaches to cotton insect control. Sampling and remote sensing analysis techniques. Precision Agriculture, 6, 431–452. doi:10.1007/s11119-005-3680-x.

    Article  Google Scholar 

  • Willers, J. L., McKinion, J. M., & Jenkins, J. N. (2006). Remote sensing, sampling and simulation applications in analyses of insect dispersion and abundance in cotton. In C. Aguirre-Bravo, P. J. Pellicane, D. P. Burns, & S. Draggan (Eds.), Monitoring Science and Technology Symposium: Unifying Knowledge for Sustainability in the Western Hemisphere (pp. 879–886). USDA Forest Service Proceedings RMRS-P-42CD. Fort Collins, CO: U.S. Department of Agriculture, Forest Service, Rocky Mountain Research Station. http://www.fs.fed.us/rm/pubs/rmrs_p042.pdf. Accessed 2 September 2008.

  • Willers, J. L., Milliken, G. A., Jenkins, J. N., O’Hara, C. G., Gerard, P. D., Reynolds, D. B., et al. (2008). Defining the experimental unit for the design and analysis of site-specific experiments in commercial cotton fields. Agricultural Systems, 96, 237–249. doi:10.1016/j.agsy.2007.09.003.

    Article  Google Scholar 

  • Willers, J. L., Seal, M. R., & Luttrell, R. G. (1999). Remote sensing, line-intercept sampling for tarnished plant bugs (Heteroptera: Miridae) in Mid-South cotton. Journal of Cotton Science, 3, 160–170.

    Google Scholar 

  • Wilson, L. T. (1994). Estimating abundance, impact, and interactions among arthropods in cotton agroecosystems. In L. P. Pedigo & G. D. Buntin (Eds.), Handbook of sampling methods for arthropods in agriculture (pp. 475–514). Boca Raton: CRC Press.

    Google Scholar 

  • Wilson, L. T., Sterling, W. L., Rummel, D. R., & De Vay, J. E. (1989). Quantitative sampling principles in cotton IPM. In R. E. Frisbie, K. M. El-Zik, & L. T. Wilson (Eds.), Integrated pest management systems and cotton production (pp. 85–119). New York: John Wiley & Sons.

    Google Scholar 

Download references

Acknowledgements

Appreciation is expressed to Ms. Debbie Boykin and Dr. Gordon Snodgrass (ARS, USDA, Stoneville, MS), Mr. Ronald E. Britton (ARS, USDA, Mississippi State), Drs. Wes Burger and Bronson Strickland (Department of Wildlife and Fisheries, Mississippi State University), Dr. Fred Musser (Department of Plant Pathology and Entomology, Mississippi State University), Dr. Matt Kramer (ARS, USDA, Beltsville, MD), Dr. Jiaxing Wu (Department of Plant and Soil Sciences, Mississippi State University) and Yongfeng Zhao (Department of Mathematics and Statistics, Mississippi State University) for their helpful discussions or reviews during the preparation of this manuscript. We thank the editor and two anonymous reviewers for other helpful suggestions. Financial support was provided by Advanced Spatial Technologies for Agriculture (ASTA-322-298) and the USDA Area-Wide Tarnished Plant Bug Management Project (thru ARS CRIS Project 6406-21610-006-00D). Approved for publication as Journal Article No. J-11245 of the Mississippi Agricultural and Forestry Experiment Station, Mississippi State University.

Disclaimer

Mention of a trademark, proprietary product, or vendor does not constitute guarantee or warranty of the product by the U.S. Department of Agriculture and does not imply its approval to the exclusion of other products or vendors that may also be suitable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. L. Willers.

Appendix

Appendix

To develop the analysis processes and concepts of this paper, the following definitions are given:

  1. (1)

    Buffer: A GIS polygon, typically a circle of a specified radius used to define a region within which specialized spatial processing operations are restricted.

  2. (2)

    Complete enumeration: A resampling method where all possible combinations of a sampling vector are derived by a combinatorial rule expressed as:

    $$ C({n,m}) = n!/m!( {n-m})! $$

    which defines the number of combinations (Beyer 1968) of n distinct items taken m at a time. The combinatorial rule was applied to enumerate all allocations (m) of a chosen sample size of (n) tarnished plant bug (TPB) samples without repeating the selection of sample counts contained in any prior combination.

  3. (3)

    Complete enumeration distribution: The histogram that describes the frequency of events (i.e. the mean of samples taken m at a time) that result from the complete enumeration of the sample vector (a set of n samples).

  4. (4)

    Count data: Integer value for the number of items of interest counted in a sample of a specified sample unit size.

  5. (5)

    Habitat category: A collection of pixels whose attribute values belong to the same category and represent cotton whose phenology of growth is more similar than other categories (Willers et al. 2005). In a specialized usage, it can be a synonym for the GIS concept of a zone in raster processing (Theobald 2003).

  6. (6)

    Over-dispersion: This occurs when count data show greater variability than is predicted by the assumed mean-variance relationship (Hinde and Demétrio 1988; Long 1997). If ignored, the selected statistical model may underestimate the standard errors and result in inappropriate inferences on the estimated regression parameters.

  7. (7)

    Peak bloom: That period during the production season when the plants produce the maximum number of open flowers per plant.

  8. (8)

    Poisson regression: A regression technique based upon the Poisson probability mass function where the estimates of parameters are derived by maximum likelihood methods.

  9. (9)

    Raster format: A GIS file format where the features are represented as a collection of cells of the same size and shape for a specified number of rows and columns. Each cell is georeferenced and only one integer valued attribute is assigned to each cell (Theobald 2003).

  10. (10)

    Sample: A set of observations obtained with a particular sampling method (Willers et al. 2005) at a geographic location in a farm field. The response, or dependent variable, is the number (or count) of insects found in the sample.

  11. (11)

    Sample size: The number of samples (n) collected by an observer while using a specified sampling method.

  12. (12)

    Sample unit size: In this study, the area of ground represented by a single sample obtained with a specified sampling method (Willers et al. 2000).

  13. (13)

    Sample vector: The collection of n samples collected by an observer during a specific period of time. The sampling vector can be unstratified (no ancillary data associated with the sample) or geographically stratified (ancillary data associated with the sample). Important ancillary data (Willers et al. 2005) associated with a sample are the sampler, its geographic coordinate (or label), and the cotton habitat class from which the sample was obtained. This concept must not be confused with vector data in GIS processing.

  14. (14)

    Simple random sample: The simplest sampling design in which a sample site is chosen by a randomization process from the set, or collection, of all possible sites and where each site has an equally probable chance of being selected.

  15. (15)

    Stratified, simple random sample: A sample design derivative where the samples are stratified according to some type of classification of remote sensing information that determines the number and geographical extent of the different strata. A simple random sampling design in each stratum determines the summary statistics of each stratum (Willers et al. 2005). This differs from a stratified sampling design where summary statistics for a population of interest are derived using samples from all the strata.

  16. (16)

    Supervized classification: A classification of image pixels closely controlled by the analyst (Pouncey et al. 1999). In this process, the analyst selects pixels that represent patterns or land cover features that are recognized, or identified, with help from other sources, such as aerial photos, ground truth data, or maps. Knowledge of the data, and of the classes desired, is required before classification.

  17. (17)

    Unsupervized classification: A computer-automated process that depends on the data itself for the definition of classes (Pouncey et al. 1999). It enables the analyst to specify some parameters that the computer uses to uncover statistical patterns inherent in the data.

  18. (18)

    Vector format: A GIS file format where geographical features are represented as points and that uses their x–y coordinates to construct points, lines and areas (Theobald 2003).

  19. (19)

    Zonal average of a buffer: The average value of the attribute of interest for all pixels whose centers lie within the boundary of the buffer polygon (Theobald 2003).

  20. (20)

    Zonal maximum of a buffer: The maximum value of the attribute of interest from the set of pixels whose centers lie within the boundary of the buffer polygon (Theobald 2003).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Willers, J.L., Jenkins, J.N., McKinion, J.M. et al. Methods of analysis for georeferenced sample counts of tarnished plant bugs in cotton. Precision Agric 10, 189–212 (2009). https://doi.org/10.1007/s11119-008-9085-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11119-008-9085-x

Keywords

Navigation