Statistical Methods for the Analysis of Time–Location Sampling Data
- 491 Downloads
Time–location sampling (TLS) is useful for collecting information on a hard-to-reach population (such as men who have sex with men [MSM]) by sampling locations where persons of interest can be found, and then sampling those who attend. These studies have typically been analyzed as a simple random sample (SRS) from the population of interest. If this population is the source population, as we assume here, such an analysis is likely to be biased, because it ignores possible associations between outcomes of interest and frequency of attendance at the locations sampled, and is likely to underestimate the uncertainty in the estimates, as a result of ignoring both the clustering within locations and the variation in the probability of sampling among members of the population who attend sampling locations. We propose that TLS data be analyzed as a two-stage sample survey using a simple weighting procedure based on the inverse of the approximate probability that a person was sampled and using sample survey analysis software to estimate the standard errors of estimates (to account for the effects of clustering within the first stage [locations] and variation in the weights). We use data from the Young Men’s Survey Phase II, a study of MSM, to show that, compared with an analysis assuming a SRS, weighting can affect point prevalence estimates and estimates of associations and that weighting and clustering can substantially increase estimates of standard errors. We describe data on location attendance that would yield improved estimates of weights. We comment on the advantages and disadvantages of TLS and respondent-driven sampling.
KeywordsTime–location sampling HIV Statistical methods
We thank Christopher H. Johnson, Nevin Krishna, Lillian S. Lin, Alexandra Oster, and Ryan E. Wiegand, Centers for Disease Control and Prevention (CDC), for helpful comments on the manuscript. John Karon’s work was done as a contractor for CDC.
- 5.Kish L. Survey Sampling. New York, NY: Wiley; 1965.Google Scholar
- 7.Cleveland WS. Visualizing Data. Summit, NJ: Hobart; 1993.Google Scholar
- 8.Kalton G. Methods for oversampling rare subpopulations in social surveys. Surv Methodol. 2009; 35(2): 125–141.Google Scholar
- 9.Marpsat M, Razafindratsima N. Survey methods for hard-to-reach populations: introduction to the special issue. Methodol Innov Online. 2010; 5(2): 3–16.Google Scholar
- 10.Semaan S. Time-space sampling and respondent-driven sampling with hard-to-reach populations. Methodol Innov Online. 2010; 5(2): 60–75.Google Scholar
- 11.Centers for Disease Control and Prevention. Prevalence and awareness of HIV infection among men who have sex with men—21 cities, United States, 2008. Morb Mortal Wkly Rep. 2009; 59(37): 1201–1207.Google Scholar
- 12.Oster AM, Wiegand RE, Sionean C, et al. Understanding disparities in HIV infection between black and white MSM in the United States. Epidemiol Soc. 2011; 25(8): 1103–1112.Google Scholar
- 17.Volz E, Heckathorn DD. Probability based estimation theory for respondent driven sampling. J of Official Stat. 2008; 24(1): 79–97.Google Scholar
- 20.Becker RA, Chambers JM, Wilks AR. The New S Language: A Programming Environment for Data Analysis and Graphics. Pacific Grove, CA: Wadsworth & Brooks/Cole;1988.Google Scholar
- 21.Venables WN, Smith DM, and the R Development Core Team. An Introduction to R, second edition. (No city given) United Kingdom: Network Theory Limited;, 2009.Google Scholar
- 22.http://www.cran.r-project.org/doc/contrib./Verzani-SimpleR.pdf. Accessed 21 April, 2011.
- 23.http://faculty.washington.edu/tlumley/survey/doc/survey.pdf. Accessed 21 April, 2011.
- 24.Lumley T. Complex Surveys: A Guide to Analysis Using R. Hoboken, NJ: John Wiley & Sons, Inc.; 2010.Google Scholar