AIDS and Behavior

, Volume 14, Issue 1, pp 218–224

Comparing Internet-Based and Venue-Based Methods to Sample MSM in the San Francisco Bay Area


    • San Francisco Department of Public Health, AIDS Office
  • Greg Rebchook
    • Center for AIDS Prevention StudiesUniversity of California
  • Alberto Curotto
    • Center for AIDS Prevention StudiesUniversity of California
  • Jason Vaudrey
    • San Francisco Department of Public Health, AIDS Office
  • Matthew Amsden
    • Reveal Communications
  • Deb Levine
    • ISIS Inc.
  • Willi McFarland
    • San Francisco Department of Public Health, AIDS Office
Original Research

DOI: 10.1007/s10461-009-9521-6

Cite this article as:
Raymond, H.F., Rebchook, G., Curotto, A. et al. AIDS Behav (2010) 14: 218. doi:10.1007/s10461-009-9521-6


Methods of collecting behavioral surveillance data, including Web-based methods, have recently been explored in the United States. Questions have arisen as to what extent Internet recruitment methods yield samples of MSM comparable to those obtained using venue-based recruitment methods. We compare three recruitment methods among MSM with respect to demographic and risk behaviors, one sample was obtained using time location sampling at venues in San Francisco, one using a venue based like approach on the Internet and one using direct-marketing advertisements to recruit participants. The physical venue approach was most successful in completing interviews with approached men than both Internet approaches. Respondents recruited via the three methods reported slight differences in risk behavior. Direct marketing internet recruitment can obtain large samples of MSM in a short time.


MSMSamplingRisk behaviorSurveillanceInternet


HIV behavioral risk surveillance has been conducted among men who have sex with men (MSM) using venue-based approaches for almost 20 years (Lemp et al. 1994; Mackellar et al. 1996; Valleroy et al. 2000). Recently other methods have been explored to reach this population, such as respondent driven sampling (RDS) (Heckathorn 1997; Magnani et al. 2005). Recruiting MSM for research via the Internet has been occurring for a number of years (Bowen et al. 2004; Mustanski 2007; Rosser et al. 2008; Ross et al. 2000). However, only recently has the Internet been used as a recruitment approach for routine behavioral surveillance surveys (Elford et al. 2004a, b; Hidaka et al. 2006; Wang and Ross 2002; Zhang et al. 2007). The literature suggests that data collected on the Internet has internal validity and is often comparable to data collected via other methods (Gosling et al. 2004; Mustanski 2001). Convenience sampling has dominated Internet research to date (Bowen et al. 2004; Mustanski 2007; Rhodes et al. 2002; Ross et al. 2000) and samples recruited online have been equated to mail-in surveys with their inherent limitations, such as response bias (Couper et al. 2001).

It is still unknown if the external validity of data collected on the Internet is comparable to data collected using other methods, and whether Internet-based samples are representative of the greater target population. In the case of MSM-focused behavioral research, some researchers have found differences between online and offline samples in terms of higher risk-taking behavior reported by the Internet samples, but the Internet surveys may sample MSM not reached through venue-based approaches (Elford et al. 2004a; Evans et al. 2007; Ross et al. 2000; Wang and Ross 2002). Others have concluded that Internet and bar samples of MSM are similar (Rhodes et al. 2002). Many researchers contend that rigorous sampling is a pivotal issue in exploiting the Internet for behavioral studies in order to be used to make inferences about the larger population in question (Couper 2007; Elford et al. 2004b). Key components to enhance rigor in sampling are being able to collect data that documents the size of the population from which the study population is drawn (enumeration), eligibility rates, participation rates and measuring completion rates or the inverse; drop-out rates. Studies have documented eligibility, participation rates and drop-out rates, but none have reported on attempts to enumerate the size of the population from which the sample was obtained (Elford et al. 2004b; Mustanski 2007). Finally, the question of whether MSM recruited on the Internet could have also been recruited in physical venues and vice versa is an important one in making decisions about which sampling approach might be most useful in this population, for a useful overview of issues related to Internet surveys see Pequegnat et al. 2007. In this paper we seek to add to this body of knowledge by presenting comparisons of sampling outcomes and exploring the differences between three samples of San Francisco Bay Area MSM, two recruited on the Internet and one in physical venues in the city of San Francisco.


The US Centers for Disease Control and Prevention (CDC) sponsors the National HIV Behavioral Surveillance (NHBS), which employs time location sampling (TLS) to survey MSM at community venues frequented by MSM. NHBS has been described in detail elsewhere (CDC 2005; MacKellar et al. 2007). In brief, NHBS is designed to be an ongoing behavioral surveillance system of high-risk populations including MSM, intravenous drug users (IDU) and heterosexuals living in high-risk areas in 25 sites across the United States, including the San Francisco Bay Area. To reach MSM, men over the age of 18 years old were systematically sampled (i.e. staff approach all men at the venues as staff are available) from randomly selected public venues, in the city and county of San Francisco, where MSM were known to congregate. A formative research phase was conducted prior to recruitment to compile a universe of recruitment venues, which included bars/clubs, social organizations, churches and street locations among others. The list of venues available for random selection each month was continuously updated so that the maximum number of venues would be available for random selection. Approximately 60 venues were candidates for each month’s random selection of venues out of a total of 164 venues identified. Not all identified venues were included for random selection due to a number of factors. These factors included: lack of owner permission, inclement weather, seasonality and lack of sufficient attendance. An average of 13 sampling events occurred each month for 54 weeks. At each sampling event, a count of all potential recruits is recorded. This enumeration produces the denominator for each sampling event and the study as a whole. Men stopping for intercepts at these venues were asked brief eligibility questions. The eligibility criteria were: being at least 18 years old, male, and a resident in any of 10 Bay Area counties (Alameda, Contra Costa, Marin, Napa, San Francisco, San Mateo, Santa Clara, Santa Cruz, Solano and Sonoma). Having had sex with a man was not an eligibility screener question so as not to dissuade men uncomfortable disclosing that information before taking the survey. After determining eligibility, staff administered a standardized behavioral risk survey using handheld computers. Participants were also offered an HIV antibody test. Participants were reimbursed $25 for taking part in the survey and $25 for participating in HIV testing.

Web-based HIV Behavioral Surveillance (WHBS) was designed as a pilot of Web-based methodologies to obtain samples of MSM via the Internet by the CDC in conjunction with health departments in six US metropolitan areas. Two sampling methods were implemented and evaluated in San Francisco during WHBS. We present the results of a TLS-like method conducted on the Internet and a direct marketing (DM) approach implemented by placing banner advertisements in Internet venues frequented by MSM.

Using similar methods to those used for TLS we developed an analogous online method—Internet venue-based sampling (IVBS). In IVBS a universe of online venues where MSM physically located in the Bay Area congregate online was created by conducting formative research with the target population through individual interviews and focus groups and by reviewing data on MSM’s Internet use collected as part of NHBS. For each of these venues attendance patterns and high traffic times were estimated. On a monthly basis a randomly selected list of venues and times was created to guide sampling. A total of 15 venues were identified for IVBS sampling and all were available for random selection each month. A total of 41 sampling events were conducted over an 8 week period. During each sampling event, staff entered the online venues at the randomly selected day and time, enumerated the men who were online or, whose profiles were “live” at the time, and systematically approached men via an online instant message or e-mail. Men responding to the online message or e-mail were screened for eligibility either via online messaging or e-mail. Eligibility criteria were the same as those for the other sampling approaches described here. Eligible men were invited to participate in the online survey. A link to the online survey was given to men agreeing to participate. Testing was not offered as part of this survey. Participants of IVBS did not receive any remuneration for taking part in the survey.

For the DM approach, five advertising creative concepts were developed and tested in focus groups and/or individual interviews with men who used the Internet, recruited from NHBS sampling venues in San Francisco. Of the five concepts developed, two were selected for further development. The selected ads were modified based on qualitative feedback and sized to fit the most common size specifications, as defined by the Internet Advertising Bureau, an industry trade association (, plus three additional specialty sizes. The banner ads were also modified to take advantage of rich media animation where available.

Ads were then placed on various websites frequented by MSM, which were selected on the basis of findings from formative research with the target population and Web traffic statistics. Web traffic statistics were provided by ComScore ( and Alexa Traffic ( Site placements included, Manhunt, Friendster, MySpace, Facebook and 19 others for a total of 24 Web properties. The ads were geographically targeted to the San Francisco Bay Designated Marketing Area when possible. Some sites identified geographic location by website user registration (for example a MySpace user registers with a ZIP code), others did so by geographic location of the IP address. Many advertisements were served via, a fully owned subsidiary of AOL/Time Warner, allowing us to leverage AOL registration data to target advertising to San Francisco (otherwise all IP addresses of AOL users appear geographically located in Reston, VA).

Banner advertisements were run 24 hours a day during the 8 weeks of recruitment. All banner advertisements directed users automatically to an online screening page. Eligibility was similar to that of NHBS: 18 years of age or older, male, and a resident of the same 10 Bay Area counties. Participants completed an online survey and were not offered HIV testing. Participants recruited via DM did not receive any reimbursement for their time.

For each approach, we recorded the total number of men enumerated (physically counted by staff in NHBS and IVBS and by recording the number of click-throughs from the banner ads to the study website in DM), and recording the total number screened. In TLS and IVBS, it is impossible to screen everyone enumerated because of resource and time limitations; staff can only screen so many men in the time allotted for the sampling event. In DM, it is theoretically possible, because of the advantages of technology, that everyone enumerated (i.e. the number of click-throughs) could be screened but men who click on the banners ads and who are directed to the screener website can simply choose to exit the page before completing the screening instrument. We also recorded the total number of eligible participants, the total enrolled and the total number of completed interviews. These data were then used to compare the sampling efficiency of the three approaches. Each completed interview represents a fraction of the total number of men who were potentially eligible for the study. A higher ratio of completed interviews to the number of eligible participants and to total number enumerated indicates a more robust and thus potentially a more representative sample of the population of interest.

We also compared demographics and risk behaviors from each of the approaches. In all samples, participants self-reported on a variety of demographic variables, in addition to answering questions about sexual behavior, HIV testing history and substance use. Men were also asked to provide the result of their most recent HIV test. Sexual behavior was captured for up to five partners during the preceding 6 months, using a partner-by-partner, sexual activity matrix. When a respondent had more than five partners in the previous 6 months, the respondent was instructed to report on the five partners he had sex with the most frequently. The matrix contained questions about the demographics of sexual partners, including HIV serostatus, in addition to questions about the number of times the respondent participated in insertive and/or receptive anal intercourse, both overall and when condoms were not used. In both NHBS and WHBS, we also specifically asked about patterns of attendance at specific types of physical venues, to evaluate whether similar samples of MSM could be obtained using either sampling method. We use attendance at gay oriented bars and dance clubs as measures of physical venue attendance as these have been referenced in the literature previously as being key venue types for behavioral surveillance (Pollack et al. 2005). Men were also asked about their use of the Internet.



Sampling for NHBS was conducted over 52 weeks, from November 2003 through December 2004. Over 44,000 men were enumerated, and 3,568 men were screened for eligibility. Of the 2,488 men who were eligible, 1,764 (71%) completed the behavioral survey. Although none of the enrolled participants dropped out of the survey, 1,574 (89%) reported either identifying as gay or bisexual or reported having at least one male sex partner in the past 12 months and were included in final analyses. Sampling for WHBS-IVBS occurred over eight weeks in March and April 2006 during which 15,387 men were enumerated and 134 (0.9%) were screened. All 134 men screened were eligible. Of the 134 eligible men who agreed to participate, only 57 (43%) completed the online survey. All 57 men reported being gay/bisexual or had sex with another man in the past 12 months. Sampling for WHBS-DM occurred over the same 8 weeks, in March and April 2006. The advertisements registered over nine million impressions, or views. Over 17,000 potential respondents were enumerated (meaning they clicked on the banner ad and were directed to the screening website), and 10,601 were screened for eligibility. Of the 1,731 eligible men, 720 (63%) completed the survey. Of the 720 men completing surveys, 666 (93%) reported identifying as gay or bisexual or having had at least one male partner in the past 12 months. In terms of drop-out rates, NHBS had the lowest with 0% of those starting the survey dropping out while in IVBS and DM, 57% and 37%, respectively, dropped out.

The 1574 completed MSM interviews represent 3.5% of men enumerated and 63% of eligible men in NHBS while 57 MSM interviews completed in IVBS represent only 0.3% of men enumerated and 43% of eligible men from IVBS. The 666 completed MSM interviews in the DM approach represent 3.9% of men enumerated and 38% of eligible men in that approach (Table 1).
Table 1

Sampling outcomes WHBS-DM, WHBS-IVBS and NHBS Samples, San Francisco




NHBS N (%)


15,387 (100)

17,000 (100)

44,000 (100)


134 (0.8)


3,568 (9)


134 (100)

1,731 (16)

2,488 (70)

Agreed to participate

134 (100)

1,147 (66)

1,764 (71)

Completed survey

57 (43)

720 (63)

1,764 (100)

Total completed MSM interviews

57 (100)

666 (93)

1,574 (89)

Percents are calculated using the denominator of the preceding category


In terms of demographic characteristics, 76% of the NHBS sample resided in San Francisco compared to 63% and 58% in IVBS and DM, respectively. NHBS was more diverse racially than either of the two online samples with only 55% of the NHBS sample reporting being white compared to 79% and 70% white in IVBS and DM, respectively. In terms of age, all three approaches sampled slightly different age groups. NHBS sampled the most participants in the 31–45 year old group (52%) with a mean age of 36 (Standard Deviation (SD) 10.0) while IVBS sampled a majority of men 41 years old or older (67%) with a mean age of 43 (SD 10.8) and DM sampled younger men 18–30 years old (48%) with a mean age of 33 (SD 10.4). More men in NHBS reported incomes of less than $50,000 per year than in DM (58% vs. 45%), while 68% of IVBS men reported incomes over $49,000 per year. Men in all the samples reported similar levels of educational attainment. Men in all three samples reported sexual orientation similarly, all at about 87% homosexual (Table 2).
Table 2

Demographic characteristics of WHBS and NHBS MSM samples, San Francisco




NHBS N (%)

Total completed MSM interviews






36 (63)

384 (58)

1,200 (76)

    South Bay

7 (12)

112 (17)

69 (4)

    North Bay

1 (2)

13 (2)

15 (1)

    East Bay

13 (23)

139 (21)

115 (7)



4 (7)

52 (8)

165 (11)

    African American

2 (4)

14 (2)

108 (7)

    American Indian


3 (0.5)

7 (0.5)

    Native Hawaiian/Pac I


7 (1)

26 (2)


45 (79)

467 (70)

859 (55)



11 (2)

25 (2)



94 (14)

293 (19)

    Mixed Race

2 (4)

15 (2)

84 (5)



1 (2)

61 (9)

35 (2)


2 (4)

130 (20)

197 (13)


5 (9)

124 (19)

246 (16)


7 (12)

91 (14)

298 (19)


4 (7)

89 (13)

291 (19)


14 (25)

87 (13)

222 (14)


9 (16)

46 (7)

117 (7)


15 (26)

38 (6)

167 (11)



2 (4)

50 (9)

158 (10)


1 (2)

106 (19)

351 (22)


13 (26)

99 (17)

411 (26)


10 (20)

95 (17)

285 (18)


8 (16)

78 (14)

147 (9)


16 (32)

142 (25)

203 (13)


    Post grad

19 (33)

173 (26)

300 (19)


15 (26)

256 (39)

689 (44)

    Some college

21 (37)

170 (26)

400 (25)

    HS or less

2 (4)

62 (9)

185 (12)

Sexual identity


1 (2)

1 (<1)

8 (1)


49 (86)

579 (87)

1407 (89)


7 (12)

72 (11)

137 (9)




22 (11)

MSM venue attendance past year

    Bars/Dance clubs

50 (88)

510 (77)

1,462 (93)


57 (100)

666 (100)

1,383 (88)

Any substance use past year

33 (58)

261 (39)

944 (60)

Any speed use past 12 months

6 (11)

69 (10)

340 (22)


22 (39)

246 (37)

664 (42)


18 (32)

210 (32)

511 (32)


14 (25)

173 (26)

450 (29)

Any discordant partnerships

28 (49)

148 (22)

638 (41)

Ever HIV test

55 (96)

581 (87)

1,511 (96)

Self reported HIV status (of those ever tested)


41 (75)

503 (76)

1,211 (78)


11 (20)

62 (9)

251 (16)


3 (3)

12 (2)

49 (3)

Some categories do not add up to 100% due to missing data

HIV Risk

Risk variables reported on by the three samples indicated some differences. NHBS and IVBS men reported a higher level of any substance use in the past 12 months (60% and 58%, respectively) vs. 39% among men sampled in the DM approach. In regards to methamphetamine (MA) use in the past 12 months, the proportion of men reporting using MA in NHBS was higher than in either IVBS or DM (22%, 11% and 10%, respectively). NHBS men reported slightly more unprotected anal intercourse (UAI) overall (42%) vs. men in IVBS or DM (39% and 37%, respectively), men in all three approaches reported similar levels of unprotected insertive anal intercourse (UIAI) and unprotected receptive anal intercourse (URAI). Finally, men in IVBS reported the highest level of HIV serodiscordant partnerships (49%) compared with 41% of NHBS men and only 22% of DM men reporting the same behavior.

Venue Attendance

By definition, 100% of both IVBS and DM men attend MSM oriented venues found on the Internet while slightly fewer NHBS men (88%) reported using MSM Internet venues. Attendance at MSM oriented bars and dance clubs was reported by 93% of NHBS men, 88% of IVBS men and 77% of DM men.


The WHBS-IVBS method reported here was not successful in recruiting a large sample of MSM residing in the Bay Area and had the poorest completed interview to number enumerated ratio of the three approaches. Both the WHBS-DM and NHBS methods reported here successfully recruited a large sample of MSM residing in the San Francisco Bay Area. MSM recruited via the three approaches varied in terms of residence location, age, income, substance use, methamphetamine use, serodiscordant partnerships and self-reported HIV status. These slight differences could well be the result of particular patterns of attendance at venues and use of the Internet among MSM in the Bay Area. Furthermore, while the DM approach is attractive in that it is “set and forget” during data collection, its more passive nature may result in recruiting fewer men who engage in high-risk behaviors than could be recruited using more active recruiter based approaches such as IVBS and NHBS. IVBS efforts targeting online venues with high-risk individuals and offering some type of incentive may prove to be a successful combination in reaching this important sub-population of MSM via the Internet. This strategy may be resource-intensive, however. Overall NHBS performed better than online recruitment strategies in terms of interview completion rates.

There are a number of limitations to our analysis. First, there is no benchmark to validate the quality of our samples but both prior surveillance efforts and ongoing efforts of community-based organizations achieve samples of similar composition to those reported here (Katz et al. 2002; Truong et al. 2006; Valleroy et al. 2000). Second, the recruitment methods most likely differ in terms of their ability to recruit less motivated subjects, subjects not likely to respond to banner ads online, such as those who might have been under the influence of a psychotropic substance, and subjects who are at higher-risk for HIV infection or transmission. The WHBS-DM approach is more passive in nature than NHBS, where staff interact with potential participants, thus WHBS-DM may have failed to sample men who engage in higher risk behavior for whom participation in this type of survey may not be as appealing. Third, NHBS included the use of incentives while both WHBS approaches did not. This difference could also explain differences in participation between the samples. Fourth, temporal trends in risk behavior could explain the differences between NHBS and the WHBS approaches. Future analyses of the next NHBS sample may help to address this possibility. Finally, NHBS sampled men in physical locations in the city and county of San Francisco, whereas the location of the men sampled in WHBS (at the time of the survey or at any point in the past) may not be presumed.

Despite these limitations, our data suggest that there may be differences between MSM recruited online via DM and MSM recruited via active recruitment online or at physical venues attended by MSM. Understanding the potential differences in the samples obtained by these types of approaches can inform decisions to use one type of approach over another and can inform interpretation of data generated by these approaches. While it is not evident whether the samples are truly different or appear to be different due to the sampling method, careful consideration of potential biases must play a part in interpreting findings from Internet surveys.

With the exception of the periodic need to collect specimens for HIV testing, including HIV incidence testing, Internet-based surveys of MSM may also be well positioned to provide necessary updates to behavioral surveillance among a population that continues to be highly affected by HIV. Moreover, Internet-based behavioral surveillance may also increase the geographic reach of venue-based behavioral surveillance approaches and may successfully sample specific segments of the MSM population such as younger MSM and MSM who do not attend MSM oriented venues such as bars and dance clubs (though the vast majority of men recruited online also attend bars and dance clubs). However, passive Internet based approaches may not be ideal to sample high-risk MSM who purposefully avoid participation while the face-to-face nature of venue-based approaches may more readily sample these men. Finally, our findings suggest innovative approaches to reach and sample high-risk men via the Internet must be further explored and refined.

Copyright information

© Springer Science+Business Media, LLC 2009