Respondent-Driven Sampling of Injection Drug Users in Two U.S.–Mexico Border Cities: Recruitment Dynamics and Impact on Estimates of HIV and Syphilis Prevalence
- First Online:
- Cite this article as:
- Frost, S.D.W., Brouwer, K.C., Firestone Cruz, M.A. et al. J Urban Health (2006) 83(Suppl 1): 83. doi:10.1007/s11524-006-9104-z
- 1.1k Downloads
Respondent-driven sampling (RDS), a chain referral sampling approach, is increasingly used to recruit participants from hard-to-reach populations, such as injection drug users (IDUs). Using RDS, we recruited IDUs in Tijuana and Ciudad (Cd.) Juárez, two Mexican cities bordering San Diego, CA and El Paso, TX, respectively, and compared recruitment dynamics, reported network size, and estimates of HIV and syphilis prevalence. Between February and April 2005, we used RDS to recruit IDUs in Tijuana (15 seeds, 207 recruits) and Cd. Juárez (9 seeds, 197 recruits), Mexico for a cross-sectional study of behavioral and contextual factors associated with HIV, HCV and syphilis infections. All subjects provided informed consent, an anonymous interview, and a venous blood sample for serologic testing of HIV, HCV, HBV (Cd. Juárez only) and syphilis antibody. Log-linear models were used to analyze the association between the state of the recruiter and that of the recruitee in the referral chains, and population estimates of the presence of syphilis antibody were obtained, correcting for biased sampling using RDS-based estimators. Sampling of the targeted 200 recruits per city was achieved rapidly (2 months in Tijuana, 2 weeks in Cd. Juárez). After excluding seeds and missing data, the sample prevalence of HCV, HIV and syphilis were 96.6, 1.9 and 13.5% respectively in Tijuana, and 95.3, 4.1, and 2.7% respectively in Cd. Juárez (where HBV prevalence was 84.7%). Syphilis cases were clustered in recruitment trees. RDS-corrected estimates of syphilis antibody prevalence ranged from 12.8 to 26.8% in Tijuana and from 2.9 to 15.6% in Ciudad Juárez, depending on how recruitment patterns were modeled, and assumptions about how network size affected an individual’s probability of being included in the sample. RDS was an effective method to rapidly recruit IDUs in these cities. Although the frequency of HIV was low, syphilis prevalence was high, particularly in Tijuana. RDS-corrected estimates of syphilis prevalence were sensitive to model assumptions, suggesting that further validation of RDS is necessary.
KeywordsHIV and syphilis prevalenceInjection drug usersRespondent driven sampling.
Injection drug use is a growing problem in cities located along the U.S.–Mexico border. Approximately 70% of U.S. cocaine originating in South America passes through the Central America–Mexico corridor.1 Cities and towns positioned on drug trafficking routes often experience epidemics of injection drug use.2,3 Injection drug users (IDUs) are at high risk of blood-borne infections, such as hepatitis C virus (HCV) and human immunodeficiency virus type-1 (HIV-1) infection, and of acquiring HIV-1 and other sexually transmitted infections (STIs) through high rates of unprotected sex.4–10
Mexico is currently considered a country of low HIV/AIDS prevalence11 (180,000 adult cases in 2005, a seroprevalence in the general population of 0.3%12), and the HIV epidemic has been mainly confined to men who have sex with men.13 Although injection drug use appears to have played only a minor role in the epidemic on a country-wide level,14 injection drug use appears to be increasingly important as a risk factor for HIV infection in some Mexican cities bordering the U.S. Viani et al.15 noted that the prevalence of HIV among pregnant women giving birth at Tijuana General Hospital rose from 0.29% in 1998 to 1.02% in 2001 and, in a subsequent study, showed that pregnant HIV-infected women were more likely to either inject drugs or to have a spouse/partner who injected drugs.16 In 2002, Valdez et al.17 reported that 21% of female sex workers in Ciudad (Cd.) Juárez injected illicit drugs, whereas a study by Patterson et al.18 in 2005 showed that over half of female sex workers in Cd. Juárez injected drugs, suggesting increasing overlap between sexual and IDU networks.
Overlap between injection drug use and the trade of sex for money or drugs may contribute to elevated risk of STIs other than HIV, such as syphilis. Syphilis has been associated with higher HIV seroprevalence in a number of populations and is considered a cofactor of HIV transmission.19–22 In contrast to HIV, syphilis has been present in Mexico since at least the time of the Spanish invasion; however, the number of reported cases has decreased from 40,607 in 1945 (190.5 per 100,000) to 2,608 in 1990 (3.2 per 100,0000).23
Aggregate figures for syphilis prevalence belie the sub-epidemics occurring within specific risk groups. Several studies have been conducted with female sex workers in Mexico and have found varying syphilis prevalence levels. In 1990, 23.7% of 1,386 sex workers in four Mexican states had a reactive syphilis test.24 In 1993, testing of 826 sex workers in Mexico City showed an overall prevalence of 6.4%, with different syphilis rates associated with different patterns of sex work: 1.3% for massage parlor workers, 4.4% for bar girls, and 9.6% for streetwalkers.25 The prevalence of syphilis among 3,100 female sex workers tested at an AIDS clinic during 1992 and 1993 was 8.2%.26 In contrast, syphilis prevalence was low (2.3 and 1.1%) among gynecological outpatients in two Mexican cities between the years 1994 and 1995.27 However, little is known about syphilis prevalence in IDU populations in Mexico.
In order to estimate the prevalence of HIV, HCV, and syphilis among IDUs, we conducted a cross-sectional study of IDUs in the border cities of Tijuana and Cd. Juárez, Mexico. Both cities are located on major drug trafficking routes and have large IDU populations (c. 6,000), with a similar sex ratio among the IDUs (c. 80% male).3 As stigma surrounding injection drug use makes it difficult to obtain a representative sample of injection drug users, we recruited individuals using respondent-driven sampling (RDS).28,29 By collecting data on individuals’ personal network sizes, RDS attempts to correct for biases in the sampling process, in order to obtain unbiased estimates of parameters such as the prevalence of a disease. In this study, we report on patterns of recruitment and the prevalence of HIV, HCV, and syphilis (and hepatitis B infection, for Cd. Juárez) in the context of sexual risk.
Materials and Methods
From February through April 2005, IDUs were enrolled in a cross-sectional study in Tijuana and Cd. Juárez, Mexico. Eligibility criteria for the study included: having injected illicit drugs within the past month, confirmed by inspection of injection stigmata (‘track marks’); aged 18 years or older; willing and able to provide informed consent; and not having been previously interviewed for the study. Subjects gave their written informed consent to participate in the study. Study methods were approved by the Institutional Review Board of the University of California, San Diego and the Ethics Board of the Tijuana General Hospital, which has one of the few federal-wide assurances in Mexico.
RDS methods were used to recruit participants.28,29 A diverse group of “seeds” (heterogeneous in age, gender, and geographic location) were selected to initiate the process. After providing informed consent, seeds underwent an interview, were educated on how to refer other eligible IDUs, and were given three uniquely coded coupons to refer their peers. Coupons were given to participants until approximately 150 participants were recruited in order to obtain a target sample size of approximately 200 per site.
On each coupon, the study name, locations where they could participate, and a brief explanation was printed. In Cd. Juárez, interviews were conducted at a clinic run by Programa Compañeros, A.C., which is a trusted and well-respected non-governmental organization (NGO) that has been providing services to and conducting studies with IDUs in the city for decades. In Tijuana, staff from both COMUSIDA, the municipal HIV/AIDS program, and the Centro de Integración y Recuperación para Enfermos de Alcoholismo y Drogadicción “Mario Camacho Espíritu”, A.C. (CIRAD), an NGO that began working with drug users in 1991, made weekly trips to three geographically diverse ‘colonias’ (i.e., neighborhoods) in the city: Zona Norte, Grupo México, and Sepanal, using a modified recreational vehicle that operated as a mobile clinic (the ‘Prevemovihl’).
Monetary reimbursements were given to participants to cover transportation costs and to compensate them for their time. The study staff in each site proposed the incentive levels based on their experience with this population and the incentives for previous studies. Participants in Cd. Juárez received $20 U.S. dollars (USD) for participation in the baseline visit and $5 USD when receiving laboratory test results at a one month follow-up visit. In Tijuana, $10 USD was given at baseline and $5 for the follow-up visit. In addition, participants at both sites were given $5 for each eligible person they recruited. These levels were not regarded as high.
Upon enrollment, trained staff administered quantitative surveys eliciting information on topics such as socio-economic and demographic profiles, drug use practices, sterile syringe access, barriers to sterile syringe use, experience with drug abuse treatment and incarceration history, health status, and HIV knowledge and testing history. We also asked about sexual behaviors and condom use with regular, casual and client partners of the opposite and same sex. Questions pertained to lifetime risk behaviors and those occurring in the prior six months.
For RDS purposes, we measured network size using the question “En los últimos 6 meses, ¿cuántas personas conoce de nombre o de apodo que se han inyectado drogas?” (“In the past 6 months, how many people do you know by name or street name who have injected drugs?”). To determine the relationship between recruiter and recruitee, we asked “¿Cuál es su relación con la persona que le entregó el cupón?” (“What is your relationship to the person who gave you the coupon?”). Participants were given the choice of: “parientes” (relative); “pareja sexual” (sex partner/spouse); “amigo(a)” (friend); “conocido” (acquaintance); “desconocido” (stranger); and “otro” (other). To determine the size of individuals’ networks with respect to injection drug use, we asked “En los últimos 6 meses, ¿con cuántas personas diferentes acostumbra inyectarse?” (“In the last 6 months, on average how many different people did you usually inject with?”). After the interview, blood was drawn for antibody testing of HIV, HCV, HBV (Cd. Juárez only), and syphilis. Pre- and post-test counseling, and referral to treatment where indicated, was provided to all participants.
Blood samples were obtained by venipuncture and serum was stored at the municipal health clinic in Tijuana or Cd. Juárez before being shipped frozen to the San Diego County public health laboratory or New Mexico State Laboratory, respectively. All participants were screened on-site in Mexico for HIV with the Determine rapid test (Abbott Laboratories). For the Tijuana samples, in the event of an HIV-positive or indeterminate test, results were confirmed with a Western blot, HIV enzyme immunoassay (EIA), and HIV immunofluorescence assay. For samples from Cd. Juárez, the HIV EIA was conducted on all samples, and a confirmatory Western blot was performed on positive or indeterminate sample. Cd. Juárez samples were also tested for hepatitis B antigen (Genetic Systems HBsAg EIA 3.0, Bio-Rad Laboratories) and antibody (DiaSorin ETI-AB-COREK PLUS). All samples were tested for syphilis with the rapid plasma reagin (RPR) test (Macro-Vue, Becton Dickenson) and if reactive, confirmed by a Treponema pallidum particle agglutination assay (TPPA; Fujirebio Diagnostics).
Obtaining estimates of population proportions of groups using RDS involves combining three kinds of data: the sample proportion of each group, the crosstabulation of groups between pairs of recruiters and recruitees, and differences in network size between groups. To estimate equilibrium proportions of different groups, and to estimate the pattern of mixing between groups, we assumed that the recruitment process followed a first order Markov process.28,29 Under this model, the relationship between the state of the recruiter and recruitee can be modeled using log-linear models applied to a two-way table of counts.30,31 We classified individuals by sex and syphilis seropositivity and fitted a series of hierarchical log-linear models of increasing complexity to the data to determine patterns of nonrandom mixing between groups along each recruitment tree, choosing the best model as that which had the lowest value of Akaike’s Information Criterion.32 For the purposes of analysis, we considered all individuals with positive syphilis tests based upon RPR, and did not classify individuals further into those with TPPA titers greater than or equal to 1:8 (who may represent infectious cases) and those with titers of 1:1 to 1:4 (who may represent past infection).
To derive RDS-corrected estimates of syphilis seropositivity in men and women in Tijuana and Ciudad Juárez, we estimated recruitment weights for each group (as the ratio of the equilibrium to sample proportions of each group). We estimated the equilibrium fraction as previously described.29 We used both raw counts and predicted counts based on the best fitting log-linear model. Degree weights were estimated using linear least squares.29 We used both unadjusted and adjusted estimates of personal network size.33 An overall sampling weight was derived for each group, from which population-level estimates were obtained.
Pre-processing of the data was performed using Stata v. 8.2 (Stata Corporation, College Station, TX). Networks and trees were generated using scripts written in Python and visualized using GraphViz (AT&T Research, Florham Park, NJ). Statistical analyses and summary statistics of the recruitment network were generated in R,34 and RDS based corrections were calculated using Maxima (http://maxima.sourceforge.net). We chose to develop our own programs rather than use RDSAT (http://www.respondentdrivensampling.org) primarily to familiarize ourselves with the statistical theory underlying RDS-based corrections. All code is available from the first author on request.
Summary statistics of age, parameters pertaining to risk of STI, and seroprevalence of HIV, HCV and syphilis, by city, sex, and by whether individuals were seeds, or recruits
Type of recruit
Age (median, range)
Age at sexual debut (median, range)
15 (7–28) [n=191]
15 (5–23) [n=181]
No. of lifetime male sexual partners (median, range)
0 (0–100) [n=186]
20 (2–500) [n=11]
0 (0–15) [n=178]
5 (1–100) [n=11]
No. of lifetime female sexual partners (median, range)
10 (0–500) [n=191]
5 (0–10) [n=2]
10 (1–300) [n=175]
0 (0–7) [n=12]
Prostitution as main source of income over the last 6 months (fraction, %)
Had sex in last 6 months (fraction, %)
Ever been given (bought) sex in the last 6 months (fraction, %)
Ever given (sold) sex in the last 6 months (fraction, %)
Injected drugs with sex partner in last 6 months
Syphilis RPR + (fraction, %)
HIV antibody (fraction, %)
HCV antibody (fraction, %)
HBV antibody (fraction, %)
Relationship between recruiter and recruitee in terms of sex and syphilis antibody status
Fit of 12 log-linear models to the data shown in Table 2
Symmetric(sex1,sex2) + Symmetric(rpr1,rpr2)
sex1 + rpr1 + sex2 + rpr2
sex1 + rpr1 + sex2 * rpr2
sex1 * rpr1 + sex2 * rpr2
sex1 * rpr1 + sex2 * rpr2 + sex1 * sex2
sex1 * rpr1 + sex2 * rpr2 + rpr1 * rpr2
sex1 * rpr1 + sex2 * rpr2 + sex1 * sex2 + rpr1 * rpr2
sex1 * rpr1 + sex2 * rpr2 + sex1 * sex2 + rpr1 * rpr2 + sex1 * rpr2
sex1 * rpr1 + sex2 * rpr2 + sex1 * sex2 + rpr1 * rpr2 + rpr1 * sex2
sex1 * rpr1 + sex2 * rpr2 + sex1 * sex2 + rpr1 * rpr2 + sex1 * rpr2 + rpr1 * sex2
Symmetric(sex1 * rpr1, sex2 * rpr2)
Saturated model (sex1 * rpr1 * sex2 * rpr2)
For both samples, the best fitting model included a significant (positive) association between the syphilis serostatus of the recruiter and the syphilis serostatus of the recruitee and was not significantly different from a ‘saturated’ model, in which each cell in the table is modeled with a single parameter. For the Tijuana sample, there was a positive association between the sex of the recruiter and the sex of the recruit, and, independently of this association, syphilis antibody positive women were disproportionately less likely to recruit syphilis positive men than syphilis negative men.
Correcting for Sampling Bias
Information on network size, and on who recruited whom, collected as part of RDS, allows population estimates to be generated from the sample, despite biases in the sampling process. To do so, we need to estimate the pattern of mixing between different groups along the recruitment network and to determine how different the network size is in each group. The seroprevalence of HIV was too low, and the seroprevalence of HCV was too high in order to obtain meaningful correction factors; hence we concentrated on obtaining population estimates of the prevalence of syphilis antibody in men and women in the two cities.
RDS-corrected estimates of syphilis seroprevalence, using the raw transition data, and adjusted degrees
Adjusted degree weight
Adjusted sampling weights
Adjusted population (%)
Samples obtained using RDS can also be biased due to differences in network size between the groups. To compensate for this effect, we calculated degree weights, based upon the reported network size of known IDUs using the adjusted mean estimates of network size33 in each group (Figure 3f). After calculating degree weights for each group, and multiplying them by recruitment weights to generate an overall sampling weight, the RDS corrected estimates of syphilis antibody prevalence were higher than those in the overall sample for both cities (Table 4).
Sensitivity of RDS Estimates
Sensitivity of the estimated population prevalence of syphilis antibody among men and women to model assumptions
Respondent driven sampling offers the promise of a probability sample of individuals from hidden and hard-to-reach populations. RDS was originally developed in the context of recruiting IDUs35–39 and, in our context, was an efficient method to recruit IDUs in two Mexican cities bordering the U.S. Recruitment was extremely rapid in Cd. Juárez compared to Tijuana, which may be due to greater access to the study site, higher monetary incentives, and that Programa Compañeros is more established in Cd. Juárez than CIRAD in Tijuana and had carried out studies in the past with monetary renumeration. In contrast, Mueller et al.40 report much slower recruitment of IDUs in Las Cruces, NM using RDS, despite similar methodology and the same eligibility criteria.
The sample seroprevalence of HIV was relatively low in both cities. HIV-1 seroprevalence in IDUs in Tijuana recruited through RDS was similar to that found in IDUs studied by Güereña-Burgueño et al.41 in the early 1990s and in a study by Magis-Rodriguez et al.42 in 2003 that used time-location sampling methods, although the absolute number of HIV-positive cases was too low to perform reliable RDS corrections. In contrast to HIV, syphilis prevalence was extremely high, especially in women and in Tijuana.
Unlike many adaptive sampling schemes43 in which the sampling process is controlled by the investigator, RDS enables the study subjects to control the sampling process. While this facilitates the recruitment process, it makes statistical inference more difficult. We found that estimates of syphilis seroprevalence were extremely sensitive to modeling assumptions. First, as recruitments between low-frequency groups are relatively rare, estimates of the recruitment rates may be biased. Smoothing these estimates using a statistical model can lead to different estimates. Although simulations and analytical results show that RDS-based estimates are unbiased in large populations, errors in RDS based estimates may be so high for small populations and/or low frequencies of groups as to render the use of RDS impractical.33 Secondly, the estimated prevalence of syphilis was sensitive to the assumption of how inclusion probability depends on reported network size. Estimates of network size may well have been different had we asked “How many people do you currently know by name or street name that inject drugs?” Estimating group-level network sizes is compromised by high variances, the small size of some of the subpopulations, and the poor ability of individuals to estimate the size of their personal networks.44 Although RDS controls for differences in network sizes, a sampling bias long known to be inherent in chain-referral samples, it is important that this information is as accurate as possible. It might be argued that prior to the advent of RDS, there was little incentive to accurately measure relative network sizes in epidemiological studies; given that this information plays a crucial role in the post-stratification process of RDS, we encourage further research to determine how best to collect this information accurately.
RDS also has some inherent limitations in terms of inferences that can be drawn from the data45: it does not generate estimates of the absolute size of the population, only proportions, and it exploits social ties between individuals, limiting what one can conclude about sexual or drug-injecting networks from RDS data. Furthermore, without comparison of RDS to other types of sampling, we cannot conclude that obtaining a sample through RDS gives us a more representative sample than other methods. Nevertheless, RDS, or a modified version thereof, has the potential to efficiently recruit hidden populations such as IDUs, and creates avenues through which interventions can reach members of these populations. In the context of this study, prevention and treatment of syphilis is clearly an important public health concern.
The authors gratefully acknowledge support from the National Institute on Drug Abuse through grants DA09225-S11 and DA019829, donor support for the Harold Simon Chair in International Health and Cross-Cultural Medicine, the UCSD Center for AIDS Research (P30 AI36214), and training grants supporting Dr. Brouwer (T32 AI07384 and K01 DA20364). In addition, Dr. Frost is supported by the National Institute of Allergy and Infectious Disease (grants U01 43638, R01 AI47745, and R01 AI57167), and by the National Institute on Drug Abuse, as part of the Sexual Acquisition and Transmission of HIV Cooperative Agreement Program (grant DA17934). We are extremely thankful to the staff of Proyecto El Cuete, CIRAD, COMUSIDA and Programa Compañeros, and to Dr. Peter Hartsock, Dr. Doug Heckathorn and Mr. Cyprian Wejnert.