Injection drug use is a growing problem in cities located along the U.S.–Mexico border. Approximately 70% of U.S. cocaine originating in South America passes through the Central America–Mexico corridor.1 Cities and towns positioned on drug trafficking routes often experience epidemics of injection drug use.2 , 3 Injection drug users (IDUs) are at high risk of blood-borne infections, such as hepatitis C virus (HCV) and human immunodeficiency virus type-1 (HIV-1) infection, and of acquiring HIV-1 and other sexually transmitted infections (STIs) through high rates of unprotected sex.4 10

Mexico is currently considered a country of low HIV/AIDS prevalence11 (180,000 adult cases in 2005, a seroprevalence in the general population of 0.3%12), and the HIV epidemic has been mainly confined to men who have sex with men.13 Although injection drug use appears to have played only a minor role in the epidemic on a country-wide level,14 injection drug use appears to be increasingly important as a risk factor for HIV infection in some Mexican cities bordering the U.S. Viani et al.15 noted that the prevalence of HIV among pregnant women giving birth at Tijuana General Hospital rose from 0.29% in 1998 to 1.02% in 2001 and, in a subsequent study, showed that pregnant HIV-infected women were more likely to either inject drugs or to have a spouse/partner who injected drugs.16 In 2002, Valdez et al.17 reported that 21% of female sex workers in Ciudad (Cd.) Juárez injected illicit drugs, whereas a study by Patterson et al.18 in 2005 showed that over half of female sex workers in Cd. Juárez injected drugs, suggesting increasing overlap between sexual and IDU networks.

Overlap between injection drug use and the trade of sex for money or drugs may contribute to elevated risk of STIs other than HIV, such as syphilis. Syphilis has been associated with higher HIV seroprevalence in a number of populations and is considered a cofactor of HIV transmission.19 22 In contrast to HIV, syphilis has been present in Mexico since at least the time of the Spanish invasion; however, the number of reported cases has decreased from 40,607 in 1945 (190.5 per 100,000) to 2,608 in 1990 (3.2 per 100,0000).23

Aggregate figures for syphilis prevalence belie the sub-epidemics occurring within specific risk groups. Several studies have been conducted with female sex workers in Mexico and have found varying syphilis prevalence levels. In 1990, 23.7% of 1,386 sex workers in four Mexican states had a reactive syphilis test.24 In 1993, testing of 826 sex workers in Mexico City showed an overall prevalence of 6.4%, with different syphilis rates associated with different patterns of sex work: 1.3% for massage parlor workers, 4.4% for bar girls, and 9.6% for streetwalkers.25 The prevalence of syphilis among 3,100 female sex workers tested at an AIDS clinic during 1992 and 1993 was 8.2%.26 In contrast, syphilis prevalence was low (2.3 and 1.1%) among gynecological outpatients in two Mexican cities between the years 1994 and 1995.27 However, little is known about syphilis prevalence in IDU populations in Mexico.

In order to estimate the prevalence of HIV, HCV, and syphilis among IDUs, we conducted a cross-sectional study of IDUs in the border cities of Tijuana and Cd. Juárez, Mexico. Both cities are located on major drug trafficking routes and have large IDU populations (c. 6,000), with a similar sex ratio among the IDUs (c. 80% male).3 As stigma surrounding injection drug use makes it difficult to obtain a representative sample of injection drug users, we recruited individuals using respondent-driven sampling (RDS).28 , 29 By collecting data on individuals’ personal network sizes, RDS attempts to correct for biases in the sampling process, in order to obtain unbiased estimates of parameters such as the prevalence of a disease. In this study, we report on patterns of recruitment and the prevalence of HIV, HCV, and syphilis (and hepatitis B infection, for Cd. Juárez) in the context of sexual risk.

Materials and Methods

Study Population

From February through April 2005, IDUs were enrolled in a cross-sectional study in Tijuana and Cd. Juárez, Mexico. Eligibility criteria for the study included: having injected illicit drugs within the past month, confirmed by inspection of injection stigmata (‘track marks’); aged 18 years or older; willing and able to provide informed consent; and not having been previously interviewed for the study. Subjects gave their written informed consent to participate in the study. Study methods were approved by the Institutional Review Board of the University of California, San Diego and the Ethics Board of the Tijuana General Hospital, which has one of the few federal-wide assurances in Mexico.


RDS methods were used to recruit participants.28 , 29 A diverse group of “seeds” (heterogeneous in age, gender, and geographic location) were selected to initiate the process. After providing informed consent, seeds underwent an interview, were educated on how to refer other eligible IDUs, and were given three uniquely coded coupons to refer their peers. Coupons were given to participants until approximately 150 participants were recruited in order to obtain a target sample size of approximately 200 per site.

On each coupon, the study name, locations where they could participate, and a brief explanation was printed. In Cd. Juárez, interviews were conducted at a clinic run by Programa Compañeros, A.C., which is a trusted and well-respected non-governmental organization (NGO) that has been providing services to and conducting studies with IDUs in the city for decades. In Tijuana, staff from both COMUSIDA, the municipal HIV/AIDS program, and the Centro de Integración y Recuperación para Enfermos de Alcoholismo y Drogadicción “Mario Camacho Espíritu”, A.C. (CIRAD), an NGO that began working with drug users in 1991, made weekly trips to three geographically diverse ‘colonias’ (i.e., neighborhoods) in the city: Zona Norte, Grupo México, and Sepanal, using a modified recreational vehicle that operated as a mobile clinic (the ‘Prevemovihl’).

Monetary reimbursements were given to participants to cover transportation costs and to compensate them for their time. The study staff in each site proposed the incentive levels based on their experience with this population and the incentives for previous studies. Participants in Cd. Juárez received $20 U.S. dollars (USD) for participation in the baseline visit and $5 USD when receiving laboratory test results at a one month follow-up visit. In Tijuana, $10 USD was given at baseline and $5 for the follow-up visit. In addition, participants at both sites were given $5 for each eligible person they recruited. These levels were not regarded as high.

Data Collection

Upon enrollment, trained staff administered quantitative surveys eliciting information on topics such as socio-economic and demographic profiles, drug use practices, sterile syringe access, barriers to sterile syringe use, experience with drug abuse treatment and incarceration history, health status, and HIV knowledge and testing history. We also asked about sexual behaviors and condom use with regular, casual and client partners of the opposite and same sex. Questions pertained to lifetime risk behaviors and those occurring in the prior six months.

For RDS purposes, we measured network size using the question “En los últimos 6 meses, ¿cuántas personas conoce de nombre o de apodo que se han inyectado drogas?” (“In the past 6 months, how many people do you know by name or street name who have injected drugs?”). To determine the relationship between recruiter and recruitee, we asked “¿Cuál es su relación con la persona que le entregó el cupón?” (“What is your relationship to the person who gave you the coupon?”). Participants were given the choice of: “parientes” (relative); “pareja sexual” (sex partner/spouse); “amigo(a)” (friend); “conocido” (acquaintance); “desconocido” (stranger); and “otro” (other). To determine the size of individuals’ networks with respect to injection drug use, we asked “En los últimos 6 meses, ¿con cuántas personas diferentes acostumbra inyectarse?” (“In the last 6 months, on average how many different people did you usually inject with?”). After the interview, blood was drawn for antibody testing of HIV, HCV, HBV (Cd. Juárez only), and syphilis. Pre- and post-test counseling, and referral to treatment where indicated, was provided to all participants.

Laboratory Samples

Blood samples were obtained by venipuncture and serum was stored at the municipal health clinic in Tijuana or Cd. Juárez before being shipped frozen to the San Diego County public health laboratory or New Mexico State Laboratory, respectively. All participants were screened on-site in Mexico for HIV with the Determine rapid test (Abbott Laboratories). For the Tijuana samples, in the event of an HIV-positive or indeterminate test, results were confirmed with a Western blot, HIV enzyme immunoassay (EIA), and HIV immunofluorescence assay. For samples from Cd. Juárez, the HIV EIA was conducted on all samples, and a confirmatory Western blot was performed on positive or indeterminate sample. Cd. Juárez samples were also tested for hepatitis B antigen (Genetic Systems HBsAg EIA 3.0, Bio-Rad Laboratories) and antibody (DiaSorin ETI-AB-COREK PLUS). All samples were tested for syphilis with the rapid plasma reagin (RPR) test (Macro-Vue, Becton Dickenson) and if reactive, confirmed by a Treponema pallidum particle agglutination assay (TPPA; Fujirebio Diagnostics).

Statistical Methods

Obtaining estimates of population proportions of groups using RDS involves combining three kinds of data: the sample proportion of each group, the crosstabulation of groups between pairs of recruiters and recruitees, and differences in network size between groups. To estimate equilibrium proportions of different groups, and to estimate the pattern of mixing between groups, we assumed that the recruitment process followed a first order Markov process.28 , 29 Under this model, the relationship between the state of the recruiter and recruitee can be modeled using log-linear models applied to a two-way table of counts.30 , 31 We classified individuals by sex and syphilis seropositivity and fitted a series of hierarchical log-linear models of increasing complexity to the data to determine patterns of nonrandom mixing between groups along each recruitment tree, choosing the best model as that which had the lowest value of Akaike’s Information Criterion.32 For the purposes of analysis, we considered all individuals with positive syphilis tests based upon RPR, and did not classify individuals further into those with TPPA titers greater than or equal to 1:8 (who may represent infectious cases) and those with titers of 1:1 to 1:4 (who may represent past infection).

To derive RDS-corrected estimates of syphilis seropositivity in men and women in Tijuana and Ciudad Juárez, we estimated recruitment weights for each group (as the ratio of the equilibrium to sample proportions of each group). We estimated the equilibrium fraction as previously described.29 We used both raw counts and predicted counts based on the best fitting log-linear model. Degree weights were estimated using linear least squares.29 We used both unadjusted and adjusted estimates of personal network size.33 An overall sampling weight was derived for each group, from which population-level estimates were obtained.

Pre-processing of the data was performed using Stata v. 8.2 (Stata Corporation, College Station, TX). Networks and trees were generated using scripts written in Python and visualized using GraphViz (AT&T Research, Florham Park, NJ). Statistical analyses and summary statistics of the recruitment network were generated in R,34 and RDS based corrections were calculated using Maxima ( We chose to develop our own programs rather than use RDSAT ( primarily to familiarize ourselves with the statistical theory underlying RDS-based corrections. All code is available from the first author on request.


Study Population

Table 1 summarizes some basic data relating to the Tijuana (15 seeds, 207 recruits) and Cd. Juárez (9 seeds, 197 recruits) study populations. Both populations were predominantly male, with participants in their early to mid-30s. Crude HCV seroprevalence was extremely high (>95%) in both cities. Hepatitis B seroprevalence was only determined for Cd. Juárez, where it was high (85% overall); only one individual was positive for HBV antigen. Crude HIV seroprevalence was low, but the crude prevalence of syphilis was high, especially among women.

Table 1. Summary statistics of age, parameters pertaining to risk of STI, and seroprevalence of HIV, HCV and syphilis, by city, sex, and by whether individuals were seeds, or recruits

Recruitment Dynamics

RDS was an effective means of recruiting IDUs in both cities. The number of individuals recruited increased rapidly following the first interview, especially in Cd. Juárez (Figure 1a), where many individuals interviewed the same day as their recruiter (Figure 1b). Apart from these differences in the tempo of recruitment, patterns of recruitment were very similar between the two cities; recruitment was highest in the fourth wave of recruitment, with some individuals being recruited after eight waves, suggesting that despite rapid recruitment, good sociometric depth was obtained (Figure 1c). After excluding individuals who were not given coupons, the number of recruits per recruiter showed a bimodal distribution, with many individuals either recruiting zero or three recruits (Figure 1d), suggesting the presence of a mixed population of ineffective and effective recruiters. In both cities, approximately one half of participants were recruited via referral trees originating from two seeds (Figure 1e). The relationship between recruiter and recruit was usually ‘friend,’ ‘acquaintance,’ or another close relationship such as a family member or a sex partner (Figure 1f), which is important as RDS-based estimates assume that these relationships are reciprocal.28 , 29

Figure 1
figure 1

Summary of the dynamics of recruitment, by city. (a) The cumulative number of recruits over time. (b) The interval between the interview of the recruiter and that of their recruitee (omitting those individuals who did not recruit). (c) The number of recruits in each recruitment wave from the seed. (d) The number of recruits per recruiter (excluding individuals who were not given any coupons). (e) The number of recruits from each seed. (f) The relationship between recruiter and recruitee.

Recruitment Trees

Figure 2 shows a ‘forest’ of recruitment trees for each city, with the syphilis antibody status indicated by shading and the gender of each individual indicated by different symbols. This figure illustrates that the sample prevalence of syphilis was higher in Tijuana than in Cd. Juárez and that syphilis cases appeared to cluster in the recruitment trees. Although there was a low frequency of women in the sample, a disproportionate number of women also had syphilis.

Figure 2
figure 2

Recruitment networks (strictly speaking, a forest of recruitment trees) for the RDS based samples of IDUs in (a) Tijuana and (b) Ciudad Juárez. Seeds are shown at the top of the figure, and arrows indicate the direction of recruitment. Syphilis serostatus is shown by shading: black- syphilis antibody positive, white- syphilis antibody negative, gray- missing data. The gender of participants is indicated by the shape of the symbol: square for female and circle for male. The size of the symbol is related to the reported network size: the larger the symbol, the larger the network size. Symbols marked with an ‘×’ denote individuals who were given coupons, but did not recruit.

To test whether syphilis prevalence differed by sex and whether cases of syphilis were clustered, we analyzed the relationship between the sex and syphilis status of the recruiter and recruitee in Tijuana and Cd. Juárez (Table 2). A set of hierarchical log-linear models were used to test whether there was an association between syphilis seropositivity and sex and between the state of the recruiter and that of the recruitee, with the best fitting model chosen using Akaike’s Information Criterion, AIC,32 with lower values indicating better fit (Table 3). This approach is functionally equivalent to calculating the RDS homophily index, for which 1 denotes a perfect association, 0 a zero association, and −1 a perfect negative association. The reason for using log-linear models is that they offer a solid statistical framework with which we can compare different models of association between the characteristics of the recruiter and those of the recruitee.

Table 2 Relationship between recruiter and recruitee in terms of sex and syphilis antibody status
Table 3. Fit of 12 log-linear models to the data shown in Table 2

For both samples, the best fitting model included a significant (positive) association between the syphilis serostatus of the recruiter and the syphilis serostatus of the recruitee and was not significantly different from a ‘saturated’ model, in which each cell in the table is modeled with a single parameter. For the Tijuana sample, there was a positive association between the sex of the recruiter and the sex of the recruit, and, independently of this association, syphilis antibody positive women were disproportionately less likely to recruit syphilis positive men than syphilis negative men.

Network Size

There was great variation in both measures of network size, with the average number of known IDUs an order of magnitude higher than the number of injecting partners. There was no correlation in network size between recruiter and recruitee, measured using either the number of IDUs known (Spearman’s rho = 0.0328 and 0.0204 for Tijuana and Cd. Juárez, respectively) or the number of injecting partners as a measure of network size (Spearman’s rho = −0.006 and −0.0121 for Tijuana and Cd. Juárez, respectively; Figure 3a, b). Network sizes were similar when the sample was grouped by syphilis serostatus and sex (Figure 3c, d). As chain referral samples are biased towards individuals with larger network sizes, we adjusted the distribution of network size by weighting the distribution of network sizes by the inverse of the network size.33 This reweighting led to a significant drop in estimated network size, from a median of 25–50 to 4–5 in Tijuana and from 15–20 to 8–10 in Cd. Juárez (Figure 3e).

Figure 3
figure 3

Summary of the network size distribution. (a) Scatterplot of the number of IDUs known by name or street name between recruiter and recruitee. (b) Scatterplot of the number of injecting partners between recruiter and recruitee. (c) Boxplot of the number of IDUs known by sex (M male, F female) and by syphilis antibody status (+ positive, − negative). (d) Boxplot of the number of injecting partners by sex (M male, F female) and by syphilis antibody status (+ positive, − negative). (e) Cumulative distribution of the number of IDUs known before (solid line) and after (dashed line) adjusting for biased sampling of individuals.

Correcting for Sampling Bias

Information on network size, and on who recruited whom, collected as part of RDS, allows population estimates to be generated from the sample, despite biases in the sampling process. To do so, we need to estimate the pattern of mixing between different groups along the recruitment network and to determine how different the network size is in each group. The seroprevalence of HIV was too low, and the seroprevalence of HCV was too high in order to obtain meaningful correction factors; hence we concentrated on obtaining population estimates of the prevalence of syphilis antibody in men and women in the two cities.

In order to obtain unbiased population estimates of the prevalence of syphilis antibody, we used a poststratification process, in which we calculated recruitment weights and degree weights, which can be combined to give an overall sampling weight.46 As RDS is a chain referral method, the prevalence of syphilis in the sample may have been different if recruitment had continued for further waves. Assuming that recruitment follows a first order Markov process, the ‘equilibrium prevalence’ of syphilis can be calculated from the cross-tabulations of the syphilis status of the recruiter and the recruitee. Using these estimates, we calculated recruitment weights, as the ratio of the equilibrium to sample frequencies of each group (Table 4); the closer these weights are to 1, the more representative the sampling of the group. For the Tijuana sample, recruitment weights were close to 1 for three groups, except for syphilis negative women, whose recruitment weight was 0.87 (i.e., this group was oversampled). For the Cd. Juárez sample, recruitment weights were close to 1, with the exception of syphilis positive women (1.25), who were undersampled.

Table 4. RDS-corrected estimates of syphilis seroprevalence, using the raw transition data, and adjusted degrees

Samples obtained using RDS can also be biased due to differences in network size between the groups. To compensate for this effect, we calculated degree weights, based upon the reported network size of known IDUs using the adjusted mean estimates of network size33 in each group (Figure 3f). After calculating degree weights for each group, and multiplying them by recruitment weights to generate an overall sampling weight, the RDS corrected estimates of syphilis antibody prevalence were higher than those in the overall sample for both cities (Table 4).

Sensitivity of RDS Estimates

To determine the sensitivity of point estimates of the prevalence of syphilis antibody to modeling assumptions, we also obtained estimates using unadjusted rather than adjusted network sizes and using a ‘smoothed’ transition matrix based on the best fitting log-linear model, rather than the raw counts. We found that RDS-based estimates were highly sensitive to these assumptions (Table 5). Estimated syphilis seroprevalence ranged from 12.4 to 26.8% in Tijuana and from 2.9 to 15.6% in Cd. Juárez, depending on how the pattern of recruitment was modeled and how reported network size was assumed to affect an individual’s probability of being included in the sample. However, our results suggest that syphilis seroprevalence is higher among women than men and higher in Tijuana than in Cd. Juárez and that sample proportions of syphilis using RDS in these populations may be underestimates of the true population seroprevalence.

Table 5. Sensitivity of the estimated population prevalence of syphilis antibody among men and women to model assumptions


Respondent driven sampling offers the promise of a probability sample of individuals from hidden and hard-to-reach populations. RDS was originally developed in the context of recruiting IDUs35 39 and, in our context, was an efficient method to recruit IDUs in two Mexican cities bordering the U.S. Recruitment was extremely rapid in Cd. Juárez compared to Tijuana, which may be due to greater access to the study site, higher monetary incentives, and that Programa Compañeros is more established in Cd. Juárez than CIRAD in Tijuana and had carried out studies in the past with monetary renumeration. In contrast, Mueller et al.40 report much slower recruitment of IDUs in Las Cruces, NM using RDS, despite similar methodology and the same eligibility criteria.

The sample seroprevalence of HIV was relatively low in both cities. HIV-1 seroprevalence in IDUs in Tijuana recruited through RDS was similar to that found in IDUs studied by Güereña-Burgueño et al.41 in the early 1990s and in a study by Magis-Rodriguez et al.42 in 2003 that used time-location sampling methods, although the absolute number of HIV-positive cases was too low to perform reliable RDS corrections. In contrast to HIV, syphilis prevalence was extremely high, especially in women and in Tijuana.

Unlike many adaptive sampling schemes43 in which the sampling process is controlled by the investigator, RDS enables the study subjects to control the sampling process. While this facilitates the recruitment process, it makes statistical inference more difficult. We found that estimates of syphilis seroprevalence were extremely sensitive to modeling assumptions. First, as recruitments between low-frequency groups are relatively rare, estimates of the recruitment rates may be biased. Smoothing these estimates using a statistical model can lead to different estimates. Although simulations and analytical results show that RDS-based estimates are unbiased in large populations, errors in RDS based estimates may be so high for small populations and/or low frequencies of groups as to render the use of RDS impractical.33 Secondly, the estimated prevalence of syphilis was sensitive to the assumption of how inclusion probability depends on reported network size. Estimates of network size may well have been different had we asked “How many people do you currently know by name or street name that inject drugs?” Estimating group-level network sizes is compromised by high variances, the small size of some of the subpopulations, and the poor ability of individuals to estimate the size of their personal networks.44 Although RDS controls for differences in network sizes, a sampling bias long known to be inherent in chain-referral samples, it is important that this information is as accurate as possible. It might be argued that prior to the advent of RDS, there was little incentive to accurately measure relative network sizes in epidemiological studies; given that this information plays a crucial role in the post-stratification process of RDS, we encourage further research to determine how best to collect this information accurately.

RDS also has some inherent limitations in terms of inferences that can be drawn from the data45: it does not generate estimates of the absolute size of the population, only proportions, and it exploits social ties between individuals, limiting what one can conclude about sexual or drug-injecting networks from RDS data. Furthermore, without comparison of RDS to other types of sampling, we cannot conclude that obtaining a sample through RDS gives us a more representative sample than other methods. Nevertheless, RDS, or a modified version thereof, has the potential to efficiently recruit hidden populations such as IDUs, and creates avenues through which interventions can reach members of these populations. In the context of this study, prevention and treatment of syphilis is clearly an important public health concern.