AIDS and Behavior

, Volume 10, Issue 4, pp 369–376

The Babel Effect: Community Linguistic Diversity and Extramarital Sex in Uganda


    • Department of Population and Family Health SciencesJohns Hopkins Bloomberg School of Public Health
    • Department of Population and Family Health SciencesJohns Hopkins Bloomberg School of Public Health
  • Priya Patil
    • Futures Group
  • George Pariyo
    • Makerere University
  • Ken Hill
    • Department of Population and Family Health SciencesJohns Hopkins Bloomberg School of Public Health

DOI: 10.1007/s10461-006-9097-3

Cite this article as:
Bishai, D., Patil, P., Pariyo, G. et al. AIDS Behav (2006) 10: 369. doi:10.1007/s10461-006-9097-3

We examine the association of community linguistic diversity with non-spousal sexual activity in Uganda. We conducted a survey on rates of sexual contact in last 12 months among 1709 respondents age 18–60 living in Uganda in early 2001. Households were selected at random from Demographic and Health Survey (DHS) 2000 household sampling frame listings in 12 districts and 120 clusters. Household listings described the principal language spoken by every household in the cluster. Sexual contact was reported by 26 vs. 13% of unmarried women in multilingual vs. monolingual clusters respectively. Extramarital sexual contact occurred for 29 vs. 16% for married men in multilingual vs. monolingual clusters respectively. These results were robust to multivariate models which included confounders such as urbanity, and cluster distance to market places, cinemas, and transportation. Our results suggest a robust association between residence in a multilinguistic community and higher rates of non-spousal sex.


Ugandacommunityextramarital sex


Campaigns in Uganda seek to prevent the sexual spread of HIV/AIDS with the slogan ABC’s (A for Abstinence, B for Being faithful, and C for Condoms.) In Uganda, rates of abstinence, monogamous behavior and condom use have risen (Hogle, 2002). Some credit a process of rational self-interest enabled by better information about HIV/AIDS due to media campaigns (Green, 2003). It is also possible that selective mortality pressure due to HIV itself could play a small role (Heuveline, 2004). Encouraging monogamous behavior has emerged as a top public health priority in Uganda and other countries.

Simply entering a monogamous union is not sufficiently protective against sexually transmitted diseases (STDs) and HIV. Either partner’s extramarital relationship exposes the couple to sexually transmitted diseases and HIV, and opens the possibility for destructive romantic entanglements. In a serosurvey of 4507 linked couples in Rakai, Uganda, 22% were found to contain a partner with HIV (Porter, Hao, Bishai, and Gray, in press). In 10% the couple was concordantly positive, in 5% female only was positive, and in 7% the male only was positive (Porter et al., in press). Besides each partner’s jealous efforts to encourage mutual faithfulness, community members can and do assist in raising the stakes against those who pursue extramarital unions. Social sanctions against extramarital sexual relations range from acquiring a bad reputation to being killed (Vandello and Cohen, 2003).

Communities enforce standards of sexual conduct on their members with varying degrees of success. For people who have publicly expressed a desire to live in a monogamous union, the community’s efforts complement individual endeavors to discourage a partner from engaging in extramarital sex. Indeed, a notable benefit of investing in a public wedding with a community-sanctioned marriage contract is to recruit the force of community sanctions to strengthen the union by recruiting the eyes and ears of the community to help monitor the behavior of a partner.

In this paper we discuss our rationale for hypothesizing a link between individual sexual conduct and a community’s degree of ethnic heterogeneity. We will test this hypothesis by measuring ethnic heterogeneity as the number of different primary languages spoken in a community.1 We will use individual level survey data on sexual behavior collected in Uganda in 2001–2002 to assess whether men and women in multilingual communities are more likely to engage in premarital and extramarital sex compared to residents of monolingual communities. Following a diaspora which occurred under Idi Amin, Uganda has become typified by many small communities populated by between one and a dozen ethnic groups-each speaking a different language. This great heterogeneity within communities makes Uganda an ideal setting for us to identify which type of community is associated with lower rates of extramarital sex.

The story of the tower of Babel offers an early account of how language diversity can impede public cooperation (Genesis 11:1–5). Diversity of preferences based on ethnic, racial, religious, or cultural differences inhibits the development of consensus that is crucial for coordinating public activity that might benefit all groups. Evidence from the U.S. suggests that people who live in diverse communities are more likely to support measures to benefit their own ethnic group (Cutler, Elmendorf, and Zeckhauser, 1993). Further evidence of discriminatory preferences can be found in a study showing that a negative correlation between the percent elderly in a community and funding for public schools was worsened when the population of children contained more ethnic minority members than the population of elderly (Poterba, 1997). These effects of living in a heterogeneous community, need to be distinguished from the well established health effects of living in a community where one’s ethnic or racial group is a minority (Faris and Dunham, 1939; Halpern and Nazroo, 2000). Social epidemiologists have shown associations between markers of racial diversity and health although the mechanism remains unclear. After controlling for socioeconomic status, African Americans residing as numerical minorities in racially mixed communities in the U.S. experience worse infant mortality rates (Fang, Madhavan, Bosworth, and Alderman, 1998) and worse cardiovascular disease rates (Franzini and Spears, 2003) than when they are numerically dominant. The authors are careful to note that the mechanism is unclear and may relate to social stress from residence in a minority community. They note that policies should emphasize the reduction of overall discrimination rather than promoting residential segregation (Franzoni and Spears, 2003).

There are several theoretical reasons to hypothesize a link between ethnic diversity and higher rates of nonmarital sexual contact. The simplest reason is instrumental and related to the technical aspects of achieving the necessary social cooperation required to enforce social norms. A community with a common language is more capable of spreading the news and the disapproval about so and so having non-marital contact. Just as common language is instrumental in building towers it is also instrumental in linking sexual behaviors to social censure. A second reason that ethnic diversity might weaken adherence to sexual norms is that it could erode community consensus and confidence that standards of behavior are absolute. If group A is somewhat more lenient about extramarital contact than group B, the group’s proximity might over time weaken the resolve of individuals in either group to view their group’s standards as absolute. This adherence effect is distinct from the well-described process whereby social contact eventually changes the social norms themselves (Montgomery and Casterline, 1996). A final reason that ethnic diversity might lead to higher rates of extramarital sexual contact might be suggested if there were evidence that the sexual contact was occurring across groups. In this case one might appeal to the “Coolidge effect” in which the availability of a variety of partners increases sexual appetite (Francoeur, Perper, Scherzer, Sellmer, and Cornog, 1991; Wilson, Kuehn, and Beach, 1963). Although there are many reasons to suspect that community ethnic diversity is related to extramarital sex, this initial exploration was designed to assess whether or not there is an association. Later studies may be able to determine which of the above mechanisms are most important.

Data from Africa also suggest that diversity can impede cooperation. Cross country data indicate that language diversity has a negative correlation with economic development indicators such as paved roads, developed electricity grids, and telephone service (Easterly and Levine, 1997). Household data from Ghana show that reciprocal financial support pacts discriminate along the lines of kinship which are carried vertically to subsequent generations (La Ferrara, 2003).

There has been surprisingly little attention to the social and community factors that motivate some individuals to seek extramarital sexual relationships. Poverty is cited as a force driving the demand and supply of commercial sex work leading to the spread of AIDS (Basu, 1998). Other community characteristics are seldom explored. One recent exception is a study of the relationship between community physical and economic development and the prevalence of HIV in Rakai, Uganda (Patil, 2003). There have not been prior studies of how ethnic diversity in a community may be associated with sexual risk taking.


A survey on HIV related attitudes and behavior was conducted in 1726 households in 12 districts in Uganda in early 2001. The districts surveyed range from East, North, West, and South. Kampala was the only urban area surveyed. More details on the survey design are available elsewhere (Bishai, Pariyo, Ainsworth, and Hill, 2004). In 1999, Uganda Bureau of Statistics (UBOS) prepared a sampling framework in every district of the country in preparation for the Demographic and Health Survey (DHS 2000). The sampling framework listed the principal language spoken in every household in all of the clusters where DHS interviewers would be deployed so that the proper translations would be available. We contractually obtained this sampling framework for 12 districts and computed the number of different languages spoken and the total number of households in each DHS cluster where our own original survey on sexual behavior and AIDS vaccine demand would be deployed. Our survey was carried out 12 months after the DHS survey left the field. Clusters contain an average of 200 households. UBOS forms statistical clusters for logistical convenience not necessarily to be reflective of concepts like community or neighborhood. Nonetheless most of the time a cluster turns out to conform to a village, settlement, or neighborhood. And inasmuch as community is mis-measured it would bias our analysis towards finding no effects.

We randomly sampled 13 households from each UBOS cluster to conduct our original survey. Eligible respondents were men and women age 18–60. All household adults in this age range (up to a maximum of three per household) were interviewed by trained interviewers of the same sex as their respondents and in respondents’ own language. Each respondent was asked, “In the past 12 months have you had sexual intercourse with a long term (boy/girl) friend or lover who is not your spouse.?” In subsequent questions the term “long term (boy/girl) friend or lover” was replaced with “short-term (male/female) friend or a casual (male/female) friend.” Men were also asked, if they had had sex in the last 12 months with a sex worker.

For unmarried persons the failure to be sexually abstinent is the outcome of interest. For married persons we focus on having extramarital partnerships. After tabulation of rates of these outcomes by number of languages spoken in a cluster, we ran multivariate logistic regression models of our outcomes to control for respondent education, wealth, and community variables that might confound the measure of language heterogeneity.

Behavior is measured at an individual level, but language heterogeneity is measured at the community level. However, important individual determinants of behavior such as age, education, and sex would be lost if the analysis took place entirely at the community level. Fortunately it is possible to adjust for the multi-level nature of the data by applying generalized estimating equations that allow for a random effect at the level of each cluster. This technique ensures that the standard errors are adjusted for correlation at the level of the cluster. In addition, district level (as opposed to cluster) dummies are used in every single regression model to adjust for regional effects that could be a potential confounder. For instance, if language heterogeneity occurs more so in district A, which also happens to have intensive social marketing of AIDS prevention strategies the dummy variables could correct this type of confounding. The proportion of the total variance contributed by cluster level residual can be measured by the “rho” statistic which we tabulate for each analysis. If rho=0, one may conclude that remaining cluster level contributions to the unexplained variation in the outcome are minimal.

Clearly urban areas are more likely to have multiple languages and to simply offer higher population density and environments that enable extramarital sex. We use multivariate methods incorporating a dummy variable for urban areas. Because a single dummy variable for urbanicity may not fully capture all of the unobservable features that could be confounding the relationship between multi-lingual community and extramarital sex, we also estimated other models incorporating a wide variety of community level variables. If the effects of multilingual community are eroded or lost as community level variables are successively added to the model, that would suggest that the relationship is confounded.


The total sample consisted of 1726 respondents of whom 1227 were married. The mean age of married and unmarried respondents was 33.5 years (SD=10.09) and 29.95 years respectively. In 57 of the 120 clusters studied only one language was spoken by all of the households listed by the 2000 DHS sampling frame—this language was Luganda or Lusoga in 26 monolinguistic clusters. The analytical sample was reduced to 1709 because 16 married and one unmarried individual refused to report on their sexual behavior.

For unmarried women rates of sexual activity were significantly higher 26% in multilingual compared to 13% in monolingual clusters (Z=3.01, p < 0.01). Unmarried men showed no significant difference in rates of sexual activity by community type with sexual contact reported by 60% in multilingual and 53% in monolingual clusters.

For married women, rates of sexual activity showed no significant difference by cluster type with sexual contact reported by 4% in monolingual and 5% in multilingual clusters. But rates of extramarital contact were significantly higher for men with 29% in multilingual clusters compared to 16% in monolingual clusters (Z = 3.59, p < 0.01).

Although the elevated risk in multilingual clusters appears to affect married men and unmarried women, we did not ask specifically about the marital status of the partners of our respondents so we cannot confirm whether married men are having extramarital sex with primarily unmarried women. Our survey does show that less than 1% of married men reported having sex with sex workers. The median number of extramarital partners in the last 12 months for men who had extramarital relationships was 4 with a maximum of 20. Sexually active unmarried women were most likely to have long term partners only (15%) followed by 10% with both long and short term partners and 2% with short term partners only. Sexually active unmarried women reported a median of 3 partners in 12 months with a maximum of 5. Married women who had extramarital sex showed a higher affinity than single women for long term partners, with 24 of the 30 women who had extramarital sex reporting involvement with a long term partner. The proportion reporting that they always used a condom while having sex with someone other than a spouse never exceeded 20% whether respondents were male or female, married or unmarried.

Table I shows the results of multivariate logistic models showing the odds ratio of non-spousal sex. For unmarried persons the association of monolingual communities with lower rates of sexual contact was robust to the inclusion of confounders such as cluster size, urban residence and the proportional representation of speakers of each of eight major languages in Uganda. This effect was statistically significant only for unmarried women and married men. Income and education had variable effects on the odds of extramarital sex, which only achieved significance in the largest sample pooling married men and women. Here income reduced the odds of non-marital sex, while education increased the odds. An additional year of age lowered the odds of having sex for unmarried individuals of both sexes and lowered the odds of non-spousal sex for married males. Cluster size which might indicate the number of potential partners and informants had no effect on the odds of non-spousal sex. Separate analysis (not-shown) repeated the regression for a sample of only rural residents and found the association between monolingual clusters and lower rates of sexual contact remained statistically robust and of similar magnitude for unmarried women and married men.
Table I.

Random Effects Logistic Regression Presenting Odds Ratios of Extramarital Sex, Estimated Using Generalized Estimating Equations with Random Effects at Cluster Levela


Unmarried persons (having sex, if unmarried)

Married persons (sex with non spouse, if married)


Both sexes



Both sexes



Only one language cited in cluster

0.619 (−1.49)

0.536 (−0.90)

0.454 (−1.73)*

0.483 (−2.85)***

0.391 (−2.93)***

0.725 (−0.56)

Imputed log incomeb

1.136 (1.14)

0.95 (−0.24)

1.008 (0.04)

0.711 (−2.77)***

0.844 (−1.08)

0.65 (−1.43)

Education levelc

0.999 (−0.01)

1.164 (0.49)

0.78 (−1.10)

1.536 (2.94)***

1.08 (−0.41)

1.132 (0.33)

Number of children in household

1.373 (2.16)**

1.292 (0.79)

0.771 (−1.24)

1.06 (0.40)

0.979 (−0.12)

1.226 (0.49)

Age in years

0.954 (−4.08)***

0.939 (−2.70)***

0.969 (−2.08)**

0.997 (−0.34)

0.947 (−3.48)***

0.969 (−1.17)

Total number of HH in cluster

1 (−0.18)

0.998 (−0.59)

1.001 (0.27)

1 (−0.11)

0.999 (−0.46)

0.998 (−0.70)

Number of people in household

0.715 (−2.88)***

0.816 (−0.77)

1.078 (0.45)

0.902 (−0.83)

1.1 (0.63)

0.661 (−1.08)















Note. z-statistics for logit coefficients in parentheses, adjusted for clustering using random effects model.

aAll regressions also included district dummies, proportion speaking each of eight different languages, and urban dummy.

bImputed log income for each household was established by regressing log household income upon asset categories, education, occupation, and district dummies in  the Uganda National Household Survey (UNHS) of 2000, and then applying these out of sample coefficients to the same covariates in our sample. Our survey was  designed to use asset categories, and educational categories that exactly matched the UNHS, to enable this imputation.

c Education level coded on a 0–4 scale. 0=none, 1=primary, 2=secondary, 3=vocational, 4=university.

*Significant at 10%; **significant at 5%; ***significant at 1%.

To test the robustness of the language heterogeneity variable to possible other confounders we included a full set of all of the cluster level data on distances collected on the DHS cluster lists: distances to a primary and secondary school, post office, market, cinema, well, bus station, and urban area. This sensitivity analysis is shown in Tables II and III. Few of these other community variables were significant in the multivariate specification, but number of languages remained a significant risk factor for non-marital sex among men despite the inclusion of these additional variables. The effect of number of languages on the risk of non-spousal sex for unmarried women was robust to the inclusion of all other cluster data except distance to an urban area, although it remained relatively unchanged by the inclusion of all other cluster distance measures.
Table II.

Multivariate Determinants of Extramarital Sex Among Married Men—Sensitivity of Results to Other Cluster Level Variablesa


Odds ratiosb of sexual activity by married men


Model 1

Model 2

Model 3

Model 4

Model 5

Only one language cited in cluster

0.407 (−2.28)**

0.347 (−3.08)***

0.355 (−3.02)***

0.368 (−2.98)***

0.388 (−2.96)***

Distance to primary school (km)

1.037 (0.50)

1.044 (0.67)

1.048 (0.71)

1.05 (0.74)

1.019 (0.31)

Distance to secondary school (km)

0.983 (0.59)

0.971 (−1.00)

0.961 (−1.43)

0.972 (−1.04)

0.984 (−0.66)

Distance to post office (km)

1.018 (1.00)

1.02 (1.12)

1.008 (0.67)

1.013 (−1.13)


Distance to cinema (km)

−0.99 (−0.70)

0.999 (−0.05)

1.002 (0.16)

0.998 (−0.17)


Distance to well (km)

1.01 (0.22)

1.008 (0.18)

0.979 (−0.49)


Distance to traditional healer (km)

1.02 (0.28)

1.036 (0.55)

1.01 (0.17)


Distance to bank (km)

1.003 (0.19)

0.991 (−0.60)

461 (110.00)


Distance to public transport (km)

0.914 (−1.24)

0.909 (−1.38)


Distance to market (km)

0.991 (−0.25)

1 (−0.01)


Distance to urban center

0.999 (−0.13)


Imputed log income

0.776 (−1.19)

0.743 (−1.77)*

0.755 (−1.69)*

0.78 (−1.51)

0.835 (−1.14)

Education level

1.113 (0.43)

1.116 (0.54)

1.129 (0.60)

1.074 (0.36)

1.073 (0.37)

Number of children in household

1.067 (1.11)

1.071 (1.37)

1.067 (1.31)

1.072 (1.40)

1.08 (1.59)

Age in years

0.943 (−3.12)***

0.943 (−3.50)***

0.945 (−3.44)***

0.945 (−3.49)***

0.947 (−3.45)***

Total number of households in cluster

0.999 (−0.77)

0.999 (−0.48)

0.999 (−0.56)

0.999 (−0.47)

0.999 (−0.51)

Number of people in household

0.498 (−0.06)

0.175 (−0.18)

1.955 (−0.07)

0.032 (−0.39)

0.032 (−0.40)







Rho (panel level variance component)






Note. z-statistics for logit coefficients in parentheses, adjusted for clustering using random effects model.

aAll regressions also included district dummies, proportion speaking each of 8 different languages, and urban dummy.

bThe exponentiated coefficients (β’s) from the logistic regression are given. We call these exponentiated coefficients odds ratios because  each unit increase in the explanatory variable would be associated with a multiplicative change in the odds of the event by exp(β).

*Significant at 10%; **significant at 5%; ***significant at 1%.

Table III.

Multivariate Determinants of Sexual Activity Among Unmarried Women—Sensitivity of Results to Other Cluster Level Variablesa


Odds ratiosb of sexual activity by unmarried women


Model 1

Model 2

Model 3

Model 4

Model 5

Only one language cited in cluster

0.879 (−0.17)

0.42 (−1.61)

0.366 (1.89)*

0.418 (1.79)*

0.456 (−1.67)*

Distance to primary school (km)

0.922 (−0.55)

0.98 (−0.18)

0.98 (−0.19)

0.988 (−0.11)

1.012 (0.12)

Distance to secondary school (km)

1.117 (1.91)*

1.062 (1.14)

1 (0.01)

1.004 (0.09)

0.981 (−0.46)

Distance to post office (km)

1.024 (0.83)

0.996 (−0.16)

.982 (−0.99)

0.982 (−1.15)


Distance to cinema (km)

0.959 (−1.43)

0.981 (−0.79)

0.991 (−0.36)

0.985 (−0.69)


Distance to well (km)

0.989 (−0.10)

0.983 (−0.20)

0.958 (−0.50)


Distance to traditional healer (km)

1.268 (1.95)*

1.231 (1.90)*

1.142 (1.55)


Distance to bank (km)

0.989 (−0.38)

1.005 (0.32)


Distance to public transport (km)

0.932 (−0.61)

0.957 (−0.42)


Distance to market (km)

0.877 (−1.95)*

0.862 (−2.14)**


Distance to urban center

0.992 (−0.21)


Imputed log income

0.813 (−0.79)

1.054 (0.30)

1.047 (0.26)

1.011 (0.06)

1.042 (0.23)

Education level

0.663 (−1.32)

0.733 (−1.30)

0.784 (−1.05)

0.781 (−1.10)

0.768 (−1.18)

Number of children in household

0.962 (−0.49)

0.914 (−1.45)

0.908 (−1.55)

0.897 (1.76)*

0.888 (1.91)*

Age in years


0.968 (−1.95)*

0.972 (−1.72)*

0.974 (−1.67)*

0.97 (−1.98)**

Total number of households in cluster

1.002 (−0.70)

1 (−0.09)

1 (−0.16)

1 (−0.17)

1 (−0.10)

Number of people in household

0 (−1.27)

0.002 (−0.55)

0.001 (−0.65)

0.002 (−0.72)

0.002 (−0.70)







Rho (panel level variance component)






Note. z-statistics for logit coefficients in parentheses, adjusted for clustering using random effects model.

aAll regressions also included district dummies, proportion speaking each of eight different languages, and urban dummy.

bTable II shows the exponentiated coefficients (β’s) from the logistic regression. We call these exponentiated coefficients odds ratios  because each unit increase in the explanatory variable would be associated with a multiplicative change in the odds of the event by exp(β).

*Significant at 10%; **significant at 5%.


We find a robust association between residing in rural Ugandan communities where multiple languages are spoken and the odds of non-spousal sex for unmarried women and for married men. We hypothesized that communities with ethnic diversity have a weakened ability to enforce traditional cultural proscriptions against non-spousal sex. Our results on language heterogeneity are robust to the inclusion of many other community level variables, but we cannot rule out whether language heterogeneity is a proxy for some other unmeasured community feature such as transiency or low social cohesion. Other work has noted difference in sexual risk taking for groups of recent migrants vs. non-migrants (Guilamo-Ramos, Jaccard, Pena, and Goldberg, 2005). We note that our findings are similar to prior studies which found that ethnically diverse communities have difficulty coordinating many other activities for the common good. In tribute to the Author/s of the Book of Genesis where this phenomenon was first noted we suggest calling this the “Babel Effect.”

The absence of partner data makes it difficult to exclude an alternative hypothesis that individuals who live in multilingual communities are seduced into non-spousal sex by the allure of mysterious partners from a different background. This behavior would be consistent with the infamous “Coolidge Effect” in which the availability new or foreign partners stimulates greater interest in sexual activity (Francoeur et al., 1991). Whether a Babel Effect or Coolidge Effect is behind our results is less important than the practical importance of recognizing the heightened risk borne by linguistically diverse communities. Our results suggest that HIV/AIDS prevention programs need to especially target these communities because they have heightened risk of non-spousal sexual behaviors that can spread this disease.



This strategy is not perfect because in Uganda there are groups that claim differing ethnicity, but share common language. If anything, this limitation makes it less likely that we will confirm our hypothesis, because there could be ethnic variation that is not captured by a measure of language variation.



The authors wish to thank the Johns Hopkins Center for AIDS Research for supporting this research (5P30AI042855-04). Helpful comments from seminar participants at the Harvard Center for Population and Development Studies, Brown University Population Studies and Training Center, and from Michael McQuestion and Saifuddin Ahmed are gratefully acknowledged. Finally, we thank Martha Ainsworth, the International AIDS Vaccine Initiative, and the European Union, for their contributions to data collection.

Copyright information

© Springer Science+Business Media, Inc. 2006