The impact of Community Mobilisation on HIV Prevention in Middle and Low Income Countries: A Systematic Review and Critique

While community mobilisation (CM) is increasingly advocated for HIV prevention, its impact on measurable outcomes has not been established. We performed a systematic review of the impact of CM within HIV prevention interventions (N = 20), on biomedical, behavioural and social outcomes. Among most at risk groups (particularly sex workers), the evidence is somewhat consistent, indicating a tendency for positive impact, with stronger results for behavioural and social outcomes than for biomedical ones. Among youth and general communities, the evidence remains inconclusive. Success appears to be enhanced by engaging groups with a strong collective identity and by simultaneously addressing the socio-political context. We suggest that the inconclusiveness of the findings reflects problems with the evidence, rather than indicating that CM is ineffective. We discuss weaknesses in the operationalization of CM, neglect of social context, and incompatibility between context-specific CM processes and the aspiration of review methodologies to provide simple, context-transcending answers. Electronic supplementary material The online version of this article (doi:10.1007/s10461-014-0748-5) contains supplementary material, which is available to authorized users.

It is often the case that outsiders such as academics, multilateral agencies, or international organisations implement interventions without assessing their relevance to their particular target community. Social research has documented failures of HIV interventions to resonate with local norms, cultures and needs [1,2]. Behaviours advocated by external health professionals may not be feasible in contexts of poverty, political conflict and gender inequalities [2,3]. As a result, there is a growing emphasis on the need for community involvement in the planning, implementation and ownership of interventions. Indeed, community mobilisation (CM) is now widely considered a ''critical enabler'' of an effective HIV/AIDS response [4]. Despite this increasing interest in CM, there has been little systematic attention to its impacts.
One reason for the lack of systematic attention to CM is that it is used across categories of HIV intervention usually considered separately, namely, biomedical, behavioural and structural interventions. CM has been argued to be valuable in recruiting men to take up circumcision in biomedical interventions [5,6]. Similarly, in behavioural interventions, CM may be used to recruit participants through outreach, or to inform culturally-appropriate materials [7,8]. CM is often termed a 'structural intervention' when it is considered as a means of empowering marginalised communities and thus changing power relations [9][10][11][12]. For a smaller number of studies, CM is considered their main mechanism of intervention [13][14][15][16]. Although the impacts of CM have not yet been subject to a systematic review, such a review stands to contribute to research and practice across the range of HIV prevention approaches, wherever CM might be considered a 'critical enabler'. The term 'community mobilisation' is not consistently defined, theorised, operationalized or systematically appraised [17]. In its fullest operationalization, CM seeks ''to create and harness the agency of the marginalised groups most vulnerable to HIV/AIDS, enabling them to build a collective, community response, through their full participation in the design, implementation and leadership of health programmes and by forging supportive partnerships with significant groups both inside and outside of the community'' [18]. This definition sets a high bar, and many operationalizations of CM are far less ambitious [19]. Although our definition preserves the conceptual distinctiveness of CM, we aimed to be relatively inclusive in this review, for very few published evaluations implement the 'maximalist' version of CM as defined above.
For the purposes of this review, we take the term 'community' to refer to collective resources that exist among a community, rather than at the individual level. We take the term 'mobilisation' to mean capitalising on those community connections and strengths to generate new possibilities of action. In keeping with the 'minimalist' definition of CM, in this paper we consider CM as a component of externally-triggered HIV interventions, rather than including indigenous CM initiated by grassroots actors with broader interests than HIV. This latter topic is addressed in the literature on social capital and HIV/AIDS [20][21][22].
Research on CM to date has tended to be qualitative, grounded in ethnographic methods, and to focus on processes rather than on outcome evaluation [13,[23][24][25]. A growing body of social research has underscored the role of CM in building HIV competent communities, ensuring interventions are relevant and accessible to local people, and enabling people to work collectively to create healthenabling environments [18,26,27]. This work has also emphasised the significance of partnerships between communities and outside agencies as a key supportive condition for effective CM [18,28,29]. Furthermore, the pertinence of Western conceptualisations of CM and its constituting elements needs to be validated in specific intervention contexts [17]. Such information is crucial to the appropriate design and evaluation of CM programmes. In the context of the ascendant 'evidence-based policy and practice' movements, however, quantitative evaluations, and systematic reviews of these, are valued as sources for informing policy or funding decisions.
The first aim of the current article, therefore, is to present a systematic review of studies of the impacts of CM as a component of complex HIV prevention interventions. The scope of this review is comprehensive in that we do not restrict it to any target group, and we consider the impact on biomedical, behavioural, and social outcome variables. The review draws conclusions about whether CM 'works' or not, and delineates more nuanced lessons about the conditions under which CM is more likely to succeed.
A systematic appraisal of the CM intervention literature is challenging. Mobilisation efforts are referred to by many terms (e.g. community solidarity, social mobilisation, community participation, community engagement), which defy simple search strategies. In addition, CM almost invariably constitutes 'complex interventions' [30], entailing multiple, indirect pathways between intervention and outcomes. A wide and unresolved debate surrounds the best way to deploy and evaluate CM interventions (CMI) [31,32], with many arguing that the complex and improvisational nature of good CM defies summary in the linear 'input-output' models of change that characterise the 'gold standard' approach of randomised controlled trials (RCT). The second aim of the article is to reflect on the methodological challenges of operationalizing, evaluating and reviewing CMI.

Methods
The definition of CM cited above sets a high standard. However, many studies which use the term 'CM' do so in a very limited way, for instance using 'CM' to simply mean reaching out the community for service or research recruitment. Heeding this concession, our key criterion was that CMI should seek to foster new capacities in a community by facilitating meaningful contact among community members. The reviewed studies aimed to engage communities in one or more of the following: enhancing supportive interpersonal relationships, building withincommunity support and solidarity (bonding social capital), and building bridges between communities and outside support partners (bridging social capital). In order to do so, interventions employed activities such as setting-up peer support groups and clubs, fostering of community-based organisations, performing dramas, rallies and awareness camps, creating community centres as 'safe spaces' for debate and conscientisation, as well as holding multi stakeholder meetings and advocacy. On the methodological front, we included reports on the impact of CM as a component of more complex interventions, and excluded articles where the impact of a singled-out mobilisation activity (e.g. peer group membership) was measured.
The following questions guided the review: • Selection Standard systematic review procedures were followed (Fig. 1). The bibliographic databases SCOPUS, PubMed, Cumulative Index to Nursing and Allied Health Literature and PsycInfo were interrogated using free-text terms to produce a sensitive search, adjusting terms depending on the search tools available (e.g. truncation). Searches included a combination of the following terms: ''intervention'' AND ''hiv OR aids'' AND ''community mobili*'' OR ''community particip*'' OR ''community led'' OR ''community based'' OR ''community activit*'' OR ''community development'' OR ''capacity building'' (full search strings by database in supplementary online information). When possible, irrelevant publication types (e.g. commentaries) were excluded using search tools. A complementary search was carried out through expert consultation and systematic reference screening of previous related reviews [12,33,34], which rendered 32 records. Records identified were systematised and, after removing duplicates, 1096 abstracts were screened. Of these, 97 were selected for full-text retrieval. To be included, articles needed to have been published in English in the last 11 years (from January 2003 to October 2013), and to report studies conducted in low-income, lower-middleincome or upper-middle-income economies [35]. Given that the nature of HIV epidemics, and understandings of CM have changed greatly historically, the timespan was chosen to reflect contemporary research and practice. We used three main inclusion criteria, including studies which: • Reported community-based initiatives (as opposed to health facility-based medical interventions) that engaged one or more community groups in concrete participatory activities. Studies should have reported modes of CM that foster meaningful community connections. • Evaluated the intervention in terms of at least one quantifiable biomedical (incidence and/or prevalence of HIV-1, HSV-2 and bacterial STI) or behavioural (reported condom use, reported health-service use, and HIV test-taking) outcome. • Evaluated outcomes with reference to a comparator or control, irrespective of research design.
The second inclusion criterion is based on the rationale that CM, and derived social outcomes, shapes sexual behaviours, which in turn impact on biomedical indicators. Thus, we targeted the proximal outcomes of this logic model. Although reported condom use has been found to be poorly associated with biomedical markers [36][37][38], we included condom use as a behavioural outcome because it is a proxy measure of actual sexual risk historically applied in the prevention literature, affording some degree of comparability across studies. Behavioural outcomes were expanded ad-hoc to engagement in extramarital sex for one study [39], on the assumption that this might increase a person's HIV risk [40]. This study was selected given its careful consideration of community involvement.
To enable the implementation of the selection criteria, and given the diversity of terminology employed, two steps were taken in the selection process. First, during abstract screening, records reporting the same study were clustered together, so as to gather as much information about a study as possible. Second, during full-text vetting, ''reference list checking'' [41] was performed to aid final selection. Eleven articles were selected from this search and assessed for inclusion.

Data Extraction and Synthesis
Twenty unique studies were identified, documented by 28 primary and 18 supplementary articles. When possible, the unit of analysis consisted of studies rather than articles. Bibliographic data were noted for each study, as were methodological details such as sample, control or comparison and intervention components (Table 1). Outcomes identified in each study were classified into biomedical, behavioural and social. Biomedical outcomes reported in included articles addressed incidence and/or prevalence of HIV-1, HSV-2 and bacterial STI. Behavioural outcomes were limited to reported condom use, reported health-service use, and HIV test-taking. Social outcomes, such as collective efficacy or community cohesion, were included only if in a study that also reported at least one biomedical or behavioural outcome. Social outcomes considered any measurement of collectivisation, cohesion, partner violence and/or participation and excluded individualised accounts of personal development, individual empowerment or autonomy. None of the studies reported structural outcomes such as changes in legislation or policy implementation. Outcomes related to knowledge, attitudes, individual level perceptions (e.g. attitude towards condoms, selfefficacy or individual-level 'power-within') were not included, given that knowledge, individual skills and attitudes are not a sound reflection of behaviour [42][43][44]. Similarly, reported STI were omitted on the basis that an actual measure (incidence/prevalence), rather than a proxy, was more valid. We included results for different types of outcomes if reported in more than one article belonging to the same study. When similar outcomes for the same study were reported by more than one article, we included the results using a larger sample, given that this frequently aggregated the smaller samples reported elsewhere.
Given the heterogeneity of studies in terms of study design, intervention participants and outcomes measurement, a meta-analysis would have been unsuitable for this review. Consequently, the narrative analysis presented below addresses our review questions and, for the second one, relies on both the direction of intervention effects and associations and their reported significance. Furthermore, although studies' risk of bias did not determine inclusion, we assessed this risk and methodological soundness by using Thomas's Quality Assessment tool for Quantitative Studies [45]. This instrument is recommended for systematic reviews of health interventions [46] and, while suitable for our review because it evaluates a range of quantitative designs [47], was adapted to accommodate the complexity of studies included (Table 2).

The Studies
Of the corpus of twenty studies, seven were RCT: project Accept [15,48], the Regai Dzive Shiri intervention [49], the MEMA kwa Vijana trial [50], the Stepping Stones intervention [51], the Manicaland Project [52], the IMAGE project [53], and a trial in the Masaka district, Uganda [54]. Project Accept was carried out simultaneously in Sub-Saharan Africa and South and South East Asia, while the other six trials were implemented in Sub-Saharan Africa.
The boundaries of 'community' were conceptualised in three main ways by the selected studies. First, in contexts of concentrated HIV epidemics, interventions targeted groups most at risk: ten studies focused on sex workers [14,56,59,[63][64][65][67][68][69][70], four on men who have sex with men (MSM) [57,63,69,72], and one study [39] focused on local heterosexual men whose high levels of alcohol consumption were found to be putting their sexual health at risk [73]. These communities were thus assumed to share a social identity, location and concrete practices (e.g. work and leisure). Second, mainly in contexts of generalised epidemics, youth were targeted by four studies [15,[49][50][51], treating them also as communities in terms of identity and sexual risks, within geographically-bound communities. Four further studies [52][53][54]71] conducted in the generalised epidemic of Sub-Saharan Africa were concerned with mobilising geographically-bound communities by targeting adults or a number of groups (e.g. women who applied for a loan; miners, sex workers and adults simultaneously) within the community. Outcomes were evaluated at the level of participant communities and their comparators, except for four projects (Accept, Avahan, IMAGE and Regai Dzive Shiri), which evaluated the effects of the intervention at the wider community-or population-level.

Do CM Interventions Work? It Depends on for Whom
Since the pattern of findings differs by population, we have divided our presentation of the findings into two sections, reporting first the findings for sex workers and other most at risk groups, and then findings for youth and general communities.

Sex Workers and Other Most at Risk Groups
The first group are mainly sex workers and, to a lesser extent, other most at risk groups such as men who have sex with men (MSM), who have been targeted in contexts of concentrated HIV epidemics. Inconsistent results were reported in the Avahan programme, in which for population effects at state level, ''greater intensity'' of the intervention was significantly associated with lower HIV prevalence in 3 Indian states, but with very small effect sizes (-0.0026 to -0.0022). The association between CMI and HIV prevalence was non-significant in 3 other states [60][61][62], although authors acknowledge that the prevalence in chronic diseases such as HIV could require long periods to be apparent. The Frontiers Prevention Project in Ecuador found no significant effects of the intervention on HIV seroprevalence among FSW and MSM [69].
With regards to other STI, a sub-study of the Avahan programme in Karnataka reported that chlamydia and/or gonorrhoea prevalence, and high-titre syphilis, were significantly reduced, while this reduction was non-significant in relation to syphilis among female sex workers (FSW) [61]. Encouragingly, the Frontiers Prevention Project in Andhra Pradesh was associated with lower likelihood of syphilis and HSV-2 among both FSW and MSM [63]. A similar project in Ecuador rendered a significant impact in the reduction of likelihood of syphilis seroprevalence among MSM, while having a borderline effect of lower HSV-2 seroprevalence among FSW in the programme [69]; non-significant programme effects were observed on HSV-2 among MSM and syphilis among FSW [69]. Active participation in the Encontros study was related to a nonsignificantly lower probability of incident chlamydia and/ or gonorrhoea [67], while the Carletonville project rendered a non-significant increase of syphilis, gonorrhoea and chlamydia among participant sex workers [71]. Furthermore, the study in the Dominican Republic [14] found CMI to reduce prevalence of one or more STI (gonorrhoea, trichomoniasis, or chlamydia) among FSW. A further analysis shows that this effect was statistically significant when CM was combined with implementing and enforcing a government policy supporting consistent (100 %) condom use [14]. On the whole, the evidence points to CMI tending to impact on the reduction of STI among sex The following adaptations were performed to the instrument's grading when assessing the studies: (a) Validity and reliability (measured under data collection method) were assigned as 'strong' for all studies using biomarkers. For studies relying on behavioural outcomes, explicit indication of the instrument's validity was sought and reliability coefficients were required. (b) Baseline differences (assessed under confounders) were computed as 'moderate' for all cohort studies (one group pre ? post (before and after)), given that they act as their own comparison group. (c) For studies that did not involve the same participants at baseline than at follow up, completion rate was computed by calculating the proportion of participants in the follow up in relation to those who participated at baseline. (d) For studies implementing the intervention in one community group but measuring effects at the population level, representativeness (under selection bias) was marked as 'somewhat likely' a Given the heterogeneity of individually reported designs, they were classified according to the instruments' typology. Assigned design is also reported in the analytic table b Data from Ng and colleagues' (2011) article were used for the assessment of this study because they encompass the Indian states considered in the other articles reviewed workers, with this effect being more likely to be significant when CMI is combined with policy interventions [14]. Regarding behavioural outcomes, among sex workers, condom use is the behaviour most addressed, and with the strongest evidence, with various degrees of effect depending on whether sexual encounters are with paying clients, casual or stable partners. Exposure to interventions was found to be significantly associated with increased likelihood of condom use [66], consistent condom use with clients [56], with new clients [14 in Santo Domingo], consistently or during last encounter with regular clients or partners, [14 in Puerto Plata,61,63,67], and for oral sex with clients, in ''all situations'' and during last week [68]. Other studies have found non-significant or marginal increases related to the intervention, including ever using a condom [71], condom use with last client [63,70], with all clients [70], with occasional or new clients [14 in Puerto Plata,61,67], with non-paying [67] and regular partners [14 in Santo Domingo,61,69], as well as for anal and vaginal sex with clients [68]. Marginal decreases in consistent condom use with all partners [70] and casual partners [71] were reported in Rio de Janeiro, Brazil and South Africa.
Of note are two behavioural outcomes reported in the interventions with FSW. In the Dominican Republic [14], authors recorded the observed FSW's verbal rejection of unsafe commercial sex, which increased from pre-to postintervention in both sites, being significant only for the community where both CM and policy enforcement were implemented. Similarly, in Andhra Pradesh [64], FSW who reported high collectivisation were considerably more likely to procure STI treatment from government health facilities than those who reported low collectivisation, as were those who reported high collective efficacy and collective agency [57]. Collective action, in contrast, was associated with a lower likelihood of STI treatment-seeking [57].
CMI targeting MSM were found to be related to a significant increase in reported condom use with casual and regular sexual partners for vaginal sex, anal sex and oral sex [72] and to the likelihood of condom use with last female partner [63] and last male sexual partner [69]. An association in the same direction was non-significant for condom use with last female partner [69] and last male sexual partner [63]. In one subsample of MSM and transgender people in the Avahan project [57] it was found that participation in a public event was significantly associated with higher likelihood of consistent condom use among paying and non-paying partners, with the same positive trend for collective efficacy, although significant only with paying partners. In the RISHTA intervention [39], which engaged local heterosexual males whose high levels of alcohol consumption were found to put their sexual health at risk, it was reported that changes in extramarital sex were significantly associated to change in alcohol use, so that significant decreases in extramarital sex were observed among men who were drinkers at baseline but non-drinkers at endline.
Social outcomes tended to be positive mainly in terms of community participation and collective identity, with inconsistency in the way social outcomes were measured. Two Sonagachi-inspired programmes found significantly positive changes in social participation [67,70], but not in perceived social cohesion [67,70] after intervention in Corumbá [67] and Rio de Janeiro [70], Brazil. Similarly, another Sonagachi adaptation found significant increases in social support through organising and solidarity, but not in political participation in West Bengal, India [65]. An evaluation of the Avahan project, in turn, found that joining a meeting, belonging to a help group and being member of a sex worker collective were significantly associated with higher perceived collective efficacy and higher perceived collective support in non-metropolitan Tamil Nadu, with varying positive effects in the other three settings of Tamil Nadu and Maharashtra, in India [59]. A different evaluation report of Avahan also found that collective identity and solidarity were significantly associated with lower odds of violence or abuse by more powerful groups in highintensity intervention districts, but not in low-intensity ones [58]. In Andhra Pradesh, the study on project Parivartan found a positive relationship between collective identity, collective efficacy and collective agency and programme exposure, which becomes stronger as the level of exposure rises [55].
In sum, there is reasonable evidence for the effectiveness of CMI in reducing STI and increasing condom use among sex workers. Among MSM, there is some evidence that CMI results in increased condom use. Findings regarding social outcomes remain uneven, depending on the social outcome measured. The evidence of CMI effects on HIV prevalence remains limited to projects Avahan and Frontiers Prevention (Ecuador), which provided inconclusive results. It is difficult to draw broad conclusions about which programmatic elements or conditions are most effective (e.g. targeting casual vs. regular partners; working in urban vs. rural areas) because each intervention employed a different design, measured different outcomes, and was conducted in a unique setting.

Youth and General Community
Studies targeting either youth or the general community reported mainly non-significant intervention effects in terms of HIV incidence [51][52][53][54] or prevalence at the community level [49,50]. Project Accept intervention reported lower HIV incidence in intervention than in AIDS Behav (2014) 18:2110-2134 2127 control communities with borderline statistical significance (p = 0.08) [48]. Similarly, regarding other STI, mainly non-significant effects have been reported for HSV-2 prevalence [49,50] and incidence [54 for one arm] as well as prevalence of chlamydia, syphilis and gonorrhoea [50, 54 for one arm], while the Carletonville study reported significant increases of chlamydia among miners, men and women, of syphilis among miners and women, and of gonorrhoea among miners [71]. The Masaka trial in Uganda, in turn, found dissimilar results depending on intervention arm: incidence of active syphilis and prevalence of gonorrhoea were significantly lower in one of the intervention arms than in the control group, while HSV-2 incidence was lower in the other intervention arm than in the control group [54]. Only the Stepping Stones programme was associated with a significantly lower HSV-2 incidence in comparison with controls [51]. Hence, there is little evidence that CMI succeed in reducing numbers of HIV and/or STI cases among youth and general communities, with some success limited to the Stepping Stones programme and project Accept.
In terms of behavioural markers, CMI were found to be significantly positively associated with the likelihood of condom use with casual partners [50 in females, 54 in one arm; 71 in miners, men and women] and ever using a condom [71 in miners and women]. However, for a number of interventions no significant changes in either direction were reported on condom use at last sex [49][50][51], reported ever condom use [54,71 in men] and condom use with regular [52] and non-spousal [53] partners. For the Manicaland study [52], reported condom use with casual partners was significantly more common in control than in intervention communities. Behaviours other than condom use were also addressed. In the Accept study, the proportion of people taking their first HIV test was significantly larger in community based voluntary counselling and testing communities than in standard care areas [15]. However, the IMAGE project found non-significant differences in having had an HIV test in intervention groups relative to controls [53]. Similarly, in terms of health service use, the Regai Dzive Shiri project had no effect on clinic attendance [49] and the Manicaland project found no intervention-related differences in treatment-seeking within 3 days of STI symptoms [52]. In light of these results, CMI appears to have some effect among youth and targeted communities on condom use with casual partners and promising effects in the uptake of voluntary testing, although evidence for the latter is limited to one study.
Among programmes targeting youth and the general population, evidence of effects on social outcomes is limited to the IMAGE and the Stepping Stones projects. In the former, targeted women reported a significant reduction in intimate partner violence and were more likely to report higher levels of participation in social groups and collective action, than their comparison counterparts, with no significant differences regarding the perception of solidarity and that the community would work together to achieve common aims [53]. In the Stepping Stones programme the proportion of male participants who reported enactment of intimate partner violence was lower than among controls, with this trend maintained across the intervention's lifetime (p = 0.099, p = 0.054 at 12 and 24 months, respectively), although there was no evidence of this difference among women [51]. Therefore, while these results are promising, there still remain inconsistencies regarding their diffusion into biomedical and behavioural indicators.
In sum, for studies involving youth, targeted groups within communities and geographically-bound communities, no significant results were found for reductions of HIV incidence or prevalence, while marginal impact on the reduction of other STI was identified, mirroring the results of previous reviews [33]. The evidence suggests that while these programmes impact on reported condom use with casual partners, this improvement may not translate into significant changes in biomedical markers. The results obtained for social outcomes are fairly positive but limited to two studies, and hence their relationship with behavioural and biomedical indicators remains to be clarified.

Discussion
The present review has gathered evidence of the effectiveness of interventions with a CM component on biomedical, behavioural and social outcomes. We present our discussion in two sections. The first assesses what can be learnt from our review to inform contemporary CM programming. Given that the findings are generally inconclusive, the second section critically reflects on the literature, to explore reasons for the inconclusiveness of the evidence.
The Systematic Review: What has been Found?
Overall, this systematic review has produced a somewhat inconclusive set of findings. Among sex workers and groups most at risk, the evidence bears some degree of consistency, indicating an overall tendency of positive impact, with more consistent and stronger results for behavioural and social outcomes than for biomedical ones. Among youth and general communities, the evidence of the effects of CMI remains inconclusive. Overall, it is not possible at this point in time to come to a general conclusion as to whether CMI are effective or not, though there is suggestive evidence for sex worker groups. Our review suggests, nonetheless, two more nuanced lessons that may be drawn.
The first is that CMI appear to be more successful with groups who have a meaningful collective identity rather than with more generalised populations. One of the main characteristics of interventions engaging sex workers and groups most at risk is that they capitalise on these groups' collective identity. CMI often work through their situation of vulnerability to foster mobilisation that is cohesive and fuelled by a need not only to attain HIV-related goals, but also to increase their material and symbolic power and status in the community. Indeed collective identity could arguably have been one of the reasons for success of organic forms of CM efforts such as Sonagachi [74].
One explanation for the different approach in these programmes is that young people and general communities do not evidently display the extreme and conspicuous disadvantages of sex workers and thus do not appear in need of tackling the social determinants of their problems. It is plausible that sex workers can mobilise against specific structural factors that marginalise them particularly (policies, structures and laws to deter sex work), which is not the case with general populations. Similarly, it is possible that mobilising a sub-set of a population is easier than mobilising entire communities. For the case of youth, hence, if they were discriminated against (as is the case of MSM in some contexts), this might foster a collective identity in the group, which would in turn facilitate tackling identifiable social determinants of health among this group.
Second, CMI seem more likely to generate favourable outcomes if accompanied by efforts for change at the structural level. For example, Kerrigan and colleagues' study in the Dominican Republic provided evidence that CMI alone renders some positive outcomes, but when implemented alongside structural changes such as brothel policy of 100 % condom use its results were more effective [14]. Similarly, researchers of the MEMA kwa Vijana trial identified the low status of young people in the community as a barrier to attaining better results as well as females' lower social status and financial reliance on males [75]. These factors, among other sociocultural issues identified by researchers, point to the need to work not only with the 'target group' but also with other community groups, in order to tackle structural barriers to CMI effectiveness.
Critique: Why are the Findings Inconclusive?
We suggest that the evidence is inconclusive not because CMI are ineffective, but instead due to problems with operationalization, evaluation and review methodologies. In other words, the full potential of CMI has rarely been evaluated. In what follows, we discuss problems in the literature, at the level of operationalization of CM, the attention to socio-political context, and the nature of review methodologies. While these problems afflict some parts of the literature, various authors have actively sought to address the problems appropriately. Thus, following each problem we also discuss ways of pre-empting or mitigating such problems.
First, inconclusive results may be related to the operationalization of CM. For complex interventions and trials in general there are a number of programme design issues that impinge on intervention impact, such as programme length, follow-up timespan, intervention exposure and adherence, as well as ''underpowered'' designs, as has been previously pointed out [33,50,60,76]. In this review, we have identified three flaws in the operationalization which we discuss in turn: (i) understandings of CM remain underdeveloped, and often tokenistic; (ii) implementations of CMI are often characterised by inflexibility; and (iii) the evaluation of CMI tends to inadequately account for social impact.
The first point about operationalization concerns the degree to which CM interventions allow for genuine community ownership. In theory, the merit of CM lies in building sustainable community strengths and agency at the community level [77,78]. In practice, however, the concept is often used to refer to static and tokenistic activities in which researchers gather ''the community'' and establish contact with relevant stakeholders. Despite our efforts to employ appropriate inclusion criteria, limited versions of CM were employed in several of our reviewed studies. This was particularly notable in interventions with youth and general communities. Articles describing the nature of CM in the reviewed studies included statements referring to CM as ''community sensitisation…to inform the community about the study'' and to obtain authorisation [79], activities ''to reduce opposition'' to the intervention programme [75], ''the process of gaining community support for the study'' [80], and undertakings to ascertain leaders' ''views and seek their support in encouraging community participation'' [81]. In these instances, interventions draw on local knowledge and input to execute programmes planned by outsiders [82]. In such cases, at best, communities are ''mobilised'' first, to gain access to their networks and thus enable research execution and second, to participate in programme delivery [82]. They may not in fact be building and capitalising on the community connections that comprise the main rationale for CM.
Examples of projects that managed to operationalize CM in a way that fostered supportive community relations come from those targeting sex workers and groups most at risk. Overall, these interventions included activities that triggered active community engagement through de-stigmatising public events, fostering of within-community cohesion and alliances with external stakeholders. The Princesinha project in Brazil [68], for example, engaged sex workers as leaders of project activities and data collectors. Public exposure was raised through celebrating the eldest sex worker, carnival participation through a ''prejudice-free'' samba group carrying prevention posters [68]. In a similar attempt in Brazil, the Encontros project [83] included activities that allowed within-community dialogue around sex work, discrimination, human rights and HIV/STI prevention. They also organised ''hot-pink'' parties, cultural performances by sex workers at the city's cultural centre, along with external partnerships with the community at large [83]. What these and other interventions [e.g. 72] have in common is the thoughtful implementation of activities that are inclusive of community members and build cohesive relationships among them, while fostering their self-presentation as an assertive 'community' in negotiations with stakeholders in the public sphere.
Stemming from this understanding of CM and in line with requirements of standardisation of intervention components in evaluation research, the second problem at the level of operationalization is inflexibility in the way the majority of the programmes included in our review responded to the needs of communities. A premise of CM is that interventions must be appropriate, and thus adapted to specific local contexts based on community ownership and leadership [2,18]. However, when studies reported changes in the planned implementation and evaluation, this was presented as a remedial measure taken by researchers, which limits meaningful engagement of the target community and therefore ownership of the project's objectives. For instance, the Manicaland project did not implement the income-generating intervention component originally planned because of country-wide economic decline during the trial [52]. While Stepping Stones programmers acknowledged that ''development of interventions is an iterative process, and interventions are generally strengthened by being more extensively tested and adapted'' [51], adaptations to the original intervention occurred before this trial was implemented and to fulfil research needs rather than community demands [51].
Among the studies included in this review, two interventions made explicit adaptations while implementing CM. Project Accept [84] made an explicit programmatic point of allowing ''site-specific adaptations'' to accommodate ''site-specific sociocultural differences'' in its varying settings. Researchers developed a thoughtful way of balancing consistency and flexibility while maintaining a ''minimum level of comparability'' [85]. Strategies used to enable consistency of themes across adaptations included engaging field staff in producing the adaptations, ensuring community acceptance, and using steering committee, ethical review boards and intervention subcommittees to approve and implement adaptations [85]. The Avahan intervention also documented the changes applied according to community demands during its implementations [86]. The remaining challenge, of course, is that such complexity, changes, and relative lack of control are at odds with the requirements of rigorous and internally-valid designs such as RCT.
The third problematic point regarding the current operationalization of CM concerns methodological issues in the measurement of impact, particularly social impact. The choice of impacts to measure, and of measurement tools, is often weak, particularly for social outcomes. There might be benefits gained by CMI participants that are not necessarily part of the programmes' evaluated outcomes (e.g. health service use) or that are intangible (e.g. increased participation in groups outside the 'target' community). Among interventions with sex workers, some programmes [55,59] limit the appraisal of social outcomes to one question per dimension (e.g. collective efficacy), restricting the power of such measurements. This indicates the need both to improve quantitative instruments, and to triangulate evidence from more open-ended data collection methods, to maximise learning from an intervention.
For example, some of the studies included in this review have used process evaluation to explain their quantitative effects [75,87] to document the challenges of implementing RCT among deprived, rural groupings [81], and to report the most successful CM approaches to engage communities [88]. Process evaluation represents a viable option to gauge the social transformations triggered by CMI because it documents the context of the 'black box' that often seems to be present in 'input-output' models. In addition, it documents 'achievements' that are part of the intervention per se. This is the case because in many contexts the sheer implementation of the programme might be in fact contesting the status quo of its target population, which was the case of a number of studies in this review [67,68], but that is often missed in quantitative evaluations such as those included here.
Second, many of the interventions failed to engage with the broader social and political context and power relations that structure health in very disadvantaged communities [2]. Contemporary understandings of CM emphasise that communities alone rarely have the power to make the social changes needed to sustain healthy behaviour, and hence, that alongside CM, efforts to engage powerful stakeholders and to move towards structural changes are also required [18,27,89]. In contrast to these understandings, in programmes involving youth and general communities, there was evidence of limited efforts to engage the broader community. Where efforts were made to engage groups beyond the target group, this often had the limited aim of enabling the diffusion of health-related knowledge, to parents or other groups [53,79] rather than engaging them in transformative change.
Among our reviewed studies, it was notable that interventions with sex workers often took greater account of the socio-political context. In such studies, having a support network, altering community relationships and fostering collective action have the potential to bring much wider benefits and thus be valued in their own right, beyond their contribution specifically to HIV prevention. For instance, advocacy was conducted with the police, local government officials, community leaders, FSW's partners and clients, and other gatekeepers [62,66]. In this way, the 'community' that brings about the project is more inclusive than the interventions' target community groups [23].
Third, reflecting on the very uneven nature of the findings, we suggest that the goal of providing an over-arching statement of 'the evidence' for CM may itself be misguided [90,91]. Most obviously, we have noted that there is a different pattern of findings for sex workers and for youth and general communities. We have observed that in some studies, there appear to be impacts on condom use with some types of partners, but not others. The IMAGE study found impressive effects on intimate partner violence (and this is widely argued to be a likely contributor to HIV transmission), but no effects on HIV incidence, which by its nature is more difficult to assess. Based on earlier positive results from the Sonagachi Project [16,92], replications were implemented in Brazil [67,70] and India [65,66] but to less positive effect. Furthermore, the Avahan intervention has disaggregated CM components and their impact on a host of measures of condom use in a variety of settings, finding some significant relationships at a fine-grained level, but not much consistency across results [57-59, 61, 62].
Such inconsistent findings make it appear unrealistic to expect a singular statement about whether CM 'works' or not. More nuanced statements, about the conditions under which CM is more likely to work might have greater potential (e.g. our review suggests that CM may be more likely to succeed if it is implemented in tandem with policy changes). However, it seems unlikely that a definitive set of decision rules to determine when CM should be attempted could be achieved. CM is, by its very nature, contextual and evolving. CM mobilises contextually-specific local networks, in locally-appropriate ways, and allows communities power to create and alter objectives. Thus, CM is not simply an intervention that is equivalent across sites, but takes different forms in different sites. Although the 'evidence-based policy and practice' paradigm prioritises controlled trials and systematic reviews of these, it may be that multi-faceted and context-specific CMI are more challenging to quantify, compare and appraise.

Implications
The above critical discussion has implications for future implementation and evaluation of CM. The first is the need for operationalization of CM informed by a committed understanding of social change. We are concerned that the evaluated CMI may not in fact be a good test of the effectiveness of CM, because the interventions do not always heed the transformative objectives behind CM, but treat it simply as an instrumental add-on to increase uptake, being inflexible to the contextual needs of the community participating in the intervention, and using simplistic measures of social outcomes. Part of the issue may be that the improvisational and responsive nature of genuine CM is not compatible with the methodological requirements of controlling variables and standardising intervention components. Another possibility is that the biomedical professionals who often lead such interventions are not equipped with the skills to facilitate an open-ended and complex social process of mobilisation [2,93]. We propose that a clear understanding of CM, informed by a social scientific theory of change, and recognising the need for specific community development skills is needed. The more established our understanding of CM is, the less likely it will be that the concept is stripped-down and depoliticised when operationalized.
Second, our discussion of context and social groups points towards the need to work with communities to address the socio-political context and to build supportive partnerships with more powerful groups, rather than with community groups in isolation. An enabling policy environment (e.g. decriminalisation and de-stigmatisation of sex work and homosexuality, governmental policies for participatory community planning of interventions) is required for communities to address socio-political issues. The reviewed studies targeting sex workers illustrate that, when such an enabling environment is absent, advocacy may be needed as part of CMI in order to negotiate power relations.
Finally, our critique of the systematic review methodology suggests that judgements about the suitability of CM may need to be made on a more local basis, and informed by a wider set of evidence than that provided by systematic reviews and/or rigorous outcome evaluations. Contemporary work in the philosophy of science questions the desirability of conceptualising social interventions in terms of 'replication' across diverse contexts, arguing that ''to draw causal inferences about a target population, which method is best depends case-by-case on what background knowledge we have'' [94]. The implication here is that a systematic review of outcome evaluations is insufficient information on which to base the choice or design of a CM intervention. Such information needs to be combined with other sources, including a plausible theory of change and knowledge of the particular context into which the intervention is being introduced.

Conclusion
Taking the evidence at face value (irrespective of our critiques of the form of this evidence), it seems too early to decide whether CM works or not, especially considering the heterogeneity of interventions. At present, at least two RCT which explicitly include CM as a component are being conducted and awaiting biomedical results [95,96]. They may offer further evidence of the contribution of these quantifying approaches to the planning, implementation and evaluation of CM as currently conceptualised. However, taking our critiques seriously, we suggest that the very aspiration to provide a single statement of 'the evidence' for diverse, evolving, and multifaceted CMI in complex settings may be misguided.