Social Psychiatry and Psychiatric Epidemiology

, Volume 47, Issue 2, pp 279–291

A controlled trial of problem-solving counseling for war-affected adults in Aceh, Indonesia


    • Department of Mental Health and Applied Mental Health Research GroupJohns Hopkins Bloomberg School of Public Health
  • Bhava Poudyal
    • Rehabilitation Action for Torture Victims in Aceh (RATA)
  • Wietse Tol
    • Global Health Initiative, Yale University and HealthNet TPO
  • Laura Murray
    • Department of International Health and Applied Mental Health Research GroupJohns Hopkins Bloomberg School of Public Health
  • Maya Nadison
    • Department of Mental Health and Applied Mental Health Research GroupJohns Hopkins Bloomberg School of Public Health
  • Paul Bolton
    • Center for Refugee and Disaster Response and Applied Mental Health Research GroupJohns Hopkins Bloomberg School of Public Health
Original Paper

DOI: 10.1007/s00127-011-0339-y

Cite this article as:
Bass, J., Poudyal, B., Tol, W. et al. Soc Psychiatry Psychiatr Epidemiol (2012) 47: 279. doi:10.1007/s00127-011-0339-y



War and conflict have consequences on the mental health of individuals and entire communities and the communities in Aceh, Indonesia, having experienced more than 30 years of armed conflict, are no exception. This study presents results from an evaluation of a non-specific mental health group counseling program among adults affected by conflict. Interventions such as these need to be evaluated to further the limited empirical evidence base for efficacious community-based treatments for improving the mental health and psychosocial problems in humanitarian settings.


A total of 589 adults were screened using a locally validated measure of mental health and functioning. Of all, 420 (71%) met the study inclusion criteria of elevated symptom levels and functional impairment: 214 and 206 in three intervention and three control villages, respectively. Intervention participants met weekly for eight sessions in groups of eight to ten adults. Following completion of treatment, 175 (85%) controls and 158 (74%) intervention participants were re-assessed. Regression analyses compared pre- and post-intervention scale scores.


We did not find an intervention effect for reducing the burden of depression and anxiety symptoms when compared with the control sample. Impact on functioning was mixed and there was an increase in use of positive coping strategies.


The lack of mental health impact may be because the mental health problems and dysfunction were not due to disorder, but were normal responses to struggles of daily living experienced by this community and not addressed by the intervention.


Mental healthInterventionCross-culturalCounselingLow-resource country


War and conflict have consequences on the mental health of individuals and entire communities. Many studies have found evidence of effects of war and violence, with survivors of war being at an increased risk for a wide range of mental health and psychosocial problems [13]. In their review, Miller and Rasmussen [4] present evidence that the impact of war goes beyond the direct exposure to violence, recognizing that stressful social and economic conditions following conflict also have significant impacts on mental health and well-being. War in low-income countries often exacerbates existing social problems such as unequal distribution of resources, ethnic and gender discrimination, and fragile democratic processes [5].

Recent post-war mental health research has primarily focused on psychiatric diagnoses (e.g., PTSD) [6], with most intervention impact data coming from open trials with small numbers or case reports. For example, in Nicholl and Thompson’s [7] review of psychotherapies used among refugees, only one randomized controlled study was cited [8]. This study showed positive results from two CBT approaches, both an exposure alone treatment, and exposure plus cognitive reprocessing. Older literature on treating war and conflict affected populations reports less positive results and the need for longer-term treatment [9]. While there is broad agreement that mental health interventions need to be community based in the aftermath of war trauma, genocide, and organized political violence [10], there is little empirical evidence on how to best treat this population.

Although there is increasing evidence that psychotherapeutic models found to be effective in the West can be adapted and implemented in low-resource settings [1113], the majority of humanitarian and non-governmental organizations (NGOs) do not use these types of evidence-based models. Important reasons for not using these programs include the increased costs of training and implementation and the lack of available specialized human resources in contrast to the costs and resource needs of existing non-specific counseling or psychosocial programs that often employ paraprofessionals. The result is that NGOs are expending large amounts of resources on interventions for which there are consensus guidelines [14], but currently little research to support this consensus. These interventions need to be evaluated to further the limited empirical evidence base for efficacious community-based treatments for improving the mental health and psychosocial problems in humanitarian settings [15, 16].

The purpose of this study was to assess the effectiveness of a non-specific counseling model. Our intent was both to determine whether this particular model was effective among the population who were to receive it as part of a services program and to demonstrate that NGO implementers and researchers can conduct impact research as part of program implementation even in challenging environments.

The site of the study was Aceh, Indonesia. The communities in the interior of Aceh have experienced more than 30 years of armed conflict between the forces of the Republic of Indonesia and the Free Aceh Movement (Gerakan Aceh Merdeka, GAM). These populations are situated far from the sea and were not directly affected by the disastrous effects of the 2004 tsunami. They were, however, affected by the unequal distribution of resources provided to the tsunami survivors, which according to the United Nations led to local discontent and perceptions of inequality with regard to international aid distribution [17].

During the year following the signing of the August 2005 peace agreement, researchers from the International Organization from Migration (IOM) and the Department of Social Medicine, Harvard Medical School carried out a psychosocial needs assessment in high conflict communities across Aceh. The study found that mental health problems associated with war and conflict were common: 35% of the respondents ranking high on depression symptoms, 39% on anxiety symptoms and 10% on PTSD. Of the civilians, 28% were victims of beating and 38% suffered the loss of a family member or friend [18]. The study report concludes with an urgent call for mental health services to be provided as part of the peace-building and post-conflict recovery process.

To meet the reported mental health needs of this population, in 2007 the International Catholic Migration Commission (ICMC) began a community-based counseling and support program based on a problem-solving approach. This group intervention, designed and implemented by national and international ICMC staff in Indonesia, was a component of a larger intervention package of general health services. This paper presents the results of a controlled trial of the effectiveness of only the group counseling approach in reducing the burden of psychosocial and psychological problems and improving functioning among adults affected by conflict-related violence in Aceh, Indonesia. The tools to measure the effectiveness of the intervention were developed following a qualitative assessment of the mental health and psychosocial problems of the target population [19].

The trial was conducted from August 2007 to January 2008 as a joint project between ICMC and Johns Hopkins University, with financial support from the Victims of Torture Fund at USAID. Our primary research aim was to investigate the impact of the group counseling intervention on reducing the severity of mental symptoms and associated dysfunction. We also planned to investigate potential moderating effects of gender and age because the literature in general, as well as recent cross-cultural studies, have shown some differential effects of gender and age with certain treatments [11, 2022]. Many of these studies note that as the evidence base grows for mental health interventions, future research should address gender and age distinctions to better guide appropriate intervention selection. This study represents one of few studies on the effectiveness of a non-specific group counseling intervention in a low-resource environment affected by war and trauma.


Site and population

The study population included adults in six villages around the central town of Bireuen District. This district was considered one of the strongholds of GAM and these villages were frequently attacked by the Indonesian military throughout the 1980s and 1990s. The populations in the study villages were exposed to systematic human rights violations, with entire villages experiencing torture through direct experience, torture of family members, and/or witnessing of torture and arbitrary killings. A local torture survivor and treatment organization (RATA) helped identify potential study villages, selecting those with historically high rates of torture and where other NGOs were not currently providing services and therefore the need was greatest. Villages were paired based on distance to the urban district center. Table 1 presents the population structures of the study villages. The designation of intervention or control status was made in discussion with ICMC and RATA staff, who had to consider which villages would be more or less accessible during the rainy season in which the first round of services were provided, with more accessible villages given priority as intervention sites. The designation of intervention or control village was not shared with the interviewers until after data collection was complete, preventing any knowledge of which villages would get services first from biasing the baseline assessments.
Table 1

Description of the pairs of study villages


Village 1a

Village 1b

Village 2a

Village 2b

Village 3a

Village 3b

Intervention status







Population size







 Males, N (%)

536 (51)

804 (49)

316 (45)

565 (48)

323 (47)

242 (47)

 Females, N (%)

515 (49)

843 (51)

381 (55)

619 (52)

359 (53)

271 (53)

Age breakdown

 0–14 years, N (%)

351 (33)

519 (32)

260 (37)

387 (33)

233 (34)

164 (32)

 15–64 years, N (%)

660 (63)

1,074 (65)

389 (56)

737 (62)

415 (61)

315 (61)

 65+ years, N (%)

40 (4)

54 (3)

48 (7)

60 (5)

34 (5)

34 (7)

Population data provided by community leaders

I intervention village, C control village

Measure development and scoring

The assessment instrument consisted of four sections dealing with general demographic information, psychological signs and symptoms, ability to function, and the use of specific coping mechanisms. The draft version of the instrument was based on data from an initial qualitative study conducted by local ICMC and RATA staff in the same district [19]. This study, applying free listing and key informant qualitative interviewing methods, showed a range of signs and symptoms similar to those defined within the domains of anxiety, depression, and somatoform disorders. Based on these results and in discussion with ICMC staff and the Acehnese staff who had experience working with the target population, we drafted a 44-item assessment of mental health problems consisting of an adapted version of the Hopkins Symptom Checklist (HSCL)-25 [23] assessment of depression and anxiety symptoms, a 7-item somatic symptom scale derived from the Symptom Checklist-90-R (SCL-90-R) [24] and 12 locally identified signs and symptoms of distress that covered both depression and anxiety symptoms. The HSCL was selected based on the similarity of the items in the HSCL to what emerged from the qualitative study, and our past experience in adapting this measure for use in a variety of cultural contexts. The 12 locally identified items referred to symptoms commonly mentioned in association with symptoms in the existing measures, but not already included in them. Although the target population was selected based on their past exposure to trauma and violence, the cardinal symptoms of PTSD (i.e., re-experiencing the event and numbing/avoidance symptoms) were not commonly mentioned during the assessment. As both the local workers and ICMC staff agreed that, based on their experience, PTSD did not appear to be an important issue, we did not include a PTSD assessment measure.

To assess functional impairment, we combined an adapted version of a standard measure with a locally developed scale based on specific tasks and activities identified as relevant to the local population. For the standard measure, we adapted sections of the World Health Organization Disability Assessment Scale II (WHODAS II) [25], interviewer administered version that assesses non-disorder specific aspects of functioning (i.e.,being able to walk, stand, work, interact with others, and care for self). Locally developed functionality scales were designed by utilizing elements from the WHODAS II and an approach originally described by Bolton and Tang [26] and since used by us and other groups [20, 27, 28]. It included items (14 for men; 16 for women) that the local population identified during the prior qualitative study as typical tasks that adults regularly do to take care of themselves and their families, and to participate in their communities.

We also developed questions on coping mechanisms for dealing with distress. Like the specific tasks in the function instrument, these were based on the data from the previous qualitative study. Nine different coping strategies were assessed (praying, reciting Koran, earning money, sitting together to chat, walking to please one’s heart, discussing, listening to advice from wise men, and engaging in sports and recreation). Respondents were asked how often they used each strategy when they felt badly (0 not at all, 1 rarely, 2 somewhat, 3 often). A coping scale score was calculated from these nine questions by summing the responses.

Scales were developed to assess depression, anxiety, and somatic symptoms. The scales were created by summing up the scores for each of the included symptoms, with the frequency of each sign and symptom over the preceding 2 weeks rated by the respondent on a 4-point Likert scale (0 indicating not having that symptom at all to 3 indicating experiencing that symptom all of the time). With the function questions, the respondent was asked to rate their ability to engage in the activity using a 5-point Likert scale (0 having no difficulty doing the task to 4 having so much difficulty that the task cannot be done). Three summary functional impairment scales were generated by summing the scores for the: local specific tasks for men (14 items) and women (16 items); and the non-gender specific elements of functionality (11 items from the WHO DAS II). A coping scale score was calculated from the nine coping questions by summing the responses with respondents asked how often they used each strategy when they felt badly (0 not at all, 1 rarely, 2 somewhat, 3 often).

Psychometric testing

Scale validity and reliability were evaluated using methods described previously [27, 28], using an approach to explore validity in situations where mental health professionals familiar with the local culture are not available to make clinical diagnoses. This process relies on agreement as to the presence/absence of symptoms by knowledgeable local key informants and the study participants themselves. Thus, we began the validation study by asking community leaders and others who were knowledgeable about the residents of their villages to identify a sample of local adults suffering from fear and/or thinking too much: the two local psychological problems most commonly mentioned by local people in the initial qualitative study [19]. We also asked these same leaders to identify a sample of adults not suffering from either problem. This process resulted in a sample of adults identified as suffering from problems of fear and/or thinking too much (n = 106) and a sample suffering from neither of these problems (n = 73).

Trained local interviewers then interviewed the adults from these lists, while blind to their identified status, administering the assessment instrument and asking them whether they themselves and their family members thought they suffered from either or both of these problems. In this way, we generated a list of 86 respondents who were concordant for having at least one of the problems [i.e.,whom both the key informant and the respondent identified as having either problem (cases)] and 23 respondents were concordant for having neither of the problems [i.e.,whom both the key informant and respondent identified as having neither problem (non-cases)].

To evaluate discriminant validity, we compared the average scale scores among the cases and non-cases. Reliability of the subscales was also evaluated. Pearson’s correlation coefficients were used to evaluate test–retest reliability, comparing assessments done 2–3 days apart.

All three symptom scales were substantially and significantly greater for the cases than non-cases, suggesting that the scales have the ability to adequately discriminate between those with/without the problems of fear and/or thinking too much. Specifically, the depression symptom scale score was 16.6 (SE 0.9) for the cases and 6.5 (SE 1.0) for the non-cases (p < 0.0001); the anxiety symptom scale score was 16.6 (SE 0.8) and 5.3 (SE 1.2) for the two groups (p < 0.0001), and the somatic scale was 14.1 (SE 0.5) and 7.4 (1.2) for the two groups (p < 0.0001). A total symptom scale was also developed that included the 44 signs and symptoms from all three scales as well as the additional local symptoms, thereby providing a general measure of the presence and severity of mental health and psychosocial symptoms. This scale, with possible scores ranging from 0 to 132, also discriminated well with cases having scores nearly twice as high as the non-cases, 62.1 (SE 2.5) to 23.3 (SE 3.8), respectively. With regard to reliability, correlation coefficients were adequate at 0.65 for the anxiety subscale, 0.68 for the somatic subscale, and strong at 0.91 for the depression subscale. Cronbach alpha scores (a measure of internal reliability) ranged from 0.81 to 0.87 for the three scales. Alpha scores of >0.70 generally indicate adequate internal consistency.

Eligibility and screening

Eligibility for inclusion in the counseling program was based on severity of mental health symptoms overall and associated functional impairment, as indicated by score on the total symptom scale (i.e., the 44 symptoms that constituted all three mental health subscales) and score on the function scale. The cutoff score for the total symptom scale was set at one standard deviation (SD 24) below the mean (62.1 points) score among the cases group, resulting in a threshold of 38 points, more than 10 points greater than the average scores for the non-cases. This same method was used in a previous trial our group conducted [7] and was found to adequately identify participants with levels of psychological distress high enough to warrant a mental health oriented intervention. Eligible cases also had to indicate at least some level of functional impairment, as assessed by a score >0 on either the local function or the adapted WHODAS II measure.

Screening and baseline assessments were conducted in August 2007. The intervention lasted from September through December 2007 and the post-intervention assessments were conducted in January 2008. As there were no available data to suggest an expected amount of change due to participation in the intervention program, the 10-point difference in total symptom scale change scores was selected based on what we considered to be the minimum change that might also be clinically significant.

A total of 592 adults were interviewed. Of these 592, 415 (70%) met the cutoff criteria of having a total symptom score of >38 points, as well as some degree of functional impairment. The high percentage of eligible adults was due in part to the study recruitment methods, which specifically focused on identifying community members with high levels of psychosocial and common mental health problems. This was done through community education activities as well as through referrals by eligible people themselves. Five additional people were also included because their functional impairment scores were high even though their problem scale scores were borderline between 37 and 37.7 points. This gave the total eligible sample as N = 420. All of the eligible adults were told to what assessment arm their village was assigned (intervention or wait control) and asked whether they consented to participate in the program. This was done because the eligible persons would find out this information later and we were concerned that those in the wait-control group might feel misled and drop out of the study at that time. Using this approach, 415 (99%) consented to participate (see Fig. 1 for sample flow).
Fig. 1

Eligibility and follow-up flow chart


The intervention consisted of a group of activities collectively referred to as the Problem-Solving Counseling (PSC) Program. This program, developed by ICMC Indonesia, exists in a manualized form available from the authors, with training covering topics such as qualities of an effective helper, confidentiality, empathy, listening and responding, questioning and problem management skills, stress and coping, and information specifically on the consequences and needs of torture survivors.

To implement this program, locally referred to as the ‘Kelompok Peugah Peugah Haba’ (talking group), ICMC worked with RATA. RATA counselors received an initial 5-day training by ICMC on the program components. They then provided individual counseling to torture survivors in non-study communities for 3 months to improve their skills. Regular supervision by the ICMC staff was provided during this initial period to ensure proper implementation and to identify areas requiring further training. After a few months of providing individual services, the counselors were provided with a second training on implementing the program in a group format, including skills for group management. The group intervention was provided by pairs of counselors working together. The process was designed so that if it became necessary to replace a counselor, new counselors would be trained and then matched with more experienced counselors to continue running the groups. This occurred with three of the counselors during the course of the study.

The group counseling program consisted of eight weekly group sessions. In the first two sessions, the intervention was introduced, expectations were discussed, and current problems related to distress identified by the participants. The selection of focal problems was done by each group of study participants. The selected problems predominantly focused on the challenges of daily life. The third through sixth sessions consisted of discussions and sharing of individual experiences with the current problems identified in the first sessions and how different members employed strategies to cope. The sessions were theme based, focusing on these different problems, with clients talking about their own experiences and how they cope. The goal of these sessions was for clients to be able to identify their currently disturbing emotions and stressors and learn how to manage them. The seventh session included a self-evaluation of how participants were doing since they joined the group and included discussions on positive and negative changes. Finally, the eighth session consisted of looking toward the future, with participants encouraged to talk about their next plans, whether they wanted to continue meeting with the group (without the counselors), and how they would arrange those meetings. At the sixth session, the group was requested to choose one leader who would assist in the facilitation of final sessions to continue the group process after the formal program sessions were complete. Group and individual counselor supervision was provided throughout the study.

Prior to the initiation of the intervention, the counselors conducted program introduction sessions in the community (called socialization sessions) in each study village, to introduce the community to the organization and the services that would be provided. This was done to improve the acceptance of these services within the community and inform potential participants of the dates on which the interviewers would conduct the eligibility screenings. These community presentations were open to everyone in the village.


Descriptive analyses were conducted on the baseline data to determine if the intervention and control groups were comparable. We calculated the amount of change experienced by each study participant on each subscale (psychosocial problems, functional impairment, and use of coping strategies) by subtracting the post-intervention scores from the scores attained during the original screening interviews (i.e., baseline). The level of significance of the mean change on each subscale between intervention and control groups was calculated using post hoc t tests. Regression analyses were used to evaluate the impact of covariates (e.g., age, village, gender, counseling pair) on these outcomes. The results from the regression analyses were adjusted for potential group effects using a clustering variable by the intervention group. This was necessary because people in groups may influence each other and must be accounted for in the analysis. Variation in outcomes by counseling pair (i.e.,were there any counseling pairs whose groups had on average different results than the others) were evaluated by systematically removing each counseling pair from the analysis and investigating whether there were changes in results and/or statistical significance.

Statistical significance of the regression coefficients was set at p = 0.05, two-tailed and expressed as a 95% confidence interval (CI). Cohen’s d effect sizes were calculated for the differences in scale change scores comparing intervention with control groups. With no a priori information on expected intervention effects, post hoc power calculations were conducted to investigate our power to detect significant effects. Given the completer sample size of n = 158 intervention and n = 175 control participants, we estimate 95% power to detect a moderate (d = 0.40) effect size.


Baseline characteristics

Table 2 compares the characteristics of the intervention and control groups among the 420 respondents eligible for participation in the counseling program and among those who actually participated. Although there was some variation in the proportion of men and women meeting eligibility criteria, across intervention and control status the demographic characteristics did not significantly differ. Most were between the ages of 30–69 years. Nearly 80% of the sample was married, with most others being widowed. The symptom scales are similar across the intervention and control groups, while the functional impairment levels differ, with the controls having higher rates of impairment among both men and women.
Table 2

Baseline demographics and scale scores of intervention and control samples


Eligible for participation

Actual participantsa


Intervention sample (N = 214)

Control sample (N = 206)

p value*

Intervention sample (N = 158)

Control sample (N = 175)

p value*


 Male, N (%)

107 (50)

85 (41)


71 (45)

70 (40)


 Female, N (%)

107 (50)

121 (59)


87 (55)

105 (60)



 <30 years, N (%)

17 (8)

14 (7)


10 (6)

12 (7)


 30–49 years, N (%)

91 (42)

97 (47)


70 (44)

81 (46)


 50–69 years, N (%)

79 (37)

77 (37)


56 (35)

67 (38)


 70 or more years, N (%)

27 (13)

18 (9)


22 (14)

15 (9)


Marital status

 Single, N (%)

10 (5)

6 (3)


4 (3)

5 (3)


 Married, N (%)

170 (79)

161 (78)


126 (80)

136 (78)


 Widow/widower, N (%)

32 (15)

39 (19)


26 (16)

34 (19)


 Divorced, N (%)

2 (1)



2 (1)



Mental health symptoms scales

 HSCL depression scale, mean (SD)

17.4 (8.1)

17.9 (6.5)


17.6 (8.1)

18.0 (6.6)


 HSCL anxiety scale, mean (SD)

17.8 (6.5)

17.0 (5.9)


18.1 (6.2)

17.1 (5.8)


 Somatic scale, mean (SD)

15.4 (4.1)

15.5 (3.7)


15.6 (4.0)

15.4 (3.8)


 Total symptomsb, mean (SD)

65.9 (20.5)

65.3 (17.8)


66.2 (20.1)

65.7 (18.1)


Functionality scales

 Local functions, male (14 items), mean (SD)

10.5 (8.6)

13.7 (10.4)


11.9 (9.5)

13.7 (10.8)


 Local functions, female (16 items), mean (SD)

11.5 (10.6)

14.9 (9.7)


11.6 (9.8)

15.1 (10.0)


 WHO DAS items (11 items), mean (SD)

10.0 (7.0)

12.0 (6.7)


10.3 (7.0)

12.1 (6.8)


 Use of coping strategies scale (9 items), mean (SD)

15.0 (5.3)

15.4 (4.9)


15.1 (5.1)

15.4 (4.9)


* p Value represents the statistical comparison of the distribution of the demographic and scale scores across comparing intervention and control samples

aThose persons found eligible, who agreed to participate, had a follow-up interview, and (for intervention participants) who attended at least two group sessions

bTotal symptoms includes the sum of the HSCL depression and anxiety scales, the somatic scale plus the 11 locally identified symptoms

Those who participated (n = 333) are defined as the subset of the eligible who consented to participate, who underwent a follow-up interview, and who (for the intervention arm) attended at least two group counseling sessions. Among the participants, the intervention and control groups did not differ demographically or on the mental problem subscales. A difference in functional impairment remains, though it is statistically significant only among the females.

Intervention impact on psychological symptoms

Table 3 presents the comparison of intervention to control participants for all three of the symptom scales, as well as total symptom scores. Based on these analyses, there are similar reductions in symptom severity scales among the intervention and control study arms. None of the effect sizes was >0.25 (small effect) with most below 0.10 (no effect). Post hoc power analyses identified >95% power to detect moderate (0.40) effect sizes and >60% power to detect small (0.25) effect sizes. Given the lack of a statistically significant difference between the intervention and control arms, intent-to-treat analyses were not conducted.
Table 3

Change in scale scores comparing intervention to control participants


Intervention (N = 158)

Control (N = 175)

HSCL depression scale (possible range: 0–45)

 Baseline score, mean (SD)

17.6 (8.1)

18.0 (6.6)

 Follow-up score, mean (SD)

13.9 (9.0)

14.6 (7.4)

 Amount of change, mean (SD), %

3.7 points (10.4), 21%

3.5 points (7.7), 19%

 Difference between intervention and control groups in adjusted mean score change (95% CI)a

0.4 (−2.3 to 3.0), p = 0.77

 Effect sizeb


HSCL anxiety scale (possible range: 0–30)

 Baseline score, mean (SD)

18.1 (6.2)

17.1 (5.8)

 Follow-up score, mean (SD)

14.0 (8.2)

13.8 (7.0)

 Amount of change, mean (SD), %

4.1 points (9.5), 23%

3.3 points (7.4), 19%

 Difference between intervention and control groups in adjusted mean score change (95% CI)a

−0.1 (−2.6 to 2.3), p = 0.91

 Effect sizeb


WHO somatic scale (possible range: 0–21)

 Baseline score, mean (SD)

15.6 (4.0)

15.4 (3.8)

 Follow-up score, mean (SD)

12.7 (5.3)

13.9 (4.8)

 Amount of change, mean (SD), %

2.9 points (6.1), 19%

1.5 points (5.2), 10%

 Difference between intervention and control groups in adjusted mean score change (95% CI)a

1.2 (−0.5 to 2.8), p = 0.15

 Effect sizeb


Total symptom scale (possible range 0–132)

 Baseline score, mean (SD)

66.2 (18.1)

65.7 (18.1)

 Follow-up score, mean (SD)

51.6 (27.0)

53.3 (22.2)

 Amount of change, mean (SD), %

14.6 points (30.2), 22%

12.3 points (21.3),19%

 Difference between intervention and control groups in adjusted mean score change (95% CI)a

1.4 (−6.6 to 9.5), p = 0.72

 Effect sizeb


Participants defined as in Table 2

aAdjusted for baseline symptom score, sex, age and group clustering

bCohen’s d effect size: small (0.15–0.40); medium (0.40–0.75); large (>0.75)

Gender-specific analyses did not show any significant results. Analyses exploring an interaction of gender and age found a trend that age was important among the men and less so among the women; older men had lower amounts of improvement compared with younger men, while among women age did not seem to be associated with amount of change (data not presented). In the exploration of differential impacts by counseling pair, we found that on average the group participants for one pair of counselors showed consistently poorer improvement across the scales; however, removing these participants did not result in improving the statistical significance of the comparisons and thus did not substantially change the conclusions (data not presented).

Intervention impact on functioning

Table 4 presents results for the change in functional impairment analyses separately by gender for the adapted WHODAS II scale and locally defined functioning scale. Among the men, the scales performed similarly, with substantial improvements (in percentage terms) among both the intervention and control groups on both scales, but noticeably greater among the intervention group (although falling just short of statistical significance for both scales). For the women, the two scales performed very differently than for the men in that both intervention and control groups showed only small changes on both scales. The largest change was actually a worsening of the function score on the local scale among the intervention group. This resulted in a statistically significant (but programmatically less relevant) difference between the intervention and control groups. None of the comparisons resulted in important effect sizes (i.e., >0.20). Age does not seem to be related to amount of improvement for either scale for men or women (data not shown). When the intervention participants who were part of groups run by the poorest functioning counseling pair were removed, the difference between intervention and control participants widened, with the differences moving from borderline significant to significant (<0.05) for the males (data not presented).
Table 4

Change in functional impairment comparing intervention to control participants




Local function scale, male (possible range: 0–56)

 Baseline score, mean (SD)

11.9 (9.5)

13.7 (10.8)

 Follow-up score, mean (SD)

8.0 (6.2)

11.3 (8.1)

 Amount of change, mean (SD), %

3.9 points (9.9), 33%

2.5 points (9.7), 18%

 Difference between intervention and control groups in adjusted mean score change (95% CI)a

2.8 (−0.2 to 5.9), p = 0.07

 Effect sizeb


WHO DAS items, male (possible range: 0–44)

 Baseline score, mean (SD)

10.0 (7.7)

11.7 (7.5)

 Follow-up score, mean (SD)

7.0 (4.9)

10.0 (5.9)

 Amount of change, mean (SD), %

2.9 points (7.0), 30%

1.7 points (6.1), 15%

 Difference between intervention and control groups in adjusted mean score change (95% CI)a

2.4 (−0.1 to 5.0), p = 0.06

 Effect sizeb


Local function scale, female (possible range: 0–64)

 Baseline score, mean (SD)

11.6 (9.8)

15.1 (10.0)

 Follow-up score, mean (SD)

13.0 (10.5)

14.6 (8.1)

 Amount of change, mean (SD), %

−1.4 points (12.7), −12%

0.5 points (11.4), 3%

 Difference between intervention and control groups in adjusted mean score change (95% CI)a

1.1 (−2.3 to 4.6), p = 0.48

 Effect sizeb


WHO DAS items, female (possible range: 0–44)

 Baseline score, mean (SD)

10.6 (6.5)

12.3 (6.3)

 Follow-up score, mean (SD)

9.8 (6.2)

12.3 (5.7)

 Amount of change, mean (SD), %

0.8 points (7.6), 8%

0.01 points (7.2), 0%

 Difference between intervention and control groups in adjusted mean score change (95% CI)a

2.3 (0.4–4.3), p = 0.02

 Effect sizeb


Participants defined as in Table 2

aAdjusted for baseline function score, sex, age and group clustering

bCohen’s d effect size: small (0.15–0.40); medium (0.40–0.75); large (>0.75)

An additional area of functionality that we assessed was the use of coping strategies that local people had told us were ways people helped themselves to feel better. Table 5 presents the results of the analysis of use of coping strategies among intervention and control participants, separately by gender. Among both men and women, there was an increase in reported use of coping strategies among intervention participants and a decrease among control participants, with the difference over time reaching statistical significance for men. For these analyses, the inclusion/exclusion of the participants of the poorest functioning counseling pair did not significantly affect the results.
Table 5

Change in usage of coping strategies comparing intervention to control participants




Intervention (N = 83)

Control (N = 103)

Intervention (N = 69)

Control (N = 68)

Coping scale (possible range: 0–27)

 Baseline score, mean (SD)

13.7 (4.4)

14.1 (4.5)

16.9 (5.4)

17.3 (5.0)

 Follow-up score, mean (SD)

14.7 (4.2)

12.8 (4.7)

17.7 (4.3)

15.1 (4.3)

 Amount of changea, %

1.0 points (7%)

−1.3 points (−9%)

0.8 points (5%)

−2.2 points (−13%)

 Difference between intervention and control groups in adjusted mean score change (95% CI)b

2.2 (−0.1–4.5), p = 0.06

2.7 (0.7–4.8), p = 0.01

10 respondents (6 female, 4 male) have missing baseline data and are not included in this analysis

aThe hypothesis is that the intervention will improve usage of coping strategy, so the change scores are based on follow-up—baseline scores

bRegression analyses adjusted for baseline coping score, age, and clustering by group


We did not find an effect for reducing the burden of depression and anxiety (effect sizes <0.10), but did find a small effect (effect size 0.25) for reducing somatic symptoms, when compared with a control population who were wait-listed to receive the intervention. For the functional impairment outcomes, men showed more intervention impact than women. While it is not clear why this difference in impact by gender occurred, one explanation might be that men have more power to make changes in their daily lives compared to women in this context. Both men and women showed small increases in their use of coping strategies compared with controls, whose use of such strategies actually declined. The decline in the use of coping strategies may be spurious or may be emblematic of unmeasured comorbid conditions that make the participants less likely to use these strategies. These results are similar to those of a controlled trial among trauma survivors (torture) in Nepal, which also investigated a non-specific counseling intervention using similar types of outcomes. In that study, counseling was associated with changes in functioning and somatic symptoms, but not with other mental health outcomes [29]. It is not possible to determine whether the impact on functioning will be maintained over time or whether the improvement is short-lived without accompanying changes in mental health symptomatology. Only longer-term follow-up studies will allow for this type of investigation.

One possible explanation for our results is that this counseling model is based on a client-guided problem-solving approach to the problems selected by the group of clients at the time the groups were created. These groups tended to select problems referring to the struggles of daily living and their associated feelings and emotions rather than to the symptoms of depression and anxiety identified in the qualitative study. If so, improvements in functioning and somatic complaints may have occurred because participants chose to focus on their highest priority non-psychological and physical problems and this approach was able to address these issues with some success. If this is what happened, we might still have expected some improvement in mental health symptoms as a result of improvement in non-psychological problems, since the former would likely be worse in the presence of the latter. Other local data support this view: one of the major problems discussed by villagers from all six study villages during both the assessment and intervention periods was the lack of economic opportunities and job prospects. The initial qualitative study [19] found that many of the mental health problems were thought by local people to be caused by economic problems. However, improved functioning in such an environment might not be helpful in relieving stress if there are no opportunities or resources to exploit.

A second possibility is that the mental health symptoms we assessed represent true mental disorders that did not respond to non-specific counseling methods even though it resulted in some improvement in functioning. This view is supported by Western literature and low-resource literature to a lesser degree, showing that for mental health disorders with a certain severity, manualized evidence-based treatment (i.e., IPT, CBT) work better than non-specific and/or non-directive therapies [12, 30, 31].

Whichever explanation is correct, our conclusion is that this intervention had some limited success in improving functioning in this population, but not in improving mental health symptoms, either because the underlying causes were economic and were not effectively addressed by the program, or because the symptoms represented disorders requiring more specific treatments. The relevance of this finding is high because many humanitarian organizations tend to choose non-specific counseling programs, because they are generally perceived as being one of the few feasible options in settings without specialized mental health resources and underfunded mental health systems. However, the results from this study together with a growing literature indicate that many populations in low-resource settings suffering from specific psychiatric symptomatology may need interventions more specifically designed to treat the presenting symptoms.

An important finding was the differential outcomes by counseling pairs. The pair whose participants experienced significantly less change than the other groups was also rated, based on the supervision reports completed during the intervention, as being the weakest pair in skills of empathy and in exploration and review of changes and challenges among participants. These results reinforce the importance of monitoring performance and addressing weaknesses throughout the intervention process.


In our power calculations, we did not take the group clustering into account, and thus we may have underestimated the sample size required to show significant differences between the program conditions. However, an analysis of the intra-class correlation coefficients indicates that the variance due to clustering was minimal, thereby not affecting the comparisons significantly.

Due to the relative small size of the villages, to prevent diffusion of the intervention from program participants to controls, we did not randomize at the individual level with intervention and controls in the same community. This reduced the number of groups, since we had only six villages available. We matched the villages according to known possible confounders (i.e., distance from urban center, village size). Once paired, ideally we would have randomly assigned one village to intervention and the other to control status. However, given access problems, the villages selected to receive the intervention first were those with easier access during the rainy season. This non-randomized matched village-pair design resulted in comparability on most indicators except for functionality, with the control villages having on average higher baseline functional impairment scores. However, despite this apparently successful matching on known factors that could influence the results, this method of selection may still have introduced biases due to unknown factors associated with ease of access.

Although the target population was selected on the basis of having been exposed to torture and trauma, we did not include a specific measure of PTSD because it was not identified as a predominant problem by the local population. Thus, we cannot know whether this intervention had an impact on PTSD-specific symptoms. In addition, we have a limited ability to judge the clinical symptom severity in our study sample. Two factors went into the decision to not utilize a clinical structured interview to calibrate the severity scores. The first was that no standard clinical interview has yet been validated for use with this population, so without significant additional resources and research, we could not implement this process. Second, the region in which we were working had very limited human capacity when it came to mental health professionals; thus, we would have had to import, most likely from a region with a different language, mental health professionals to implement any clinical interview that was adapted and validated.

In the absence of a clinical evaluation, we investigated the severity of our study sample using cutoff scores often used to identify clinically significant problems for the HSCL screener. Mollica et al. [32] used a cutoff of 1.75 to identify the presence of psychiatric illness among an Indochinese population, and Ventevogel et al. [33] identified cutoffs of 2.25 (women) and 1.50 (men) among a primary care sample in Afghanistan. Using these cutoffs, nearly one-quarter of the completer sample (n = 78) had average HSCL scores of 1.75 or greater and nearly 40% (n = 129) had average HSCL scores of 1.5 or greater. These comparisons should be taken as estimates only given that these cutoff scores come from different populations with different cultures and different trauma and life experiences. Thus, the lack of local information on clinical significance limits our conclusions in that it may be that mental health interventions were provided to a sample of individuals without significant mental health disorders. Thus, the finding of no effect could be an artifact of there being no significant mental health disorders to address.

As in previous controlled trials reported by this group [11, 12], we made use of the normal NGO program implementation process in conducting this study. Utilizing this process, limited resources normally require that some eligible persons or groups receive services before others with coverage gradually extending across the target population. By extending assessment beyond the initial group receiving services, we were able to set up a program evaluation in the form of a controlled trial. However, where the NGO program consists of a single program, the study design makes it explicit who receives the intervention and who are the wait controls. This knowledge may have affected the behavior of the interviewers and the study participants in either or both study arms and in a variety of ways. In particular, those in the intervention arm might expect (and therefore be more likely to report) changes, whereas the wait-control group would be less likely to expect (and less likely to report) changes. However, this would have increased the likelihood of positive findings in this case.

Research as capacity building

Outside technical assistance by university-based researchers was used to guide the NGO partners in all stages of the evaluation assessment, from providing the training needed for the local staff to conduct the needs assessment and instrument development and validation process, to collaboration on the assessment design and evaluation components. Rather than having the technical support team simply conduct the evaluation, time was spent working with the collaborating NGO to ensure their understanding and training in all components of the evaluation, which allowed them to not only gain technical skills but also allowed for local, culturally appropriate input into all phases of the intervention and research design. Based on the results of this assessment, ICMC has since piloted several types of combined economic and mental health programs in the control villages, including impact assessments, in order to begin to learn about the interaction of these two important components of well-being. As some of us have advocated in previous studies [11, 12], we believe that NGOs and other service providers often can and should use a controlled trial format to assess the impact of their interventions, and that program design frequently lends itself to such studies. The need for trials by program implementers is particularly acute for mental health programs where the evidence base for interventions is small and symptoms are subject to change according to environmental factors.


We are very grateful to all the Rehabilitation Action for Torture Victims in Aceh (RATA) staff who provided insight into the problems of the local population, implemented the intervention, and informed its ongoing adaptation, as well as those who contributed to the overall logistics which made possible both the intervention and the associated research. We would also particularly like to thank the United States Agency for International Development (USAID) Indonesia, and the USAID Victims of Torture Fund, who provided the necessary institutional and financial support.

Copyright information

© Springer-Verlag 2011