Mixed Methods Process Evaluation of a Sanitation Behavior Change Intervention in Rural Odisha, India

Process evaluations of public health programs are critical to understand if programs were delivered as intended and to identify improvements for future implementations. Here we present a mixed methods process evaluation of the Sundara Grama intervention, which sought to improve latrine use and safe child feces disposal among latrine-owning households in rural Odisha, India. The Sundara Grama intervention was delivered to 36 villages in Puri district by a grassroots non-governmental organization (NGO) and included eight activities: palla performance, transect walk, community meeting, community wall painting, mother’s meeting, positive deviant household recognition, household visit, and latrine repairs. The process evaluation quantitatively assessed fidelity, dose delivered, and reach, and qualitatively examined recruitment, context, and satisfaction. Quantitative data collection included an activity observation survey, activity record, and endline trial survey. Qualitative data collection included an activity observation debrief and in-depth interviews with NGO mobilizers. For the quantitative data, a ‘delivery score’ was calculated for each activity, as well as the proportion of target participants in attendance. Qualitative data were analyzed using thematic analysis. Mean delivery scores, reported as a percentage, were moderate to high. Household visit activities (97% general visit, 96% positive deviant visit) and the mother’s meeting (81%) had the highest delivery scores, followed by the palla (77%), transect walk (77%), and community meeting (60%). Activities were attended, on average, by 30% to 73% of latrine-owning households. Several factors aided delivery, including pre-intervention rapport building visits and village stakeholder support. Factors that hindered delivery included inclement weather, certain recruitment strategies, and village social dynamics. Overall, the Sundara Grama intervention was implemented as intended and achieved good reach. The findings suggest education-entertainment strategies, like the palla, and multi-level communication approaches are particularly beneficial. The results also showcase the importance of examining the implementer experience and broader context.


3
Background Sanitation is important for physical, social, and mental health. Specifically, safe sanitation is associated with positive impacts on diarrheal disease, parasitic infections, stunting, cognitive development, and mental and social well-being (WHO, 2018). Between 2000 and 2017, there was a global increase in access to basic sanitation services, from 56% to 74% of the global population. Still, an estimated 673 million people (9% of the global population) continue to practice open defecation, with 348 million living in India (JMP, 2019).
Since the 1980s, the Government of India has instituted national sanitation campaigns to increase access to household latrines, primarily through financial subsidies (WSP, 2010). The latest campaign, known as Swachh Bharat Mission (SBM, "Clean India Mission"), began in 2014 and ended in October 2019 when the Government of India declared the country open defecation free (ODF) (DDWS, 2020). While India has seen a substantial reduction in the proportion of the population practicing open defecation, India's new status as ODF has been questioned (Chatterjee, 2019;Exum et al., 2020;JMP, 2019). Household access to a latrine does not guarantee its use; a significant proportion of latrine-owning households in India report members continue to practice open defecation (Exum et al., 2020;Gupta et al., 2019). Barriers to latrine use extend beyond access or ownership and can include poor latrine construction and design, lack of water availability, fear of pit filling and the need to empty, preference and perceived benefits of open defecation, and gender normative perceptions that latrines are only meant for women (Caruso et al., 2017;Coffey et al., 2017;Coffey et al., 2017a, b;Routray et al., 2015).
We designed and evaluated a theory-driven behavior change intervention called Sundara Grama ("Beautiful Village"), that aimed to increase latrine use and safe disposal of child feces among latrine-owning households in rural Odisha, India. The Sundara Grama intervention resulted in a 6.4% (95% CI 2.0-10.7%) increase in latrine use and a 15.2% (95% CI 7.9-22.5%) increase in safe child feces disposal (Caruso et al., 2022). The purpose of the process evaluation described herein was to determine if the Sundara Grama intervention was delivered as intended and reached its intended population, understand village members' and implementers' perceptions of the intervention, and assess the financial feasibility of the intervention.
We examined Sundara Grama delivery by applying the Saunders et al. (2005) process evaluation framework, which assesses six key process indicators: fidelity, dose delivered, reach, recruitment, context, and satisfaction. We applied this particular framework since it was developed specifically for health behavior change programs that are stand-alone, theory-based programs. Given that Sundara Grama had impact on behavior, the process evaluation findings will enable understanding of what facets of the intervention should be replicated, adapted, or omitted in future iterations, as well as what contextual factors influenced delivery. An examination of contextual factors, meaning aspects of the physical, social, and political environment in which the intervention took place, was particularly important for understanding how the Sundara Grama intervention may need to be modified if implemented at scale or in other settings. Further, this work aligns with a call within the water, sanitation, and hygiene (WASH) sector to apply implementation science to assess the intervention delivery processes and contexts that enable intervention effectiveness (Haque & Freeman, 2021). Specific lessons from this process evaluation can inform the delivery of other community-wide behavior change interventions, especially those focused on sanitation in India.
The goals of this paper are thus two-fold: to describe the mixed methods approach used to assess delivery of the Sundara Grama intervention and to report the process evaluation results and lessons learned.

Approach
We used a convergent parallel design for this mixed methods process evaluation (Creswell & Plano Clark, 2011). In this design, quantitative and qualitative data collection are conducted concurrently but separately. Both sets of results are then combined to answer the main research question. Our main research question assessed how the Sundara Grama intervention was delivered. We then applied the Saunders et al. (2005) framework for assessing health promotion programs to break down the main research question into six key process indicators of intervention delivery (Supplemental Table 1). In our convergent mixed methods design, we quantitatively assessed fidelity, dose delivered, and reach, and qualitatively examined recruitment, context, and satisfaction. Our process evaluation thus aimed to answer the following sub-questions: 1. Was the intervention implemented as planned? (fidelity, dose delivered) 2. Who was reached and how were participants recruited?
(reach, recruitment) 3. What factors, such as weather and social dynamics, impacted delivery? (context) 4. What did participants think of the intervention? (participant satisfaction) 5. What were the experiences of implementers in delivering the intervention activities? (implementer satisfaction) We also conducted a cost analysis of intervention delivery as this has been noted to help make trial findings more actionable (Haque & Freeman, 2021). We carried out a cost analysis to examine if the intervention was financially feasible at scale and relevant for future sanitation policies, such as the Government of India's second phase of the SBM campaign.

Setting and Sample
The Sundara Grama intervention was delivered to 36 villages in Puri district, Odisha state between January 2018 and February 2019. Puri is approximately 70% rural, and government sanitation campaigns have been implemented in the area for many years (Boisson et al., 2014;Routray et al., 2017). Among the 36 villages that received the Sundara Grama intervention, 33 villages were engaged in the cluster randomized-controlled trial (CRT) and 3 villages were engaged only in qualitative research as part of a sub-study that assessed village member perceptions of the intervention and possible spillover. Findings from this qualitative substudy are reported in  and information on the trial design, setting, and randomization procedures are reported in Caruso et al. (2019). This process evaluation leverages data from all 36 villages that received the intervention and 19 interviews with members of the intervention delivery team.

Sundara Grama Sanitation Intervention
The Sundara Grama intervention included a multi-level communication approach with activities delivered at the community, group, and household levels, reiterating the motto "Moro Swacha, Sustha, Sundara Grama" (My Clean, Healthy, Beautiful Village). Each activity was designed to target specific behavioral factors identified through formative research to influence latrine use and/or safe child feces disposal. Community-level activities included an adapted palla theater performance with sanitation skits (this is a traditional entertainment art form of Odisha that includes skits, songs, and poetry with witty elements ("Palla: The show must go on," 2014)); an early morning transect walk to re-evaluate the village's state of open defecation; a community meeting to discuss sanitation problems, create an action plan to address those problems, and identify positive deviant households (households where all members used a latrine all the time); and a community wall painting that showed both the decided-upon action plan, and a map of the village that indicated which households were positive deviants. The group-level activity was a mother's meeting with caregivers of children < 5 years old to provide action knowledge and hardware (potties and scoops) to aid safe child feces disposal. Household-level activities included either provision of a celebratory poster to positive deviant households or household visits with non-users to encourage commitment toward all members using the latrine. Based on observation of latrine conditions during the baseline trial survey, a subset of households were identified to receive a comprehensive assessment of their latrine's condition. Those in need of minor repairs (e.g., missing door, broken slab) were subsequently selected to receive repairs to ensure latrine functionality and privacy.
Rural Welfare Institute (RWI), a local grassroots NGO, was the implementer. RWI engaged four teams of five (one supervisor and four community mobilizers) to lead community meetings, transect walks, mother's meetings, and household visits (June-July 2018). The palla performances (June-July 2018), community wall paintings (completed after the rainy season in September-October 2018), and latrine repairs (completed between November 2018-January 2019) were carried out by different local artisan groups. See Table 1 for a detailed description of intervention activities and Fig. 1 for a timeline.

Data Collection
Data collection took place between June 2018 to February 2019 during three time periods: during intervention delivery, immediately after, and four to six months postimplementation (Fig. 1). The Emory University research and enumerator teams primarily collected the data as the evaluators; however, RWI staff conducted one component of the data collection. The quantitative and qualitative data collection tools and approaches are described in Table 2.

Quantitative Data Collection
Quantitative data collection included an activity observation survey, activity record, and endline trial survey (i.e., follow-up survey). Activity observation surveys and endline trial surveys were conducted by a team of Odia-speaking Emory enumerators who were engaged in the CRT, underwent a multi-day training, and pilot tested the tools. Activity records were filled out by RWI mobilizers who led activities and were trained on the tool by author PR. Intervention activities are listed in the order they were implemented 2 Primary caregivers who did not have a latrine in their household were provided information on how to safely bury their child's feces

Activity Observation Survey
An Emory enumerator completed an activity observation survey during each palla, transect walk, community meeting, and mother's meeting. This structured, checklist style survey mirrored the activity guides used by RWI mobilizers and palla troupes. To assess fidelity and dose delivered, the survey included questions to confirm if intended components were conducted. Example fidelity items included completion of preparation steps, stakeholders in attendance, use of program materials, and components delivered in correct order. Dose delivered items included completion of each activity step and key messages delivered. The survey also included a question about issues that may have hindered the activity and two Likert questions to capture enumerator perception of the activity quality and level of participant engagement. To assess village population reach, the enumerator recorded the number of village members in attendance by age group (adult vs. under 18 years old) and sex. Attendance was taken at a specific time point during each activity and enumerators used a tally counter device to aid their counting. Surveys were completed using paper and pen to enable the enumerators to easily move through the survey. Responses were later entered into a digital version of the survey using ODK Collect (available from https:// opend atakit. org/) on an Android phone.

Activity Record
RWI mobilizers filled out an activity record to confirm they completed each positive deviant recognition and household visit activity. To assess fidelity and dose delivered, the activity record confirmed whether or not key activity steps took place and if a banner or poster was given. To assess reach, the activity record documented number of household members in attendance. RWI mobilizers also filled out an activity record for each mother's meeting, which mostly acted as an attendance sheet documenting how many caregivers were in attendance, if their child(ren) also attended, and if the caregiver was given a potty and scoop. RWI submitted all activity records to the Emory research team, which were double-entered into an Excel database for analysis. Emory enumerators did not complete these activity records as their involvement in household visits and attendance-taking during the mother's meetings might have disrupted those activities or caused the enumerators to be conflated with the RWI implementing team.

Endline Trial Survey
During the endline trial survey, the Emory enumerator team asked questions about the activities to respondents from latrine-owning households in the intervention villages. To assess dose delivered, respondents were asked if Fig. 1 Timeline of process evaluation data collection (above arrow) and Sundara Grama implementation (below arrow)

Table 2
Overview of quantitative and qualitative data collection tools * Most intervention activities were completed by July 2018 and as such, time of data collection is reported from this timepoint with implementer IDIs 1 to 4 weeks later and endline data collection 4 to 6 months later. Context Satisfaction 1 to 4 weeks after implementation* Author SU they received certain program materials (i.e., potty, scoop, poster). Among households that were selected to receive latrine repairs, dose delivered was assessed by asking the respondents to confirm whether or not the repairs took place. To assess household-level reach, respondents were asked which all activities the household had attended, including whether or not they had seen the community wall painting. All trial households, regardless of latrine-owning status, were censused during endline data collection to determine village size and latrine coverage, and thus inform reach calculations (described in detail below). Accordingly, reach measures could not be calculated for activities conducted in the three villages engaged only in qualitative research.

Qualitative Data Collection
Qualitative data collection included activity observation debriefs completed by Emory enumerators and implementer in-depth interviews (IDIs) conducted by author SU.

Activity Observation Debrief
An activity observation debrief was completed by an Emory enumerator after each palla, transect walk, community meeting and mother's meeting. To capture contextual factors and participant satisfaction, the debrief form included openended sections on factors that hindered or aided delivery and how village members reacted to and participated in the activity. The debrief forms were completed using paper and pen and were subsequently transcribed and translated into English by the field supervisors and organized by topic.

Implementer Interviews
Implementer IDIs were conducted immediately postimplementation to explore mobilizers' perceptions of and satisfaction with recruitment and delivery. Interview topics included successes and challenges with recruitment and delivery approaches, perceptions of participant satisfaction, and recommended changes to the activities. Mobilizers were given the opportunity to share their reflections on all intervention activities. However, author SU more deeply explored only one of the intervention activities in each interview to minimize the time burden on participants, with three to four interviews per activity. Author SU aimed to interview all 20 RWI mobilizers but because of schedule limitations only interviewed 19. Interviews were conducted in Odia and audio recorded. Interviews were not fully transcribed due to funding and time constraints. Instead, SU re-listened to recordings and recorded detailed summaries of the responses by topic in English. When a quote was particularly meaningful, SU translated and transcribed the specific quote as close to verbatim as possible.

Delivery Score
We calculated a 'delivery score' to assess fidelity and dose delivered based on relevant indicators from the activity observation survey or activity record. The maximum possible score was based on the number of indicators assessed for that specific activity, with 1 or 2 points possible for each indicator. A maximum delivery score meant the activity was delivered as intended (fidelity) and in its entirety (dose delivered). Common fidelity indicators across activities included: attendance by a key stakeholder, adequate length of activity, pre-activity preparations completed, and components delivered in correct order. Dose delivered indicators included completion of each activity component. For example, the palla performance included an introduction, six sanitation skits, specific messages on latrine use and open defecation, and a closing. Each of these components was assessed based on one to several questions in the activity observation survey and could receive 0, 0.5, or 1pt depending on how completely the component was delivered. For the positive deviant recognition and household visit, the delivery score was calculated based on relevant indicators from the activity record. Delivery scores were converted to percentages and calculated for each intervention village. An average delivery score for a given activity was also calculated for each of the four RWI implementing teams, by averaging scores across their assigned villages. A delivery score < 50% was interpreted as low fidelity (less than the majority of the activity happened as planned), scores between 50-80% indicated moderate fidelity (the majority or most of the activity happened as planned, but not all), and scores > 80% indicated high fidelity (almost all of the activity happened as planned). Supplemental Tables 2-5 outline the scoring criteria for each activity.
Fidelity and dose delivered was assessed for the community wall painting by author PR who reviewed photos of each painting and confirmed all components were present. Lastly, for the latrine repairs, dose delivered was assessed based on household confirmation in the endline trial survey that repairs were completed.

Reach
We determined village population reach and latrine-owning household reach. Village population reach was only calculated for activities implemented at the community-level and was determined by dividing the number of people in attendance at the activity (recorded in the activity observation survey) by the village population (determined from the trial endline survey data).
Latrine-owning household reach was calculated for all activities, except latrine repairs, by dividing the number of households that reported attending each activity by the total number of latrine-owning households in the village (both determined from the trial endline survey data). For reach at the mother's meeting, the denominator was total number of latrine-owning households with a child less than 6 years old in the village. For the positive deviant recognition and household visit activities, a combined reach was calculated since latrine-owning households were meant to receive one or the other of these two activities.

One-way ANOVA
To examine consistency of program delivery and reach across the four RWI implementing teams, the delivery score and reach means of each respective team were compared using one-way analysis of variance (ANOVA) in IBM SPSS 26 statistics software.

Qualitative Data Analysis
The activity observation debriefs and implementer IDI responses were analyzed to uncover themes related to recruitment, satisfaction, and context. The transcribed debrief notes and IDI responses were organized by these key topic areas in Microsoft Excel. A modified thematic analysis approach was then used whereby "focused memoing" was done in lieu of a formal "focused coding" step. In this approach, author GDS read through the observation debriefs several times and created memos on emerging themes, focusing on the positives and negatives of a specific process evaluation component in each read-through. For example, identifying emergent themes around what went well with recruitment (positives) and what did not go well (negatives). GDS then read through the data again and synthesized the predominant themes based on common memos. This process was repeated for the implementer IDI responses, with an additional examination of satisfaction themes from the implementer perspective. Findings from both analyses were compared to identify common themes, as well as themes unique to either data source (enumerator or RWI mobilizer). With regards to reflexivity, GDS took part in the intervention design process and assisted with training the RWI mobilizers but was not present during intervention delivery. This allowed her analytical interpretations to not be as influenced by personal observations. Nonetheless, GDS remained conscious during analysis of her different sociocultural background and ensured emergent themes were supported by multiple observations/interviews.

Cost Analysis
The Sundara Grama intervention was designed to cost an average of 20 US dollars (USD) or less per target household, a funder requirement to ensure the intervention was policy-relevant and financially feasible at scale. We report the total cost, in USD, of implementing Sundara Grama across the 33 trial intervention villages. The research team documented expenses related to intervention inputs and latrine repairs, and RWI provided their human resource costs to the research team. Training and overhead expenses are not included in the total cost.
Cost per latrine-owning household reached was calculated by dividing the total delivery cost by the number of latrine-owning households that reported attending at least one of the activities in the endline trial survey.

Ethics
The Institutional Review Board at Emory University

Was the Intervention Implemented as Planned? (Fidelity, Dose Delivered)
Average delivery scores were high for household visit activities (general visit: M = 97%; range 87-100%) (positive deviant visit: M = 96%; range 78-100%) and mother's meeting (M = 81%; range 56-100%), indicating the activities were often conducted as intended (Table 3). Both the palla performance and transect walk had a lower average score of 77% (palla range 52-100%; transect walk range 50-91%), while the community meeting had the lowest average score at 60% (range 40-83%). The activity observation survey data for these three activities showed steps were often followed only 'somewhat' in the correct order. For the community meeting, none of the meetings were sex-segregated as intended, few were conducted in a private place, most did not have participants introduce themselves, and three activity steps were rarely completed in full (recognize sanitation challenges, identify positive deviants, and discuss becoming a model village). There was no statistically significant difference in average delivery score between the four implementing teams for any of the activities.
Enumerator-reported quality of the activities, using a 4-point Likert, aligned with the delivery scores with 81% of mother's meetings, 78% of palla performances, 69% of transect walks, and 64% of community meetings being rated good or very good. Only two transect walks and one mother's meeting were rated poor.
All 36 community wall paintings included the required components. Lastly, among the 358 households selected to receive latrine repairs and surveyed at endline, 75% reported receiving the repairs.

Reach
Among latrine-owning households in the intervention villages, 93.1% (N = 1956) reported attending at least one activity. Average reach among latrine-owning households was highest for household activities (M = 73%; range 44-90%) and palla performance (M = 69%; range 39-87%), moderate for mother's meeting (M = 47%; 17-81%) and community meeting (M = 43%; 27-67%), and lowest for the transect walk (M = 30%; range 16-51%) ( Table 3). For the community wall painting, on average, only 36% of latrine-owning households reported having seen the map and only 14% reported being able to identify their own household on the map. For latrine-owning household reach, there was a statistically significant difference between the four implementing teams for the household activities (F = 3.48, p = 0.03).
For community-level activities, palla performances reached, on average, almost a quarter (M = 24%; range 6-50%) of the village population, while far fewer were reached on average for the community meetings (M = 8%; range 3-17%) and transect walks (M = 5%, range 1-14%). For village population reach, there was a statistically significant difference between the four implementing teams for the transect walk (F = 3.08, p = 0.04).
The gender and age of those in attendance varied depending on activity. On average, about half of the audience members at the palla performances were women (M = 49% women; range 27-77%); slightly fewer women on Table 3 Average delivery score and reach for each intervention activity by RWI mobilizer team * Levene's statistic was not significant for any of the delivery scores or reach calculations except for village population reach of the community meeting. In this case, Welch's ANOVA was conducted instead. Degrees of freedom for delivery score F (3, 32); degrees of freedom for reach F (3,29); degrees of freedom for Welch's F (3, 15.08) ± Missing data: In two villages, no delivery score was calculated for the transect walk activity because enumerators arrived late and could not observe the full activity. The village population reach could also not be calculated in one of these villages because the enumerator arrived too late for the attendance count. In another two villages a delivery score was not calculated for the positive deviant recognition activity because there were no positive deviant households identified + 'Household Activities' refers to either the positive deviant recognition activity and/or the household visit activity ** For the mother's meeting, this is specifically latrine-owning households with a child < 6 years old reached average attended the transect walk (M = 45% women; range 0-77%), with no women in attendance in two villages; and more women on average attended the community meeting (M = 57% women; range 7-87%). The community meeting was also attended by more adults on average (M = 81% adults; range 64-100%) compared to the palla performance (M = 59% adults; range 31-80%) and transect walk (M = 57% adults; range 18-100%). Almost a third of transect walks (N = 10) had a majority of boys and girls < 18 years old in attendance. For the mother's meeting, 20% of all participants did not have a household latrine and 41% brought their child.

Recruitment
Several factors were identified that aided activity recruitment, while others hindered recruitment (Table 4). RWI mobilizers described the "pre-intervention visits," which were designed to build rapport with village stakeholders and plan activity logistics, as a very successful strategy that later aided recruitment of village members to intervention activities. Additionally, mobilizers sometimes received recruitment assistance from village members and stakeholders who would help go door-to-door to invite village members to the activities. Specifically, Anganwadi workers (teachers for government-run preschool centers) often helped with mother's meeting recruitment.
In contrast, the recruitment strategy used during the transect walk, where mobilizers went around beating a bell early in the morning, sometimes led to confusion and irritation. Some thought the bell was signaling a call to prayer or that someone had died, while others objected to hearing the bell so early in the morning or felt it disturbed their morning routine. RWI mobilizers also explained it was sometimes difficult to convince people to attend activities, especially for the community and mother's meetings, since an incentive was often expected and not provided. As some community members would tell them, "If there is no eating, there is no meeting."

What Factors Impacted Delivery? (Context)
Delivery of the Sundara Grama activities was impacted by three contextual factors: stakeholder support, inclement weather, and social dynamics (Table 4).
Village stakeholders positively impacted delivery by providing the RWI mobilizers with additional assistance. According to enumerator observation debriefs, village stakeholders participated in the activities, helped prepare activity locations in advance, and even managed tensions or conflicts that arose during the community meetings, specifically around government latrine subsidies and construction quality. Based on the activity observation survey data, at least one stakeholder provided support in 92% of all activities observed. The most common stakeholders providing support included Ward members (45%), Anganwadi workers (43%), village heads (36%), and ASHA community workers (Accredited Social Health Activist) (30%).
Inclement weather negatively impacted delivery. Rain, and in a few cases severe heat, led to activities starting late, fewer participants being in attendance, participants leaving early, and the need to shift activity locations to seek better shelter. Activity observation survey data showed weather was an issue in 26% of all activities observed.
Social dynamics related to caste, gender, and age also hindered aspects of delivery and reach. According to both implementer IDIs and enumerator observation debriefs, caste divisions affected activities in three villages. In one village, caste divisions compelled RWI mobilizers to organize two separate palla performances and also led to issues organizing the community meeting. In a second village, one caste group was not able to attend the palla because it was held near the village temple from which they were prohibited. In a third village, one caste group refused to attend the transect walk in the presence of another caste group.
In interviews, RWI mobilizers described how younger mothers were sometimes not able to attend the mother's meeting and that older women from their households would attend in their place. The mother's meeting activity record data confirmed this observation; 48% of participants across all the meetings were 40 years old or older with 36% older than 45 years, indicating these participants were likely not the mother of the child < 6 years old but potentially the grandmother.

What Did Participants Think of the Intervention? (Satisfaction)
According to both implementer IDIs and enumerator observation debriefs, the palla performances were positively received by village members, but the transect walk and meetings experienced some negative reactions (Table 4).
Village members enjoyed the palla performances -they laughed at the jokes, "listened mindfully," praised the performance for bringing awareness to their village, and commented on how it was both educational and entertaining.
The transect walk elicited mixed reactions. It was well received by children in particular and participants greatly enjoyed the handwashing demonstration at the end of the walk. However, village members often had negative reactions to visiting open defecation (OD) sites in the village and marking feces with colored powder, the main component of the activity. Many village members refused to visit the OD sites, while others expressed anger, irritation, disgust, and shame toward the act. In some cases, the RWI mobilizers were scolded for leading such an activity. Despite negative reactions, RWI mobilizers explained in interviews that a few Prior engagement and rapport building + :

Palla
Stakeholder support during activity*: RWI mobilizers were trained to visit their assigned villages in advance to build rapport with stakeholders and work with them to plan activity logistics. This was noted by several RWI mobilizers as a very successful strategy that aided recruitment.
Enjoyment: Village members were often observed enjoying the palla, especially the jokes and some skits in particular, and listening "mindfully." Successful edutainment: Village members sometimes praised the performance for bringing awareness, and perceived it as both entertainment and educational.
Village stakeholders were sometimes observed to provide support during activities by preparing the location, participating themselves and, during some community meeting, helping calm upset participants.
Village support with recruitment: Both village members and stakeholders sometimes provided mobilizing support by going door-to-door to invite people.
For the mother's group meeting in particular, Anganwadi workers often helped recruit caregivers by calling them on the phone or visiting their home.

Transect Walk
Enjoyed Handwashing: Village members enjoyed the handwashing demonstration and it often drew the most participant engagement. Engagement of children: In several villages, children were noted as enjoying the activity and being the most active, sometimes only, participants during the walk.
Refusal to participate: During many of the walks, participants refused to visit the OD sites or protested the walk altogether. Negative reaction: Participants expressed anger, disgust, shame, and irritation when sprinkling powder on feces and going to OD sites.

Inclement weather:
Many activities were disrupted by rain and, in a few cases, severe heat. This led to activities starting late, fewer participants in attendance, participants leaving early, and the need to shift activity locations to seek better shelter.
Challenges with recruitment (lack of incentive) + : RWI mobilizers expressed challenges with convincing people to attend the activities, especially for the meetings. They explained this was in part due to the lack of incentive being offered-as some community members would tell them, "if there is no eating, there is no meeting."

Community Meeting
Latrine upsets: Many of the meetings were interrupted by participants who were upset over not receiving their latrine subsidy yet or the poor quality of their government-provided latrine.
Caste divisions: In a few villages, caste issues were faced. These issues either prevented village members of a given caste to attend the activity or required separate activities to be conducted, one for each caste group.
Negative reaction to bell recruitment: For the transect walk, RWI mobilizers rang a bell to alert participants about the activity. Some village members found this irritating or confusing.

Mother's Group Meeting
Distracted: Caregivers were sometimes observed to be distracted, not listening or actively participating, and seeming hurried to return home. Hardware distribution upsets: There were often upsets from individuals who did not receive the potty and scoop. In most cases, these individuals were caregivers who had left the meeting early or did not attend at all.
Gender and social norms + : In both the palla and community meeting, RWI mobilizers explained how women and men sometimes had to sit in separate areas and younger, unmarried women were typically not allowed to attend activities. In the mother's group meeting, sometimes younger mothers were not able to attend when an older woman from their household was in attendance.
participants felt the transect walk would positively impact their village. Both the community and mother's meetings experienced frequent upsets. Many community meetings were disrupted by participants voicing their frustration at the poor quality of their government-provided latrine or not having received their latrine subsidy; activity observation survey data showed poor latrine construction came up in 75% (N = 27) of the meetings and latrine subsidies came up in 53% (N = 19). In one meeting, participants attempted a mass exodus over these issues. Poor latrine construction was also mentioned by participants in 33% of the transect walks (N = 12).
In several mother's meetings, the distribution of potties and scoops caused upsets. RWI mobilizers were trained to provide participants with a potty and scoop at the end of the meeting once all information was covered. Some caregivers who left the meeting early or had not attended at all became upset over not receiving the hardware. Sometimes their husbands came and demanded the hardware.

What Was the Experience of Implementers in Delivering the Intervention Activities?
In interviews, RWI mobilizers provided feedback on aspects of intervention delivery that were successful, such as the pre-intervention visits, palla performances, and household visits, and aspects that were challenging, such as traveling to their assigned villages and being misconstrued as government officials.
Mobilizers explained that the pre-intervention visits were critical to building rapport with village stakeholders from the start and that the palla performance was an ideal introductory activity since it was well received and helped mobilizers continue to build a positive relationship. Mobilizers also viewed the household visits as especially effective as they could directly engage with participants and reach members who were not able to attend the other activities, such as newly married and younger women. "It [palla performance] was perceived as a form of both education and entertainment among people. Organizing palla as the first activity was an advantage to the whole program as it helped us build a rapport with villagers. After the palla the villagers were waiting for us to do the other activities." -RWI mobilizer The biggest challenges mobilizers faced were with travel and the misbelief among community members that they were actually government officials. Mobilizers lived far from their assigned villages and some villages were not easily reached by public transportation, on which female staff in particular relied as they did not own a motorbike like male staff.
"Sometime the assigned villages are too far from our homes so we have to travel long distances to reach the villages and it also takes a lot of time. Rains and bad weather usually made this worse." -RWI mobilizer Many mobilizers also described how they were repeatedly misidentified as government officials, with community members believing they had come to cancel ration cards for those who were not using their government latrine. This misbelief and the issues related to latrine construction and subsidies caused many mobilizers to experience verbal attacks during activities, which were sometimes difficult to manage.
"Before conducting the palla, community members had a lot of negative comments. They thought we were government workers and we were there without any intention of actually doing something, and that we had taken money from the government and that we weren't actually going to do anything beneficial for the villages. But their ideas changed after the palla." -RWI mobilizer Mobilizers offered several recommended changes to the intervention: less repetitive activity messages, less prescriptive activity guides to allow for flexibility in how messages are conveyed, snacks or other incentives at the meetings since it is expected and could make recruitment easier, later starts to activities (not in the early morning) to make logistics easier, and more time for household visits.
Finally, female mobilizers often had challenging experiences when delivering the intervention. These women, many of whom were young and in their first job, reported being catcalled and shamed by community members as their presence defied social norms restricting the mobility of young women. One woman's father began accompanying her to the villages because he was worried for her safety. Another mobilizer was scolded by her parents for leaving the house so early for work, as it was not socially appropriate.
"Getting up early and leaving the house early is also a challenge. Our neighbors think bad about us. They say, 'Are you not ashamed? You are such a young girl where do you go so early in the morning?'" -RWI mobilizer However, these same mobilizers explained that they gained the respect of community members over time and that they had become more confident by the end, no longer shy in front of others and more comfortable with public speaking.
"I was not very confident about my public speaking skills. But now I am very confident when I speak to people in the houses. I feel like a different, more confident person after the program. Even my family members have noticed that." -RWI mobilizer.

Cost Analysis
Delivery of the Sundara Grama intervention in 33 villages cost a total of 36,172 USD, with an average cost of 1,096 USD per village (Table 5). Payments to the palla troupes, wall painting artisans, and latrine repair contractors, including cost of materials, accounted for 43.6% of the total delivery cost (average of 477.97 USD per village); RWI staff salaries and transportation stipends accounted for 43.5% (average of 476.58 USD per village); and consumables, such as banners, posters, potties, scoops, and other activity materials accounted for 12.9% (average of 141.58 USD per village). Based on the endline trial survey, 1,956 latrineowning households reported having attended at least one of the activities, making the cost per latrine-owning household reached 18.49 USD.

Discussion
We conducted a mixed methods process evaluation of the Sundara Grama behavior change intervention that sought to improve latrine use and safe child feces disposal in 36 villages in rural Odisha, India. The intervention activities reached a substantial portion of the target population at a cost of 18.49 USD per latrine-owning household reached. Activities were implemented with moderate to high fidelity, except for the community meeting, which often had several components missed, and were delivered consistently across the four mobilizer teams. Both participants and mobilizers praised the palla performance, but provided mixed reactions to other activities. Pre-intervention rapport building visits and village stakeholder support aided delivery, while inclement weather, certain recruitment strategies, and social dynamics hindered delivery. This process evaluation provides insights into what did and did not contribute to intervention success, and highlights the need for communitywide programs to identify and assess strategies that consider the social and political context, and impacts from past programming. Two specific components of the Sundara Grama intervention were critical and provide insights for other behavior change programs: 'edutainment' and multi-level activity delivery. Public health programs often use education-entertainment, or 'edutainment,' strategies to transfer knowledge and skills. However, a recent review of the literature on broadcast media interventions describes how these edutainment approaches can go beyond education alone and deliver messages that shift norms and attitudes to catalyze health behavior change (Grady et al., 2021). Theater performances have also been used in this way successfully. An evaluation of an exclusive breastfeeding campaign in rural Zimbabwe found exposure to an edutainment road show led to changes in social norms, beliefs, and attitudes among men and helped reduce the gender knowledge gap on this important childcare practice (Jenkins et al., 2012). The palla performance adds to this body of research; each skit was embedded with a variety of sanitation behavioral messages that touched upon motivations and social norms, as well as action knowledge and the health risks of open defecation. Our results show this kind of multifaceted folk theater performance can be delivered with quality, reach a large audience, and be positively received. Moreover, there are many benefits to using traditional entertainment art forms, like the palla, compared to mass media: audience members experience the messaging as a collective which may bolster its acceptance, it is often better suited for hard-to-reach communities, and it can help revitalize a traditional art form ("palla: The show must go on," 2014).
We also found our multi-level approach with activities at the community, group, and household-level ensured all types of village members-men, women, children-were reached. This may explain the trial results, which reported modest increases in latrine use across both sexes and different age groups (Caruso et al., 2022). The variety of activities provided multiple opportunities to communicate and reiterate behavioral messages across populations. This may be one reason why the trial results found a significant increase in safe child feces disposal despite mostly older women, likely grandmothers, attending the mother's group meeting: mothers were still receiving safe disposal messaging through other activities like the palla and household visit (Caruso et al., 2022). Other water, sanitation and hygiene (WASH) programs that seek to improve the WASH behaviors for all types of community members should consider this kind of multi-level communication approach.
We also identified aspects of Sundara Grama that did not work well. The community meeting had the lowest delivery score, likely because it requires more skillful facilitation and participatory engagement, and may need more intensive implementer training for full delivery. The wall painting had the lowest reach and thus could be omitted from any future delivery given impact was achieved without it being noticed.
Social dynamics influenced the implementation of Sundara Grama, offering lessons learned for future communitywide programs and emphasizing the importance of assessing social dynamics when evaluating delivery. In at least three intervention villages, casteism negatively impacted the ability for all village members to attend and engage in the palla performance and/or community meeting, a finding expanded upon in a separate qualitative paper on community perceptions of Sundara Grama, which discusses how social divisions hindered intervention delivery . Similarly, caste issues were documented in a qualitative process evaluation of a government sanitation program implemented in Puri between 2013-2014; lowercaste groups were sometimes forced to sit in a separate area during community meetings or were altogether not invited (Routray et al., 2017). In contrast, a process evaluation of a community-level handwashing behavior change program in rural Andhra Pradesh, 'SuperAmma,' quantitatively examined exposure to intervention activities and found no difference between caste groups (Rajaraman et al., 2014). Such analyses by social groups are essential and should be more commonplace. However, as this and other studies demonstrate, qualitative explorations are also needed to capture participants' perceptions of their ability to fully engage and feel a part of activities. Future delivery of community-level programs in rural India should be mindful of caste divisions and identify, enact, and assess strategies, using both quantitative and qualitative approaches, to ensure equitable reach and engagement.
Social dynamics also negatively affected the experience of RWI mobilizers in implementing Sundara Grama. In order to deliver activities, the female mobilizers had to go against gender norms that restrict young women's movement and engagement outside the home. As a result, female mobilizers were subjected to social backlash, including cat calling and public shaming. Other studies in India and Pakistan have also documented how restrictive gender norms limit the ability for women to both participate in and carry-out public health programs (Gailits et al., 2019;Mistry et al., 2009;Mumtaz et al., 2013). Program evaluations do not always consider the implementer experience, but implementers must operate within the same cultural norms as participants and it is vital to understand how those norms may affect their role or lead to unintended harm. Moreover, these findings demonstrate the need for NGOs and other implementers to establish safeguards and adequately prepare staff, often young women, who will have to challenge cultural norms as part of their work. Strategies may include establishing protocols to ensure safety and well-being, helping staff mentally prepare for negative social reactions, creating opportunities for staff to comfortably share their experiences and concerns, and equipping other staff who are not at risk of undermining a norm, and thus in a position of social power, 1 3 with strategies for supporting their fellow team member. We caution implementers from altogether refraining from hiring staff that may face social backlash because doing so prevents individuals from making their own choices and taking on new opportunities; as several of the female RWI mobilizers explained, over time they gained the respect of community members and experienced a newfound self-confidence.
In addition to social dynamics, understanding and being able to respond to the political and historical context in which a program takes place is also invaluable for implementation success. The delivery of Sundara Grama was mired by the dissatisfaction and distrust village members felt from past government-led sanitation programs. During the community meetings and transect walks, village members often disrupted the activity to voice their frustration at the poor construction quality of their government latrine and the unfilled promise of a latrine subsidy. Several studies have documented the same issues toward the various government sanitation campaigns rolled out between 2011 to 2018, indicating these issues are not new and quite persistent (Barnard et al., 2013;Gupta et al., 2019;Routray et al., 2017). The misbelief held by village members that RWI staff were actually government officials who had come to force them to stop open defecating also impeded activity delivery. While this reaction was unexpected, it is not unfounded; coercive tactics authorized by local government officials including harassment, public humiliation, fines, and the threat or actual loss of public benefits are well documented during the latest sanitation campaign SBM (Doshi, 2017;Editorial Board, 2017;Gupta et al., 2019;PTI, 2019). When designing interventions, the political context and experience of past programs should be considered; community members, stakeholders and even implementing staff can be engaged from the start on how to address these issues head on as they are sure to arise. As one RWI mobilizer suggested, the Sundara Grama program could have included training on the latrine subsidy reimbursement process so mobilizers could offer some form of support to village members.
This process evaluation has many strengths. The study was framework-driven, systematically assessed delivery and reach across all villages, employed both quantitative and qualitative methodologies to appropriately evaluate each process evaluation component and triangulate findings, and explored both the participant and implementer experience. We also note a few limitations. While most of the data was collected by our separate evaluation team, the RWI mobilizers documented their own delivery of the household activities, which could have led to biased data. In addition, since reach of latrine-owning households was assessed in the endline trial survey, which took place 4 to 6 months after implementation, it is possible household members had forgotten about the activities by that time, although this would lead to a more conservative reach assessment. Lastly, there are important limitations to the qualitative data collection and analysis. Since only one enumerator completed each activity observation debrief they may have missed some relevant observations and also brought their own biases in how they reflected on what was observed. Similarly, because only author GDS conducted the qualitative data analysis the themes identified come from her analytical interpretation alone. That said, we are not aware of other WASH intervention studies that gave such focus to the experiences and insights of implementers, and overall see this qualitative component as a strength of the study.

Conclusion
Using mixed methods and a framework-driven process evaluation, we found the Sundara Grama sanitation intervention was implemented as intended and achieved good reach. The edutainment palla performance and multi-level activity delivery were particularly salient approaches that could be applied to other WASH programs that aim for communitywide behavior change. We also uncovered lessons learned on the need for process evaluations to examine the social, political, and historical context in which a program takes place, as well as the implementer experience, to ensure successful and equitable delivery and prevent unintended harm.