Background

Background and rationale

An estimated 250 million children (43%) under age 5 in low- and middle-income country (LMIC) settings experience compromised cognitive and socioemotional development due to poverty, poor nutrition, and inadequate psychosocial stimulation [1]. Myriad parenting interventions that promote responsive stimulation and early learning have been shown to be effective at improving early childhood development (ECD) outcomes in many LMIC settings [2, 3] at least in the short-term, but they are (a) still too expensive to implement at scale in low-resource settings, especially in rural areas that lack resources and infrastructure to implement public health programs such as rural Kenya, and (b) their early impacts tend to fade over time in the absence of continued support [4]. New ways to deliver effective ECD parenting interventions in low-resource settings are sorely needed that are low-cost to be potentially scalable, while also able to sustain impacts over time.

The increasingly widespread use and low costs of mobile phones have spurred development of myriad mobile health (mHealth) interventions as potentially scalable means to deliver healthcare services [5] and improve health outcomes [6,7,8] in LMICs. There is growing evidence that mHealth interventions can also increase parental engagement [9,10,11,12,13,14], though these studies come predominantly from high-income countries and focus primarily on enhancing parental literacy activities through simple SMS over short program durations of up to six months. To our knowledge, only one study [14], found positive effects on children’s early literacy.

While SMS-based interventions are potentially cost-effective, they often struggle to effectively communicate complex behavioral change messages, especially in areas with lower literacy or in rural settings. This limitation is highlighted by a recent study from Latin America that underscores the challenges of transitioning ECD programs to remote delivery during the COVID-19 pandemic [15]. Furthermore, whereas earlier SMS studies typically utilized passive, one-way text messages, recent recommendations advocate for two-way and group interactions to provide more continuous support in mHealth behavior change programs [8, 16]. The emerging trials in Peru and Brazil that incorporate digital tools for content delivery via two-way communications are indicative of growing efforts in this direction [17, 18]. Despite these advancements, no prior ECD study has: (1) integrated audiovisual content that extends beyond simple SMS, along with two-way and group interactions on digital platforms; (2) assessed the sustainability or cost-effectiveness of mHealth compared to in-person delivery; (3) examined impacts on a broader spectrum of children’s developmental outcomes beyond literacy; and/or (4) been tested in an LMIC setting in an effectiveness trial with a large sample.

In a previous study, we provided scientific evidence that an 8-month ECD parenting intervention, featuring fortnightly in-person group meetings delivered by Community Health Promoters (CHPs) from Kenya’s rural health care system, significantly improved child cognitive, language, and socioemotional development, as well as parenting practices [19]. The intervention’s group-based model was also the most cost-effective among the few previous ECD interventions that reported costs [20]. These results support the notion that health services — particularly through the work of CHPs — are ideal starting points for scaling effective ECD interventions [21, 22]. In a two-year follow-up assessment, we found that impacts were still positive, but smaller in magnitude [23], suggesting that a cost-effective program may still be too expensive for scaling in a rural LMIC setting such as rural Kenya, where health services are often underfunded.

Our study will address two key remaining questions in the ECD literature: (1) how to scale promising ECD programs in low-resource, rural settings, and (2) how to sustain early impacts longer-term in a cost-effective manner. We will test whether an mHealth-based intervention, which progressively substitutes in-person meetings with remote delivery over time, can simultaneously achieve the competing goals of scalability and sustainability of ECD parenting interventions in LMICs.

Objectives and research questions

Our study aims to experimentally test the relative effectiveness and costs of a traditional in-person delivery model against a hybrid model that combines in-person meetings with remote mHealth delivery. This evidence-based ECD parenting intervention, targeting mothers and their children aged 6–18 months in rural Western Kenya, originally spanned 8 months. Our primary goal is to determine the best model to maximize the intervention’s reach and sustained impacts on child outcomes. By extending the original intervention over two years, we aim to enhance the program’s ability to sustain improvements in parenting behaviors and children’s outcomes over the long term. Integrating delivery into the ongoing operations of local CHPs within Kenya’s rural health care system, utilizing mobile technology, and engaging national and local ECD policymakers and stakeholders as key collaborators from the project’s inception, our project seeks to discover scalable, sustainable solutions for resource-limited settings. Our primary outcomes will include children’s development, parental responsive stimulation, positive parenting behaviors, as well as maternal and family wellbeing.

Our research questions are:

  1. (1)

    Will an ECD responsive parenting curriculum adapted to mHealth delivery and tailored to the local cultural context be accepted by program beneficiaries and CHP delivery agents?

  2. (2)

    How does the short-term effectiveness of a hybrid delivery model, which progressively substitute in-person meetings with remote mHealth delivery, compare to that of a traditional in-person delivery model?

  3. (3)

    Can a hybrid delivery model sustain early impacts in the medium-term better than a traditional in-person delivery model?

  4. (4)

    Are the cost-savings entailed in hybrid delivery large enough to make it more cost-effective than an in-person delivery model in the short- and medium-term?

  5. (5)

    What are the key implementation processes that can make a hybrid delivery more scalable than a traditional in-person delivery model?

To answer these research questions, we will collect measures of parental behaviors, knowledge, beliefs, self-efficacy, and mental health, along with child developmental outcomes at baseline, 12, and 24 months after the start of the interventions to assess short-and medium-term impacts. We will also track all program costs by treatment arm, including private opportunity costs for delivery agents and participants, to estimate the relative cost-effectiveness of the two delivery models in the short- and medium-term from a societal perspective. Additionally, a planned process evaluation will collect output measures of delivery and training quality, as well as attendance to in-person meetings and engagement in remote delivery, to identify which aspects of the remote delivery model are most effective and to inform a potential transition to scale.

Methods

Study setting

This study will take place in rural areas of Kisumu and Vihiga counties in western Kenya, characterized by high rates of poverty, child mortality, and stunting (31–34%). We will select a total of three subcounties, all of which are large enough to collectively select a total of 90 rural villages to participate into this study: Vihiga and Hamisi subcounties from Vihiga county, and Kisumu West subcounty from Kisumu county. All areas outside Kisumu town are predominantly rural, and our local NGO implementing partner, the Safe Water and AIDS Project (SWAP), has a local Jamii (“community”) center in Vihiga county that will facilitate local monitoring and supervisory capacity. Most villagers are subsistence farmers or informal manual laborers. Despite their poverty, in a Fall 2021 survey in these areas, 94% of households reported access to a mobile phone or smartphone, reflecting the vast expansion in mobile phone ownership worldwide. However, phones are most commonly owned by the household and often under the direct control of the husband or male household head.

Eligibility criteria

This research project will involve a total of 1260 Kenyan mothers or other primary caretakers (1200 randomly selected for the main trial and 60 for the pilot study) and their children aged 6–18 months from 96 total villages (90 for the main trial and 6 for the pilot study) located across rural areas of Kisumu and Vihiga counties in western Kenya. Within selected villages, eligible mother-child dyads will be defined by (1) mothers or other primary caretakers aged 18 years or older, and (2) with a child aged 6–18 months at recruitment without signs of severe mental or physical impairments. If the mother has more than one child aged 6–18 months at recruitment, we will invite the youngest to participate. If the primary caretaker of an otherwise eligible child is the father or another male relative, he will be eligible for inclusion in our study, though we expect the vast majority of primary caretakers will be women, predominantly mothers. For simplicity we refer to this group as mothers.

Overview of the trial design

Our evaluation design is a cluster Randomized Controlled Trial (cRCT) stratified across three subcounties in rural western Kenya, encompassing 90 villages and 1,200 households. In this design, 90 CHPs and their associated villages will be randomly assigned to one of three treatment arms. Arm 1 is the in-person delivery model, where 30 CHPs will deliver a traditional in-person group-based intervention. This includes a first intensive phase of 20 fortnightly village sessions over 12 months, followed by a second, less intensive phase of monthly booster meetings for 12 additional months. Arm 2, the mHealth delivery model, involves other 30 CHPs delivering a hybrid intervention that progressively substitutes in-person meetings with remote delivery over time. Arm 3 serves as a control group, where 30 villages will continue to receive CHP services as usual. Interventions in Arms 1 and 2 will deliver the same content, based on a curriculum tested in an earlier trial [19, 24], but extended over two years to maximize its potential to sustain impacts.

In collaboration with the local NGOs, the Safe Water and Aids Project (SWAP) and the ECD Network for Kenya (ECDNeK), we will train these 90 CHPs to implement the interventions in their respective villages. Using a Training of the Trainers (TOT) model, our core study team will initially train staff from SWAP and ECDNeK to become lead trainers. These trainers will then instruct cohorts of CHPs (defined by subcounty and treatment arm) in the local language. In addition to this initial training, this TOT approach includes ongoing support through monthly refresher trainings, designed to reinforce the skills needed for the upcoming sessions.

The initial training for the early subcounty cohort will span 5 days and cover the first 4 sessions, which are all in-person in both treatment arms, accommodating approximately 20 CHPs from Arms 1 or 2 in each subcounty. After the first training, CHPs will host the first 4 sessions in their villages. Subsequent trainings for later sessions will be organized separately for each study arm within each subcounty, because Arm 2 will start substituting in-person meetings with remote delivery. We anticipate to hold a total of five 1-week training sessions every two months, plus five additional 1-day monthly refreshers, to cover Phase 1 of the intervention, which consists of 20 fortnightly sessions over 12 months. For Phase 2, which comprises 12 monthly boosters, we anticipate three one-week training sessions, each covering boosters 1 to 4, 5 to 8, and 9 to 12. Figure 1 summarizes our study’s evaluation design, the envisioned activities, and the timeline.

Fig. 1
figure 1

Misingi Bora mHealth design and timeline

Village and participant enrolment randomization strategies

We will randomly assign villages and households to the interventions in three steps. First, we will work with local administrative data to list all the potential study villages within each of the subcounties of Kisumu West (Kisumu County) and Vihiga and Hamisi (Vihiga County), estimated to have at least 8 households with children aged 6–18 months. SWAP will record the GPS coordinates of each village center (usually a church or marketplace). From this list, we will randomly sample 90 villages, stratified by subcounty and maintaining a minimum distance from each other to minimize potential cross-village contamination. Villages will comprise our study’s clusters, from which we will sample households to participate in the study.

Second, within each sampled village, we will conduct a census to create a full listing of all eligible households and record their GPS coordinates to facilitate the collection of surveys. SWAP will train CHPs from the 90 selected villages to collect this basic information. Our previous experience with collecting census data in neighboring subcounties show that most villages are rather small, with a size ranging from 8 to 16 households with a child aged 6–18 months. Therefore, we will invite all eligible households from selected villages to be part of the study. Our previous experience also shows minimal rates of refusal. Eligible participants will meet the criteria outlined above. Using the list of study participants per village and the recorded GPS coordinates, the village CHP will guide a trained interviewer to visit the households to invite eligible mother-child dyads into the study and to undergo informed consent procedures for participation.

Third, after the baseline survey is completed, we will randomly assign CHPs and their associated villages to one of the two treatment arms or the control group. Each study arm is expected to have 30 CHPs and 400 households. CHPs in villages assigned to an intervention arm will attend the training program outlined above. Households assigned to an intervention arm will be contacted and invited to attend the ECD village sessions. All randomizations will be stratified by subcounty to ensure balance across treatment arms, controlling for any village-level characteristics that may influence intervention outcomes. All CHPs will receive a stipend for their participation in the census and the intervention as appropriate.

Interventions

The ECD parenting interventions, hereafter referred to as the Msingi Bora mHealth interventions, will build on our team’s previous work in the Msingi Bora trial [19], and its subsequent bi-monthly booster extension [23]. The original Msingi Bora structured curriculum comprised 16 biweekly sessions, centered around five key messages: love and respect within the family, responsive talk, responsive play, hygiene, and nutrition, which were summarized to participants as love, talk, play, wash, food. Every fourth session served as review to help consolidate learning. The curriculum manuals were available in English, Swahili and Luo, enabling CHPs to effectively be trained and deliver the program in the local language. Boosters had the same structure as earlier sessions, but focused on reinforcing language development and positive parenting strategies to manage children’s behaviors.

For the Msingi Bora mHealth trial, the basic structure remains the same but additional materials will be incorporated to enhance areas like nutrition education and maternal wellbeing. The interventions will span two years, the target population is families with children 6-18 months at recruitment, and mothers and their age-eligible child will be invited to participate in all sessions.

The intervention delivery will vary across treatment arms as follows:

Arm 1: In-person group sessions

In villages assigned to Arm 1, the first phase will feature 20 in-person group sessions delivered biweekly over 12 months by CHPs. Each session, lasting 60–90 min, will address one of the five key intervention messages. The inaugural session introduces the ECD program. Four sessions will focus on love and respect within the family and maternal wellbeing, using group discussions and role-playing to bolster maternal self-efficacy, self-esteem, and healthy family dynamics. Seven sessions will be devoted to responsive play and talk, teaching caregivers how to play with children with games using play materials available at home (such as cups, bowls, bottles, and stones), and how to engage in conversations, singing and storytelling with the child to enhance language development. One session will specifically addresses child health care practices, including diet and hygiene, though these topics are also integrated throughout other sessions. Every group session, regardless of the session topic, will include 30 minutes of guided mother-child play and talk activities to reinforce new behaviors. Finally, every fourth session will serve as a review to consolidate learning, totaling five review sessions.

Phase 2 extends the program with 12 monthly booster sessions designed to help sustain improvements in parenting behaviors and children’s outcomes over time. These boosters maintain the structure and duration of Phase 1 sessions but shift focus to advanced strategies for responsive play and talk as children grow, positive disciplinary practices to manage children’s behaviors, as well as maternal mental health. Every third booster session will serve as a group review, also revisiting hygiene and nutrition practices. New content will be introduced through group discussions, skits, and guided mother-child interactions.

Arm 2: mHealth hybrid delivery model

In villages assigned to Arm 2, the curriculum mirrors that of Arm 1, but integrates a hybrid mHealth delivery model, where most group sessions will be adapted to remote delivery via smartphones. Mothers in this arm will receive smartphones and a small monthly data plan to enable access to video content and facilitate their engagement in WhatsApp group interactions with other mother participants and the village CHP. This setup is designed to bolster social networks of support and opportunities for social learning. Remote sessions will feature video demonstrations of play and talk activities by lead trainers, supplemented by audio recordings that summarize key points and offer guidance on implementing these activities at home. A remote package containing these videos and audios will be distributed at the start of each session period, giving mothers ample time to engage with the material.

The creation of village WhatsApp groups, including the CHP, is intended to facilitate follow-up on new behaviors and encourage mothers to share their experiences, fostering social support networks. The Q&A activity conducted at the end of each in-person session in Arm 1 will be replicated through WhatsApp group calls hosted by the CHP near the end of the session period. This adaptation ensures that barriers to behavior adoption are addressed, and discussions about homework for the next session are facilitated. Review sessions in Arm 2 will remain in-person to maintain some face-to-face interaction and ensure adherence to the program.

Figure 2 illustrates the curriculum design, dividing session contents into two phases and highlighting the progressive substitution of in-person meetings with remote delivery over time in Arm 2 (highlighted in green), with only review sessions held in-person starting from session 7 in this arm.

Fig. 2
figure 2

Curriculum design and session contents Arms 1 and 2

Outcomes

The survey measures selected for our assessment battery and how they relate to primary and secondary outcomes of interest are detailed in Table 1. Most of these measures have been validated, translated into Swahili and Luo using standard translation and back-translation methods, and previously utilized to evaluate short- and medium-term impacts in our earlier trial within the same study setting [19, 23]. With the exception of those measures applicable only to children older than 2 years, all measures will be included in the assessment battery administered at each time point. This includes the Bayley III scale for assessing children up to 42 months, and the Global Scales for Early Development short-form (GSED).

Table 1 Primary and Secondary Outcomes of Interest and Survey measures

Blinding

Our study will have separate teams for collection of surveys and program implementation to prevent biases, ensuring program implementers and evaluators maintain their respective focuses and objectivity. The interventions will be managed by the implementation team at SWAP, led by Co-I Alu. This team will participate of the TOT sessions, will lead the training of CHPs in their subcounties, and will be in charge of monitoring the quality of implementation activities. Additionally, they will also coordinate three subcounty teams, each composed by a subcounty supervisor and two mentor CHPs. These teams will collect attendance and monitoring data, and supervise the daily activities of the CHPs. Survey data collection will be conducted by an external team of trained enumerators and supervisors, managed by a second evaluation team at SWAP led by Mr. Odhiambo. This team will focus exclusively on evaluation activities. Our core team of investigators will train enumerators into the household survey and the child assessments. Due to the nature of the intervention, participants and delivery agents will not be blinded to their study allocation. However, survey enumerators will be blinded to the intervention allocation status of participants and villages. Baseline surveys will be collected prior to randomization to ensure the sample is balanced across the three study arms.

Compliance

We do not anticipate noncompliance with treatment status for villages and households assigned to the control arm, as our sampling frame will ensure a healthy minimum distance between villages and CHPs in the study. For households in villages assigned to a treatment arm, our power calculations, which are presented below, account for potential noncompliance with treatment by including an expected attendance rate of 75% to the sessions.

Retention

Once a mother-child dyad is enrolled into the study, we will make every reasonable effort to follow the dyad for the entire study period. Both the baseline and midline surveys will collect mobile phone numbers from household members to facilitate the tracking for subsequent surveys and session invitations, where appropriate. The mobile number of one neighbor will additionally be collected to assist in locating the dyad in cases of non-retention. Reasons for non-retention may include migration to another village or subcounty due to separation, (re)marriage, or relocation for work. We will record these cases, including the new address and contact information, and will continue to follow-up these families at midline and endline surveys. Each survey round will include up to four attempts to contact a household before it is considered for removal from the sample. Our power calculations account for 7% annual attrition to allow for such instances.

Data

Sample size and power calculations

This cluster RCT will involve a total of 1200 Kenyan mothers-children dyads, providing sufficient power to detect impacts on our primary outcomes. We calculate our power based on the primary outcome of children’s cognitive development using the Bayley III scale, which typically has a mean of 100 with a standard deviation (SD) of 15. The original Msingi Bora intervention had an effect size of 0.52 SD on children’s cognitive scores, with an annual attrition rate of 7%, and 75% compliance among mothers throughout biweekly sessions and boosters. The intra-cluster correlation coefficient (ICC) from Vihiga county was 0.02. Assuming a more conservative ICC of 0.04, 80% power, 75% compliance, an annual attrition rate of 7%, a more conservative ICC of 0.04, and after correcting for baseline covariates, with 30 villages (400 mother-child dyads) per treatment arm,we would be able to detect a minimum difference of at least 0.22 SD in cognition between the in-person versus mHealth intervention arms, or between any intervention arm and the control group at midline. Using the step-down method of Romano and Wolf to adjust the p-values for multiple hypothesis testing [45],Footnote 1 the minimum detectable effect would be between 0.25 and 0.27 SD. At endline, assuming a 15% of accumulated attrition, the detectable difference in child cognition between treatment arms would be 0.25 SD. Adjusting for multiple hypothesis testing, the minimum detectable effect would lie between 0.29 and 0.31 SD. Finally, to further enhance the robustness of our estimated impacts, we will construct indices of child and parental outcomes using latent factor models and estimate the intervention effects on these indices. We anticipate creating at least four indices representing different families of outcomes: (i) child developmental measures; (ii) parental stimulation and health behaviors; (iii) parental knowledge and beliefs; and (iv) parental wellbeing.

Data collection

Household surveys and procedures

For the 1,200 respondent mothers and children recruited for the main trial, participation will involve a 60–90-minute baseline survey. A team of two trained enumerators, one for the household socioeconomic and maternal surveys and the other assessing the child, will visit households to invite mothers into the study and undergo informed consent procedures for participation. All households, irrespective of their village’s eventual treatment assignment, will be asked to provide written or verbal consent explaining the purpose and contents of the study as well as their anticipated time commitment for attending the village-based sessions and/or participating in sessions delivered remotely, if their villages are assigned to an intervention arm. Mothers will be made clear that participation in the surveys is voluntary and participation in the intervention is not guaranteed but based on their village’s random assignment. For those households that express a willingness to continue in the study, one interviewer will conduct the maternal and socioeconomic surveys in a first visit, and another interviewer will assess the child in the second visit.

All households surveyed at baseline will be re-contacted to undergo a midline survey roughly 15 months later, immediately following the conclusion of Phase 1, to assess short-term impacts after 12 months. The duration, procedures and measures will mirror those of the baseline. Interviewers will reassess the same children and re-interview the mothers from the baseline survey. An endline survey will also be conducted at the end of the two-year interventions to evaluate medium-term impacts. Both midline and endline surveys will assess the same maternal and child outcomes as at baseline. However, the last two surveys will include new measures for assessing children’s cognitive, socioemotional, and executive functioning development that are applicable only to children older than 2 years (see Table 1). As with the baseline, a team of two enumerators will conduct the fieldwork: one for the socioeconomic and maternal surveys, and the other to assess the child. Enumerators will be masked to intervention assignment. To express gratitude, all study households will receive a hygiene pack valued at 400 Ksh upon completion of each survey wave.

Monitoring and process data

We will collect both qualitative and quantitative data on the quality and fidelity of delivery following the CARE (Consolidated Advice for Reporting ECD Implementation Research) guidelines [46]. A planned collection of monitoring data will account for the implementation differences by arm. For the in-person sessions, subcounty supervisors will collect detailed implementation data in the form of attendance sheets, monitoring checklists measuring CHPs’ quality of delivery, as well as CHP’s self-assessment forms. For the remote sessions, we will continue to collect monitoring checklists and CHP self-assessment forms, but focus on the CHPs’ performance during WhatsApp group calls. In addition, at the end of each remote session, subcounty supervisors will collect a parental remote engagement form completed by CHPs in Arm 2. This form includes individual-level measures of parental engagement with the audiovisual content and participation in the WhatsApp group calls and chats. All the data will be collected using SurveyCTO and will be transmitted to SWAP servers in Kisumu, where SWAP staff will clean and aggregate the data to be transferred to an aggregate server hosted at USC. Finally, following the end of all interventions, local research staff will be trained to conduct FGDs with a minimum of 20 mothers and 12 CHPs assigned to the 2 intervention arms. The exit FGDs will aim to understand what worked and what did not from CHPs’ and parents’ perspectives, and this qualitative data will be used to complement quantitative findings using mixed methods.

Costing data

As explained above, our cost-effectiveness analysis is a key project aim. For a comprehensive understanding of the project’s costs, we will adopt a societal perspective for this analysis that includes both the provider’s implementation costs (e.g., CHP payments, cost of SMS) and opportunity costs to the household and community (e.g., time costs of delivering and attending sessions, and interacting remotely, for CHPs and mothers, respectively), separately by intervention arm. We will track all implementation costs during Phases 1 and 2, by treatment arm, using a step-down accounting cost method based on actual incurred costs provided by SWAP’s financial statements. We will use economic costing methods to estimate opportunity costs for mothers and CHPs as appropriate. Additional opportunity costs stemming from maternal behavior changes induced by the interventions will also be included. We will collect and report all costs in accordance with the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) guidelines [47].

Data management

All data collected will include personally identifiable information (PII), but it will be coded so only a household identifier can be linked to PII. Surveys will be collected via tablets and contain personal identifiers (names), anthropometric and psychosocial measures of children and their mothers, and mobile telephone numbers. Data from tablet-based surveys will be safely stored in SurveyCTO and will be downloaded and analyzed using Stata software version 16 (College Station, TX). To increase security over paper questionnaires, these data will be encrypted in SurveyCTO. Participant names will be removed from the data and no longer stored in any tablet after the successful linking of the midline and endline surveys to baseline data using a USC-generated ID. Access to this linked file will be restricted to only authorized study staff. Data transfer from SWAP to USC will be done with only encrypted, password-protected files. Survey data will be treated with the maximum norms of confidentiality following the study protocols involving human subjects as reviewed by the USC’s Institutional Review Board (IRB) and the Maseno University Ethical Review Committee (MUERC).

Statistical methods

Short-term impacts

We will use outcomes data from the midline survey and our cluster randomized design to estimate the short-term effectiveness of the two intervention arms, 1 and 2, relative to the control Arm 3. We estimate relative effectiveness in an Intention-to-Treat (ITT) framework. Let \(\:Y\) denote an outcome of interest at midline, and D is a vector of dummy variables for the random allocation to one of the treatment arms: In-person model \(\:\left({D}_{1}\right)\), Hybrid model \(\:\left({D}_{2}\right)\), and the Control group \(\:\left(C\right)\). The ITT parameters capturing the intervention effects relative to the control group can be estimated from the following linear regression:

$$\:Y\:={\alpha\:}_{0}+{\alpha\:}_{1}^{ITT}{D}_{1}+{\alpha\:}_{2}^{ITT}{D}_{2}+{X}^{{\prime\:}}\beta\:+{\lambda\:}+\varepsilon$$
(1)

In Eq. (1), \(\:{\alpha\:}_{k}^{ITT}\) is the parameter capturing the ITT impact of intervention type {\(\:k=\text{1,2}\}\) relative to the control group on the final outcome. \(\:\text{X}\) is a vector of covariates that includes children’s age and sex, family socioeconomic status, and outcomes at baseline. \(\:{\lambda\:}\) represents the randomization strata (the subcounty), and \(\:{\epsilon\:}\) is an error term, clustered at the village level. The ITT parameters are identified by the orthogonality between the error term and treatment status. We will correct for multiple hypothesis testing among potentially highly correlated outcomes using the Romano-Wolf estimator [45].

Similarly, we can estimate the Treatment-on-the-Treated (TOT) parameter that captures the average treatment on compliants with treatment arm \(\:k\), with respect to the control group, using the following Two-Stage Least Squares procedure:

$$\:Y\:={\alpha\:}_{0}+{\alpha\:}_{1}^{TOT}{P}_{1}+{\alpha\:}_{2}^{TOT}{P}_{2}+{X}^{{\prime\:}}\beta\:+\varepsilon$$
(2)
$$\:{P}_{1}={b}_{0}+{b}_{1}{D}_{1}+{b}_{2}{D}_{2}+{X}^{{\prime\:}}\gamma\:+\eta$$
(3)
$$\:{P}_{2}={c}_{0}+{c}_{1}{D}_{1}+{c}_{2}{D}_{2}+{X}^{{\prime\:}}\delta\:+\pi$$
(4)

In Eq. (2), \(\:{\alpha\:}_{k}^{TOT}\) is the TOT impact of intervention \(\:\text{k}\) on the outcome. \(\:\{{P}_{1}\), \(\:{P}_{2}\}\) are dummy variables for observed compliance (participation) in the in-person and hybrid interventions, respectively. These can differ from the random allocation if there is imperfect compliance. Equations (3) and (4) correct for selection bias into participation by modeling the participation decision using the randomization as an instrumental variable and estimating it by Two-Stage Least Squares (2SLS) methods.

Medium-term impacts

Assessing the medium-term impacts of the interventions at endline is straightforward and follows the same analysis plan outlined in Eqs. (1)-(4) for short-term impacts. Instead of using the outcomes from the midline survey, we will use outcomes measured at the final endline survey.

Cost-effectiveness

Following recent guidelines for cost-effectiveness analyses, we will calculate incremental cost-effectiveness ratios (ICERs), expressed in terms of incremental ITT impacts on child outcomes per $100 investment. For example, the ICER for the hybrid intervention relative to the in-person intervention can be calculated with the following formula:

$$\:\text{C}\text{E}\text{R}=\frac{{({\alpha\:}}_{2}^{\text{I}\text{T}\text{T}}-{{\alpha\:}}_{1}^{\text{I}\text{T}\text{T}})\text{*}100}{{{\mu\:}}_{\text{c}2}-{{\mu\:}}_{\text{c}1}}$$
(5)

where \(\:{{\mu\:}}_{\text{c}2}\) is the cost per child of the hybrid intervention in Arm 2, \(\:{{\mu\:}}_{\text{c}1}\) is the cost per child of the in-person intervention in Arm 1, \(\:{{\alpha\:}}_{2}^{\text{I}\text{T}\text{T}}\:\)is the ITT impact of Arm 2 on an outcome of interest, and \(\:{{\alpha\:}}_{1}^{\text{I}\text{T}\text{T}}\) is the analogous ITT impact in Arm 1.

Mediation analysis

To examine the interventions’ pathways of change, we will conduct a Mediation Analysis following a Monte Carlo simulations approach [48]. In a standard mediation model where the outcome of interest is \(\:Y\) and the mediating factor is \(\:M\), the goal is to estimate the magnitude and significance of the intervention’s indirect effect \(\:\left(a\times b\right)\), as opposed to the direct effect \(\:\left(c\right)\), from the following model:

$$\:Y\:={b}_{0}+\text{b}M+cD+\varepsilon$$
(6)
$$\:M\:={a}_{0}+aD+u$$
(7)

Using this simple model, we can investigate the pathways through which one of our intervention arms influences changes in a specified outcome of interest. For example, we can explore if intervention impacts on children’s outcomes \(\:\left(Y\right)\) are mediated by changes in factors \(\:\left(M\right)\) such as stimulation behaviors, disciplinary practices, nutrition practices, or other maternal intermediate outcomes including knowledge, self-efficacy, social networks, or mental health. To do this, we will perform the following steps. First, we will run regressions using Eq. (7) for each potential mediator of interest to estimate the intervention impact on the mediator, captured by the coefficient \(\:\widehat{a}\). Second, for each potential mediator, we will run regressions using Eq. (6), including treatment dummies and the particular mediator of interest, to estimate the coefficient \(\:\widehat{b}\). Using the estimated regression coefficients and their standard errors, we will compute the 95% Monte Carlo confidence intervals for the indirect effect \(\:\left(\widehat{a}\times\widehat{b}\right)\) based on a very large number of repetitions. An interval that does not include zero indicates a significant indirect effect of that particular mediating factor. To assess the total indirect effect including all relevant mediators, we will examine the Monte Carlo confidence intervals using the paths a and b from all mediators that proved significant individually, but now included together in the same regression model, as in Eq. (6).

Heterogeneous effects

Given the complexity of our experiment and the numerous hypothesized channels through which our interventions may affect final outcomes, it is challenging to anticipate all possible heterogeneous effects in advance. However, understanding whether our interventions are more effective among disadvantaged households is crucial for designing targeted policies and addressing equity-efficiency considerations by bridging socio-economic gaps in early child development. Therefore, we plan to test for heterogeneous treatment effects by examining variables such as children’s sex and age, maternal age an education, household wealth, and child outcomes at baseline.

Missing data and attrition

In all our analyses, we will handle missing data and attrition across survey waves by fitting logistic regression models to determine whether missing observations are random. To correct for potential non-random attrition in our regressions and in calculating standard errors, we will employ Inverse Probability Weighting (IPW) methods [49]. IPW reweights our data to give larger weights to participants who are underrepresented in the midline or endline samples due to attrition. We will complement this strategy with the estimation of Lee Bounds for all our results [50]. To test the impact of outliers on our findings, we will assess the robustness of our estimates by comparing results from the full sample with those from a trimmed sample, excluding the bottom 2% and the top 2% of the data distribution, and will test for the significance of differences between these estimates.

Adverse effects

Interviews, surveys, and the ECD program are low-risk, and therefore adverse events (AEs) are very unlikely. Any experienced AEs that do occur will likely be due to factors unrelated to the study. However, participation may lead to unintended or unexpected adverse consequences (e.g., giving smartphones to women might trigger intra-household conflict). In these instances, we will rely on local monitoring and reporting mechanisms established by SWAP, which has extensive experience handling fieldwork activities in community health projects. All SWAP staff have been trained to report adverse events and intervene as necessary, assessing the participant’s situation and developing a response plan. Incident reports will be written within one business day, and study investigators will inform the IRBs of all AEs. This plan has been reviewed and approved by the local IRB at Maseno University, as well as USC’s IRB.

Dissemination of results

Our dissemination plan will consist of two central strategies:

1) Engagement with local ECD Policy: Our research team includes staff based in Kisumu and Nairobi, as well as planned activities in all years to ensure our project remains engaged with local ECD policymakers and stakeholders throughout its duration. In Year 1, Co-Investigator Mwoma and her team at the ECD Network for Kenya (ECDNeK) will host full-day sensitization workshops in Kisumu and Nairobi to launch the project, inviting key County and National policy makers, stakeholders, and partner agencies. This ensures their input in planning project activities. Following this launch, we will establish an advisory board with representatives from partner agencies and stakeholders (e.g., Ministry of Health, Ministry of Labor and Social Protection, Africa Early Childhood Network), who will meet virtually twice per year to get updates on project progress and provide feedback and guidance. ECDNeK will also hire a part-time policy coordinator to oversee project networking and advocacy activities, attending meetings to ensure our project’s connection to the local ECD policy environment.

2) Dissemination of Results: In the later years of the project, ECDNeK and SWAP will coordinate dissemination and policy engagement workshops to share project findings. Our research team plans to publish the study’s protocol and all findings in peer-reviewed journals in economics and public health, and present results at domestic and international conferences, such as the Society of Research in Child Development (SRCD).

Discussion

There is an urgent need to discover the most effective and potentially scalable models of delivery for evidence-based responsive caregiving interventions that can improve children’s developmental outcomes among disadvantaged children in resource-poor settings. Additionally, it is critical to ensure these improvements are sustained to realize long-term benefits and help break the intergenerational transmission of poverty. While there is abundant evidence that ECD responsive caregiving programs can achieve short-term impacts on parenting behaviors and children’s outcomes, the challenge remains to sustain these impacts in the longer-term and scale those programs.

Our adaptation and test of the Msingi Bora program for remote delivery via smartphones, along with our strategy to extend the interventions to two years of continued program support, are designed to tackle these challenges head-on. However, our study might encounter several practical and operational issues. The first challenge is related to the measurement of children’s outcomes with reliable, direct assessments rather than parental reports. Internationally accepted “gold standard” direct assessments, such as the Bayley-III and the WPPSI-IV, have been primarily developed for high-income country settings. These assessments are time-consuming, require highly skilled assessors, and depend on the child’s mood, with the mother’s presence often needed to comfort the child. To address this, we will conduct a one-month training program for survey enumerators, including two weeks of intensive training in Kisumu, one week of supervised practice with rural families, and a final week of field testing to establish test-retest and inter-rater reliability (IRR) measures before full implementation.

The second challenge relates to the risk of intervention spillover across villages. We will adopt several strategies to mitigate this risk. For example, we will ensure CHPs’ catchment areas are well mapped and maintain a minimum distance between sampled villages. Additionally, CHPs will collect detailed attendance data, including the village of residence, to monitor and address uninvited participantion. Finally, SWAP will coordinate with with Community Health Units, which supervise CHPs across large groups of villages, to prevent cross-village contamination. Additionally, SWAP will identify any concurrent interventions by other NGOs to avoid overlap whenever possible. When overlap is unavoidable, these interventions will be documented to incorporate this information in our statistical analyses.

Finally, the remote nature of Arm 2 poses a challenge due to technology illiteracy among our study participants, which could hinder their ability to download video content or join WhatsApp group calls and chats. To mitigate this risk, we will conduct a special in-person session before the start of remote sessions in this arm to train families on smartphone usage, covering access to audio and video content, use of WhatsApp for group interacions and group calls, and downloading materials. We will also provide internet packages to CHPs and the families to facilitate access to the remote content and the virtual social network. Every few remote sessions, we will host an in-person session to maintain adherence to the program, encourage smartphone retention, and reinforce key content shared remotely. A buddy system will be implemented to encourage peer support in using smartphone technology, further enhancing program adherence and the enactment of new parenting behaviors.

In total, despite these challenges, the strengths of our study outweigh the limitations.