1 Introduction

Sub-Saharan Africa (SSA) has made substantial progress in improving maternal, neonatal and child health (MNCH) outcomes over the last 15 years, but continues to suffer the highest global burden for poor MNCH [1,2,3]. From the start of the Millennium Development Goal era strengthened primary care including community health worker (CHW) networks and community-based interventions are amongst the key recommended approaches for increasing universal health coverage in this region [4,5,6], and are especially valuable in contexts where a significant proportion of births are rural and take place at home [7]. This integrated, health-systems focus continues to be a priority for the achievement of sustainable development goal (SDG) targets for health [7,8,9]. In practice however, such programmes may be difficult to sustain where primary care systems are under-resourced, and may even fail to achieve levels of coverage that translate to significant improvements in health at the population level [10]. The evaluation of innovative, pragmatic community-based MNCH trials sensitive to the characteristics of the implementing environment is therefore crucial to understand what works, how interventions work, and in what context in order for programmes to be effective at scale.

Such evaluations must consider the potential complexities of assessing the success of integrated intervention approaches [9] which may contain multiple components or delivery mechanisms linked to outcomes at both population and health system levels. There will be unique challenges too; mortality rates, pregnancy rates, birth rates, and reproductive period-specific risks of morbidities are frequently prioritized MNCH indicators that often require specialized data collection methods or very substantial sample sizes when assessed at population level [11,12,13,14]. Where MNCH needs are highest, facility-based care systems are usually also the least supported and/or inaccessible to many, and facility-based routine outcome data for these indicators will be often incomplete or inaccurate [15]. This holds even truer for data generated in areas where populations already have poor access to facility-based care. Monitoring and evaluation will therefore rely heavily on participant-reported outcomes from household surveys and other local data sources to provide the bulk of information on intervention impact. This requires careful planning and implementation to maintain high data quality and minimise bias [15], and not all potential pitfalls will be predicted before a study starts. Many of these issues are not specific to MNCH research, but can have disproportionately significant negative impacts on MNCH evaluations where key outcomes like deaths or pregnancies are rare or time-bound. Some infrastructure for household surveys may be in place in sites which form part of the Demographic and Health Surveys or Multiple Indicator Cluster Surveys networks in SSA [16, 17]; however in the context of the evaluation of specific MNCH programmes or pragmatic trials, additional surveillance systems may have to be developed from scratch outside of the routine data systems to measure study outcomes with higher levels of timeliness and accuracy, to collect outcomes which may not be available from existing sources, or which can’t be estimated reliably based solely on participant recall.

In this special series “Lessons Learned from Operationalising Impact Evaluations of MNCH Interventions” we consider the operational challenges in delivering quality evaluations of community-based MNCH programmes. The setting is Ghana, a West African country with low-middle income country status [18] which for over 20 years has had a central role in hosting several large, seminal studies of MNCH through a long-term collaboration between the award-winning Kintampo Health Research Centre (KHRC) [19], a Ghana Health Service Research Centre in Brong Ahafo (BA), and the London School of Hygiene and Tropical Medicine in the United Kingdom.

2 History of the LSHTM-KHRC collaboration

The LSHTM collaboration with the Ghana Health Service can be traced back to the early 1990s with the initiation of a series of randomised controlled trials of vitamin A supplementation in infants and mothers. The first, the Ghana VAST trial in Navrongo, Northern Ghana established the concept of the safety and efficacy of supplementation in early childhood, enrolling nearly 23,000 infants and demonstrating a positive impact on infant morbidity and mortality [20]. The consequent question of whether supplementation could be successfully linked to the expanded programme of immunisation (EPI) delivery schedule prompted a multi-country study of the impact of EPI plus vitamin A (EPI-Plus) in India, Peru and Ghana [21], and with it the creation of the KHRC, the host of the Ghana trial arm. KHRC enrolled nearly 3000 mother-infant pairs into the EPI-Plus surveillance system, providing the early structure for the expansion of the survey model into one of the largest research-based surveillance systems in SSA. The expansion was necessary to address a related MNCH question of equally significant international importance at the turn of the millennium, the potential impact of monthly low dose vitamin A supplementation distributed at community level to women of reproductive age on maternal mortality rate (MMR)—the ObaapaVita Study [22]. An earlier study in Nepal had reported a large and significant impact on MMR—however the study design and analysis approach were met with considerable skepticism [11, 23, 24]. With lower maternal mortality rates in Ghana, the statistical power needed to assess similar impact in ObaapaVita required the monthly follow-up of over 200,000 unique participants across seven districts between 2000 and 2008—an unprecedented sample size at the time. After the study ended, the surveillance system went on to become the engine and long term scaffold for a series of policy-influencing community-based randomised controlled trials, cohort studies, and secondary analyses of MNCH (Fig. 1). It is of note however that the Ghana vitamin A studies alone have provided a significant proportion of the entire evidence base shaping current global policy on vitamin supplementation, including reversing previous recommendations for supplementation to new mothers and infants aged < 6 months and greenlighting programmes of supplementation to children 6 months of age and older.

Fig. 1
figure 1

Key completed trials and embedded studies of maternal, neonatal, and child health conducted in Brong Ahafo 1995–2016. References: Lancet [22]; Lancet [25]; Lancet [26]; Lancet Global Health 2018 [27]; Edmond et al. [28]; Dzakpasu et al. [29]; Weobong et al. [30]; Gram et al. [31]; Manu et al. [32]; Pitt et al. [33]; Nesbitt et al. [34]; Vlenterie et al. [35]

3 Series concept—addressing the gap between the design and implementation of complex evaluations

The successful delivery of a surveillance system of this size and complexity requires large amounts of effort on many fronts, from design to delivery, and that the Ghana system functioned in the same form for nearly two decades is a phenomenal achievement for all involved. This series brings together the practical experiences and learnings from a range of academic, operational and policy experts all with significant input in the design or conduct of one or more MCNH trial evaluations at KHRC during this period. Over several workshops and paper development meetings in person and online the authors of this commentary were joined by authors of the papers in this series both altogether and in topic sub-groups to define an overarching aim for the series and goals for each paper: we agreed the main goal was to provide experiences and lessons learned in conducting evaluations that are primarily practical and useful for stakeholders in MNCH research. The learnings may well be relevant for evaluators working in other areas of public health relevant to SSA and we welcome any feedback to the editor to this effect.

As the articles in this series illustrate, the study teams were obliged to solve issues that will be recognised as relevant to many working in LMIS; adapting to differences in home address cultures and household enumeration, fidelity of IDs, high migration rates, low vital registration (VR) rates, engaging with cultures and communities, and maintaining good quality data collection and processing. The series highlights topics of particular resonance that cover many of these issues. Sam Newton and co-authors [36] (“Maximizing community participation and engagement: lessons learned over 2 decades of field trials in rural Ghana”) discuss a subject ideally addressed at the start of implementation of any new evaluation—optimising engagement with the target population. The authors offer many valuable insights here, key among them are details of the innovative “Information, Education and Communication” (IEC) team led by senior social scientists based at the KHRC. Initially created to communicate the purpose and potential benefits of the ObaapaVita Study to maximise treatment adherence, the remit of the IEC team evolved over time with the implementation of new trials, quickly becoming a permanent unit at the centre [36] and conducting formative and qualitative research to better understand the needs and perspectives of the study population. The authors credit this team as one of four core strategies contributing to successful long-term community engagement in the MNCH research studies at KHRC. They nicely highlight the added benefits of the IEC against the more traditional approaches of one-off sensitization or engagement activities implemented at the start or end of studies only. We note the World Health Organisation (WHO) has recently endorsed permanently-embedded community engagement teams for similar settings and has provided a theoretical framework for their development [37], which will complement the excellent practical implementation advice provided by Newton et al. in this series. In the paper “Are verbatim transcripts necessary in applied qualitative research?: experiences from two community-based intervention trials in Ghana” [38] Zelee Hill and co-authors [38] expand on the development of qualitative techniques used by the IEC team. Qualitative research was a core component of both the Obaapavita trial and the Newhints trial of community health worker visits to pregnant and recently delivered women [25] and featured in other studies conducted within the surveillance system—variously for promotion of participant engagement, to inform study design and implementation, and for evaluations of determinants of programme successes [28, 38, 39]. The authors focus on one of the most time consuming aspects of qualitative data collection, the collection and transcription of narrative interview data [40]. They document the evolution of interview transcription methods used by team members where difficult trade-offs were required between comprehensiveness and efficiency or time. They recommend contexts in which a ‘fair notes’ approach, that supplements field notes with additional information from interview recordings or from memory, may be most valuable. Given the lack of consensus in the social science community on the value of fair notes and the uses of audio recordings to expand field notes [41,42,43], this paper is a welcome addition to the few literature in this field.

Caitlin Shannon and co-authors have developed a checklist for the optimized design of pregnancy surveillance systems that capture maternal and child health outcomes, described in the paper “Implementing effective community-based surveillance in research studies of maternal, newborn and infant outcomes in low resource settings” [44]. Unsurprisingly given prioritization and commitments to improving MNCH outcomes worldwide [8, 10] considerable efforts are dedicated to harmonization of key indicators [27, 45] and designing surveillance tools to collect them [46,47,48]. This paper continues the series theme by documenting many years of challenges and solutions in operationalizing and adapting recommended MNCH tools, but additionally—and crucially—takes a wide-lens perspective in outlining lessons learned in designing an ‘integrated’ surveillance system with sufficient sensitivity to collect linked data on incidence of pregnancies, births, and deaths within the system, and related outcomes for impact evaluations. By taking this approach, they implicitly acknowledge the connection between maternal and child health outcomes and their often shared aetiology [45] and like Newton et al [36] primarily underline the essential need for good community relations for integrated surveillance to be sustainable. They pay special attention to documenting learnings around the cost–benefit of more frequent (e.g. monthly) versus less frequent survey cycles to identify new pregnancies and deliveries. There is little formal research or documentation comparing survey cycle durations that best capture MNCH outcomes, however general recommendations from experienced implementation teams suggests survey ‘sweeps’ every 3–6 months are usually sufficient to track pregnancies and births in population-based surveillance systems in low-resource settings [49, 50]. Shannon and co-authors argue that in settings where migration rates are relatively high and rates of pregnancy-related deaths are relatively low (377 deaths per 100,000 pregnancies in the study site in Ghana [36]), more frequent pregnancy surveillance may be considered.

Shannon et al. [44] and expanded in Danso et al. [51] (in “Population cause of death estimation using verbal autopsy methods in large-scale field trials of maternal and child health: Lessons learned from a 20-year research collaboration in Central Ghana”) discuss methods to survey and categorise maternal, neonatal and infant deaths. Facility and community-based systems for death registration are poor in Ghana, and like most of SSA, country estimates often rely on modelling or incomplete sub-national data to estimate causes and burden of mortality [45, 52]. KHRC adopted the WHO “verbal autopsy” (VA) tool [14], to collect information related to a death from close family and friends of the deceased; forms were later reviewed by physicians and assigned a cause of death. Implementing VA surveys required special training, field and data management protocols that optimized accuracy of results whilst ultimately respecting the mourning period and cultural practices in the study site [14]. The Danso et al. paper discusses recall periods, selection and training of VA interviewers and special adaptations to the WHO tool to increase the probability of identification of the causes of deaths. This includes the perspectives of co-authors directly involved in the design and conduct of VA studies who recount the value of the expanded structured narrative section to facilitate physician review, an otherwise optional section in the original WHO VA form. Contemporary methods for assigning causes of deaths increasingly rely on automated machine learning algorithms applied to closed-ended VA questions [53, 54], and more recently free text [54, 55] with the aim to shorten the process and reduce cost and time resources. Reports of their accuracy in practice however have been mixed [56, 57]. The use of respondent narratives will likely continue to remain useful options as automated methods continue to evolve.

4 Conclusions

There are a number of good technical sources for guidance on conducting complex impact evaluations which focus on theory, epidemiology or quantitative solutions to problems of quality or bias [15, 45, 49, 50]. The unique approach of this series is to address practical considerations and methodological challenges in designing large population-based impact evaluations of MNCH interventions. Official manuals accompanying DHS, WHO, and other guideline bodies with MNCH programme components may include general recommendations for survey implementation, however the final path rightly relies on the implicit knowledge or ingenuity of implementers themselves to decide which works best for their scenario – often in a process of iterative learning.

The unusually large scale and long lasting nature of the MNCH surveillance system in Ghana has proved a rich source of experiences and lessons in programme evaluation design. It is our hope that the experiences shared in this series provide a valuable reference for others conducting population based evaluations in settings similarly characterised by nascent routine surveillance and vital registration systems, and where inequities in access to health care continue to exist. It is such contexts where good quality data and sustainable approaches to MNCH progamme design and evaluation are urgently required to reverse stagnating progress in the achievement of health and development goals [58]. The KHRC surveillance system would have been impossible to run effectively without the dedication and commitment of the KHRC leadership who provided oversight and strategic direction throughout every phase of its existence, the implementing teams, and studies principal investigators. We finally dedicate this series to Dr Paul Arthur, Ghanaian physician, epidemiologist and founding director of the KHRC who with BK (co-author of this introduction) co-designed and implemeted the surveillance system in 1995. Their vision ensures that the KHRC, now under the exemplary leadership of Dr Kwaku Poku Asante continues to be a centre for excellence not only for MHCH studies but for countless essential public health research trials relevant to Africa and other low and middle income settings.