Key findings

What is known and what is new about this study?

 • Neonatal resuscitation programmes are being scaled up globally, yet coverage of resuscitative interventions is not routinely tracked. Resuscitation coverage and quality measures have not yet been validated in either population-based surveys or routine facility registers.

 • Challenges exist for measurement of resuscitation coverage indicators:

  ° Numerator: Which action during clinical resuscitation (e.g. stimulation or bag-mask-ventilation [BMV]) is both measurable and valid?

  ° Denominator: What is measurable and useful (e.g. live births plus fresh stillbirths or non-breathing, or non-crying babies)?

 • EN-BIRTH is the first observational study (> 23,000 births) to assess validity of neonatal resuscitation coverage measurement, in both exit survey of women’s report and routine register records. Using time-stamped data, we analysed coverage and quality of neonatal resuscitation in five hospitals in Bangladesh, Nepal, and Tanzania.

Survey — what did we find and what does it mean?

 • Numerator options: Survey-reported coverage of BMV (0.3–1.9%) markedly under-estimated observed coverage (0.7–7.1%). BMV had low sensitivity (< 21%) and high specificity (> 98%). Newborn stimulation was reported by < 3% of women, very much lower than observed coverage (5.2–21.0%).

 • Denominator options: Crying at birth had low “don’t know” responses (< 3%) in exit survey. Compared to observed crying within a minute of birth, sensitivity was high (> 95%); however, specificity was low (< 22%). Survey-reported BMV coverage validity was consistently low for all denominators assessed.

Register — what did we find and what does it mean?

 • Numerator options: Stimulation and BMV were recorded by 4 of 5 labour ward registers, yet accuracy varied between hospitals even with the same register design. BMV sensitivity ranged from 12.4–48.4% and specificity was high (> 93%). For stimulation, sensitivity was low at 7.5–40.8% and specificity was more variable (range 66.8–99.5).

 • Denominators: Livebirths and fresh stillbirths were recorded in all registers. The “non-crying/non-breathing” combined denominator was only in the Bangladesh registers and could not be validated. Register-recorded BMV coverage was consistent whichever denominators was applied.

Gap analysis for quality of care and measurement

 • Most newborns (71.4–94.7%) who did not respond to stimulation did receive BMV, but only 1% within the recommended 1 min after birth.

What next and research gaps?

 • Population-based surveys are not likely to be useful for measuring neonatal resuscitation coverage, given low validity of exit-survey report. Additionally, household surveys would be underpowered since resuscitation is required by a small proportion of babies.

 • Routine hospital registers have potential to track resuscitation coverage indicators, but implementation research is needed to standardise design and processes, including data flow to Health Management Information Systems. BMV is the most accurate numerator, true denominator measurement is complex and requires more research, including assessment of non-crying.

 • Data use with feedback loops and support to frontline healthcare workers could help improve data quality and quality of care. Local clinical quality improvement and special studies are important to reduce quality gaps, particularly for timely BMV, and help meet global goals to end preventable deaths.

Background

Annually, 7–14 million newborns (5–10%) are estimated to require stimulation to initiate breathing at birth and 6 million newborns require bag-mask-ventilation (BMV) [1, 2]. Intrapartum-related events (previously termed “birth asphyxia”) are a leading cause of neonatal mortality, accounting for 11% of under-five deaths [2, 3]. Such intrapartum-related events can cause stillbirths just before birth and neonatal deaths just after. The majority (> 84%) of stillbirths are in low- and middle-income countries (LMICs) and an estimated 50% are intrapartum [4, 5]. Resuscitation is recommended for all babies who do not breathe after birth since live births may be misclassified as stillbirths [6, 7]. Meeting Sustainable Development Goal (SDG) targets by 2030 for ending preventable neonatal deaths requires universal coverage of high quality care around birth for women and their babies, including resuscitation for those who do not breathe at birth [8, 9]. Globally ~ 80% of births are now in facilities [10], with many LMICs scaling up neonatal resuscitation programs [11,12,13]. However, lack of measurement for coverage and quality of neonatal resuscitation impedes tracking of progress [14].

The definition of coverage requires a numerator capturing the intervention (or a component) divided by a target denominator regarding clinical need. A good indicator may not include all of the clinical intervention but should “indicate” well and also not incentivise undesirable practices. Resuscitation coverage measurement has specific challenges. Clinical algorithms have multiple actions that could be used as numerators, notably: stimulation of the baby or the action of BMV. Suction is indicated for some babies, but inappropriate suctioning can be harmful, thus should be avoided for a measurement focus [15].

Resuscitation algorithms start at birth for all babies, including fresh stillbirths, being dried and assessed for crying or breathing. WHO guidance on basic resuscitation focuses on the baby who is not breathing spontaneously or is depressed [16]. A global partnership called “Helping Babies Breathe,” (HBB) widely used for neonatal resuscitation training in LMICs, uses crying during thorough drying as a rapid and objective assessment, then evaluating breathing (Fig. 1) [17]. In line with WHO guidelines, if the baby is not crying and not breathing, then stimulation is provided to improve or initiate breathing, and clearing of the airway if it is blocked with secretions. If the baby is not breathing after these actions BMV should begin within 1 min of birth.

Fig. 1
figure 1

Helping Babies Breathe algorithm decision points to measure neonatal resuscitation coverage

Most data on maternal and newborn health care coverage in LMICs relies on population-based surveys, notably Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS), none of which capture neonatal resuscitation. Routine facility data are currently an underutilised source for neonatal resuscitation coverage for routine Health Management Information Systems (HMIS). Interventions around the time of birth are typically recorded in one or more facility documents: individual patient records, labour and delivery ward registers, and intervention-specific registers (e.g., neonatal resuscitation register) [18]. Previous research has demonstrated availability of some neonatal resuscitation data in routine labour ward registers [19, 20]. Use of HMIS data aggregated from registers is impeded by concerns regarding data quality [21], but to date no validation studies have been undertaken regarding either survey or routine register data for neonatal resuscitation coverage indicators.

The Every Newborn Action Plan, agreed by all 195 United Nations member states, includes an ambitious measurement improvement roadmap [9] to validate coverage indicator measurement for care and outcomes around the time of birth. The Every Newborn–Birth Indicators Research Tracking in Hospitals (EN-BIRTH) study was undertaken in three countries (Tanzania, Bangladesh, and Nepal) and aimed to assess validity of measurement of selected newborn and maternal indicators for routine facility-based tracking of coverage, quality of care, and outcomes [22].

Objectives

This paper is part of a supplement based on the EN-BIRTH multi-country validation study, ‘Informing measurement of coverage and quality of maternal and newborn care’, and focuses on neonatal resuscitation measurement with four objectives:

  1. 1.

    Assess NUMERATOR accuracy/validity for neonatal resuscitation coverage indicator (stimulation and BMV) measurement by exit survey of women’s report and routine labour ward registers compared to direct observation (gold standard).

  2. 2.

    Compare DENOMINATOR options for resuscitation coverage measurement: including all births (except macerated stillbirths), non-crying babies and non-breathing babies.

  3. 3.

    Analyse GAPS in coverage, quality of care and measurement in relation to recommendations, notably timely initiation of BMV.

  4. 4.

    Evaluate BARRIERS AND ENABLERS to routine labour ward register recording for resuscitation regarding register design, filling, and use.

Methods

EN-BIRTH was an observational, mixed methods study comparing data from clinical observers (gold standard) to survey-reported and register-recorded coverage of perinatal care and outcomes (Fig. 2). Detailed information regarding the research protocol, methods, and analysis has been published separately [22, 23]. Data were collected from July 2017–July 2018 in five public CEmONC hospitals in three high mortality burden countries: Maternal and Child Health Training Institute, Azimpur and Kushtia General Hospital in Bangladesh (BD); Pokhara Academy of Health Sciences in Nepal (NP); Temeke Regional Hospital and Muhimbili National Referral Hospital in Tanzania (TZ). (Additional file 1). Baseline health facility assessments established that all five hospitals had capacity to resuscitate newborns. Resuscitation guidelines used in all five hospitals were based on HBB [17]. Participants were consenting women admitted in labour for care around birth. Exclusion criteria included imminent birth and no fetal heart beat heard on admission. Clinically trained researchers observed participants 24 h per day and recorded data on the baby’s condition at birth (e.g., crying/breathing) and care (e.g., stimulation and BMV). The observers received refresher training in HBB as part of their clinical observation training before the study started [22]. Data were collected with a custom-built android tablet-based application, including timestamps for observations. Research data collectors interviewed women after discharge before exit from hospital regarding their baby’s condition after birth and care received. Resuscitation and outcome data were extracted from routine hospital registers. Metadata definitions of selected indicator options for validity testing are shown in Additional file 2. To determine the reliability of the observational data (gold standard) supervisors duplicated observation (and register data extraction) for a subset of 5% to calculate Cohen’s Kappa coefficients. Health workers and data collectors were interviewed about barriers and enablers to use of routine registers in recording of perinatal care and outcomes.

Fig. 2
figure 2

Neonatal resuscitation validation design, EN-BIRTH study. EN-BIRTH validation design comparing observation gold standard with register-recorded and women’s report on exit survey; EN-BIRTH data collection tools (observation checklist, register data extraction tool and exit survey tool) are published separately [22]

Results are reported in accordance with STROBE Statement checklists for cross-sectional studies (Additional file 3). Quantitative analysis was undertaken using R version 3.6.1 [24].

Objective 1: Numerator for indicator measurement validation

Livebirths and fresh stillbirths (hereafter referred to as “newborns”), were considered to require initial assessment for resuscitation, whilst macerated stillbirths were excluded. We explored accuracy of two possible numerator options N1) Stimulation and N2) BMV in both survey and register data compared to observation data.

In exit surveys, where a woman reported her newborn had difficulty breathing at birth, she was asked about resuscitation practices. In line with common survey indicator reporting, where women replied, “don’t know” we considered the survey-reported stimulation/BMV response as “no”.

We compared observed coverage (gold standard) of stimulation and BMV to survey-reported and register-recorded coverage. We calculated absolute differences between measured coverage (survey or register) and observed coverage to understand under- or over-estimation at the population level. Using two-way tables, we calculated individual-level validity statistics: sensitivity, specificity, and percent agreement ((true positive + true negative)/total) of register-recorded and survey-reported BMV coverage to measure observed coverage. Area under the curve, inflation factor, positive predictive value, and negative predictive value were also calculated. All calculations were stratified by hospital with 95% confidence intervals. Pooled results for validity analyses were calculated using random effects meta-analysis, presented with i2, τ2, and heterogeneity statistic (Q).

Objective 2: Denominator comparisons

We explored neonatal resuscitation coverage measurement using three possible denominator options: D1) all newborns (total births excluding macerated stillbirths), D2) newborns not crying within the first minute after birth and D3) newborns not breathing within the first minute after birth.

We compared these denominators using validity ratios (measured:observed coverage), similar to verification ratios in data quality review methods [25], for survey-reported and register-recorded BMV coverage. Validity ratios > 1 show overestimation of survey-reported or register-recorded coverage compared to observed, while ratios < 1 show underestimation. Results were heat-mapped using standard data quality review cut-offs (over/underestimate by 0–5%, 6–10%, 11–15%, 16–20 and > 20%).

Objective 3: Gap analysis for coverage and quality of care, and measurement

We examined gaps in coverage and timely neonatal resuscitation amongst a subset of newborns with a clinical need for resuscitation within 1 min of birth. These newborns were not breathing in the first minute after birth and did not respond to stimulation (or suction when performed). For this (A) eligible population subset, we analysed four gaps for neonatal resuscitation: (B) coverage gap for BMV, (C) quality of care gap between any BMV coverage, and timely coverage (within 1 min), (D) measurement gap for survey-report, and (E) measurement gap for register-record.

Objective 4: Barriers and enablers to routine recording

Qualitative data collection tools for focus group discussions and in-depth interviews were informed by the Performance of Routine Information System Management Series (PRISM) conceptual framework [26]. Detailed qualitative methods and overall results are available in an associated paper [27]. A purposive sample of nurses, midwives, doctors, and EN-BIRTH data collectors from each of the five hospitals participated. Analysis identified themes based on three domains: register design, filling, and use [26]. In addition, respondents were asked questions regarding the order in which resuscitation is documented in registers, patient notes, and other documents as well as how long after resuscitation is documentation entered in the labour ward register. This paper presents emerging themes regarding recording of neonatal resuscitation.

Results

Among 23,811 eligible women across the five participating hospitals, 23,724 consented to participate (Fig. 3). Among 23,471 observed births, 22,752 were live births (22,522) or fresh stillbirths (230). Data extraction was completed for 21,101 newborns (92.7%), and exit surveys were conducted with 20,245 women (90.7%). Reasons for women’s non-participation in exit survey included refusals and exit from facility prior to research team approach. Table 1 shows characteristics of newborns in the EN-BIRTH study sample, by hospital. Overall, 98.7% were alive at discharge from labour and delivery, 1% were fresh stillbirths, and less than 1% were born alive but died on the labour ward. Nearly one-third of births (29.5%) were by caesarean section, highest (73.6%) in Azimpur BD.

Fig. 3
figure 3

Flow diagram of cases for neonatal resuscitation analysis, EN-BIRTH study (n = 22,752)

Table 1 Characteristics of babies and women, EN-BIRTH study (n = 22,752 births)

Among 22,752 newborns (denominator option D1), 3688 (16.2%) were stimulated (numerator option N1) and 998 (4.4%) received BMV (numerator option N2) (Fig. 4). Within the first minute after birth, 5330 were observed as non-crying (denominator option D2), and among these 3860 were also observed as non-breathing (denominator option D3).

Fig. 4
figure 4

Neonatal resuscitation numerators and denominators, EN-BIRTH study (individually weighted, observation data, n = 22,752)

Assessing biases in the data

Duplicate case observation inter-rater reliability showed substantial agreement (> 0.71) for resuscitation elements (Additional file 4). Register extraction agreement was lower and varied greatly between sites, ranging from − 0.035 to 0.939.

Objective 1: Numerator for indicator measurement validation

Numerator option 1: stimulation

Observed coverage of stimulation ranged from 5.2% in Azimpur BD to 21.0% in Muhimbili TZ. Survey-report gave large underestimates for stimulation with survey-reported coverage ranging from 0.6–2.2%. Sensitivity was very low (< 14%) while specificity was high (> 98%) (Table 2; additional validity details in Additional file 5 and Additional file 6).

Table 2 Individual-level validation in exit surveys and registers for stimulation at birth indicator, EN-BIRTH study (n = 22,752)

Register-recorded coverage (0.8–34.8%) underestimated coverage in the Bangladesh hospitals and overestimated coverage in the Tanzania hospitals (Fig. 5). While sensitivity was low (< 41%), specificity was high across most sites (66.8–99.5%).

Fig. 5
figure 5

Hospital register design and completeness for stimulation and bag-mask-ventilation, EN-BIRTH study (n = 22,752)

Numerator option 2: BMV

Observed BMV ranged from 0.7% in Azimpur BD to 7.1% in Muhimbili TZ. Survey-reported coverage (0.3–1.9%) underestimated observed coverage (Fig. 6). Sensitivity was < 21% while specificity was high across all hospitals (> 98%). Register-recorded coverage (0.9–7.2%) was closer to observed coverage. While sensitivity ranged from 12.4–48.4%, specificity was > 93% across all hospitals (Table 3; additional validity details in Additional files 7 and 8).

Fig. 6
figure 6

Coverage (and 95%CI) of bag-mask-ventilation measured by observation, register, and exit survey, EN-BIRTH study (n = 22,752). *Random effects meta-analysis; BD = Bangladesh, NP = Nepal, TZ = Tanzania, stim. = stimulation, suct. = suction, BMV = bag mask ventilation, FSB = fresh stillbirth; BMV = bag mask ventilation; Full denominator details presented in Additional file 14

Table 3 Individual-level validation in registers and exit surveys of bag-mask-ventilation indicator, EN-BIRTH study (n = 22,752)

Objective 2: Denominator for indicator measurement comparison

Denominator option 1: all newborns (live births and fresh stillbirths)

The validation of birth outcomes is reported separately [28]. Survey validity ratios for BMV coverage measurement using this all newborn denominator performed poorly (0.11–0.71) and register validity ratios were moderate to poor (0.70–1.22) (Fig. 7).

Fig. 7
figure 7

Validity ratios for exit survey-reported and register-recorded coverage of bag-mask-ventilation, EN-BIRTH study (n = 22,752). Full denominator details presented in Additional file 14

Denominator option 2: non-crying newborns

Survey-reported prevalence of crying at birth (90.5–95.8%) was higher than observed prevalence of crying within 1 min of birth (72.0–86.7%) with very few “don’t know” responses (< 3%). While sensitivity was high (> 95%) specificity was low (< 22%) (Table 4; additional validity details in Additional files 9 and 10).

Table 4 Individual-level validation in exit survey of crying at birth indicator, EN-BIRTH study (n = 22,752 births)

Survey validity ratios for BMV using this non-crying denominator performed poorly (0.13–0.58), while sensitivity was low (< 16%), specificity was high (> 98%). Register validity ratios ranged from poor to very good (0.40–0.92). While sensitivity was low (11.1–46.8%), specificity was high (> 91%).

Denominator option 3: non-breathing newborns

Prevalence of not breathing within the first minute ranged from 11.7% in Azimpur BD to 21.0% in Pokhara NP. The survey validity ratio for BMV coverage measurement using this non-breathing denominator performed poorly (0.14–0.49). Sensitivity ranged from 0 to 20.8% while specificity was > 97% across hospitals. Register validity ratios were better, but still classified as poor (0.45–0.78). While sensitivity ranged from 11.1–51.3%, specificity was high across all hospitals (> 92%).

Objective 3: Coverage and quality gap analysis

Among the subset proxy for true clinical need [newborns who did not cry/breathe in the first minute with no response to stimulation (or suction if needed)], most received BMV, ranging from 71.4% in Azimpur BD to 94.7% in Pokhara NP (Fig. 8) but timely coverage was very low (1%). Survey-reported coverage (< 28%) substantially underestimated true coverage. Register-recorded coverage also underestimated true coverage and ranged widely from 0.0% in Kushtia BD to 52.9% in Temeke TZ.

Fig. 8
figure 8

Gap analysis for coverage and quality among newborns non-crying/not responding to stimulation/suction, EN-BIRTH study (n = 200). BD = Bangladesh, NP = Nepal, TZ = Tanzania; BMV = Bag-mask-ventilation; Full denominator details presented in Additional file 14

Among newborns receiving any BMV on the labour ward, the proportion receiving the first ventilation breath within 1 min of birth ranged from 0.2% in Temeke TZ to 8.0% in Pokhara NP. Across the three denominators explored, time to initiation of BMV was similar (Fig. 9).

Fig. 9
figure 9

Time to bag-mask-ventilation by denominator, EN-BIRTH study (n = 991). BD = Bangladesh, NP = Nepal, TZ = Tanzania; D1: n = 991, D2: n = 672, D3: n = 454, No cry/breath/response: n = 142

Objective 4: Barriers and enablers to routine recording

Register design

Labour ward registers varied in design, between the five hospitals (Fig. 5). Bangladesh labour ward registers had three specific columns for recording neonatal resuscitation: (i) “baby did not breathe/cry after birth” (tick box for ‘yes’ and tick box for ‘no’), (ii) “stimulation” (instructions to tick for ‘yes’ and leave blank for ‘no’) and (iii) “BMV” (instructions to tick for ‘yes’ and leave blank for ‘no’). The Tanzanian register captured resuscitation steps by numerical code in a column headed “Helping Babies Breathe” (suction = 1, stimulation = 2, BMV = 3) or “no”, and blanks are treated as not recorded. There was no specific column in the Nepal register for resuscitation.

Documentation practices in registers

Resuscitation practices were recorded in varying order into multiple documents (Additional file 11). Reported time between care and documentation ranged from 2.5 min in Pokhara NP to 22.5 min in Temeke TZ.

Register design

Register design largely acted as a barrier to recording in Pokhara NP:

“Drying, stimulation, and bag-mask ventilation are written [in the patient’s chart], but in the main register it is not present… we do not have routine care of the newborn in the register, only in the patient’s chart.”

-Data collector, Pokhara NP

In the other hospitals health workers duplicated documentation in registers with multiple other documents (e.g. partographs, patient case notes) (Additional file 12).

Register filling

Aspects of register filling acted as both barriers and enablers. Training and support from senior nurses enabled improved accuracy of documentation, while limited time acted as a barrier. Health workers across the hospitals discussed the lack of time to document, particularly for complicated cases and resuscitation when they are focused on delivering care:

“Just after finishing [resuscitation], you must keep everything clear… time is a problem… you must estimate, there are times it is difficult and other times you ask the [senior nurse]… because in an emergency you all work together; thus, you remind each other.”

– Health worker, Temeke TZ

Health workers in Pokhara NP received specific support for documentation in neonatal resuscitation:

“We have received training on HBB and we were trained for documentation in that. We were doing documentation before, but we received direction for improving it.”

-Health worker, Pokhara NP

However, while health workers in Pokhara NP record resuscitation in other documents, it is not recorded in routine hospital registers.

Register use

While improved patient care and use of data by managers motivated documentation and was affirmation of the care health workers were giving, not all respondents could identify the use for resuscitation data in routine registers.

Feedback was lacking where documentation didn’t line up with clinical need:

“Sometimes when you look at the [APGAR] score of the baby, maybe it’s 5, you wonder why they didn’t perform resuscitation, there’s a possibility they [did] but they haven’t documented that… There’s no one to follow up on that… The person responsible for data comes and copies what’s written in the register, be it a low score… but they never ask them why they didn’t perform resuscitation if the baby had a low score”

– Health worker, Temeke TZ

Conversely, in Bangladesh, health workers were not sure what happened with resuscitation data:

“Resuscitation is an emergency subject. There remains a referral slip while resuscitating a baby on emergency that indicates the baby went to operating theatre... We write down the procedures of resuscitation in that slip... I am not sure whether this actually goes in the monthly report or not.”

-Health worker, Azimpur BD

Data culture

Data culture was both an enabler and barrier to routine documentation of resuscitation. It acted as a barrier where minor interventions were not seen as worth recording:

“Minor things like suctioning were not recorded and they only documented on a resuscitation case that took more than ten minutes.”

-Data collector, Muhimbili TZ

However, the importance of documentation was noted for organizational and personal protection:

“For instance, if a child has been born but unfortunately, let us say she had a problem, you have resuscitated her, but you did not document… and the mother/parent has become very angry and start complaining, or the whole management has become angry with you why the child had this situation, but you did not record what you have done ... You will not defend yourself, but documentation defends you.”

-Health worker, Muhimbili TZ

Discussion

EN-BIRTH study’s large sample size (22,752 live births and fresh stillbirths) allowed the first validity assessment of measurement for neonatal resuscitation coverage in routine hospital registers and surveys, against a gold standard of clinical observation. We found that survey report poorly captured resuscitation indicators. Routine labour ward registers performed better, but variably, and have potential, especially with data quality improvement.

Survey-reported coverage was challenging, which is not surprising. We found most women who reported their baby had trouble breathing after birth did not know if their baby had been stimulated or received BMV. We recommend resuscitation need or BMV questions should not be added to existing population-based surveys. Furthermore, the sample size required for this relatively low-incidence practice, would be challenging even in DHS surveys with large, nationally representative samples [29].

The numerator for neonatal resuscitation is key. Stimulation by rubbing the baby’s back is easily conflated with the similar action of drying every newborn baby and was not recognized at all by mothers (< 3% in survey report). Suction is only necessary if the airway is blocked and a measurement focus on suction may unintentionally encourage this potentially harmful practice which can cause bradycardia. BMV is the most distinguishable option for a clear subset of non-breathing babies and had higher accuracy than stimulation. Though underestimated in surveys, accuracy of BMV was still performed better than stimulation by survey-report. Additionally, BMV is a more suitable intervention for which to assess quality and links to health facility assessments where standard questions include presence and recent use of neonatal bag and masks.

Health facilities are where ~ 80% of women now deliver [10], providing an opportunity to track neonatal resuscitation coverage through routine facility data using BMV as the numerator. Four of the five routine registers assessed were already capturing BMV count data. At the population level, register-recorded coverage of BMV was within 2.1% of observed coverage although individual-level validation metrics suggested low sensitivity. Selective register design is important in capturing what is needed yet avoiding documentation over-burdening. In Tanzania, the register column labelled “HBB” aligns measurement with scale-up programming. The design in Bangladesh instructed health workers to leave the column blank when BMV is not done; thus, calculating completeness and differentiating between truly ‘not done’ and register ‘incomplete’ was impossible. Where register instructions in Tanzania state to write “no” if BMV was not done, completeness was moderate to high (54.6–91.0%). Although data collectors rarely indicated data were not readable (< 0.5%), there were low inter-rater kappa results for register-recorded BMV in some sites [23]. Because extraction/aggregation is the first step for data flowing to higher levels in the health system, more research is needed to improve this. Capturing reliable data depends on user-friendly, appropriate recording systems, however, accuracy varied even within the same country using identical register design, highlighting the importance of information culture and supervision. Our qualitative findings suggest differences in understanding of importance and utility of resuscitation data at different hospitals.

Denominators are notably challenging for interventions such as resuscitation which are indicated based on clinical need for only a subset of babies [30]. Current WHO guidance recommends number of live births in a facility, with a footnote that this is pragmatic whilst ongoing work to test different denominators, including EN-BIRTH, is completed [31]. Here we have included live births plus fresh stillbirths, for whom resuscitation is recommended. Any newborn without maceration or major malformations, even if they appear completely lifeless, should be given the chance of resuscitation [32]. The reduction in stillbirth rates associated with resuscitation training [33,34,35] are likely results of reduced misclassification of live births as stillbirths.

Measuring the true denominator for clinical need for resuscitation is complex. Newborns require BMV if non-breathing/gasping after initial drying/stimulation or if they suffer subsequent apnoea at any time. Breathing well may be difficult to measure as the concept excludes gasping, fast breathing and grunting. It is critical to emphasise these breathing patterns during clinical training as BMV is indicated for some (e.g. gasping) but not all of these breathing patterns. EN-BIRTH observers collected breathing or not breathing as a binary variable because formative research suggested other breathing patterns were not feasible to capture. In our study, 2/5 registers captured non-breathing but as a composite non-crying and non-breathing indicator. Consequently, accuracy of this denominator in registers could not be assessed.

Non-crying has potential utility as a denominator as it is simple for health workers to capture and is part of the process in assessing need for resuscitation. Additionally, crying at birth is a single event and thus more straightforward to record as opposed to breathing which is a process and might change over time, particularly for preterm babies. While not all non-crying babies will require further steps of resuscitation, almost all babies who do need BMV are non-crying. One study has shown babies breathing but not crying after birth have an increased risk of death [36]. We found the observed coverage of BMV ranged from 3.6–17.8% among babies not crying in the first minute. Further research is required to assess if non-crying is useful and benchmarking is feasible. However, as considerations turn towards respectful newborn care and minimal handling, further research is needed related to newborn physiological responses after birth and what is appropriate to measure.

Apgar scores are captured in all the routine hospital registers in our study, including in Pokhara NP, which captured no resuscitation interventions. Apgar scores do not capture interventions around the time of birth, rather describe a newborn’s physical condition and response to any interventions at 1 and 5 min after birth and are already known to have limitations, notably low inter-rater reliability. The one-minute Apgar score, which includes heart rate, does not fit well with current resuscitation algorithms which recommend checking the baby’s heart rate after a minute of ventilation (2 min after birth). As such, the Apgar score is not a useful denominator for neonatal resuscitation and as usually written in individual patient records, we suggest exploring replacing this column in routine labour ward registers with data elements that can be used for coverage measurement e.g. not crying after birth.

Timely resuscitation is essential and even small delays in starting resuscitation can contribute to death or disability [37]. Our assessment of quality of care focused on timeliness of the start of BMV within the first minute after birth. While coverage of BMV was high (85%), only 1% of newborns received the first ventilation within 1 min of birth. In the all newborns denominator, not all will require BMV within 1 min of birth as many were crying/breathing at birth and subsequently became distressed or apnoeic. A coverage gap for BMV of fresh stillbirths is to be expected as it is not appropriate to resuscitate those babies who are diagnosed before birth to have died in utero e.g. confirmed by ultrasound. Measuring timing of BMV is clearly not feasible in surveys and very unlikely to be possible in routine labour ward registers. Given this major quality gap regarding timing of resuscitation initiation, local audit and special studies are important to drive quality improvement.

Strengths and limitations

Strengths of this study include the multi-site and multi-country design and large sample size enabling the capture of multiple decision points on resuscitation algorithms. We evaluated how several possible numerators/denominators performed using clinical observation as a gold standard. We assessed possible bias in the observation data with double observation for a subset of cases. Overall, BMV had good inter-observer agreement. Whilst clinically trained observers provided gold standard data on coverage of interventions, subjectivity remains possible e.g. differentiating stimulation from immediate drying. To limit this, the tablet application was designed to capture stimulation in a specific neonatal resuscitation section separate from the immediate care practices, such as drying. The low coverage of stimulation amongst non-crying/breathing newborns (34–38%) may reflect poor quality of care or difficulty in measurement for stimulation by an observer.

Some other limitations should be noted. Survey-reported coverage was assessed in exit survey, closer in time to the events in question than standard population-based surveys with 2–5-year reference periods. In survey, only women who answered ‘yes’ to a question asking whether their baby had difficulty breathing at birth were asked further questions about resuscitation, thus some who may have recognised newborn stimulation were not counted towards survey-reported coverage. Additionally, the EN-BIRTH study sample may be healthier than the average in these facilities (women too sick to consent, women with no fetal heart beat heard at admission, etc., were excluded from the study). As the study sites were CEmONC hospitals, case mix, coverage, and measurement may differ at lower-level facilities.

Importantly, the true denominator of babies in need of BMV will not be captured by facility measurement, especially the disadvantaged who are more likely to deliver at home in LMICs. However, home births are less likely to receive BMV in most LMICs, so facility measurement is likely to capture nearly all the numerator in terms of newborns receiving BMV. Hence approaches such as those used in immunisation when the denominator is missing may help to estimate the coverage of the whole population for contexts with many home births.

Conclusion

Neonatal resuscitation is a high impact evidence-based intervention for a leading cause of under-five mortality, preventable stillbirth and disability. Yet the current lack of coverage measurement is impeding global tracking of scale-up in high-burden countries. We found bag-mask-ventilation was the most reliable numerator. Measuring the true denominator for clinical need is complex and further denominator research is required, including respectful care considerations, evaluating non-crying as a potential alternative. Based on these results, we do not recommend tracking this indicator through population-based survey. Register measurement of neonatal resuscitation has potential and if standardised and included in HMIS, could aid in tracking progress towards global targets across countries. An appropriate resuscitation denominator could potentially replace the Apgar score, which was recorded as a column in all five registers. Implementation research is needed regarding how to improve register data quality. Measuring and addressing quality of care gaps, notably for timely provision of resuscitation in the first minute, is crucial for programme improvement and impact, but unlikely to be feasible in routine systems, requiring audits and special studies. Improving data is possible and necessary, informing progress to meet global goals and meet every family’s aspiration that their baby will survive and thrive.