Background

Several studies have shown mortality of high-risk-infants can be reduced if these infants are treated in highly equipped neonatal intensive or intermediate care units [1]. Therefore, different levels of care have been introduced for treatment of pregnant women and their newborns in relation to the medical condition. For each level certain requirements in terms of infrastructure, staffing, equipment and qualifications are defined. If a centre does not fulfill these requirements, a specialized care is usually not allowed [2, 3]. Since experience of the care team is likely to be also of advantage, it could be assumed that infants will benefit from hospitals with high annual birth volume. That assumption is supported by our recent systematic review, showing for very low birth weight infants an improved maternal and neonatal outcome in centers with higher birth volumes in high-risk births [4].

Important other risk factors for pregnancy and birth complications are higher maternal age, comorbidities (e.g. placenta praevia, pre-existing or gestational diabetes) or smoking. These factors are likely to increase the risks for maternal or neonatal adverse events [5,6,7,8,9,10]. Currently, appropriate management of these risks is still being discussed [11,12,13,14,15]. In order to better study the impact of different interventionson on subsequent outcome, a homogenous definition of birth outcomes is needed and core outcome sets (COS) are currently developed [5, 6]. COS are multilaterally consented and standardized sets of outcomes which should be reported in clinical trials to guarantee comparabilityIn recent years, COS have been increasingly developed and registered for perinatal and maternal care [16], like gestational diabetes [17], preterm birth [18], maternity care [19], neonatal medicine [20] or pregnancy and childbirth [21]. However, currently there are no COS available to study the impact of birth volume on outcome of low risk pregnancies. For both this reason and since birth complications are difficult to predict in low risk pregnancies, it remains unknown whether women with a low risk pregnancy could also benefit from care in hospitals with higher birth volumes.

The aim of this systematic review was to summarize and critically appraise the impact of hospital case volume on mortality and morbidity in low risk birth cohorts.

Methods

We conducted this systematic review in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Checklist [22] and registered the review protocol (CRD42018095289) in the International Prospective Register of Systematic Reviews [23]. The original search strategy (Additional file 1) and review was designed to identify studies on the effects of either perinatal regionalization or hospital birth volume on infant and maternal outcomes. Here we report on the results of volume-outcome-relationships.

Eligibility criteria, information sources, search strategy

Inclusion and exclusion criteria (Table 1) addressed population, intervention, comparison, outcome and study type (PICOS). Interventions/ expositions included volume effect estimates on mortality as primary outcome and secondarily on caesarean sections, readmissions, birth complications, developmental delays (outcome) in all births or a pre-defined low risk birth cohort (population). In order to ensure comparability and current status of obstetric care, observational or interventional studies (study type) from countries with neonatal mortality rates below 5 per 1000 births (UN Child mortality report) that were published in English or German language after 01/01/2000 were included [24].

Table 1 PICO-Scheme

Study selection

We systematically searched Medline and EMBASE on 18/04/2018 and on 26/02/2020. The search strategy included a combination of free text words and database-specific subject-headings (Additional file 1) using Ovid interface. We used Endnote X7 for the creation of the literature database and the removal of duplicates. Two authors (FW, AB) independently screened titles/ abstracts and full texts for eligibility. Additionally, an expert panel (MR, JM, Rainer Rossi) highlighted missing relevant papers. After full-text-screening, we conducted a hand search including forward (citing literature) and backward (cited literature) screening of included studies. Discrepancies during screening, extraction or quality assessment were solved by consulting of another reviewer (JS). For interpretation of reliability, we applied the prevalence-adjusted bias-adjusted kappa (PABAK). The advantage of PABAK in contrast to Kappa value is the consideration of the high class imbalance [25].

Data extraction and data synthesis

We predefined a data extraction form in MS Excel including study charateristics (e.g. population, period, country) and outcomes (e.g. definition, exposing/ referencing annual volume, result, estimator) was used. One reviewer extracted (FW) and a second (DK) verified the results resolving discrepancies by consensus or consulting a third reviewer (JS). To decide whether individual studies can be pooled in a meta-analysis, we reviewed methodological quality, comparability of the study contexts (population, outcomes, volume-thresholds and risk adjustment) and statistical heterogeneity. If studies were considered as not comparable, a qualitative synthesis followed.

Critical appraisal process

Two independent reviewers (FW, DK) performed the quality assessment using the Methodology Checklist for Cohort studies of the Scottish Intercollegiate Guidelines Network (SIGN). This checklist contains 14 items with a final quality rating of the studies in "high quality", "acceptable" and "inacceptable" [26]. Methodological explanations and definitions in the context of the application of the checklist are presented in Additional file 2.

Patient and public involvement

No patient involved.

Results

Study selection

After screening of 7955 records 13 studies met our predefinded eligibility criteria were included in the systematic review (Fig. 1) [27,28,29,30,31,32,33,34,35,36,37,38,39]. Additional file 3 contains the reasons for exclusion of the remaining 30 full texts [40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69]. The high prevalence and bias adjusted Kappa (PABAK) (Fig. 1) in both title-abstract and full-text-screnning suggests no systematic differences between the raters.

Fig. 1
figure 1

PRISMA flow-chart

Study characteristics

Table 2 shows the characteristics of included studies. The observation period varied between 29 years (1967–1996) [33] and one year [35, 39]. The earliest observation started in 1967 [33] and the latest ended 2012 [39]. All of the included studies used cross-sectional designs to analyse retrospective cohorts in perinatal registers (Additional file 4). The studies were conducted in Finland [30, 32, 34], the United States [28, 35, 39], Sweden [27], Norway [33], Germany, [29] the United Kingdom, [31] Australia, [36] the Netherlands [70] and Canada [37]. The analyzed populations consist of either all births [27, 28, 30, 31, 33,34,35, 37, 39] and/ or a predefined low risk population [29, 32, 34, 36, 38] excluding e.g. low birth weight or multiple births. Annual volumes and its comparators were set differently in terms of group sizes and defining births [27, 29,30,31,32,33, 36, 39] or deliveries/ pregnancies respectively women giving birth [28, 34, 35, 37, 38] as basis for the calculation. While “birth” refer to the neonate, “delivery” describes the mother who is giving birth. Due to multiple pregnancies, number of deliveries is usually lower than the number of births. Unfortunately, not all studies reported both numbers, but Table 2 shows the different annual volumes in the included studies. In addition to the different annual volumes, maximum, [29, 33, 36,37,38,39] minimum [35] and mean quantities [27, 28, 34] as well as university clinics (UH) [30, 32] were used as reference volumes. The analyzed outcomes included stillbirths, [31, 32, 34] perinatal/ early [29, 30, 32, 34, 37, 38] and neonatal mortality, [27, 31, 33,34,35,36, 39] birth by caesarean section [30, 36] and composite outcomes like perinatal adverse outcome [38] or maternal morbidity/ mortality [37]. Six out of thirtheen studies did not solely focus on volume-outcome relationship, but analyzed influence of geographic accessibility [37], birth at night hours [38], staffing [31], availability of facilities [31], on call arrangements [32], or birth at weekday/ weekend [39].

Table 2 Characteristics of included studies

Results of the critical appraisal

Table 3 shows in detail that most of the included studies (12 out of 13 studies) fulfilled the majority of the queried items leading to an “acceptable” quality [27,28,29,30,31,32, 34,35,36,37,38,39]. Quality of one study was rated as “unacceptable” due to lack of comparability (missing baseline-tables, item 1.2) of the investigated groups [33].

Table 3 Detailed results of sign—quality assessment for cohort studies

Due to the retrospective design and other methodological reasons, some items were not applicable:

  • number of participants (item 1.3)

  • outcome already present before start of study (item 1.4)

  • drop-out (item 1.5)

  • comparison between full and lost to follow-up (item 1.6) and

  • multiple measured exposure levels (item 1.12).

None of the studies fulfilled the criteria for blinding (item 1.8) and critical recognition of limited possibilities of blinding (item 1.9) in cohort studies. An externally demonstrated validity (item 1.11) and reliability (item 1.10) of the assessed outcomes was not applicable due mortality, caesarean sections or other clinical outcomes are not subjective measures.

We originally planned to perform a meta-analysis but were unable to conduct it due to definitional heterogeneities in the included studies. Additional file 5 provides a tabular overview of heterogeneities identified between the outcomes analyzed. Five studies were excluded from a pooled estimate due to singular report of the outcome maternal mortality, [28] maternal morbidity/ mortality, [37] neonatal complications, [38] missing adjustments [34, 35] and the singular use of risk ratios as estimator, [31] 99% confidence intervals [36] or pearson correlation coefficients [39]. The remaining results for the outcomes stillbirth, [32, 34] perinatal/ early neonatal mortality, [29, 30, 32, 37, 38] neonatal mortality [27, 33, 39] and caesarean sections [30] were not comparable due to heterogeneously defined adjustment variables, populations (all births vs. predefined low risks), outcomes (e.g. undefined vs. defined) and volume-thresholds. Consequently, we summarized the results qualitatively.

Effects of annual volume on neonatal outcomes

Stillbirth was evaluated in three studies [31, 32, 34] and defined as fetal death prior to 22 [34] or 24 [31] weeks of gestation or remained undefined [32]. For hospitals with medium-sized birth volumes (1000–1999 p.a.) stillbirth odds ratio was significantly higher when compared with university hospitals [32]. Similar effects were found for hospitals with birth volumes between 1000–2999, when compared with high birth volumes (≥ 3000 p.a.) [34]. However, taking all data together there was no clear volume effect on the rate of stilbirths (Fig. 2).

Fig. 2
figure 2

Stillbirths and early/ perinatal mortality. Legend: […]1 BW, age, parity, born outside clinic, birth planned and documented clinic, mode of delivery, born before arrival at clinic, time of birth, congenital anomaly/ malformation. […]2 age, parity, socio-economic position. […]3 age, parity, mode of delivery, ethnicity, calendar year trend. […]4 gender, Eclampsia, Premature rupture of membranes, Oligohydramnios, Abruptio placentae, Prolapsed umbilical cord, Noxious influences transmitted via placenta/ breast milk, Congenital anomalies, Hydrops fetalis, Other maternal conditions

Perinatal or early neonatal mortality has been defined as death within the first 7 days of life [29, 30, 34, 38] or as a combined outcome [34, 37]. One study did not provide a specific definitio [32]. Results were always adjusted, except for one study [34]. Whereas two studies did not report a significant volume-effect, [32, 38] four studies showed significantly higher rates of perinatal/ early neonatal mortality in hospitals with low (≤ 1000) [29, 30, 34, 37] or very low (≤ 500) [29, 37] birth volumes (Fig. 2) for either low risk (term infants with birthweight > 2499 g) [29, 34] or all births [30, 37].

Neonatal mortality was defined as 28-day-, [31, 33,34,35,36, 39] or 27-day-mortality [27] in order to analyze all [31, 33,34,35,36, 39] and/or low risk births [27, 34,35,36]. The majority of the studies undertook adjustments [27, 31, 33, 36]. As illustrated in Fig. 3 five [27, 33, 35, 36, 39] out of seven studies reported significant volume effect estimates with neonatal mortality being higher in hospitals with lower [33] or higher annual birth volumes [27, 35, 36, 39]. The remaining two studies reported non-significant volume-outcome effects [31, 34].

Fig. 3
figure 3

Neonatal complications and neonatal mortality. Legend: […]1 parity, GA, year of birth, smoking, parental cohabitation, maternal BMI. […]2 insurance status, maternal Aboriginal or Torres Strait Island status, maternal residential area. […]3 parity, mode of delivery, ethnicity, calendar year trend

The study from Moster et al. reported higher neonatal mortality rates in hospitals with low birth volumes however, was lacking comparability between groups due to missing baseline-table and thus, quality was rated “unacceptable” [33]. In conclusion, methodically limitations hinder conclusive statements regarding the effect of birth volume on neonatal mortality.

Neonatal complications were reported in one study as a combined outcome (“perinatal adverse outcome”) including stillbirths, death ≤ 7 days, 5-min Apgar < 7 and a transfer to a neonatal intensive care unit in singleton births. Non-monotonous, significantly higher odds ratios of neonatal complications were reported for units with 750–999 and 1500–1749 births (Fig. 3) compared to at least 1750 births per anno [38].

Effects of annual birth volume on maternal outcomes

Adjusted maternal mortality was reported as failing attempts to resuscitate women with severe complications during birth [28]. The volume-outcome relationships were reported to be non-monotonous in general with lower and higher relative risks of maternal mortality in lower (50) and higher annual birth volumes (≥ 2250–7500) [28].

Adjusted maternal complications were reported in two studies as a combined outcome consisting of maternal mortality and different morbidy outcomes in all births [28, 37]. In a Canadian study the odds ratio were reported to be significantly higher in hospitals with ≤ 1000 births p.a [37]. However, a study from the US reported non-monotonous results with higher risk ratios in hospitals with high (2500) and low (50) annual birth volumes. Without providing results, the relative risks of maternal complications remained higher with a further increase in birth volume [28]. In conclusion, no conclusive statement regarding the impact of birth volume on maternal complication is possible due to contradicting study results as shown in Fig. 4.

Fig. 4
figure 4

Maternal mortality, maternal complications and caesarean sections. Legend: […]1: race, hospital, year, comorbidity index, insurance status, household income, hospital teaching, hospital bed size, hospital region, hospital ownership, hospital location. […]2: GA, CS, Median income, Education rate, Aboriginal population, Unemployment rate, Minority, Statistical area classification, Travel Distance, Delivery hospital volume, Hospital level, HIV, Type 1/2 DM, Gestational/ other/ unspecified DM, Cystic fibrosis, Rheumatic heart disease, Hypertension, Ischemic heart disease, Pulmonary hypertension, SLE, Chronic renal disease, Twins/ multiple gestation, Previous CS. […]3: race, hospital, year, comorbidity index, insurance status, household income, hospital teaching, hospital bed size, hospital region, hospital ownership, hospital location. […]4: insurance status, maternal Aboriginal or Torres Strait Island status, maternal residential area. […]5: parity, smoking, socio-economic position

An adjusted rate of delivery via caesarean section was reported in two studies [30, 36]. Hemminki et al. reported a significantly higher rate of caesarean sections in “small-hospital-areas” with less than 750 births per year compared to “capital areas” [30]. In contrast, Tracy et al. reported a significantly lower rate of caesarean sections in hospitals with ≤ 500 births [36]. Thus, contradicting study results do not allow conclusions regarding volume-effects on mode of delivery (Fig. 4).

In summary, most studies suggested a volume-outcome relationship on perinatal / early neonatal mortality and however reported either insignificant, non-monotonous or conflicting results regarding volume effects on the remaining outcomes.

Discussion

This systematic review on the effects of hospital case volume on the safety and outcomes of infants classified as being on low risk births has tremendous public health impact, as births of children are so frequent and such an important life event. There is evidence already for high risk births and many other conditions such as preterm birth [1, 23], pediatric intensive care [71] or pediatric heart surgery [72] that hospitals with more experience and higher case numbers provide better healthcare indicated by better health outcomes of patients being treated there. We therefore speculated that higher birth volumes of hospitals were also related to better outcomes in births of low risk or all infants. These studies reported on mortality (stillbirths, perinatal, neonatal, maternal), morbidity (neonatal, maternal) and mode of delivery. Readmissions and developmental delays were not reported. Initially, a pooled estimate was intended. Heterogenities within the definitions and presentations of characteristics led to the decision not to perform a pooled estimate. Therefore, the results were synthesized qualitatively focusing on volume-outcome in general and especially in terms of lower annual birth volumes (≤ 1000). The heterogeneous results reported by two studies in different groups were not discussed by the study authors [30, 34] but might be caused by effect modifications.

While a possible effect of volume on early neonatal mortality was found to be consistent when statistical significance was reached, the influence of birth volume on other outcomes was less consistent. The reason for these inconsistencies has to be discussed. It could be assumed, that inconsistencies can be explained at a systemic level reflecting differences between national health care systems with variations in budgeting, access, geographical and historical conditions. One study included in this review showed differences of caesarean sections in dependence to hospital birth volume [36]. Several explanations could be discussed. It is possible that this could be an effect of perinatal regionalization treating high risk pregnancies in high birth volume hospitals leading into the need of surgical birth interventions. On the other hand, the appropriateness and need for the indication of e.g. epidural anesthesia was also discussed with reference to hospital ownership [15]. However, to further analyze the sensitive topic of appropriateness, qualitative research with primary data is needed. Due to the lack of detail information and data quality, routine data must be used with caution in order to avoid over- or misinterpretation [73].

With respect to a risk appropriate care, perinatal regionalization policies vary in terms of general organization, obligation and practice [2, 3]. At the provider level birth/delivery volumes may be only one covariate between several others such as time of birth, [38, 39, 70] personnel and material resources, [31, 32, 74] work environment [75] or qualifications [76] influencing the outcome of newborns indicated by studies included in this review.

Despite of lower early neonatal mortality in hospitals with high annual birth volume, closure of low volume institutions has to be considered very carefully, since reults have been discussed controversially. Some studies suggest a higher rate of unplanned out-of-hospital births [77] and an increased rate of neonatal mortality and stillbirths immediately after closures [58]. Furthermore, an increased rate of adverse birth outcomes [78] and higher stress/ anxiety levels of pregnant women were reported in large rural landscapes with long distances to access perinatal care [79]. Other studies report significantly lower rates of stillbirths and neonatal mortality in both rural and urban regions after closing maternity units [41].

The heterogeneous definitions identified in this and other systematic reviews [80] support the need for a standardized terminology of outcomes, populations and volume-thresholds. The definition of core-outcome sets (COS) would help to overcome that issue. The uniform terminology enables the design of comparable studies and forms the basis for the development of an international perinatal register. A homogeneously created perinatal register would allow individual patient data meta-analyses providing promising results as it has been shown for other indications [81, 82].

Overall most (12/13) of the included studies showed an “acceptable” quality as it is the highest rating for retrospective studies [26]. One study lacked an illustrated comparability of the study groups that led to “unacceptable” quality as it strongly limits transparency. None of the studies blinded the assesors nor was a report of non-blinding included. Nevertheless, we considered the studies as meaningful for interpretation because the assessed outcomes are difficult to manipulate and therefore the lack of blinding seems to be a minor weakness.

Strengths and Limitations

This is the first systematic review explicitly assessing birth volume effects on neonatal outcome in low risk births. The review used transparent methods (independent screening, search strategy), was officially registered, is based on two major databases (combined with extensive hand search and expert panel for highlighting relevant literature) and followed common critical appraisal requirements of systematic reviews determined by AMSTAR 2 [83]. The high inter-rater-reliability ensures comprehensibility. The time and national restriction in the inclusion criteria could be interpreted as a limitation. However, it is well known that international comparisons must take into account the efficacy of health care systems [84, 85]. Thus, we used neonatal mortality rates as an indicator of this efficacy. With respect to the time restriction starting with publication in 2000, this review considered the decline of neonatal mortality and the development of perinatal care in since 1990 [86]. On the other hand, some of the studies have long past study periods (1967–2012) and intervals (1 to 29 years), indicating that the publication date did not work perfectly well as a delimiter to represent only current perinatal care. Almost every study showed an “acceptable” quality with retrospectively collected routine or register data.

Conclusion

The aim of that review was originally to investigate volume-outcome associations in a comparatively low-risk birth cohort. With the exception of 7-day mortality, the review revealed heterogeneous results and major differences in the conception and definitions of the included studies.The qualitative synthesis of the studies indicated increased rates of early neonatal mortality (< 7d) in hospitals with birth volumes below 1000 or 500 births per anno when statistical significance was given. With respect to stillbirths, neonatal mortality, maternal mortality, caesarean section and neonatal and maternal complications the studies included reported inconclusive or insignificant results. Referring to the heterogeneously conducted study concepts in terms of assessed populations, volume-thresholds and outcomes, we recommend the development and use of internationally consented core-outcome sets to provide a homogenous definitional basis in future studies. A uniform terminology would enable a homogenously conceived internationally birth register for individual patient data meta analyses. Based on these data, strengths and weaknesses of different perinatal settings could be investigated using a common terminology of population, volume and outcome.