Background

Severe maternal morbidity (SMM) covers a range of conditions along the continuum to maternal death during pregnancy or within 42 days after delivery [1]. Maternal morbidity is a substantial public health concern [2] whose incidence is rising in Canada and the US [3]. These patterns are driven by multiple risk factors including delaying childbearing, use of assisted reproductive technologies, rising rates of obesity, and Caesarean delivery [4]. Because of the low prevalence of maternal mortality in many industrialized countries, data covering several years are required to compute precise estimates of prevalence and risk factors, thus complicating the use of maternal mortality as a population health indicator [2, 5]. Consequently, SMM has received increasing attention as an indicator of perinatal health and obstetric care [6, 7].

As the focus in industrialized countries such as Canada has shifted towards ‘near miss’ events as a means to improving the health and quality of care for pregnant women [5], prediction of SMM has been identified as a critical research gap in obstetrics [4]. Many maternal characteristics are known pre-conception or early in pregnancy and are strong risk factors for the development of SMM [2, 8]. Therefore, a combination of such factors may reliably predict its onset, enabling evidence-based and rational early triage of high-risk women for enhanced surveillance and subspecialty-based care.

Advances in maternal morbidity risk prediction include a US obstetric comorbidity index [9], which was externally validated within a Canadian population, resulting in modest discrimination (C-statistic of 0.66, 95% confidence interval [CI] 0.65–0.67) [10]. That index included variables that both preceded, and were simultaneous with, the onset of SMM, making it a useful research tool for identifying the burden of morbidity but less so for clinical prediction. Others have developed models focused on specific subtypes of maternal morbidity, such as cardiovascular-related conditions [7]. Models predicting maternal mortality include the Collaborative Integrated Pregnancy High-dependency Estimate of Risk (CIPHER) model (C-statistic 0.82, 95% CI 0.81–0.84) and the Maternal Severity Index (C-statistic 0.83, 95% CI 0.80–0.85) [11], both developed among women either already critically ill or hospitalized, and mostly later in gestation.

Since SMM predominantly arises around birth or early postpartum [1], the ideal timeframe for prediction is before or early in pregnancy to facilitate effective preventive strategies such as referral to high-risk centres or shared-care antenatal care pathways [12, 13]. Existing models do not enable these latter steps, nor do they account for important pre-pregnancy factors, such as maternal infertility and its treatment, which are associated with SMM [14]. Additionally, existing prediction efforts did not consider prior adverse pregnancy outcomes among parous women. We therefore undertook the current study to develop and internally validate a clinical prediction model of SMM, defined as a composite of maternal end-organ injury or death, using readily available factors ascertained pre-pregnancy and prior to 20 weeks’ gestation in a population-based study in Ontario – Canada’s most populous and multi-ethnic province.

Methods

The use of data in this project was authorized under section 45 of Ontario’s Personal Health Information Protection Act, which does not require review by a Research Ethics Board. We followed the Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) guideline for reporting of prediction studies [15].

Population and data sources

All women with a pregnancy lasting beyond 20 weeks’ gestation, and who delivered within an Ontario hospital between April 1, 2006 and March 31, 2014, were identified within the Better Outcomes Registry & Network (BORN) databases [16]. Data beyond 2014 were not available in these datasets. The BORN registry captures over 99% of hospital births in the province, and has been validated for data completeness and accuracy [17, 18]. We used the Registered Persons Database, the Immigration, Refugees and Citizenship Canada’s Permanent Resident Database, the Ontario Health Insurance Plan (OHIP) outpatient claims database, and the Canadian Institute for Health Information (CIHI) Discharge Abstract Database to capture maternal demographics, pre-existing health conditions and diagnoses and procedures documented during a hospitalization (see Table S1 for variables and diagnostic codes used to develop the study cohort). The datasets were linked using unique encoded identifiers and analysed at ICES – a not-for-profit provincial research entity that houses a large network of health administrative databases (https://www.ices.on.ca/).

We excluded ectopic pregnancies, pregnancies resulting in abortion or miscarriage or ending before 20 weeks’ gestation. We randomly sampled one birth (live- or stillbirth) per woman to avoid potential within-person correlations among women with multiple pregnancies (Table S1; Figure S1).

Study outcomes

The primary composite outcome was maternal end-organ injury or death arising between 20 weeks’ gestation and up to 42 days after the index birth hospital discharge date. The list of conditions used to define maternal end-organ injury was based on the model developed by Bateman [9] and validated by Metcalfe [10], comprising 20 diagnoses and procedures, and consistent with Canadian perinatal surveillance definitions for SMM and death [19,20,21] (Table S1).

A secondary outcome was all-cause maternal mortality, from birth until 365 days postpartum, since previous work has shown a persistent increase in mortality risk beyond the early postpartum period [22, 23].

Candidate predictors, variable selection, and coding

Demographic, medical and obstetric factors known to be associated with an increased risk of SMM were considered as candidate predictors. These included: estimated maternal age at conception (continuous, categorical, and squared terms); residential income quintile; world region of origin (Table S2), as a proxy for both maternal birthplace and ethnicity; attendance at a first-trimester prenatal care visit; pre-pregnancy body mass index (BMI); parity; multiple gestation; infertility; infertility treatment; placental disorders (e.g., placenta praevia, placenta accreta); and pre-existing medical conditions coded within 12 months before the estimated date of conception (Table S1). Substantial missing data were noted only for the variable pre-pregnancy BMI (63.8%). We tested models in which BMI was modelled as a continuous variable and where missing values were assigned the median BMI (24.2 kg/m2). We further tested models in which BMI was divided into the following categories: < 18.5 kg/m2, 18.5–24.9 kg/m2 (reference category), 25–29.9 kg/m2, > 30 kg/m2, and missing. Certain categorical variables with a low frequency in the cohort were combined with other similar variables (e.g., pre-existing cardiovascular conditions; placental conditions and anomalies). Variables were also assessed for collinearity by checking the variance inflation factor (VIF), and where collinear (VIF > 5), the most commonly reported variable was selected [24].

In the model restricted to the sub-cohort of parous women, in addition to the above variables, we included complications coded in any previous pregnancy as predictors (Table S1).

Possible interactions between variables were assessed and included if statistically significant at alpha = 0.10 [25].

Statistical analysis

Descriptive statistics

We used standardized differences to contrast births with and without the primary composite outcome of maternal end-organ injury or death, with a value > 0.10 indicating an important difference in baseline characteristics [26].

Model discrimination

Among the entire cohort, a logistic regression model was fit using the final selected variables to predict the primary composite outcome of maternal acute end-organ injury or death from 20 weeks’ gestation until 42 days postpartum. A backward elimination method was applied for variable selection, with predictor evaluation based on a balance of the model’s C-statistic, clinical influence, and statistical significance. For continuous predictor variables such as age in which non-linear associations with the outcome were observed, a quadratic (squared) term was added to the model. Model discrimination was expressed as a C-statistic and its 95% CI, as well as visual detection of a receiver operating curve (ROC). We considered a C-statistic of < 0.5 to be not useful, 0.5 to 0.6 poor, 0.6 to 0.7 moderate, and ≥ 0.7 as good [27].

Model internal validation

To arrive at an optimism-corrected C-statistic, we used a bootstrapping approach, with 500 bootstrap samples selected from the original cohort, with replacement [28] – an approach known to produce stable estimates with low risk of bias [29]. The optimism-corrected C-statistic was defined as the C-statistic from the original data minus the optimism value [30].

Model calibration

Model calibration was assessed by visual inspection of calibration plots of observed vs. expected probabilities of the outcome, where a 45-degree line denotes good calibration, and a slope of 1 indicates perfect agreement between observed and expected events [31].

Risk classification

We used a risk classification table and computed likelihood ratios (LRs) [32] with associated 95% CI to assess the main model’s ability to stratify the population as low or high-risk. We divided the population into five groups of predicted probability: very low risk (< 1.5 per 1000), low risk (1.5 to 3 per 1000), intermediate risk (3 to 5 per 1000), high risk (5 to 15 per 1000) and very high risk (> 15 per 1000). These cut-offs were chosen based on the overall incidence of our primary outcome of 3.1 per 1000, which we assumed to reflect the risk among the majority of the cohort. Positive LRs of > 5 and > 10 were interpreted as moderately or very useful “rule-in” tests, while values between 0.2 and 0.5, and < 0.1 were considered moderately and very useful “rule-out” tests [33].

Funding

This study was supported by funding from the Canadian Institutes of Health Research (grant number 15139).

Results

After sampling one birth per woman from among 853,517 eligible births, the total cohort comprised 634,290 births (Figure S1). The primary outcome of end-organ injury or death from 20 weeks’ gestation up to 42 days postpartum occurred in 1969 women (3.1 per 1000), including 62 deaths (0.1 per 1000). Women who experienced the primary outcome were older, more likely to have a pre-existing medical condition, and to have had infertility treatment (Table 1).

Table 1 Baseline characteristics of the study population, according to whether a woman had the composite outcome of maternal end-organ injury or death between 20 weeks’ gestation and up to 42 days after birth. All data are shown as a number (%) unless otherwise stated

The most frequent factors contributing to end-organ injury or death were acute heart failure (40.6%), need for assisted ventilation (29.2%), acute renal failure (12.0%) and shock (10.1%) (Table 2).

Table 2 Occurrence of maternal end-organ injury or death between 20 weeks’ gestation and up to 42 days after birth, and the ranking of the most prevalent morbidity indicators

Model discrimination and internal validation

Overall cohort

In the overall cohort (n = 634,290), variables significantly associated with the composite outcome of maternal end-organ injury or death included maternal age, low income, world region of origin, high BMI, pre-existing medical conditions, and placental disorders (Table S3), which contributed to the final model. Attendance at a first-trimester antenatal visit and parity were inversely associated with the composite outcome. The corresponding model C-statistic was 0.68 (95% CI 0.66–0.69) (Fig. 1). There was minimal overfitting of the model, with mean optimism of 0.0055 (95% CI 0.0050–0.0061), and an optimism-corrected C-statistic of 0.67 (95% CI 0.66–0.68). Model discrimination was unchanged when BMI was included, either as a categorical variable with “missing” as a separate category, or as a continuous variable imputed with the median value for BMI. We tested 300 pairwise interactions, of which 13 interactions were statistically significant. The main model including interaction terms resulted in similar model discrimination as the main model (C-statistic 0.69, 95% CI 0.68–0.70), however this model included unstable estimates. Therefore, the model without interaction terms was chosen as the most balanced and efficient model.

Fig. 1
figure 1

Receiver operating characteristic curve showing discrimination of the clinical prediction model for maternal end-organ injury or death. Legend: Outcomes are those arising between 20 weeks’ gestation and 42 days after birth, using variables measured pre-pregnancy, and in the index pregnancy prior to 20 weeks’ gestation. Predictor variables and adjusted odds ratios are shown in Table S3. Analysed is the entire cohort of 634,290 births. C-statistic for Area Under the Curve = 0.68 (95% CI 0.66–0.69)

All-cause mortality from birth until 365 days postpartum occurred in 194 women over the study time period (0.3 per 1000). The final multivariable model for all-cause mortality no longer retained world region of origin, parity, previous spontaneous abortion, and several medical comorbidities (Table S4). Major psychiatric conditions and alcohol and substance use newly emerged as predictors. The corresponding C-statistic was 0.70 (95% CI 0.66–0.74) (Figure S2). However, this model was slightly over-fitted, and the optimism-corrected C-statistic was 0.67 (95% CI 0.63–0.71).

The risk classification table for the main model, dividing the cohort according to the five categories of predicted risk of acute end-organ injury or death (Table 3) demonstrated the capacity of this model to classify women who are at very low risk (−LR 0.41, 95% CI 0.33–0.52) and those at very high risk of the outcome (+LR 8.58, 95% CI 7.32–10.05), but was less useful in classifying women in intermediate risk categories.

Table 3 Risk classification comparing predicted and observed risks of the outcome using five groups of predicted probability, and associated likelihood ratios in each group. Data are from main model predicting acute end organ injury or death from 20 weeks' gestation until 42 days after birth (n = 634,290)

Sub-cohort of parous women

In the sub-cohort of 333,435 parous women, the aforementioned variables significantly associated with end-organ injury or death persisted, as did the addition of an unplanned Caesarean delivery and severe organ injury in a previous birth (Table S7). The C-statistic was 0.61 (95% CI 0.59–0.63) when limited to variables from the index pregnancy (Fig. 2a), rising to 0.69 (95% CI 0.67–0.70) after adding pre-pregnancy predictors (Fig. 2b), and 0.71 (95% CI 0.69–0.73) when including the variables from a previous pregnancy (Fig. 2c). We noted minimal overfitting for each model. With optimism-corrected C-statistics of 0.60 (95% CI 0.58–0.62), 0.68 (95% CI 0.66–0.70), and 0.70 (95% CI 0.69–0.72), respectively.

Fig. 2
figure 2

Receiver operating characteristic curve showing the discrimination of the clinical prediction model for maternal end-organ injury or death. Legend: Outcomes are those arising between 20 weeks’ gestation and 42 days after birth using variables measured in the index pregnancy prior to 20 weeks’ gestation (a); the index pregnancy prior to 20 weeks’ gestation and pre-pregnancy (b); the index pregnancy prior to 20 weeks’ gestation, pre-pregnancy, and in a previous pregnancy (c). Predictor variables and odds ratios are shown in Tables S5 (a), S6 (b), and S7 (c). Analysed is the cohort of 333,435 births among parous women. a C-statistic for Area Under the Curve = 0.61 (95% CI 0.59–0.63). b C-statistic for Area Under the Curve = 0.69 (95% CI 0.67–0.70). c C-statistic for Area Under the Curve = 0.71 (95% CI 0.69–0.73)

Model fit and calibration

Visual inspection of the calibration plots in the entire cohort suggested good agreement between observed and expected events for the primary outcome, with slightly worse calibration for mortality (Figure S3a-b). Among parous women, model calibration for maternal end-organ injury or death improved from the base model to models including variables measured pre-pregnancy and in a previous pregnancy (Figure S3c-e).

Discussion

Main findings

We have shown that a model based on variables available pre-pregnancy and in early pregnancy can moderately discriminate women destined for a severely morbid event or death from those likely to have uncomplicated pregnancies. Predictive variables retained in the final models included demographic, obstetric, and other medical risk factors. Notably, attendance at the first trimester visit with a care provider – a measure of good prenatal care – was inversely associated with the risk of SMM. Inclusion of prior pregnancy factors, which have not been incorporated in previous predictive models for SMM, further enhanced model performance, in keeping with the importance of clinical obstetrical history. Our model displayed good calibration, indicating that a combination of routinely measured pre-pregnancy and early pregnancy factors can estimate the absolute risk of acute end-organ injury or death with reasonable accuracy. Using this model effectively increased the probability of identifying a very high-risk woman with this outcome by 40%, and reduced the probability in someone considered very low-risk by 20% [33], but was less useful in classifying women in intermediate risk categories. This suggests that additional clinical, laboratory, or paraclinical factors are needed to accurately predict morbidity in all women, and further, that a certain proportion of these events are truly sudden and unpredictable.

Strengths and limitations

The models in this study relied on information that is routinely known at the time of the first antenatal visit, using variables that were temporally remote from when most maternal morbid events arise – largely around the time of birth [1]. Moreover, our source population comprised all pregnancies from gestational week 20. However, our datasets had few routinely collected clinical measures, such as blood pressure and haemoglobin or glucose concentrations, or first-trimester screening biomarkers. In the prediction of preterm preeclampsia, for example, a model that contained a combination of clinical and paraclinical variables (including placental biomarkers) performed better than with either set of variables in isolation [34]. BMI was incomplete in our dataset, as is common in most administrative data sources. However, the proportion in any given BMI category and those with missing values was not appreciably different among women with and without the outcome. Furthermore, there were few substantial differences in other baseline characteristics between those with missing vs. non-missing BMI (Table S8). Thus, while the contribution of BMI to the outcome may not have been well represented in our models, this unlikely changed the overall model performance. In addition, clinical practice around identification and management of SMM has evolved over time. It is plausible that the strength of different risk factors for SMM may have changed across the study period as well (e.g., use of lower-risk IVF strategies). However, we used a constant definition for the study outcomes, and any changes in clinical practice patterns would not affect the internal validity of our models.

Prediction models are often used to estimate an individual’s absolute risk of a serious adverse event that might be mitigated with the use of a particular therapy, while avoiding subjecting individuals at low predicted risk to potential harmful effects of such therapy [35]. In obstetrics, serious adverse events are rare, with limited options for targeted prevention. We acknowledge, therefore, the limitations afforded by the C-statistic to discriminate between individuals with and without a rare adverse event, in which a high false positive rate might be justified [36]. The LRs add clinical meaning to the model and serve as a foundation for what might be considered reasonable predictability of rare but catastrophic obstetric events. The high LR of the model for women with very high predicted risk despite the rarity of the outcome in this group speaks to the potential utility of the model as a screening tool.

Interpretation

The models in this study relied on information that is routinely known at the time of the first antenatal visit, and that is temporally remote from when most morbid events arise – around the time of birth [1]. Our main model shows the potential utility of harnessing data in early pregnancy to predict a variety of later adverse maternal outcomes. Consistent with previous research on postnatal mortality [23], our model for all-cause mortality showed substance use, alcohol use, and psychiatric conditions to be significant predictors of death up to 365 days postpartum.

SMM rates have stagnated within Western nations, yet evidence-based strategies to reduce their burden are lacking [1]. Despite the possibility for early identification and prevention of some forms of SMM, current practice guidelines do not incorporate recommendations for prediction of severe morbidity, and use narrow crude definitions to identify such events [37]. Further refinement of clinical prediction models and the eventual development of a clinical risk calculator may help to inform early triage of women for enhanced surveillance or referral to subspecialty care or shared-care antenatal pathways – decisions that at present rely principally on clinical judgment. In developing and refining our study’s model in external cohorts, investigators should consider adding first-trimester placental biomarkers and other maternal biomarkers alongside routinely measured clinical variables, such as blood pressure and weight. The incorporation of such variables may facilitate prediction of the whole of severe morbidity as well as cause-specific outcomes, and better inform individualized and targeted prevention [38].

Conclusion

In conclusion, a model developed using pre-pregnancy and early pregnancy predictors available within administrative datasets had moderate prediction of maternal acute end-organ injury or death, and as such shows significant promise in the early clinical prediction of SMM. The addition of factors from a prior pregnancy among parous women slightly improved the model performance. Enhancement of these models, using direct clinical measures, and by external validation or using machine learning, is needed.