Background

Out-of-hospital cardiac arrest (OHCA), occurs at an incidence of 30.0 to 97.1 individuals per 100,000 people annually [1]. As OHCAs are frequently unexpected yet require immediate treatment, global rates of survival to hospital discharge after OHCA remain dismally low (~ 8.8%) [2]. Patients’ characteristics such as age and co-morbidities affect the outcome [3, 4] but survival is also related to the individual components of the response to OHCA, such as bystander cardiopulmonary resuscitation (CPR) [5,6,7], early defibrillation [8], targeted temperature management [9, 10] and coronary catheterization [11].

The characteristics of OHCA differ in male and female [12]. Female are usually older [13] and have more co-morbidities (e.g., hypertension, diabetes, obesity) [14,15,16]. Female arrest in the privacy of their own home and resultantly their arrests tend to be less witnessed [17]. One meta-analysis showed that 77% of female who arrest do so at home vs. 67% of male. Only 40% of OHCAs among female were witnessed vs. 47% among male [18]. In some studies female receive less bystander CPR [17, 19, 20] but this finding is inconsistent [21]. The interval between emergency medical services (EMS) dispatch to EMS–CPR or first rhythm capture may also be longer in female than in male [22]. Possibly as a result of all these, female present less with shockable rhythms than do male [16,17,18, 23, 24].

It, therefore, seems unsurprising that overall female receive less advanced cardiac life support [25, 26]. Female less frequently undergo defibrillation, receive less epinephrine and are even less likely to undergo endotracheal intubation [18, 22, 27]. There is controversy with regard to whether female are more likely to have return of spontaneous circulation (ROSC) than do male [28] or not [21, 27]. Even when ROSC occurs and the patient is brought to hospital, differences remain in post-resuscitation care as female less frequently undergo coronary angiography [15, 27], percutaneous coronary intervention [16, 26] and other evidence-based interventions [26]. Despite these differences, studies reporting outcomes of male and female after CPR range between better, similar and worse survival outcomes for one sex compared to another.

This systematic review was, therefore, conducted to investigate whether male and female with OHCA have different mortality rates or neurological outcomes despite adjustment for confounding variables.

Methods

The study was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) recommendations [29, 30] and was registered in the PROSPERO database prior to the study initiation (CRD 42021226050).

PICO question

Do adult female (P) after out-of-hospital cardiac arrest (I) compared to adult male after out-of-hospital cardiac arrest (C) have different survival rates and neurological outcomes (O)?

Search strategy

We systematically searched PubMed, Embase and Web of Science databases (inception to 23-April-2022) for papers reporting outcomes at the time of hospital discharge in adult (age 16 years or older) male and female after out-of-hospital cardiac arrest. The search was performed three times to ensure a full and up-to-date review of the literature. No language restriction was applied during the search, then only studies written in English were included. In brief, we used keywords as exact phrases and subject headings according to database syntaxes with the help of an information specialist. The full search strategy is described in Additional file 1.

Eligibility and inclusion/exclusion criteria

We included papers describing randomised controlled trials, nonrandomised clinical trials, observational cohort studies or case series of adult humans with OHCA. Case reports, animal models and special populations (children, pregnant female) were excluded. Studies were included if sex differences in outcome were their primary study aim and if they evaluated at least one outcome of interest (survival to hospital discharge or 30-day survival). We used author definitions for OHCA. Only studies published after 1995 were included to account for changes after the 1993 declaration of the Council for International Organizations of Medical Sciences regarding inclusion of women in clinical trials.

Paper selection

The titles and abstracts of all records were screened independently and in duplicate by two of the authors (IL, AN) using the Covidence software tool. The papers selected were downloaded in full and they were reviewed independently and in duplicate by the same two authors to verify fulfilment of inclusion criteria. The reference lists of relevant articles were searched for additional potentially pertinent articles (i.e., snowballing method). All articles rated discrepantly by the two screening authors in the Covidence software were reviewed and discussed one by one by both authors and subsequently included only if both authors agreed on eligibility. In each stage disagreements were resolved by a third author (SE). Among papers reporting completely or partially overlapping data, we selected the paper displaying adjusted data and describing the greatest number of patients. Since many of the included studies presented data from national databases, the risk of overlapping data was high. To avoid such overlap, we excluded any study that presented data from a country whose national database was already being used in another included study on the same period (Additional file 2: Table S1). In this case, the study included was the one with adjusted data, and if more than one study from the same national database had adjusted data, the largest cohort was chosen. Although pre-planned, alternative prioritisation based on clinical sensibility due to poorer adjustment with more patients was never required.

The list of papers with overlapping populations is presented in Additional file 2. We planned to contact the corresponding authors of the screened studies in case questions arose regarding eligibility or data presentation but no such issues arose. The details of the inclusion/exclusion process are shown in the PRISMA diagram (Fig. 1).

Fig. 1
figure 1

Flow chart of the included studies. *A table summarizing the overlapping databases is provided in the Additional file 2: Table S1)

Data extraction

Two authors (IL, AN) extracted the data in duplicate using a standardized data extraction form. Discrepancies in the extracted data were adjudicated by a third author (SE). The data extracted included study characteristics (e.g., source country, study type, single/multicentre), patient demographics (age, sex), medical background, treatments and outcomes. We also collected data on the type of adjusted analyses performed and the variables adjusted for. The final version of the database was validated by all the investigators involved in data collection (IL, SE, AN) and is available as Additional file 3.

Assessment of risk of bias

For the primary outcome, two of the authors (IL, MI) assessed the risk of bias (RoB) of the included studies independently and in duplicate using the ROBINS-I tool. [31] Disagreements over RoB were resolved by consensus or, if necessary, adjudicated by a third author (SE).

Outcomes

The main study outcome was the rate of adjusted survival to hospital discharge or, if not available, at 30 days. Secondary outcomes included the rates of (1) unadjusted survival to hospital discharge (2) favourable neurological outcome at discharge. Favourable neurological outcome at discharge was defined according to Cerebral Performance Category (CPCs) as this is the most commonly used tool for this outcome. CPC 1 and CPC 2 were considered favourable outcomes in our analysis as observed in the studies reporting neurological outcomes.

Certainty of the evidence

The certainty of the evidence (i.e., the overall effect estimates) was assessed for the primary outcome and the secondary outcomes using Grading of Recommendations Assessment, Development and Evaluation (GRADE) [32].

Statistical analysis

The population characteristics were described as weighted means and weighted standard deviation for continuous variables; and weighted means of percentages and weighted standard deviation from percentages for categorical variables. For adjusted analyses we used the generic inverse variance method to pool estimates and standard errors (SEs) as per Cochrane guidance [33, 34]. The results were reported as odds ratios (OR) with their 95% confidence intervals (CI) for dichotomous outcomes. A prediction interval (PI) was calculated for the primary outcome and for any outcome with an OR excluding the value of no difference. Meta-analyses were performed using adjusted estimates from multivariate models or propensity-matched cohorts for the mortality outcomes. ORs and CIs were transformed to natural log and SEs using standard formulas. Random effects models were used for all analyses.

We planned sensitivity analyses based on the RoB of the included studies.

Preplanned subgroup analyses were performed to study possible heterogeneity stemming from the number of centres (multi vs. single), study location (Europe, Asia, North America, other), population denominator (i.e., whether non-survivors to hospital admission were accounted for in the cohort or not), OHCA etiology, quality of adjustment variables.

We added an additional subgroup analysis based on the study timeframe after the literature search due to the broad range of years ultimately studied (31 years). The study timeframe was defined as the year of inclusion of the last patient in the cohort rather than the year of publication, since several studies were published many years after completion of patient inclusion. We also intended to perform an analysis separating before/after the 2015 guidelines, allowing for a 1-year implementation period. However, no study included patients strictly from 2016 onward; those most recent pooled data from both periods.

When effect size was attributable to a small number of studies in a subgroup (≤ 5) we calculated a pooled version of τ2 to be used across all subgroups, thereby decreasing reliance on an imprecise estimate of between-study heterogeneity in one subgroup [35].

All P values were two-tailed. P values less than 0.05 were considered statistically significant. Statistical heterogeneity (i.e., chance variation between studies) was sought by visual inspection of forest plots and with the nonparametric Cochran’s Q test and the I2 statistic [34]. Heterogeneity was considered likely if Q > df (degrees of freedom) and was considered confirmed if the P value was 0.10 or less. The possibility of small-study effects was first explored through inspection of funnel plot. The Harbord’s and Peter’s tests were planned to be performed to investigate small-study effects, except in the case of significant heterogeneity between studies, where an arcsine test using Rücker's random effects was to be preferred, assuming no publication bias [36,37,38]. These tests were chosen over the Egger’s test, given the dichotomous nature of the outcome of interest [39, 40].

All analyses were performed using R software (R Core Team 2013, R Foundation for Statistical Computing, Vienna, Austria, URL (http://www.R-project.org/) with the package meta [41].

Results

A total of 5,423 studies were screened, of which 28 were included in the final analysis, corresponding to 1,931,123 patients (1,136,311 male and 794,812 female) (Fig. 1).

Included studies

All the studies identified were observational and all were retrospective analyses of data from cohorts tracked and documented in real time. Five studies were only published as poster presentations [42,43,44,45,46]. The data covered four continents (Europe, Asia, America and Oceania) over a period of 33 years (1988–2021). There were data from single centre studies [43, 45, 47,48,49], national registries [20, 21, 25, 44, 46, 50,51,52,53,54,55,56] or local emergency medical services registries (e.g., comprised of all OHCAs registered in a city, in several countries or in several countries over a period of time) [17, 42, 57,58,59,60,61,62,63,64,65].

All studies included only patients with OHCA. Thirteen studies included only OHCA of cardiac etiology [17, 21, 25, 42, 46, 48, 53, 56,57,58,59, 63, 65] and fifteen studies included OHCA of both cardiac and non-cardiac etiologies [43,44,45, 47, 49,50,51,52, 54, 55, 60,61,62, 64]. Among the latter, seven also included patients whose cardiac arrest was due to trauma [20, 49,50,51,52, 61, 62] and five did not specify whether traumatic cardiac arrests were excluded or not [43,44,45, 54, 55]. Twenty-two studies provided survival data from OHCA to discharge (or to day-30), while six studies excluded patients that did not survive to hospital admission, reporting data from hospital admission to discharge (or to day-30) only (Additional file 2: Table S2) [44, 45, 47, 48, 53, 55].

Baseline data

Female were older than male (weighted means, 71.6 ± 5.1 years vs. 67.4 ± 3.8 years, based on eighteen studies), their arrest was less likely to be witnessed than that of male (weighted means of percentages, 40.5 ± 8.7% vs. 46.0 ± 8.8%, data from 21 studies). Female were more likely to undergo bystander CPR (weighted means of percentages, 46.6 ± 10.7% vs. 40.8 ± 8.9%, data from eighteen studies) but were less likely to present with a shockable rhythm than male (weighted means of percentages, 8.4 ± 4.4% vs. 24.1 ± 12.3%, data from 23 studies).

Less female than male were treated with coronary angiography (weighted means of percentages, 17.8 ± 15.1% vs. 32.3 ± 20.7%, data from four studies), percutaneous coronary intervention (weighted means of percentages, 7.0 ± 7.6% vs. 14.8 ± 7.8%, data from six studies) or targeted temperature management (weighted means of percentages, 12.8 ± 12.2% vs. 17.6 ± 14.6%, data from five studies) (Table 1).

Table 1 Description of the included population

Risk of bias

High RoB was identified predominantly in two domains (Fig. 2); domain 1 which pertains to problems with the adjusted analyses (lack of adjustment and/or lack of important variables in the adjustment) and domain 2 which pertains to selective population inclusion. If only OHCA from cardiac origin or only survivors to hospital admission were included, the domain was rated as moderate at least. The results showed that nine studies were at serious RoB, while the remaining studies were at moderate RoB. The funnel plot suggested publication bias by showing an asymmetry, this was, however, not confirmed by the Rücker’s test (p = 0.058).

Fig. 2
figure 2

Risk of bias visualisation for the primary outcome: adjusted survival to discharge (or 30-day survival) after OHCA

Primary outcome

All the studies provided data on survival either to hospital discharge or to day-30 after the occurrence of OHCA. Twenty-one studies provided adjusted data on survival. The variables used for adjustment differed between studies (Table 2).

Table 2 Outcomes and variables used for adjustment

While unadjusted data showed a lower likelihood of survival in females than in males, with an OR 0.68 [0.62–0.74], PI [0.43; 1.05] (I2 = 97%), adjusted aggregated data showed no difference in survival between male and female OR 0.98 [0.92–1.05], PI [0.76–1.35] (Fig. 3). As heterogeneity was high for our primary outcome (I2 = 86%, Fig. 3), sub-group and meta-regression analyses were performed as preplanned. Subgroup meta-analyses based on the quality of the variables adjusted for, geographical location, etiology of OHCA, type of cohort, number of centres and population denominator (Additional file 4) and meta-regression analyses for the same variables (Additional file 4) did not reduce heterogeneity. The quality of the variables adjusted for was added as a post-hoc analysis, as the latter was very uneven. Sensitivity analysis omitting studies identified as outliers most affecting heterogeneity, studies that had included traumatic arrest as the cause of OHCA or omitting studies assessed as having high RoB (Additional file 4) also did not reduce heterogeneity. We also tried to omit studies displaying data in a non-Utstein style, with no satisfactory results. Publication bias was also not found (Additional file 4). In other words, survival to hospital discharge or 30 days was not significantly associated with any of the variables that could be adjusted for when using published data (Table 3).

Fig. 3
figure 3

Forest plot of adjusted data on male and female survival to discharge or to 30 days after OHCA

Table 3 Results of meta-analyses

The pre-planned strategy was to refrain from aggregating data if heterogeneity was high. After seeking clinical heterogeneity that might have explained the levels of statistical heterogeneity, we concluded that the latter was stemming from latent variables and we decided to share the meta-analyses as supplemental data to show our honest reasoning and approach to the reader.

Secondary outcomes

The meta-analyses performed for the secondary outcomes are reported in Additional file 4. Females had a lower likelihood of unadjusted survival than males with an OR 0.68 [0.62–0.74], PI [0.43; 1.05] (I2 = 97%), and females also had a significantly lower likelihood of favourable neurological outcome than males with an OR 0.56 [0.49–0.66], PI [0.32; 0.98], (I2 = 95%). This trend disappeared when the data were adjusted, without difference between male in female in adjusted neurologically intact survival with an OR 0.96 [0.83–1.10], PI [0.63; 1.46] (I2 = 84%).

We found high heterogeneity (ranging from I2 = 98% to I2 = 83%) for unadjusted survival to discharge/30 days, and both adjusted and unadjusted favourable neurological outcome, precluding any interpretation.

Certainty of the evidence

The results of the GRADE assessment with regard to primary and secondary outcomes are reported in e-Table 3. Certainty was rated as low for the estimated rates of adjusted and unadjusted survival to hospital discharge (or 30-day survival), and low for the adjusted rates of neurological outcomes.

Discussion

This analysis identified no difference between male and female in the adjusted rate of survival to hospital discharge or 30 days after OHCA. This finding is striking when compared to the major survival advantage for male in our unadjusted analysis. Some of the unadjusted difference in the outcomes of male and female may be explained by baseline factors less conducive to a successful outcome (i.e., older age, greater comorbidity, less witnessed arrests and less shockable rhythms). This finding is in line with the "gender paradox" described by Bougouin et al., wherein females have similar survival outcomes despite worse prognostic factors than males [18]. Adjustment for many of these factors seems to have corrected the initial imbalance. However, the ongoing heterogeneity in our meta-analyses for both the primary and secondary outcomes suggests prudence is still required before equal outcomes are assumed. Had some of our adjusted analyses shown less heterogeneity, this would have served as proof that ultimate survival is related to the factors studied. Our failure to eliminate or even reduce heterogeneity suggests the presence of latent factors that could still tilt the final balance in favor of better survival for either male or female. This latent factors could include disparities in post resuscitation care, such as access to CAG, PCI and TTM as suggested in our data. These elements were not included as variables for adjustment in many studies and may relate to upstream factors or inequities in the system of care. Post-resuscitation management has been reported to be more conservative in women than in men (less referral for cardiac interventions in particular) [12]. We found no sex differences in survival despite the fact that females have worse prognostic factors than males. Therefore, the sex-related risk–benefit ratios of specific treatments should undergo thoughtful reconsideration as there may be unwarranted inequalities in the care we provide to our patients.

Three metanalyses have been published on the topic of sex differences in outcomes following OHCA. One was published in 2015 and, therefore, required an update [18]. We identified and included fifteen papers published after the last search date of this study. The second was published more recently but suffers from several major flaws; its protocol was not registered, sex differences as primary and secondary outcomes were pooled, there was no division of single centre, national registry and emergency medical services data, changes over time were not studied and, most importantly, adjusted and unadjusted outcomes were not separated [66]. The third meta-analysis was performed on adjusted data, but it too was not registered and its main focus was on age as a confounder [13]. All of these meta-analyses reported a high degree of heterogeneity as do we. The authors of one of these meta-analyses have put forward that differences in survival of male and female may be “a matter of education" (i.e., poorer outcomes in locations with greater inequality for female) [66]. We sought such effects in the subgroup analysis by geographical location and found none. Put together, all of these findings suggest the need to study variables outside those reported in the Utstein template. Our failure to elucidate the causes of heterogeneity suggests the presence of unstudied factors (e.g., in-hospital treatment and/or decision making). Ignoring this finding may perpetuate latent inequalities, perhaps even unrelated to sex.

We did find important selection bias introduced by use of a population denominator of survivors to hospital admission. While this bias seems to contribute little to the heterogeneity observed in relation to our PICO question, it may affect other studies related to survival outcomes. The omission of some cases is often understandable in emergency circumstances. However, our finding suggests there is a need to establish a reporting template that highlights inclusion bias when reporting outcomes after cardiopulmonary resuscitation.

The differences observed between males and females in this analysis may be explained by multiple factors. Females clearly have worse predisposing factors than males at the time of cardiac arrest. In some places poor female education may lead to neglect of risk factors and even late referral if symptoms occur pre-arrest. There may be intrinsic differences in physiological responses to ischemia–reperfusion mechanisms which require elucidation. Finally post-resuscitation care may differ based on family request, professional concerns regarding potential risk/benefit ratio and more.

Our study has several strengths. We prospectively registered the study protocol, performed analysis of adjusted data, assessed the RoB of the included studies and applied GRADE to assess the certainty of the evidence. However, any meta-analysis is only as good as the studies it includes. Most of the studies included in this review were judged at high RoB and neither randomization nor blinding are an option with regard to the study question at hand. Analysis of retrospective studies (regardless of real time data collection) carries an inherent risk of confounding due to study design. Consequently, GRADE, which depends on the lowest quality of evidence for the clinically meaningful outcomes being studied, yielded only low certainty for our conclusion. This is further compounded by the fact that our strategy to handle statistical heterogeneity was unsuccessful. The internal and external validity of the studies included are low and the study populations are also very heterogeneous. Finally, although we studied adjusted data, the variables adjusted for differed between studies. Some studies did adjust their data on care management variables (speed of intervention, PCI, TTM), and there was a possibility of collinearity between these variables and sex, as differences in care themselves can be related to sex leading to a risk of overfitting in our analysis. Sensitivity analyses excluding those studies did not lead to a significant change in results.

Conclusions

No difference was found between male and female in the adjusted rate of survival to hospital discharge or at 30 days after OHCA. Substantial heterogeneity and failure to elucidate its causes suggest the existence of undetermined factors that were not reported in the Utstein template or the underestimation of factors, such as post-resuscitation care. This systematic review calls for additional further investigation for possible latent inequalities in some reports. Future studies should include data on post-resuscitation care (such as provision of CAG, PCI and TTM) as well as information that is currently no included in the Utstein template such as end of life decisions and in-depth analysis of decision-making processes with regard to provision of organ and life support.