Background

The evaluation of medical quality and performance has become an integral part of health care systems. Hundreds of indicators have been developed to assess the quality and performances of health facilities and health systems and to make inter-hospital comparisons. Cesarean section (c-section) rate is one of the most frequently used quality indicators to evaluate or compare obstetric departments. The utilization of c-section for reasons other than medical necessity [13] and the associated costs [4] make this procedure particularly subject to observation by policy makers and public health experts [5].

Hospitals and health systems are often evaluated based on their cesarean delivery rates, with the implicit assumption that lower rates reflect more appropriate, as well as more efficient, clinical practice.

Cesarean section rate has high face validity and is easy to measure because its predictors are reported in administrative data. However, the apparent simplicity of calculating the cesarean section rate can be deceptive. In fact, there is little consistency across performance measurement systems in the specifications of how it is calculated. There are differences in how the population is defined (i.e. who is included and excluded) and how the risk adjustment methodologies are applied [5].

Cesarean delivery is indicated for many clinical situations such as placenta previa, HIV and other risk factors [4, 6, 7] and the failure to account for such patient-specific risk factors may lead to biased comparisons. This may be particularly problematic for making inter-hospital comparisons, given the wide variations in socio-demographic and clinical characteristics of patients at different hospitals, and the referral bias that can result from differences in the availability of clinical services for women with high-risk pregnancies [8].

Risk adjustment is one technique to identify and control for potential confounders [9]. It is increasingly used in observational studies, and has already been used to compare c-section rates between hospitals that are homogenous regarding their "a priori" risk of cesarean delivery. Various authors have demonstrated modest to poor agreement between hospital rankings based on unadjusted and adjusted c-section rates [5, 8, 1012]. Only one author did not observe important differences in hospital rankings, suggesting that additional resources for complex data adjustment may not be warranted [6].

In most cesarean c-section studies, rates were adjusted by controlling for a large number of potential confounders, including socio-demographic, maternal and fetal clinical conditions, even if they were not actual confounders [13]. However, collecting many variables is onerous and is associated with problems of data completeness, accuracy, and reliability, and tends to reduce the precision of the adjusted measures [14]. Moreover, no consensus has been reached regarding which clinical, demographic, and/or hospital organizational factors should be considered actual confounders.

There are only a few studies [1, 10, 15] regarding risk-adjustment methods for inter-hospital comparison of c-section rates that have been conducted in Europe. We present one carried out in Emilia Romagna (Italy), to determine whether risk adjustment is necessary for inter-hospital comparison and to identify a risk-adjustment model based on a limited number of variables to increase the precision of estimates without compromising validity.

Methods

Since 1995 in the Emilia Romagna Region of Northern Italy (RER), all hospital discharge abstracts have been electronically recorded, using a Hospital Information System (HIS). The data stored in the system includes demographics [ID number, gender, date and place of birth, place of residence], discharge ID, admission and discharge dates, up to 9 discharge diagnoses and 9 procedures (International Classification of Diseases, 9thRevision, Clinical Modification ICD-IX-CM), ward(s) of hospitalization, date(s) of in-hospital transfer, and the regional code of the admitting facility.

We selected all hospital discharge abstracts for women in labor and of newborns from 36 maternity units in the region from January 2003 to December 2004.

This study takes as its sample live births for whom the discharge records for the mothers and infants were linked by hospital code, mother's discharge ID and date of delivery.

To identify the delivery, we used Diagnosis-Related Groups (DRGs) 370–375 from the discharge data. DRG 370 and 371 (cesarean section with and without complication, respectively) were used to identify cesarean deliveries. ICD-IX-CM diagnosis code 654.2x was used to identify any previous cesarean deliveries [16]. The number of births from primary cesareans was calculated as the difference between the number of births from c-sections deliveries and number of births from c-sections deliveries in women with previous cesareans.

Therefore, primary cesarean rates were calculated with the formula:

Number of births from primary cesarean deliveries 100 Number of births from deliveries with no previous c-section MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabb6eaojabbwha1jabb2gaTjabbkgaIjabbwgaLjabbkhaYjabbccaGiabb+gaVjabbAgaMjabbccaGiabbkgaIjabbMgaPjabbkhaYjabbsha0jabbIgaOjabbohaZjabbccaGiabbAgaMjabbkhaYjabb+gaVjabb2gaTjabbccaGiabbchaWjabbkhaYjabbMgaPjabb2gaTjabbggaHjabbkhaYjabbMha5jabbccaGiabbogaJjabbwgaLjabbohaZjabbggaHjabbkhaYjabbwgaLjabbggaHjabb6gaUjabbccaGiabbsgaKjabbwgaLjabbYgaSjabbMgaPjabbAha2jabbwgaLjabbkhaYjabbMgaPjabbwgaLjabbohaZjabgwSixlabigdaXiabicdaWiabicdaWaqaaiabb6eaojabbwha1jabb2gaTjabbkgaIjabbwgaLjabbkhaYjabbccaGiabb+gaVjabbAgaMjabbccaGiabbkgaIjabbMgaPjabbkhaYjabbsha0jabbIgaOjabbohaZjabbccaGiabbAgaMjabbkhaYjabb+gaVjabb2gaTjabbccaGiabbsgaKjabbwgaLjabbYgaSjabbMgaPjabbAha2jabbwgaLjabbkhaYjabbMgaPjabbwgaLjabbohaZjabbccaGiabbEha3jabbMgaPjabbsha0jabbIgaOjabbccaGiabb6gaUjabb+gaVjabbccaGiabbchaWjabbkhaYjabbwgaLjabbAha2jabbMgaPjabb+gaVjabbwha1jabbohaZjabbccaGiabbogaJjabb2caTiabbohaZjabbwgaLjabbogaJjabbsha0jabbMgaPjabb+gaVjabb6gaUbaaaaa@BCB0@

There were 62,836 births from deliveries to women with no previous c-section and they were included in our study population, excluding the following:

  • mothers under 11 and over 50 years of age

  • mothers discharged from hospitals without an operating room

  • infants with a birth weight under 550 or over 6000 g

Hospitals with fewer than 100 deliveries per year were excluded to warrant sufficient power of comparison.

The following socio-demographic variables, considered as potential risk factors for cesarean sections, were collected: maternal age (<17, 18–20, 21–24, 25–28, 29–33, 34–38 ≥ 39), citizenship, (Italian, from developing countries, undeveloping countries other than Italy), residency (RER or other), and marital status (married, divorced-separated, single, widow). Maternal and neonatal clinical factors were also retrieved. These factors were defined using the primary and secondary discharge diagnoses of the delivery and newborn admission ' [see Additional file 1]'.

We did not consider dystocia and fetal distress as potential risk factors because of the poor reliability of their definition and because this diagnosis may reflect post ad hoc justifications of cesarean use, rather than objectively assessed conditions [17, 18].

The study was conducted in collaboration with the Azienda Ospedaliera Sant'Orsola-Malpighi, the teaching hospital of the University of Bologna, Italy

Statistical analyses

Descriptive statistics and hospital-specific crude Odds Ratios (odds of c-section for patients admitted to a specific hospital vs. odds of c-section for patients admitted to the reference category) were reported.

To take into account the role of confounders, two different logistic regression models were adopted: a "full" and a "parsimonious" model.

The "full" model was defined applying a backward selection procedure to a list of potential confounders selected according to available scientific evidence. All previously defined factors were entered and were retained if they were significant predictors of c-section (p < .05). Because of the large size of the database, an α of .05 was chosen to minimize the number of variables in the model and to maximize the strength of the association.

The "parsimonious" model was defined applying a "change-in estimate" procedure [1921]. The first step of this method included the same factors entered in the full model and the exposure of interest (a specific hospital vs. reference category). Subsequently, all factors that did not modify, or only slightly modified the estimated effect of exposure, were excluded from the model.

The "change-in estimate" procedure identified the actual confounders for single comparisons and was repeated for each comparison (each hospital vs reference), defining as many risk adjustment models as there were comparisons. All factors, identified by at least one comparison, were included in the "parsimonious" model.

The model's performance was evaluated based on how closely it predicted the results actually observed, following the criteria for discrimination (C statistics) and calibration (Hosmer-Lemeshow test). The differences in the predictive value of the two models were assessed using the Akaike Information Criterion [22] to augment the log likelihood ratio χ2 test, with a penalty for differences in the number of variables in the models compared.

The reference category included hospitals with the lowest adjusted c-section rates based on the full model. This category was defined according to the following steps:

  1. 1.

    28 hospital dummies were added to the full model and the corresponding adjusted ORs were ranked. In this case the reference category was selected as the hospital with the highest number of births.

  2. 2.

    Four hospitals with the lowest adjusted ORs were selected as reference category.

  3. 3.

    Finally 25 hospital dummies, representing the rest of the 25 hospitals, were added to the full model and the estimated ORs were ranked. In this case the four hospitals, selected as reference category, were used for benchmark purposes in evaluating hospital performance for c-section in this study.

The crude and adjusted ORs obtained by the two models were used to rank hospitals, and the consistency of rankings was assessed using Spearman's rank correlation coefficient.

The statistical analysis was performed using SAS 8.2 (SAS Institute, Cary, NC) and Stata 8.2. (College Station, Texas 77845, USA).

Results

Of a total of 62,836 births from deliveries with no previous c-sections in the RER during 2003–2004, 15,197 (24.2 %) were births from primary caesarean deliveries.

Table 1 lists the 29 hospitals involved, and their cesarean delivery rates that ranged from 11.8% to 57.0%.

Table 1 Total number of births and from primary cesarean deliveries by hospitalRER, 2003–2004

Results from the multiple logistic regression models are listed in table 2. Of the twenty-four significant variables included in the full model, the greatest adjusted ORs were found for malposition and malpresentation of fetus (OR = 155.5; 95% CI: 126.0–190.6), antepartum hemorrhage/abruptio placentae/placenta previa (OR = 75.1 95% CI: 54.1–104.4), cord prolapse (OR = 70.3; 95% CI: 16.5–299.1) and HIV (OR = 28.3; 95% CI: 11.7–68.6). The following variables were not significant predictors after adjustment: pre-term delivery, premature rupture of membranes, Rh-isoimmunization and post-maturity or macrosomia.

Table 2 Crude primary c-section ORs by all variables and adjusted ORs by the variables included in the full and the parsimonious model

The change-in estimate procedure identified seven variables which act as confounders in at least one comparison: marital status, age of mother, infant birth weight, fetopelvic disproportion/excessive development of the infant, eclampsia or pre-eclampsia, antepartum hemorrhage/abruptio placentae/placenta previa, malposition and malpresentation of fetus. These factors were included in the parsimonious model; antepartum hemorrhage/abruptio placentae/placenta previa, malposition and malpresentation of fetus accounted for the greatest adjusted ORs.

Table 3 reports the number of comparisons where each variable is identified as confounder by the change-in estimate procedure.

Table 3 Variables identified as confounders and occurrence of their confounding effect on total comparisons

The discrimination capacities were high for both models (0.78 full model; 0.73 parsimonious model) and the Hosmer-Lemeshow statistic showed a lower calibration in the parsimonius than in the full model (H-L = 24.76 p = 0.002, H-L = 7.32 p = 0.503 respectively). The AIC statistic was similar in the two models (AIC= 52.195,9 full model; AIC= 54.803,4 parsimonious model).

Table 4 reports crude and adjusted C-section ORs for hospitals and p-values by the full and the parsimonious adjustment model.

Table 4 C-section ORs and 95% CI by hospital, obtained by crude analyses, the full and the parsimonious adjustment model

Hospital W had the highest C-section rate when analyzing crude ORs (OR = 5.79; 95% CI: 4.83–6.94) or adjusted ORs, estimated by both models (OR = 8.77; 95% CI: 7.11–10.80 and OR = 8.55; 95% CI: 6.97–10.49 respectively). Adjusted ORs by the full model were greater than crude ORs in 24 out of 25 hospitals, adjusted ORs by the parsimonious model were greater than the crude value in 24 out of 25 hospitals

Hospital ranking using the crude and adjusted ORs is reported in table 5.

Table 5 Hospital rank according to OR calculation methods: crude analyses, full and parsimonious model

Eighteen units had their rank change after adjustment by the full model: seven facilities moved 1 position, four moved 2 or 3 positions, seven moved 4–10 positions. The parsimonious model changed the ranking of 19 facilities: ten moved 1 position, four moved 2 or 3 positions, five moved 4–10 positions.

The correlation coefficient between hospital rankings ordering crude and adjusted ORs by the full model was 0.87, and 0,80 by the parsimonious model. The correlation coefficient between hospital rankings ordering adjusted ORs by the two models was 0.92.

The ratios between upper and lower 95% confidence intervals for the hospital specific ORs estimated by the "parsimonious" model were lower than those obtained by the "full" model; the former improved the precision of the estimates.

Discussion

Our results indicate that risk adjustment, by removing the inherent bias associated to non random allocation of deliveries, substantially changes inter-hospital comparisons.

After adjusting for heterogeneity of distribution of risk factors for c-section, the ranking of maternity units was substantially modified, with most hospitals registering higher adjusted than crude c-section rates.

In addition to adjusting for a large number of potential confounders, the specific goal of this study was to identify an efficient model that included only actual confounders of the comparison between hospitals. Including factors in a risk adjustment model that do not induce a relevant bias on the measure of association may reduce precision of estimates. The change-in-estimate is one method that improves the parsimony of the model and still results in precise estimates, by eliminating variables that are not actual confounders. To act as a confounder a variable must be associated with the outcome of interest (i.e. c-section) and heterogeneously distributed between categories of exposure (i.e. hospitals). Among the 24 factors identified as c-section predictors by the full model, only seven were the actual confounders used in the parsimonious model. Six out of seven actual confounders were clinical conditions of the mother (antepartum hemorrhage/abruptio placentae/placenta previa, malposition and malpresentation of fetus, eclampsia or pre-eclampsia), the fetus (birth weight), or both (fetopelvic disproportion/excessive development of the infant); these factors and maternal age have already been recognized by previous studies as risk factors for c-section [7, 8, 2328]. Marital status, the remaining actual confounder, could be a surrogate of unmeasured risk. It is beyond the scope of our study to discuss the possible reasons of the heterogeneous distribution of these risk factors across hospitals, likely to be related to selection factors at work in the health care system.

Two general categories of factors might explain the variation in primary cesarean section rates between hospitals: case mix and hospital performance. In Emilia Romagna, there was an increase in inter-hospital variability after adjusting for clinical case mix, confirming that differences in c-section rates are mainly due to non clinical factors.

Although many Authors [5, 11, 12, 18] advocate considering case mix when comparing c-section risk, the impact risk adjustment has on hospital comparisons and rankings differs between studies [5, 8, 28].

In addition to the mixed findings regarding the role of case mix in explaining inter-hospital variations, there are important methodological differences between studies. They are related to the source of data used (birth certificates, medical records and insurance claims), to criteria used to define c-section, to inclusion and exclusion criteria, to the final summary indicators (rates or ORs) produced, to the methods used for risk adjustment and to the variables controlled [17].

In this study we defined c-section based on DRGs. A previous study [5] showed high reliability among different methods used to calculate c-section rates.

The highest adjusted ORs obtained for the clinical variables identified as actual confounders in the parsimonious model address an important issue regarding appropriate adjustment factors. Although one would think that risk factors for cesarean section would be consistent across studies, there is inconsistency in the risk factors included in the adjustment models [8, 11, 2932]. For example, factors like presentation other than vertical malposition or malpresentation, fetopelvic disproportion/excessive development, and placenta previa tend to indicate complicated pregnancies where c-section is often the only choice.

It is therefore possible that instead of adjusting for complicated pregnancies, they should be evaluated independently from others without such complications [33].

This study, as many other studies, evaluated c-section performance on administrative discharge data. Problems in accuracy, completeness, and quality might differ from hospital to hospital. The potential for inconsistencies in the coding of discharge records challenges the accuracy of the assessment of the outcome and of the risk factors in both the study population and in other populations [12]. Errors in coding could have occurred, which would have resulted in subsequent errors in adjustment. Omissions of ICD codes identifying risk factors, were more likely in the group without c-section leading to an excess of risk adjustment. Nevertheless, discharge databases are widely available at the state and regional levels, and are already routinely used. Administrative data have proved to be an accurate source to monitor c-section rates and a reliable data source to adjust for risk factors [3336]. Moreover, administrative data from Emilia Romagna are considered of good quality, especially when compared with those from other regions in Italy. Methods used to develop models based on administrative information have the potential to be generalized to other populations. However, any risk adjustment model should be considered time and population specific.

Another limit of the study is the impossibility of including all possible clinical factors in the model; maternal parity, primipary, fetal distress and dystocia, for example are known risk factors for cesarean section [7, 37], but were not included in the model because the information was either unavailable, incomplete, or considered unreliable. The result being that c-section ORs in some hospitals, especially teaching and referral hospitals might have been underestimated.

Conclusion

Risk adjustment is necessary to compare hospital c-section rates, it shows differences in rankings and highlights inappropriateness of some hospitals. By adjusting for only actual confounders valid and more precise estimates could be obtained.

Anyway, additional studies, including qualitative studies, are recommended to identify which clinical and non clinical factors can explain inter-hospital variability. These factors should be explored in order to address the inappropriate use of this procedure.