The use of population health data for health and health outcomes research is increasing. These routinely collected data may be administrative, surveillance, registry or vital statistics collections and have the common feature of including information on an entire population. However, concerns about the completeness of comorbidity information in the admission of interest (index record) have been raised as a limitation of using hospital discharge data for research [1]. One reason that comorbidity information is under-ascertained from hospital records is that only diagnoses affecting the current admission are required to be coded in the discharge summary, so unrelated chronic illnesses may not be recorded [2]. However, through record linkage it is possible to evaluate a patient's hospitalisation history in detail. Records belonging to the same individual can increasingly be longitudinally linked. The term that refers to identifying disease prevalence from health records that precede the record or event of interest is 'lookback' [3].

Using a longer lookback period for ascertaining a condition is likely to result in a higher proportion of subjects with the condition, but the effect of the condition may be reduced because the severity of the condition can vary depending on how recently it was identified [4]. Few studies have assessed the impacts of different lookback periods on ascertaining comorbidities, and almost all focused on the predictive performance of a comorbidity score in modelling of in-hospital or post-hospital mortality or readmission [3, 58]. Little is known about the most appropriate lookback period for ascertaining comorbidities with regard to disease prevalence and risk estimation, predictive ability and statistical modelling of other outcomes. This is especially true in pregnancy which usually occurs among women who are relatively young and healthy. In Australia, 14% of female hospitalizations are related to pregnancy and childbirth. To date, lookback studies have been limited to older populations and the utility of the approach in pregnancy is unknown.

Worldwide, obstetric haemorrhage is a leading cause of maternal mortality and accounts for about 25% of all maternal deaths [9]. Increased rates of haemorrhage following childbirth have been observed in recent years in Australia, Canada, USA and Scotland [10]. Risk factors for obstetric haemorrhage include chronic diseases, advanced maternal age, obesity, cesarean section, multiple births, and induction and augmentation of labor [1113]. Obstetric haemorrhage is therefore a suitable outcome to use for examining the effect of different lookback periods on ascertainment of risk factors and their prediction of subsequent outcome. In this study, we used longitudinally linked hospital discharge records to (1) assess impacts of different lookback periods on ascertainment of chronic disease history in pregnant women and (2) examine effects of increased ascertainment on modelling of risk factors for obstetric haemorrhage.


Study population and data sources

In the State of New South Wales (NSW), Australia, comprehensively linked perinatal population data were available from 1 July, 2000 to 31 December, 2006. Details of the record linkage were reported in a previous study [14]. For the current study we selected a population of pregnant women with five years of lookback and focused on women in their first pregnancy. Women with a previous pregnancy would have prior maternal admissions and might therefore have more opportunities for identification of chronic diseases in hospital data than women without a previous pregnancy. Study subjects included 55,002 women who had their first birth in NSW during 1 July, 2005 to 31 December, 2006. These women were identified from the NSW Midwives Data Collection ('birth data'). The birth data contain information on all births in NSW, including number of previous births, maternal health (including pre-existing hypertension), pregnancy, labour, delivery and perinatal outcomes. The birth data include information on live births or stillbirths of at least 20 weeks gestation or at least 400 grams birth weight.

The NSW Admitted Patient Data Collection ('hospital data') covers every inpatient admission in NSW, and includes demographic and episode-related data. Data from the medical records are coded according to the tenth revision of the International Classification of Diseases Australian Modification (ICD-10-AM) and the affiliated Australian Classification of Health Interventions [15]. Up to 20 diagnoses and 20 procedures were used for disease identification in this study. Figure 1 presents the selection procedure of study subjects and hospital records. This study was approved by the NSW Population and Health Services Research Ethics Committee.

Figure 1
figure 1

Selection procedures for the study women and their hospital records.

Ascertainment of diseases

We selected chronic diseases including cardiac diseases, chronic renal disease, asthma/chronic obstructive pulmonary disease (COPD), psychiatric disorders, pre-existing hypertension, pre-existing diabetes, thyroid disorders and autoimmune diseases, for the study. The autoimmune diseases include Crohn's disease, ulcerative colitis, lupus, idiopathic, thrombocytopenic purpura, multiple sclerosis, psoriasis, autoimmune thyroiditis, rheumatoid diseases, Coeliac disease, vasculitis, pernicious anemia, myasthenia gravis, autoimmune hepatitis, ankylosing spondylitis, polymyositis and primary biliary cirrhosis. There is some evidence suggesting increased risk of obstetric haemorrhage associated with cardiac disease, pre-existing hypertension, asthma and thyroid disorders [11, 1618]. Based on biological plausibility, and since it is not clear that others have investigated potential associations with obstetric haemorrhage, renal disease, psychiatric disorders, diabetes and autoimmune diseases were also included as potential risk factors for haemorrhage. Given that there are relatively few population-based studies of risk factors for obstetric haemorrhage and that the chronic diseases are relatively rare among pregnant women, our large sample of pregnancies represented an ideal opportunity to investigate potential influence of other chronic diseases on haemorrhage risk. These diseases were chosen because their chronic nature means that they would still be present at the time of the birth regardless of the lookback period chosen.

The ICD-10-AM diagnosis and affiliated procedure codes for these chronic diseases are presented in Table 1 in the Appendix. Cardiac diseases with an acute onset, such as acute myocardial infarction, other acute ischemic heart diseases, acute pericarditis, acute and subacute endocarditis, acute myocarditis, cardiac arrest and heart failure, were excluded if first identified from the birth admission records. This is because the identified diseases may be the complications of pregnancy and it is only appropriate to include diseases that are present before the birth admission. We used maternal hypertension information that was recorded in either the birth data or the hospital data at birth to improve ascertainment of this condition [19].

Table 1 ICD10-AM diagnose and procedure codes for the eight chronic diseases.

In this study, obstetric haemorrhage (refer to as 'haemorrhage') was identified from maternal hospital records for the birth admission and any associated transfer to another hospital prior to discharge home. A case of haemorrhage was determined if a record had any diagnosis code for postpartum haemorrhage (O72), intrapartum haemorrhage (O67), placenta previa with haemorrhage (O44.1), antepartum haemorrhage (O46.0), morbidly adherent placenta (O43.2), transfusion (Z51.3) or acute post-haemorrhage anaemia (D62); any procedure code for transfusions (13706-01,13706-02,13706-03,92061-00 or 92062-00) or in case of vaginal birth any procedure code for manual removal of placenta (90482-00 or 90483-00).

Data analysis

The proportion of women with the selected chronic disease was calculated for different lengths of lookback, with the longer lookback periods including all conditions reported in the shorter periods: 'Birth' - at birth admission (day 0), 'Pregnancy' - from day 0 back to the estimated 1st day of pregnancy, '2 years' - from day 0 back to 2 years, '3 years' - from day 0 back to 3 years, '4 years' - from day 0 back to 4 years and '5 years' - from day 0 back to 5 years. The first day of pregnancy was estimated by baby's date of birth minus 7 × gestation age (ranged from 18 to 44 weeks) that was recorded in the birth record. Potential risk factors for obstetric haemorrhage such as type of hospital, baby's gender, birth weight, multiple birth, gestational age, maternal age and combination of onset of labour and mode of delivery were obtained from the birth record where they are reliably reported [20].

Logistic regression was employed to determine the effect size (odds ratio) of a potential risk factor on haemorrhage after adjusting for maternal age. In the selection of independent risk factors, age was always retained in the model and a backwards elimination approach was used to progressively remove the least significant term until all terms remaining were significant (P < 0.05, two-sided). The capacity of a model to predict haemorrhage was evaluated using the area under the receiver-operating characteristic (ROC) curve (C-statistic), with values of 1.0 representing perfect ability and 0.5 indicating no better ability than chance. For comparing correlated C-statistics, we used %roc SAS® macro [21] (a nonparametric approach based on generalized U-Statistics [22]).


Of 55,002 women with a first birth from 1 July 2005 through 31 December 2006, 53,438 (97.2%) linked to a birth admission record and were included in the study (Figure 1). A total of 111,647 hospital records including admissions in the five years prior to (n = 58,209) and the birth admission records (n = 53,438) were used to ascertain chronic diseases for the 53,438 women, giving a median of two records per woman (range: 1 to 263; interquartile: 1 to 3). Of the 53,438 women with a mean age of 28.8 (SD 5.7) years, 47.9% had one record (the birth admission), 22.2% had two records, 11.2% had three records, 6.2% had four records, 3.6% had five records and 9.0% had six or more records. Table 2 presents numbers of women and hospital records for each ascertainment period. In addition to the birth admissions, 1,831 women were transferred to another hospital prior to discharge from the hospital system, and both the birth and record subsequent to transfer were used for identifying haemorrhage (Figure 1). From these 55,269 records, 5,047 (9.4%) women were determined to have haemorrhage according to the definition of this study.

Table 2 Numbers of women and hospital records for each ascertainment period

The numbers of these chronic diseases ascertained from different lookback periods are presented in Table 3. In this sample, asthma/COPD was the disease with the highest prevalence (2.35%) while the thyroid disorders had the lowest prevalence (0.51%). The proportion of all cases that were ascertained from the birth admission records (=No. of cases from 'Birth'/No. of cases from '5 years') differed by disease from 17.8% for chronic renal disease to 82.0% for pre-existing diabetes. For all chronic diseases, the prevalence increased with increasing length of lookback period. However, the rate of the increase was much slower after 2 to 3 years than for the more recent periods. The additional (case) ascertainment from year 4 to year 5 (i.e. [number at year 5 - number at year 4]/number at year4) was small, ranging from 0.5% for pre-existing hypertension to 6.8% for asthma/COPD.

Table 3 Cumulative frequency and relative frequency of cases ascertained at different lookback periods and the prevalence of diseases for the 53,438 women by disease type

Based on birth admission records only and after adjusting for maternal age, significant risk factors for haemorrhage were cardiac disease (OR 1.55, 95% CI: 1.03 to 2.33; P = 0.04), chronic renal disease (OR 3.00, 95% CI: 1.71 to 5.26; P < 0.001) and psychiatric disorders (OR 1.67, 95% CI: 1.22 to 2.27; P = 0.001) (Table 4). However, cardiac disease was not statistically significant when we included cases ascertained from hospital records that were more than two years prior to delivery. For these three diseases, the effect size decreased consistently with increasing length of lookback period. For the other five diseases presented in Table 3, there was only a small change in the effect size from one lookback period to another despite the identification of more women with the condition. For example, for pre-existing diabetes age-adjusted OR based on lookback period of 'Birth', 'Pregnancy', '2 years', '3 years', '4 years' or '5 years' was 1.06, 1.12, 1.10, 1.15, 1.14 and 1.12 respectively; and for pre-existing hypertension was 1.22, 1.18, 1.17, 1.16, 1.15 and 1.14 respectively (Table 4).

Table 4 Age-adjusted&odds ratio (OR) for potential predictors of obstetric haemorrhage for different lookback periods

In multivariate analysis, only chronic renal disease (adjusted OR 2.85, 95% CI: 1.58 to 5.12; P < 0.001) and psychiatric disorders (adjusted OR 1.48, 95% CI: 1.08 to 2.02; P = 0.01) were independent risk factors for haemorrhage using the birth admission records. These results were adjusted for type of hospital, baby's gender, birth weight, multiple birth, gestational age, maternal age and combination of onset of labor and mode of delivery (Table 5). This model has a C-statistic of 0.624. The predictive ability did not improve with any extension of the lookback period for ascertaining the two diseases (all P > 0.29), and the C-statistic remained about the same (0.624) for models using hospital records from lookback periods of 'Pregnancy', '2 years', '3 years', '4 years' and '5 years'. This was also the case when comparing to a model that included two variables (i.e. one for the birth admission and the other for the lookback period) for each of the two chronic diseases (C-statistic 0.624 vs 0.624, P = 0.95). Both additional variables were not statistically significant in the model (both P > 0.71).

Table 5 Independent risk factors of obstetric haemorrhage and C-statistics


This study showed that longer ascertainment periods resulted in improved identification of chronic disease history among pregnant women. Surprisingly, extension of the lookback period up to five years for chronic diseases did not increase the estimated risk effect of any predictions of haemorrhage, and contributed little to the performance of the haemorrhage predictive model. These results indicate that the effort of accessing previous hospital records for the completeness of comorbidity information is not always worthwhile.

As anticipated, the ascertainment rate of a chronic disease in this and other studies [6] increased progressively with increasing length of the lookback period. We hoped that a five-year ascertainment period for a chronic disease would give good estimation of the population prevalence in the study of young and generally healthy women. In this study, the population prevalence of chronic renal disease in young women in NSW was estimated to be around 0.7% based on a five-year ascertainment period. This is within the range of internationally reported prevalence (0.5 to 1.3%) [2326]. The rate of 0.8% for cardiac diseases (mainly congenital heart disease in this population) in this study also appears to provide a good estimate of the population prevalence. Congenital heart disease occurs in approximately 1% of newborn babies worldwide [27] and about 80% of patients with such disease survive to adulthood [28]. The prevalence of 0.6% for pre-existing diabetes in this study is similar to the population prevalence of 0.7% in Australian women aged <45 years, 2004 to 2005 [29]. The rate of 0.51% for thyroid disorders in this study is similar to the estimated rate of clinical hypothyroidism or hyperthyroidism in the USA (0.43%), although the majority of thyroid disease is subclinical [30, 31].

However, our study indicated that the prevalence of some diseases (i.e. asthma and chronic hypertension) was under-estimated. This is likely to be related to the fact that hospital data only identifies diseases/conditions that require hospitalisation or that affect a hospital admission. Although lookback over 5 years increased the identification of asthma from 0.9% to 2.4%, this still represents poor identification of women with asthma. The National Health Survey 2004-05 reported that 13.5% of Australian women aged 15 to 45 years had asthma and 3% of the population had COPD (emphysema and/or bronchitis) [32]. Similarly a validation study of 1184 pregnant women in NSW reported the prevalence of asthma to be 12% in pregnancy and the sensitivity of the recording of asthma as a comorbidity during maternal birth admissions was only 12.3% [33]. The prevalence of chronic hypertension (1% with ≥2 years of lookback) is lower than the prevalence of antihypertensive drug use in 25 to 34 year olds in NSW in 1999 (1.4%), but 26.3% of pregnant women were < 25 years in our population [34]. Other limitations of using longitudinally linked hospital records included missing ascertainment periods (e.g. migration or admission to hospitals outside NSW) and outpatient data, the assumption of disease chronicity and changes in diagnostic criteria for a disease over time.

With regard to predictive ability, information from prior hospital admissions might not improve the capacity of a predictive model if it were simply used to increase the number of cases with a condition. In a study of 61,815 patients, Kim and Ahn [5] reported no significant improvement in the predictive capacity of in-hospital mortality of a model with 3-years inpatient comorbidity score (either Elixhauser or Charlson) compared to a model with 1-year inpatient comorbidity score. Zhang et al. [3] also reported that models for 1-year mortality prediction among elderly patients using 1-year inpatient Charlson score or 2-years inpatient Charlson score were almost identical. Extra cases identified from prior admissions might be less severe or at an early stage of the illness but are treated equally to the cases from the index admission in the analysis. This might explain the finding of no improvement in the statistical performance by this and other studies.

On the other hand, Zhang et al. [3] found increased predictive capacity if comorbidity information from year 1 and year 2 inpatient records for the Charlson score were entered separately into the model. Preen et al. [6] reported similar findings, and found that C-statistics for 1-year mortality prediction in medical patients and procedural patients were 0.892 and 0.917 respectively for a model with a comorbidity score of the index admission and increased to 0.900 and 0.923 respectively for a model with two comorbidity scores: one for the index admission and another for 5-year prior admissions. In another study of the contribution to model performance in predicting in-hospital mortality made by extra information from a 3-year lookback period, Stukenborg et al. [8] reported that comorbidity risk adjustment (either Deyo/Charlson or Elixhauser method) achieved the best performance in various groups of hospital patients when comorbidity information from the index and prior admissions were treated as separate covariates in a model. Nevertheless, they also concluded that ascertaining information from prior admissions provided little improvement in the explanatory power of risk adjustment methods. Using information from the index and prior admissions as independent indicators might allow the model to distinguish late-stage from early-stage cases because more severe cases were more likely to be ascertained more than once and thus produce some improvement in the statistical performance.

With regard to effect estimation, increasing the number of cases with a disease/condition by using information from prior hospital admissions could produce an effect size smaller than that estimated using only the index record (i.e. only severe or active cases). In this study we found that the more remote (in time) that hospitalisations with chronic disease were reported, the smaller the effect the disease had on haemorrhage. One explanation for this could be that conditions that were ascertained from previous hospital records might have been treated and well controlled or be less severe than conditions identified from the index records. The effect of a risk factor on a particular outcome is likely to be dependent not only on the risk factor but also its severity, and a more severe instance is more likely to be ascertained in recent records than in older records [35]. Not much additional comorbidity information had been gained in this generally young and healthy population using a longer lookback period. Thus, this study indicates that the findings of lookback studies may not be generalisable between young and older populations.

Chronic renal disease (via anemia) and psychiatric disorders (via medication) may place women at increased risk of obstetric haemorrhage [36]. Pregnant women with chronic renal disease or treated psychiatric disorders which complicate the pregnancy or are associated with hospitalisation during the pregnancy should be considered to be at risk of haemorrhage and be treated accordingly.


A five-year ascertainment period for a chronic disease improves estimation of the population prevalence in a young and generally healthy population if the disease required treatment in hospital. On the other hand, diseases that do not require hospitalisation or cases with no obvious symptoms or in subclinical categories would usually not be picked up using longitudinally linked hospital records. In the case of haemorrhage prediction, comorbidity information from prior hospital admissions did little to improve the haemorrhage modelling. For estimating the effect size of a risk factor, the most appropriate lookback period should be determined by the study objective.