Data sources
This study used three record-linked data sources: (1) English national Hospital Episode Statistics (HES)—admitted patient care data for the whole of England, 1998–2012 [14]. These hospital record abstracts, routinely collected by the English national Health and Social Care Information Centre, contain details of every episode of admitted patient care (including day case care) occurring in English National Health Service (NHS) hospitals and NHS commissioned care in the private sector. The details in each record include date of admission and discharge, demographic information about the patient and the reasons for their admission to hospital, which include clinical diagnoses coded using the ICD (www.who.int/classifications/icd/en/). The English national HES first became linkable in 1998, with the collection of anonymised encrypted personal data items, and the most recent HES data provided by the national data provider to the Unit of Health-Care Epidemiology, Oxford University, was for 2012; (2) Maternity Hospital Episode Statistics (MHES) for the whole of England, 1998–2012 [15]. These are a subset of HES and are intended to cover every birth occurring in an NHS hospital or under NHS provision (including home deliveries). For each birth, there is a maternity record for the mother and a delivery record for the child. These are similar to regular HES records but, in addition to the usual information contained in HES, they include extra ‘tails’ of data that provide information about the mother’s characteristics during delivery and the child’s characteristics at birth. The maternity/delivery data items collected are described in detail in the HES Data Dictionary [16]; (3) National death registration data, 1998–2012. Death certification data are collected in England by the Office for National Statistics (ONS). Each death registration record contains demographic information about each deceased individual, date of death and diagnostic information about the cause of death, again coded using the ICD.
These three data resources have been linked together into a multipurpose mother–infant database at the Unit of Health-Care Epidemiology (UHCE), University of Oxford, such that each infant’s MHES record is linked to his or her successive records of hospitalisation and/or death in later life, as well as to the mother’s MHES record and her successive records of prior or subsequent hospitalisation and/or death. The UHCE has longstanding experience of linking routinely collected hospital admissions data to study individuals across time, specifically with the use of linked HES since its introduction and ONS death registration data, methods for which have been documented extensively elsewhere [17, 18]. The linkage of each infant’s birth record to its subsequent hospitalisation records (and any death record used in censoring on follow-up) was conducted by matching encrypted personal identifiers, which included HES-ID [19], NHS number (unique to each individual in England), date of birth and postcode. The mother–infant matching was achieved similarly using a mixture of deterministic and probabilistic methods (further information is provided in the electronic supplementary material [ESM] Methods).
The data resources were obtained for permitted use in this study and ethics approval was obtained from the Central and South Bristol Multi-Centre Research Ethics Committee (04/Q2006/176) for analysis of the record-linked data. Full access to the database was available for use in this study.
Study design and population
In total, 7,335,218 mother–infant pairs were identified through mother–infant linkage of MHES records from 1 April 1998 to 31 March 2012. These pairs were extracted from the database, along with any other HES and/or death records belonging to either the mother or the child that occurred during the same period. The ESM Table shows the number of linked pairs by financial year, referenced to birth registry data from the ONS (all references to years are financial years such that 1998 means 1 April 1998 to 31 March 1999). The linked data were analysed using a retrospective cohort study design to compare the rates of type 1 diabetes in children by birthweight, gestational age at birth and BFGA. Children born in 1998 were excluded from the analysis to allow sufficient prior history for a diagnosis of gestational diabetes to be recorded, and children born in 2011 were excluded to allow each child at least 1 year of follow-up. After restricting to live births only, the number of mother–infant pairs was reduced to 4,895,768 (97% of those excluded had unknown or unrecorded birth status). Multiples were excluded because their fetal growth patterns are known to be atypical. Children with missing values for either birthweight or gestational age at birth were excluded. Children with a recorded birthweight <500 g or >5499 g and/or gestational age at birth <30 weeks or >43 weeks were excluded because of implausibility/non-viability and because previous validation studies of MHES have revealed these values to be commonly erroneous [20]. These exclusions (see Fig. 1) brought the total number of mother–infant pairs to 3,834,405.
Exposure variables
Birthweight (grams)
The most recent meta-analysis [8] of the association between birthweight and type 1 diabetes grouped children according to the following birthweight categories: <2500 g, 2500–2999 g, 3000–3499 g, 3500–3999 g and ≥4000 g, with 3000–3499 g taken as the reference category. The same approach was taken in the present study; although a flexible approach to grouping was also adopted to explore the relationship.
Gestational age at birth (completed weeks)
The most recent meta-analysis [11] of the association between preterm birth and type 1 diabetes defined preterm birth as less than 37 completed weeks of gestation (i.e. <259 days). This is also the internationally accepted definition of preterm (ICD10 P07.3). Post-term pregnancy is internationally defined as pregnancy that has extended to or beyond 42 completed weeks of gestation (294 days) (ICD-10 P08.2). The definition of ‘term’ is debated but the American College of Obstetricians and Gynecologists Committee on Obstetric Practice Society for Maternal-Fetal Medicine recommend the following classifications [21]: preterm, <37 0/7 weeks; early term, 37 0/7 weeks through 38 6/7 weeks; full term, 39 0/7 weeks through 40 6/7 weeks; late term, 41 0/7 weeks through 41 6/7 weeks; and post-term, 42 0/7 weeks and beyond. The same groupings were used in the present study with ‘full term’ taken as the reference group.
BFGA
In previous studies BFGA has been measured in quintiles [10]. The same approach was taken in the present study. For each week of gestational age at birth, the children were grouped into quintiles of birthweight, so that each individual was coded between 1 and 5 with equal numbers of children in each quintile for each gestational week. This was done for male and female children separately, since boys are generally heavier for their gestational age than girls. A composite variable was then created, which brought together all of the data for the children in quintile 1, all of the data for the children in quintile 2 and so on. The same approach was taken to generate BFGA in deciles. While BFGA is a convenient way of summarising the effect of birthweight while simultaneously adjusting for gestational age at birth and sex, BFGA was not considered a substitute for looking at actual birthweight adjusted for gestational age in multivariable analyses.
Potential confounders or effect modifiers included maternal age in years (grouped <25, 25–29, 30–34, 35–39, >40); maternal type 1 diabetes (ICD-10 codes E10 or O24.0); maternal obesity (E66); gestational diabetes (O24.4 or O24.9); infant sex; area deprivation based on the mother’s Index of Multiple Deprivation (IMD) rank (in quintiles); and Caesarean section (elective and emergency combined).
Follow-up and outcome measurement
Type 1 diabetes diagnoses were identified by searching each child’s subsequent HES records for ICD-10 diagnosis code E10 after the age of 9 months. Type 1 diabetes diagnosis before 9 months is extremely rare and any recorded type 1 diabetes diagnoses at this age, although coded as such, would almost certainly represent neonatal diabetes (ICD-10 P70.2) [2]. Date of entry to the study population for each infant was the 15 day of the month of their delivery discharge record (exact date of birth was not available from HES in compliance with data governance requirements). Since follow-up for all participants was measured from month of birth, cumulative follow-up time for each individual was approximately equivalent to age. Date of exit for each individual was the date of their earliest type 1 diabetes diagnosis record, if it occurred, otherwise date of death, if it occurred, otherwise the end of the follow-up period (31 March 2012).
Statistical analysis
The crude incidence rate (per 100,000 years) of type 1 diabetes was calculated for each category of birthweight, gestational age at birth, and BFGA in quintiles and deciles. Mantel–Haenszel adjusted rate ratios were calculated to control for each of the secondary independent variables in turn, and adjusted HRs were calculated using Cox’s proportional hazards models to compare the groups after multivariable adjustment. Where appropriate, trend tests across exposure groups were conducted by entering the categorical variables into the models as continuous terms and using the likelihood ratio test (LRT) to check that model fit was not compromised. The proportional hazards assumption was tested formally by splitting age–time at 4.5 years so that there were equal numbers of outcomes in each age–time period and then testing for interactions between the primary exposure variables and age–time.
The strategy for building the Cox models was based on which other secondary independent variables had the strongest effect on the relationships between the exposure variables and type 1 diabetes (except for infant sex, which was considered an a priori confounder). Missing values were always dealt with in multivariable analyses by ensuring that any two models being compared contained the same observations.
All analyses were performed using Stata/IC 13.1 for Windows, StataCorp, TX, USA.