Birthweight is associated with clinical characteristics in people with recently diagnosed type 2 diabetes

Aims/hypothesis Low birthweight is a risk factor for type 2 diabetes but it is unknown whether low birthweight is associated with distinct clinical characteristics at disease onset. We examined whether a lower or higher birthweight in type 2 diabetes is associated with clinically relevant characteristics at disease onset. Methods Midwife records were traced for 6866 individuals with type 2 diabetes in the Danish Centre for Strategic Research in Type 2 Diabetes (DD2) cohort. Using a cross-sectional design, we assessed age at diagnosis, anthropomorphic measures, comorbidities, medications, metabolic variables and family history of type 2 diabetes in individuals with the lowest 25% of birthweight (<3000 g) and highest 25% of birthweight (>3700 g), compared with a birthweight of 3000–3700 g as reference, using log-binomial and Poisson regression. Continuous relationships across the entire birthweight spectrum were assessed with linear and restricted cubic spline regression. Weighted polygenic scores (PS) for type 2 diabetes and birthweight were calculated to assess the impact of genetic predispositions. Results Each 1000 g decrease in birthweight was associated with a 3.3 year (95% CI 2.9, 3.8) younger age of diabetes onset, 1.5 kg/m2 (95% CI 1.2, 1.7) lower BMI and 3.9 cm (95% CI 3.3, 4.5) smaller waist circumference. Compared with the reference birthweight, a birthweight of <3000 g was associated with more overall comorbidity (prevalence ratio [PR] for Charlson Comorbidity Index Score ≥3 was 1.36 [95% CI 1.07, 1.73]), having a systolic BP ≥155 mmHg (PR 1.26 [95% CI 0.99, 1.59]), lower prevalence of diabetes-associated neurological disease, less likelihood of family history of type 2 diabetes, use of three or more glucose-lowering drugs (PR 1.33 [95% CI 1.06, 1.65]) and use of three or more antihypertensive drugs (PR 1.09 [95% CI 0.99, 1.20]). Clinically defined low birthweight (<2500 g) yielded stronger associations. Most associations between birthweight and clinical characteristics appeared linear, and a higher birthweight was associated with characteristics mirroring lower birthweight in opposite directions. Results were robust to adjustments for PS representing weighted genetic predisposition for type 2 diabetes and birthweight. Conclusion/interpretation Despite younger age at diagnosis, and fewer individuals with obesity and family history of type 2 diabetes, a birthweight <3000 g was associated with more comorbidities, including a higher systolic BP, as well as with greater use of glucose-lowering and antihypertensive medications, in individuals with recently diagnosed type 2 diabetes. Graphical Abstract Supplementary Information The online version contains peer-reviewed but unedited supplementary material available at 10.1007/s00125-023-05936-1.


Birthweight data from original midwife records: data documentation from the Danish National Archive to the DD2 research group
The Steno Diabetes Centre partnered with the Danish National Archive to ascertain birth information from original midwife records for the DD2 cohort participants. The majority of the following text has been directly copied (without linguistic edits) from text written in English by native Danish-speaking employees at the Danish National Archive, who in the current context are regarded as third parties.
To obtain precise birth information, three different sets of original midwife records were used. The original midwife, caring for a mother and infant, filled out the birth information on the day of the birth.

The birth data consist of:
• Birthweight • Birth length • Born at term: yes/no • Twin: yes/no

Description of the cohort:
For the Danish National Archive to be able retrieve the correct birth information for the cohort participants, the following criteria had to be fulfilled: • Being born in the period from ~1920 to 1988 • Being born in Denmark (not including the Faroe Islands or Greenland) • Being identifiable through the biological mother's name.
After a complete examination of the entire 9549 cohort participants in the Danish Civil Registration System, a total of 8896 individuals matched the criteria and could potentially be documented through relevant sources in the Danish National Archives collection. Those not meeting those criteria included: • Individuals who born after 1988 • Individuals who born abroad • Without an official birthplace in the Danish Civil Registration System • Individuals not identifiable through the biological mother's name.

Sources/methods of retrieving birth data in the Archives:
The following archive series were used: • Midwife Records (Jordemoderprotokoller) -1977 [13] • Birth Reviews (Fødselsanmeldelser) for Copenhagen -1977, Frederiksberg and Gentofte municipalities, including the Copenhagen district, 1953-1977 [14] • Birth Reports (Fødselsindberetninger) / Reviews 1968-1988 [14] The above archive series differ with respect to setup and data records, and for cases where one archive does not document a birth, another archive may contain and supply information. Especially in the period 1968-1988, births were registered most systematically and the archives are therefore more accurate with respect to the Birth Reports. For the births before 1968, the Midwife Records were primarily used with the support of the Birth Reviews.
In the proofreading phase, there was a particular focus on alignment of the data. In the beginning of the 1900s the most common units of weight in Denmark were pound and length measured in inches. Pound was converted to grams with one pound equal to 500 grams [17]. In the few instances where inches were used, conversion to the length to the nearest complete cm has been registered [17].
The question of "born at term" in the Midwife Records is associated with some uncertainty, as the accuracy of information depends on the midwife's own judgement, and because information in regard to whether an individual was "born at term" can be written in different ways, which have changed over time. The midwife records have one check box for "born at term" and one for "preterm", and "yes" should be written in only one of the boxes. If the midwife for some reason wrote "yes" in both boxes, it was counted as missing data registration. For babies registered as "preterm", midwives could estimate and write how early the child had been born. However, this information was often missing and/or given as a range (e.g., "from 1 to 4 weeks early", etc.), and we therefore could not use the data for exact gestational age. Instead, we constructed an indicator variable "born-at-term: yes/no", from all available information in the records.

Results:
The Danish National Archives has birth information for about 94% or 8363 individual DD2 patients with • 1186 individuals whose birth data could not be retrieved, and • 192 incomplete sets of birth data.
In total, there were 8171 DD2 participants with complete birth data.

Control population
For every birth of any DD2 patient, we in addition extracted two random individual births from the same midwife information sheet (containing birth information for, on average, 6-8 different births), with the aim of establishing a control population. By extracting the controls from the same midwife record, they were matched according to date of birth, individual midwife, and geographical location. This resulted in a total control population of 18,210 individuals, of whose records a total of 13,180 (72.3%) were proofread.

Exploring linear and non-linear relationships.
Spline knots were placed at fixed quantiles of the predictor's marginal distribution [18,19]. Best models were chosen by visual inspection and lowest Akaike Information Criterion (AIC). If linear regression models had a pvalue <0.05 and an AIC lower or similar to that of the other models, then linear regression models were chosen allowing the best possible interpretation of coefficients. Logarithmic transformations are presented as the percentage change in the outcome per 1000 g change in birthweight.
Scatterplots are stratified by sex (male, female), family history of type 2 diabetes (number of reported relatives) and age at enrolment in categories. Scatterplots have been jittered to add a small amount of noise to hide individual identification through outliers.
Default quantiles for knots placement of restricted cubic spline models: Scatterplots of linear regression analysis Linear (a), 2-degree polynomial (b), 3-degree polynomial (c), and restricted cubic spline (d) regression plots.

Multivariate imputations by chained equations (MICE) model specification
We employed multivariate imputations by chained equations using the MICE package from R [20].
The percentage of missing values across the dataset ranged from 0 to 57%, with an average of 13.75% missing values across all variables. Included below are plots showing the percentage of missing data per variable according to birthweight, age at enrolment, and sex. The plots show that the distributions of missing data according to birthweight categories, age at enrolment, and sex were similar, providing evidence against potential attrition bias.
Legend: Percentage of missing data of variables used in the study. Each birthweight group is in kilograms. Abbreviations: BMI = Body Mass Index, HDL = high-density lipoprotein, LDL = low-density lipoprotein, hsCRP = high-sensitivity C-reactive protein, HOMA2 = Homeostasis Model Assessment 2, Log = natural logarithm, T2D = type 2 diabetes. Comorbidities and medication do not have any missing data.
We used multiple imputation to create and analyse 20 imputed datasets, with the number of iterations set to 10. Incomplete variables were imputed under fully conditional specification, using the default settings of the mice 3.14.0 package [20]. The parameters of substantive interest were estimated in each imputed dataset separately and combined using Rubin's rules.
The following variables were used in the imputation model: age at enrolment, sex, place of enrolment,  alcohol consumption, physical activity, weight, waist, height, BMI, waist-hip ratio, waist-height ratio,  family history of type 2 diabetes, HOMA2-Beta, HOMA2 insulin sensitivity, C-peptide, blood glucose,  hsCRP, birthweight, birth length, born-at-term status, total cholesterol, LDL cholesterol, HDL cholesterol, triglycerides, smoking status, systolic blood pressure, diastolic blood pressure, HbA 1c , number of glucoselowering drugs, type of glucose-lowering drugs, lipid-lowering drugs, antihypertensive medication, number of antihypertensive drugs, Charlson Comorbidity Index Score, macrovascular complications, microvascular complications, diabetes-associated neurological disease, diabetes-associated eye disease, diabetes-associated renal disease.
The following variables were imputed by the following methods: For continuous variables, predictive mean matching was used and included weight, waist, height, BMI, waist-hip ratio, waist-height ratio, HOMA2-Beta, HOMA2 insulin sensitivity, C-peptide, blood glucose, hsCRP, birth length, total cholesterol, HDL cholesterol, LDL cholesterol, triglycerides, systolic blood pressure, diastolic blood pressure, and HbA 1c .
For binary data logistic regression, imputation was used and included born-at-term status (only 49 individuals). For categorical data with more than two levels, polytomous regression imputation was used and included smoking status.
We combined the raw data variables of weight, height, waist, and hip measurements into BMI, waist-hip ratio, and waist-height ratio before the imputation procedure. This procedure was justified by our findings that inclusion of only the raw variables for hip, weight, height, and waist gave rise to imprecise imputations, as can be seen below in the density plots. Specifically, hip circumference gave rise to imprecise imputations (bimodal density plots); therefore, it was not included in the imputation model as a single raw variable. Overall, the observed and imputed values had similar distributions, as shown in the density plots below. Distribution of variables to be either computed before or after imputations.
During the entire DD2 enrolment period, the diagnosis of type 2 diabetes in routine clinical practice in Denmark has followed WHO criteria. Before 2012 this was primarily based on the oral glucose tolerance test (OGTT) or fasting plasma glucose measurements. Since 2012 it has primarily been based on glycosylated haemoglobin A (HbA 1c ) >48 mmol/mol (6.5%). No further diagnostic criteria have been applied in the DD2 project. Moreover, patients are eligible whether or not they have initiated glucose-lowering therapy at the time of DD2 cohort enrolment. Up to September 29, 2018, a "newly diagnosed" type 2 diabetes patient was principally defined in DD2 as a patient diagnosed later than 1 January 2009, with the recommendation to include only patients with a diabetes duration shorter than 1 year. Median diagnosed diabetes duration at enrolment in the total DD2 cohort is 1.3 years, interquartile range [IQR] 0.3-2.9 years.
The patient can give informed consent.

Inclusion criteria after September 29, 2018
After September 29, 2018, the first criterion was changed to "A diagnosis of type 2 diabetes made within the last 24 months"; this is not relevant for our study period. The remaining 2 criteria are unchanged.
Recruitment was done at all Danish outpatient clinics and in approximately 462 general practitioner clinics (1853 clinics exists) through the courtesy of the clinicians at the recruiting sites. Charlson comorbidity index score [9]. To assess the burden of comorbidities, the complete hospital contact history of each participant was obtained through linkage with the Danish National Patient Registry (DNPR), which contains discharge records from all Danish hospitalizations since 1977 and hospital outpatient visits since 1995, coded according to the International Classification of Diseases, 10th Edition [10]. We retrieved data on the 19 major comorbid disease categories included in the Charlson Comorbidity Index (CCI) [11,12], and computed a CCI score for each person (excluding diabetes as it constituted the index disease of our study population). Linkage to the Danish National Patient Registry.

Definitions and codes used in this study
Any macrovascular complications Type 2 diabetes with macrovascular complications was obtained through linkage with the Danish National Patient Registry. The following ICD-10 and procedure codes were used: DI700, DI708, DI709, DN280, DI701 Any microvascular complications Type 2 diabetes with microvascular complications was obtained through linkage with the Danish National Patient Registry . The  following ICD-10 codes were used:  DE104, DE114, DE124, DE134, DE144, DG590, DG632, DG603,  DG609, DG618, DG619, DG620, DG621, DG622, DG628   Legend: Stepwise prevalence ratio adjustment. Lifestyle model = + alcohol, smoking status, physical activity. Abbreviations: BMI = body mass index, BP = blood pressure, HDL = high density lipoprotein, LDL = low density lipoprotein, hsCRP = high-sensitivity C-reactive protein, HOMA2 = Homeostasis Model Assessment 2, PR = prevalence ratio. Legend: Stepwise adjustment for linear regression models. Coefficient = unit increase per 1000 g increase in birthweight. For logarithmic transformation, the coefficient represents percentage increase per increase in birthweight. Abbreviations: BMI = body mass index, BP = blood pressure, HDL = high-density lipoprotein, LDL = low-density lipoprotein, hsCRP = high-sensitivity C-reactive protein, HOMA2 = Homeostasis Model Assessment 2, log = natural logarithm. Legend: Prevalence ratios of birthweight and main outcomes adjusting for 1: base model (sex, age at enrolment and family history of type 2 diabetes); and 2: further adjusting for polygenic risk scores of type 2 diabetes and birthweight. Abbreviations: PRS = polygenic risk scores, BW = birthweight, aPR = adjusted prevalence ratio, 95% CI = 95% confidence interval, BMI = body mass index. Legend: Linear regression models using polygenic risk scores for type 2 diabetes or birthweight with birthweight. Coefficient = grams increase per one standard deviation increase in polygenic risk scores. Models were calculated using scaled PRS centering around the mean. Abbreviations: PRS = polygenic risk scores, CI = confidence intervals, SD = standard deviation. Legend: Prevalence ratios, base model: sex, age at enrolment, and family history of type 2 diabetes. Abbreviations: BMI = body mass index, BP = blood pressure, HOMA2 = Homeostasis Model Assessment 2, aPR = adjusted prevalence ratio. Legend: Prevalence ratios according to main analysis using birthweight (BW <3000 g and BW >3700 g) and conventional birthweights (low BW <2500 g and high BW >4500 g); adjusted for sex, age at enrolment, and family history of type 2 diabetes. Abbreviations: BMI = body mass index, BP = blood pressure, HOMA2 = Homeostasis Model Assessment 2, aPR = adjusted prevalence ratio. Table 11: Alternative sex and born-at-term birthweight categories.