FormalPara Key Points for decision makers
Very preterm birth or very low birth weight status was associated with a significant difference in the Health Utilities Index Mark 3 multi-attribute utility score of − 0.06 (95% confidence interval − 0.08, − 0.04) in comparison to birth at term or at normal birthweight; this was not replicated for the Short Form 6D.
Impacted functional domains included vision, ambulation, dexterity and cognition.
Very preterm birth or very low birth weight status is associated with lower overall health-related quality of life in early adulthood, particularly in terms of physical and cognitive functioning.


Very preterm (VP; < 32 weeks gestation) birth or very low birthweight (VLBW; < 1500 g) is associated with increased mortality [1,2,3], adverse neurodevelopmental outcomes [4,5,6] and a greater socioeconomic disadvantage extending into early to mid-adulthood [7,8,9,10]. Prematurity is a growing public health concern as increasing preterm birth rates coupled with improvements in survival rates place increased pressures on healthcare budgets worldwide [11,12,13]. Furthermore, the long-term adverse psychological and economic consequences of preterm birth on families, such as the increased risk of anxiety and depression and increased out-of-pocket expenses, are well documented [14,15,16,17]. Policies and interventions to improve outcomes in such populations are needed and should include the consideration of health-related quality of life (HRQoL). Generic HRQoL measures are holistic constructs [18, 19] that are highly correlated with widely used health metrics including morbidity, mortality and healthcare costs, and have well-documented positive measurement properties for assessing health status particularly in the general population [20,21,22]. In particular, HRQoL measures that are accompanied by preference-based value sets generate utility scores that represent preferences for health states on a 0–1 cardinal scale where 0 represents being dead and 1 represents full health. Health utilities can inform health economic and policy evaluations [23,24,25] and are also helpful for understanding an individual’s current health status with respect to the particular attributes or dimensions considered within each instrument. Furthermore, HRQoL measures that are accompanied by preference-based value sets have become increasingly important in understanding daily impacts on individual functioning when included in clinical studies [26,27,28,29].

While prospective cohort studies have generally highlighted the adverse effects of VP birth on longer term health and developmental outcomes, there is limited and conflicting evidence about its impact on HRQoL outcomes in adulthood [7, 30]. Some studies suggest that it is associated with lower utility scores that persist into early adulthood [30,31,32,33,34], while others find no conclusive relationship [7, 8, 35,36,37,38]. Furthermore, it remains unclear whether, and to what degree, perinatal and early life factors are associated with HRQoL in adulthood.

Research into the long-term outcomes of individuals born VP/VLBW is subject to methodological challenges primarily owing to sample attrition in cohort studies over time [39], which often disproportionately results in the loss of participants from socioeconomically disadvantaged families [40,41,42], and in participants with more impaired outcomes [43]. In some cohort studies, this has led to relatively small samples and weak external validity of the study results [7]. To overcome the limitations associated with analyses restricted to single cohorts or studies with weak external validity, the use of an individual patient data analysis consolidated over several cohorts has advantages and, with harmonisation of data, allows a detailed examination of associations between VP/VLBW status and a range of outcomes. Previous research has advocated new studies that build upon current knowledge and exploit gains from larger samples that both strengthen statistical power and allow multivariable and subgroup analyses to be performed when data across multiple cohort studies are combined [7, 30]. Such research should provide much needed data to inform policy efforts around VP/VLBW status and its long-term consequences.

This study had the following aims: (a) to estimate the association between VP/VLBW status and preference-based HRQoL outcomes in early adulthood by pooling harmonised data from five prospective longitudinal birth cohort studies and (b) to identify the specific aspects of HRQoL in early adulthood that are associated with VP/VLBW status.


Study Cohorts

For inclusion in this analysis, prospective cohort studies were required to (1) have measured the HRQoL in adulthood (defined as age ≥ 18 years [44]) of individuals born VP/VLBW using preference-based measures; (2) included a comparison control group of term-born or normal birthweight individuals, and (3) contributed data to the Research on European Children and Adults Born Preterm (RECAP) Consortium (, a database of cohorts of individuals born VP/VLBW. Eligible cohorts were identified by a recent systematic review of preference-based HRQoL outcomes following preterm birth or low birthweight [30]. The review identified seven cohorts that measured HRQoL in adulthood of individuals born VP/VLBW using preference-based measures, one of which (POPS) did not include a comparison control group of term-born or normal birthweight individuals [45], whilst data from the McMaster cohort [31] were not contributed to the RECAP platform. Five prospective cohort studies therefore contributed to the study: the Bavarian Longitudinal Study (BLS) [46], the Victorian Infant Collaborative Study (VICS) [47], the EPICure study [48], the New Zealand Very Low Birth Weight (NZ-VLBW) study [49] and the Norwegian University of Science and Technology (NTNU) Low Birth Weight in a Lifetime Perspective (NTNU LBW Life) study [50]. The included studies were primarily designed to investigate the associations of VP/VLBW status with health outcomes [51] and had received country-specific ethical reviews along with participants’ written informed consent in adulthood. This analysis used records from the start of data collection up to the earliest assessments in adulthood (BLS at 26 years, VICS at 18 years, EPICure at 19 years, NTNU LBW Life at 19 years and NZ-VLBW at 22–23 years). In addition, we undertook a further analysis that also included repeated HRQoL assessments at 23 and 28 years for the NTNU LBW Life and NZ-VLBW cohorts, respectively.

Table 1 details the background characteristics of the samples in each cohort, including eligibility criteria, age(s) of assessment in adulthood and the composition of control groups. Additional details for each study can be found in published research as follows: BLS [46], VICS [47], NTNU LBW Life [50], EPICure [48] and NZ-VLBW [49]. The process of harmonising data across RECAP cohorts involved the application of an identical set of definitions, scaling methods and categorisations of all variables included in the analyses. Dictionaries were developed to guide harmonisation of all variables of interest across studies.

Table 1 Background characteristics of cohorts

Outcome Measures

Participants’ perceptions of their current health status were assessed using at least one of the following self-report measures: Health Utilities Index Mark 3 (HUI3) [BLS, VICS and EPICure] and either the Short Form 12 (SF-12) or Short Form 36 (SF-36) (BLS, VICS, NZ-VLBW and NTNU LBW Life). The NZ-VLBW and NTNU LBW Life cohorts administered the SF-12/SF-36 at two different ages in adulthood. The BLS and VICS studies assessed HRQoL using both the HUI3 and SF-12 or SF-36 measures at one age only [46, 47].

The HUI3 comprises eight attributes: vision; hearing; speech; emotion; pain; ambulation; dexterity; and cognition [52,53,54]. Levels of function within each attribute were scored on a 5-point or 6-point scale with a range from normal/optimal function to severe impairment. Responses were mapped onto an eight-attribute health status vector and eight single attribute utility (SAU) scores were computed [52]. Responses to the HUI3 health status classification system were subsequently converted into multiplicative multi-attribute utility (MAU) scores using the Canadian algorithms [52,53,54,55]. HUI3 MAU scores range from −0.36 and 1.0, with −0.36 representing the worst possible HUI3 health state, 0 representing being dead, and 1.0 representing full health [54, 55]. Most HUI3 attributes reflect objective aspects of the health of the individual (vision, hearing, speech, ambulation, dexterity and cognition).

The SF-36 health status assessment questionnaire was designed to describe HRQoL using 36 items and yields an eight-dimension health profile: physical functioning, physical role limitations, social functioning, bodily pain, general health, mental health, vitality and emotional role limitations. The SF-12 includes 12 of the 36 items from the SF-36 with an identical dimension structure. For each dimension, responses to the survey items are transformed into a scale from 0 to 100, where higher scores indicate better HRQoL. Responses to the SF-36/12 items were converted [56] into SF-6D MAU scores for each participant using the UK SF-6D utility algorithms [57]. The SF-6D algorithms also reduce the eight dimensions of the SF-36/12 to six by combing role limitations due to physical and emotional problems and omitting general health perceptions. The SF-6D MAU scores range between 0 and 1.0, with 0 representing being dead and 1.0 representing full health [57]. A minority of the SF-6D dimensions (physical functioning, role limitations) reflect physical dimensions, while the remaining dimensions (pain, social functioning, mental health, emotional and vitality) reflect more socio-emotional aspects of health.

For the main analysis, we utilised HUI3 MAU scores and SF-6D MAU scores. However, additional analyses considered the following outcome variables: indicator variables that denote optimal levels of function across each health attribute or dimension, HUI3 SAU scores and SF-12 dimension scores. Appendix A of the Electronic Supplementary Material (ESM) provides additional details regarding the HUI3 and SF-6D measures as well as data on the strength of relationship between them in our study data. For both measures, differences in MAU scores equal or greater to 0.04 were considered clinically significant [58, 59].

Main Exposure

The main independent variable in this study was an indicator for VP or VLBW status, i.e. whether an individual was born < 32 weeks’ gestation or ≤ 1500 g. Additionally, we assessed the association of VP birth only on HUI3 and SF-6D outcomes, regardless of birthweight status.


Independent variables incorporated into the analyses were previously shown to be associated with the HRQoL of those born preterm or low birthweight [34, 45]: sex (male (referent)), age at assessment (measured in years), and mother’s level of education harmonised according to the International Standard Classification of Education (ISCED) into low (ISCED levels 0–2), medium (ISCED levels 3–5) and high (ISCED levels 6–8) [60] [low maternal education (referent) category in all models]. We additionally accounted for cohort effects using indicator variables for each cohort.

Empirical Analyses

Given the structure of the data, we combined individual participant data (IPD) across cohorts as follows: the HUI3 meta-cohort included data from BLS, VICS and EPICURE; and the SF-6D meta-cohort comprised data from BLS, VICS, NZ-VLBW and NTNU LBW Life. We designed an empirical strategy to examine the association between VP/VLBW status and HRQoL outcomes in adulthood across combined cohorts using generalised mixed models in a one-step approach, and estimated via logistic and linear probability models. The one-stage IPD analysis could be implemented either using a fixed-effects or random-effects model. [61]

In our study, cohort study participants were enrolled across different geographical regions, ages and time frames, which suggests the presence of systematic differences across cohorts (Table 1). Study participants also shared common characteristics that were fixed across geographic and socioeconomic dimensions caused by study-specific or country-specific factors. While this motivated the use of fixed-effects models, we performed formal tests to identify whether the IPD meta-analysis should be performed using fixed-effects or random-effects models. Details are provided in Appendix B of the ESM.

For the main analysis, we used a fixed-effects one-stage IPD analysis, and the earliest available HRQoL assessments during adulthood for each cohort as described in Table 1, for each of the following outcomes: indicator variables that denoted optimal levels of function across each attribute or dimension for each outcome measure, HUI3 SAU and MAU scores, SF-12 dimension scores and SF-6D MAU scores. To further investigate the robustness of findings, we implemented the following robustness checks: (a) we utilised linear mixed models in a one-step approach, which modeled cohort effects as random effects and (b) we included all HRQoL assessments available for all participants in the longitudinal analyses, applying the fixed-effects and random-effects models. Analyses were performed using STATA version 16 and p-values of 0.05 or less were considered statistically significant.


Baseline Characteristics of Prospective Cohort Studies

Table 1 reports baseline sample characteristics of each cohort. Pooled data from the five cohorts totaled 873 VP/VLBW individuals and 694 controls for the HUI3 meta-cohort and 909 VP/VLBW individuals and 680 controls for the SF-6D meta-cohort. Years of birth ranged from 1985 to 1995, and mean age at assessment varied from 18 to 28 years. Table 2 describes the characteristics of VP/VLBW individuals and controls with non-missing HUI3 and SF-6D MAU scores. Within the HUI3 meta-cohort, cases were slightly younger at assessment compared with controls, and had lower levels of maternal educational attainment compared with controls. In contrast, these differences were not apparent in the SF-6D meta-cohort but more cases in this cohort tended to have non-Caucasian ethnicity.

Table 2 Characteristics of VP/VLBW individuals and controls within HUI3 and SF-6D meta-cohorts

IPD Meta-Analysis

Table 3 shows the results from logistic regression models for optimal levels of function and linear probability models for SAU scores within the HUI3 meta-cohort. Analogous estimates for the SF-6D meta-cohort are presented in Table 4. Results from the one-stage meta-analysis presented in Table 3 demonstrate that VP/VLBW status was associated with sub-optimal levels of function and lower SAU scores for the following HUI3 attributes: vision, speech, emotion, ambulation, dexterity and cognition. Within the SF-12 cohort, VP/VLBW status was associated with sub-optimal physical functioning and social functioning, which contrasts with higher odds of optimal levels of function for the following SF-6D dimensions: role limitations, mental health and vitality.

Table 3 One–stage individual participant data meta-analyses for HUI3 SAU scores and MAU scores comparing VP/VLBW with control groups
Table 4 One-stage individual participant data meta-analyses for SF-6D multi-attribute utility scores and SF-12 dimension scores comparing VP/VLBW with control groups

Using a one-stage IPD meta-analysis, we considered HUI3 and SF-6D MAU scores as outcome variables (Table 5). The adjusted impact of VP/VLBW status on the HUI3 MAU score was − 0.06 (95% confidence interval [CI] − 0.08, − 0.04) with no significant impact on SF-6D MAU scores. Results from analyses that restricted cases to very preterm status alone were similar for the HUI3 MAU score, − 0.04 (95% CI − 0.06, − 0.01), again with no association with SF-6D MAU scores.

Table 5 One-stage individual participant data meta-analyses: impact of VP/VLBW on HUI3 MAU score and SF-6D MAU score, all cohorts combined. Method: linear fixed-effects model

Higher levels of maternal education (high vs low) were associated with a higher HUI3 MAU score (mean difference 0.06; 95% CI 0.03, 0.10). Female individuals had lower MAU scores than male individuals for the SF-6D (mean difference − 0.04, 95% CI − 0.05, − 0.02), but not for the HUI3 (mean difference − 0.01, 95% CI − 0.03, 0.01). There was weak evidence that increased age at assessment was associated with higher HUI3 MAU scores (mean difference 0.01, 95% CI 0.00, 0.01), but not SF-6D MAU scores (mean difference 0.00, 95% CI − 0.00, 0.01).

Robustness Checks

Appendix C of the ESM reports the results of the one-stage IPD analysis that examined the impact of VP/VLBW status on HUI3 and SF-6D outcomes and modeled treated cohort effects as random effects. In this analysis, the adjusted HUI3 MAU score difference between VP/VLBW and controls was approximately − 0.06 (95% CI − 0.08, − 0.03) [Table C1 of the ESM], whilst VP/VLBW participants had poorer function than their controls across the following HUI3 attributes: vision, pain, ambulation, dexterity and cognition (Table C2 of the ESM). In contrast, VP/VLBW status was not associated with lower SF-6D MAU scores (Table C3 of the ESM) but was associated with lower SF-12 physical functioning and social functioning scores (Table C4 of the ESM). Furthermore, analyses that used data from all available SF-6D HRQoL assessments for all participants and described in Appendix D (Table D1, D2, D3) of the ESM were again very similar to those reported in Tables 4, 5. Overall, the results remained robust as estimates derived from the fixed-effects and random-effects models broadly agreed and painted similar patterns regarding the impact of VP/VLBW status on HRQoL in early adulthood.


Overall, VP/VLBW status was associated with clinically significant decrements [58, 59] in the HUI3 MAU score in adulthood after adjusting for covariates. Specifically, the adjusted impact of VP/VLBW status on the HUI3 MAU score exceeded generally accepted thresholds of minimal important differences in multi-attribute utility scores for assessing effects on preference-based HRQoL outcomes [58, 59]. In contrast, our analyses did not detect differences in SF-6D MAU scores by VP/VLBW status.

One explanation for the above observations is that the HUI3 MAU score weights motor function, sensory function and cognition, which are known to be associated with clinical outcomes in VP/VLBW individuals, more highly compared with the SF-6D MAU score [62]. Furthermore, the HUI3 and SF-6D differ in terms of their conceptual underpinnings, dimension and item structures, and valuation protocols, which might also explain differences in outputs. However, the totality of evidence suggests that the HUI3 and SF-6D results are broadly consistent and show that VP/VLBW status is primarily associated with poorer overall physical and cognitive functioning in adulthood, but not with socio-emotional or mental health. Thus, while evidence suggested a weak relationship between VP/VLBW status and SF-6D MAU scores, across all models employed, we consistently found that VP/VLBW status was associated with lower SF-12 physical and cognitive functioning scores. Moreover, results for the HUI3 revealed specific aspects of health that were most affected by VP/VLBW status because across all models and specifications employed, we found evidence that VP/VLBW status was associated with decrements in vision, ambulation, dexterity and cognition. The results demonstrate that VP/VLBW status was associated with decrements in nearly all HUI3 SAU scores, and in SF-12 physical and social functioning scores (Tables 3 and 4). However, in general, these decrements were small and attained statistical significance primarily for differences in physical aspects of health. The pattern of results suggest that the evidence obtained from the SF-6D corroborates and is broadly consistent with the HUI3 results.

Consistent with previous systematic reviews [7, 30], higher levels of maternal education are associated with higher utility scores, including in adulthood. In fact, the differences in HUI3 MAU scores between those with mothers with low versus high education are of a similar quantum to being born VP/VLBW versus being born at term. This has previously been shown for differences in functional outcomes such as intelligence [63, 64]. Our models detected differences between the sexes for SF-6D MAU scores as female sex was associated with a utility loss of − 0.04 (95% CI − 0.05, − 0.02), also previously observed in published research [7, 30].

Overall, our results are consistent with previously observed patterns reported in the disability literature, which shows that preterm birth is mostly associated with physical and cognitive impairments, but to a lesser degree with mental health problems [63, 65, 66]. While the differences in HRQoL scales we observe here might seem relatively small compared with effects we see in contemporaneous studies of disability, the differences may be mitigated in terms of “quality of life” because participants are reporting how they feel, as has been shown in other studies [31]. Furthermore, given the high sample size of VP/VLBW individuals in our study, it is unlikely that future studies will identify different patterns between VP/VLBW and HRQoL beyond those we report.

Strengths and Limitations

The major strength of this study is the large sample size of VP/VLBW individuals and the longitudinal research design aimed at strengthening internal and external validity. Our constituent studies ascertained HRQoL outcomes using two measures, which differ in terms of their conceptual underpinnings and definitions of health. This unique feature of our pooled dataset allowed us to disaggregate the impacts of VP/VLBW status across different components of health and demonstrate that the influences on HRQoL largely relate to areas of physical and cognitive functioning identified as common associates of preterm birth [64, 65]. Although the mental health aspects of VP/VLBW outcomes are also well known [14, 67], within the two HRQoL measures we applied we found no effect. The use of both fixed-effects and random-effects models to investigate the relationship between VP/VLBW status and HRQoL is also a strength as this confirmed the robustness of our results.

The contributing cohorts used reliable and valid methods to recruit participants and maintained substantial participation rates throughout follow-up amongst study cases and controls. Finally, the use of adult self-report HRQoL data avoids biases associated with proxy parental reporting. From the childhood literature, child-proxy agreement around descriptions of children’s HRQoL is generally poor, particularly for subjective constructs such as emotion and pain [68, 69].

Limitations include the different eligibility criteria in terms of the number of weeks of gestation or birth weight for different constituent study cohorts. Given differences in recruitment mechanisms of study participants, with some cohorts recruiting controls during childhood, it was not possible to examine HRQoL impacts by gestational week or birth weight at a more granular level. Furthermore, although our meta-analyses adjusted for maternal education, we acknowledge that we have not accounted for other socio-economic determinants of health. Finally, the results of this study might not be applicable to low-income or middle-income countries as the data used in this study were collected in high-income countries. Equally, our findings might not be fully applicable to the Americas, Asia or Africa as our study only included cohorts from Western European countries, Australia and New Zealand, limiting the external validity of the results presented.


The results indicate that the HUI3 MAU measure may be more sensitive at detecting differences in HRQoL following the long-term sequelae of preterm birth or low birthweight than the SF-36/12 and hence the SF-6D. Our results show that the HUI3 and SF-6D instruments might not be interchangeable for use in clinical and population research, and cost-effectiveness-based decision making that considers the long-term consequences of VP/VLBW status [70, 71]. The HUI3 might be preferred to the SF-6D in economic and policy evaluations that quantify particularly physical health or cognitive outcomes in individuals born VP/VLBW. Our results suggest that differences in the health descriptive systems of the HUI3 and SF-6D measures likely drive the differences in MAU scores of VP/VLBW individuals and controls in adulthood. Results presented in this study indicate that these two measures mostly do not seem to measure similar constructs, implying complementarity between the two HRQoL measures.


Results from five prospective longitudinal cohort studies previously identified by published systematic reviews [7, 30] demonstrate that VP/VLBW status is predominantly associated with decrements in physical and cognitive aspects of HRQoL during adulthood. Studies that estimate the effects of VP/VLBW status on multi-dimensional HRQoL outcomes in later adulthood are needed.