Introduction

Preterm birth (<37 weeks GA) is a leading cause of mortality in infants and children under five years, with higher risks for those infants born more prematurely.1 In 2019, this resulted in approximately 900,000 deaths worldwide.2,3 Preterm birth is also associated with a high risk of short-term effects on health, including necrotizing enterocolitis, sepsis, bronchopulmonary dysplasia (BPD), and retinopathy of prematurity. In addition, preterm infants face various health risks at a later age, such as neurodevelopmental impairment, poorer growth, and type 2 diabetes mellitus.4,5,6,7 Those infants born small for gestational age (SGA) are generally at higher risks for these diagnoses.7,8,9,10,11

Timely clinical interventions are critical to minimize potential long-term morbidities in infants that develop as a result of complications during their NICU stay should they occur.12 To this end, preterm infants are continuously monitored using purposive sampling for measurement of blood-based markers such as C-reactive protein (CRP), bacterial cultures, and cell counts. However, the whole blood volume required for these laboratory tests is considerable compared to the total blood volume in neonates (~70 mL/kg).13 Importantly, substantial iatrogenic blood loss is associated with an increased risk of anemia.14 As a result, additional blood draws for basic research to improve fundamental understanding of common health complications in preterm infants are limited.

Whole blood contains hundreds of proteins15 that are currently not measured in standard monitoring. These proteins can provide insight into a wide range of major biological processes, including metabolic state, inflammation, complement system, hemostasis, and both the adaptive and immune system15 that is currently not attained in standard diagnostic clinical care. These proteins may contribute to a better basic understanding of preterm infants. Previous studies focused on either the impact of gestational age (GA) on protein levels in cord blood or postnatal protein changes in full-term infants and have found major changes in hemoglobin composition, immunoglobulins, energy homeostasis systems, and hormonal axes after birth.16,17,18,19,20,21,22,23,24,25 However, a recent study showed that most protein changes in the first month after birth of extremely preterm infants are associated with postnatal age instead of GA.26 This suggests that translational studies investigating the development of morbidities in this population require an understanding of how these circulating proteins change postnatally in order to identify deviations associated with their morbidities of interest.

Due to its unbiased nature and small plasma volume requirement per assay mass spectrometry (MS) based profiling is an attractive tool to study this vulnerable population of preterm infants.15 Recent advances in the field of MS-based serum profiling have led to high-throughput workflows, thereby providing a plethora of information on a wide range of biological processes from one sample. Here, we explored the potential of MS-based serum proteomics in the systematic and unbiased evaluation to investigate age-related changes in circulating protein levels and biological processes in preterm infants.

Methods

Study samples from preterm infants

Samples from preterm infants were obtained from the PulmonaRy Inflammation and glucocorticoiD sensitivity for the PReDICTion of BronchoPulmonary Dysplasia (PRIDICT-BPD) study, addressing the feasibility of various biomarkers for the prediction of BPD in a unselected population of preterm infants.27 For this specific study, left-over serum samples were used from participants if caregiver(s) gave written informed consent for usage of study material for future research. The PRIDICT-BPD study was performed between November 2019 and October 2021 at the two NICUs of the Amsterdam University Medical Center in the Netherlands. Infants were eligible to participate in this study if they were born at a GA below 30 weeks, and had no major congenital malformations. The Medical Research Ethics Committee of Vrije Universiteit Amsterdam approved the study (protocol number 2019.371). In the PRIDICT-BPD study, we collected cord blood samples at birth and capillary or arterial blood samples at postnatal days 3, 7, 14, and 28, as long as the participants were admitted to the NICU (Fig. 1a). Blood draws for the study were always combined with scheduled blood draws for clinical care. After collection, the samples were left to clot for 30 minutes to 2 hours, and centrifuged at 15000 g for 2 minutes at room temperature. Serum aliquots were stored at −80 °C, as described previously.27

Fig. 1: PRIDICT-BPD study design and serum proteomics.
figure 1

a Overview of the PRIDICT-BPD cohort and blood sample collection, consisting of 49 appropriate for gestational age (AGA) and 18 small for gestational age (SGA) infants. Samples collected from infants and healthy adults were analysed using the proteomics workflow starting with serum sample preparation, followed by LC-MSMS analysis and data analysis. b Overview of the samples collected in cord blood (day 0) and from capillary or arterial blood (day 3, day 7, day 14, and day 28) in the PRIDICT-BPD cohort. c Histogram representing the coefficient of variation (CV) for all proteins quantified across repeated injections of a study wide quality control sample, with 874 proteins reliably quantified with high reproducibility (CV < 30%). d Label free quantification (LFQ)-intensity levels plotted for quantified haemoglobin subunits (zeta: HBZ; alpha: HBA1; beta: HBB; delta: HBD; gamma: HBG1/HBG2) and haptoglobin (HP) proteins across the technical replicates.

Clinical data on neonatal characteristics were obtained during the study period from the electronic patient records. SGA was defined as birth weight below the 2.3 percentile according to the Dutch reference curve.28 Continuous non-normally distributed data as medians with interquartile range (IQR) and categorical data were expressed as counts with percentage (%). Patient characteristics were compared between infants born SGA versus infants born appropriate for gestational age (AGA) using a Mann-Whitney U test for continuous variables, and a Chi-square or Fisher’s exact test for categorical variables, depending on the distribution of the data.

Healthy adults

We obtained serum samples from six healthy adults from Sanquin, Amsterdam, the Netherlands. Serum was obtained by leaving whole blood to clot for at least 30 minutes at room temperature and then centrifuged at 1800 g for 20 min at room temperature. Aliquots were stored at −80 °C until analysis. Ethical approval was obtained from the Sanquin Ethical Advisory Board.

Serum sample preparation

Serum aliquots from the PRIDICT-BPD study cohort and healthy controls were thawed at room temperature (RT) and 10 μL was diluted 1:60 in 100 mM Tris(hydroxymethyl)aminomethane hydrochloride (Tris-HCl, Life Technologies, UK) (pH = 8.0). 9 μL diluted serum was mixed with 5 μL of 20 mM Tris(2-choloroethyl)phosphine (Thermo Fisher Scientific, Rockford, IL), 80 mM chloroacetamide (Sigma Aldrich, St Louis, MO) in 100 mM Tris-HCl (pH = 8.0). After incubation at 95 °C for 5 minutes, samples were cooled down to RT and proteins were digested overnight at 25 °C with 100 ng MS-grade Trypsin Gold (Promega, Madison, WI) in 30 μL of 50 mM Tris-HCl (pH = 8.0). Samples were acidified to a final concentration of 1% (v/v) trifluoroacetic acid (Thermo Fisher Scientific, Rockford, IL), 8x diluted in 0.1% formic acid in water (Biosolve, NL) and stored at −80 °C until MS analysis. Quality control (QC) samples were made by pooling equal aliquots of all samples in the cohort. 20 μL of the samples were loaded onto EvoTip Pure tips (EvoSep, Denmark) according to manufacturer’s guidelines.

Mass spectrometry analysis

Samples were analyzed using an Evosep One liquid chromatography system (Evosep, Denmark)29 coupled to an Orbitrap Fusion™ Lumos™ Tribrid™ Mass Spectrometer (Thermo Fischer Scientific) equipped with an electrospray ionization source with a spray voltage of 2150 V. Peptides were separated using the predefined 30 samples / per day method on a 15 cm × 150 μm, 1.5 μm Performance Column (EV1137, EvoSep, Denmark) with 0.1% formic acid in water or acetonitrile (Biosolve, NLD) as mobile phase A and B, respectively. MS data were acquired with data independent acquisition mode. For MS1 a scan from 390−1010 m/z at 60 K resolution was made (100% normalized automatic gain control, target (AGC); 100 ms maximal injection time). Next, 75 consecutive MS2 scans were made with an 8 m/z window size and a 1 m/z overlap at 30 K resolution with a precursors mass range of 400−1000 at 300% AGC and 54 ms maximal injection. Higher-energy collisional dissociation fragmentation was set to normalized collision energy of 23%. All spectra were recorded in centroid mode and the default charge state was set to 3.

Data processing

A spectral library was generated in DIA-NN (version 1.8.1)30 using single-pass mode as neural network classifier, protein inference based on the genes from the reviewed FASTA (20423 entries, downloaded on 08 August 2023), and robust LC (high accuracy) for the quantification strategy. Proteins were then identified in the samples using DIA-NN with the theoretical library and match-between-runs, but without heuristic protein inference and shared spectra. Data was analyzed using R (version 4.1.1)31 and tidyverse (version 1.3.2)32 for data processing. Protein group label-free quantitation (LFQ) values were kept only if intensities were based on two or more unique precursors (Table S3a). Data was log2 transformed and proteins were considered to be accurately quantified proteins if they were identified in at least 50% identification per time point and 40% per group (adults or infants) across the entire cohort (Table S3b). Imputation was performed using a normal distribution (downshift of 1.8, width of 0.3).

Data analysis

Venn diagram was made with Eulerr33 and tissue enrichment was performed using TissueEnrich.34 Protein dynamics to theoretical shapes were investigated based on the squared Pearson correlation (similarity score) calculated with Hmisc35 based on the median protein intensity over infants per timepoint using the imputed data. The effect size over time was calculated as the area for every protein using Flux36 based on the median intensity per timepoint using the imputed data, in which the first timepoint was used to normalize to zero. Increasing and decreasing trends in protein abundances were classified based on a r-squared correlation ≥ 0.5 and absolute area (|effect size|) ≥ 20. Biological process enrichment analysis was performed using Metascape with all 1492 identified protein symbol names (Table S3a) as background and default settings. Briefly, terms with P-value < 0.05, a minimum count of 3 and an enrichment factor of more than 1.5 were grouped into clusters based on membership similarities (Kappa scores > 0.3).37 The most statistically significant term within a cluster was selected to represent the cluster, selecting only the top-20 terms. Cytoscape was used to visualize the networks.38 For exploration of the complement cascade, we manually annotated the proteins in the cascade based on literature and visualized their effect size.39,40,41,42 Statistical significance was determined using moderated t-tests using Limma43 using the imputed data. Correction for multiple testing was performed using the Benjamini-Hochberg method and a P-adjusted value < 0.05 and absolute log-fold change (|LFC|) of >1 was considered statistically significant and biologically relevant. Differences in longitudinal trajectories between groups was determined by subtracting the median coefficient of variation (CV) of the AGA group from the CV median of the SGA group for each protein. Differences in abundance were determined by calculating the LFC based on the median protein abundance in the AGA group compared to the SGA group. Differences in development were considered stable at a relative CV below 30%.

Results

Patient cohort and serum proteomics analysis

In the PRIDICT-BPD study, a total of 67 infants with a median GA of 27.6 weeks (IQR 26.6, 28.6 weeks) and a median birth weight of 1000 gram (IQR 810, 1190 gram) were included. Eighteen (26.9%) of these preterm infants were born SGA. The sample sets obtained at day 3 (n = 65, 97.0%), followed by day 7 (n = 56, 83.6%) and day 14 (n = 49, 73.1%) were the most complete, whereas for day 0 (n = 31, 46.3%) and 28 (n = 27, 40.3%) samples were less complete (Fig. 1a, b). From day 0 up to day 14, samples were missing mostly due to logistic reasons, however at day 28 samples were missing due to discharge from the NICU, as described in the primary publication on this cohort.27

To evaluate the robustness of our unbiased label-free MS-based proteomics workflow (Fig. 1a), we performed seven repeated measurement of a study wide pooled sample. Out of the 910 identified proteins in this pool, 874 proteins were reliably quantified with a coefficient of variation (CV) below 30% (Fig. 1c, Table S3c). This robustness is exemplified by the stability in protein levels of hemoglobin subunits, hemoglobin zeta (HBZ), epsilon (HBE), gamma-1 (HBG-1), gamma-2 (HBG-2), alpha (HBA), beta (HBB), delta (HBD) and mu (HBM) across repeated injections of the study wide pool (Fig. 1d).

Global differences in the serum proteome between adults and preterm infants

To assess the circulating proteome of preterm infants, we explored variety and abundances of proteins in preterm infant (Table S1) compared to adult serum (Table S2). More proteins were quantified in preterm infants than in adults serum (Fig. S1a). This was due to increased quantification of proteins in the lower regions of the measurement range. Because a similar distribution is prerequisite for linear models,43 we opted to further compare the serum proteomes of preterm infants and adults solely qualitatively (Fig. S1b). A total of 515 proteins were present in both adults and preterm infant serum, while four proteins were only observed in adults (HBD, neutrophil elastase (ELANE), apolipoprotein C-IV (APOC4), and fructose-biphosphate aldolase B (ALDOB). In preterm infant serum we were able to identify 302 unique proteins, including known fetal and neonate-specific hemoglobulin subunits (HBZ, HBE, HBG-2, and HBM) as well as various collagen chains (Fig. S1c, Table S3d). Out of the 302 infant-specific proteins, 131 proteins were enriched for 23 different tissues. Most tissue specific proteins originated from the placenta (e.g. protein delta homolog 1, DLK1), bone marrow (e.g. porphobilinogen deaminase, HBMS) or cerebral cortex (e.g. protein kinase C-binding protein NELL2, NELL2) (Fig. S1d, e, Table S3e).

Development of specific protein profiles with postnatal age

To characterize protein changes associated to age-related changes after preterm birth, we classified proteins and their post-partum abundance trajectories based on their similarity to theoretical developmental (r2 > 0.50) trajectories over time. To distinguish proteins with similar developmental profiles with differences in magnitude compared to baseline (cord blood), exemplified for HGB2 and IGHG1 (Fig. 2b), we also accounted for effect sizes (|area|>20) (Fig. 2a, Table S3f). In this way, this approach can identify the similar trend over time between these proteins as well as the differences in protein level decrease.

Fig. 2: Classification of protein levels changing after birth in preterm infants.
figure 2

a Graphical visualisation of the approach to classifying proteins that show changes over time based on 1) similarity score (top left) calculated as the squared Pearson correlation to theoretical changes over time and 2) effect size (bottom left) calculated as the area of the protein level changes over time normalised to the protein levels at day 0 based on imputed data. b The label free quantification (LFQ)-intensity levels for immunoglobulin heavy gamma 1 (IGHG1, purple) and haemoglobin subunit gamma-2 (HBG2, green) per time point in infants with each sample shown using dots and the distribution with Tukey box-and-whisker plot. On the right the similarity score and effect size is depicted. c Similarity score plotted against the effect size for each protein represented as a dot. Red dots indicate haemoglobins and blue dots immunoglobulins, all other proteins are represented in grey dots. The squared in the plot show the threshold set for the classification of proteins for which the levels decrease (orange) or increase (green) with time after birth. d Tukey box-and-whisker plot representing the distribution of LFQ-intensity levels of coagulation factor XIIIa (F13A1), sulfhydryl oxidase 1 (QSOX1), bifunctional purine biosynthesis protein (ATIC), porphobilinogen deaminase (HMBS), complement component 9 (C9) and apolipoprotein A-IV (APOA4) per timepoints. Proteins panel backgrounds represent their classification. e Enrichment network visualization based on proteins classified as decreasing (orange) or increasing (green) with time with manual annotation of highlighted enrichment terms. Nodes represented each term within by pie charts indicating their associations with each classification.

Overall, 507 proteins, including coagulation factor XIIIa (F13A1, area: −0.09; r2: 0.15) and sulfhydryl oxidase 1 (QSOX1, area: 0.7; r2: 0.01), did not change in a clear age-dependent pattern over time (Fig. 2c, white area – 2D). Out of the 315 protein levels that were classified as changing with time postnatally, the majority of the proteins (235) levels decreased in abundances over time (Fig. 2c, orange box). In line with literature, we observed decreasing protein levels for embryonic and fetal hemoglobin (HBG1, HBG2, HBZ, HBM, HBE1) and IgG (IGHG1, IGHG2, IGHG3, IGHG4) with postnatal age (Fig. S2, S3). Additionally, we observed non-described developmental decreases in protein abundances e.g. for bifunctional purine biosynthesis protein ATIC (ATIC, area: −49.3; r2: 0.97) and porphobilinogen deaminase (HMBS, area: −71.7; r2: 0.93) (Fig. 2d). Proteins with decreased levels associated with gluconeogenesis, cellular response to heat stress, and hydrogen peroxide metabolic process (Fig. 2e, S4, Table S3g, S3h). The 79 proteins with increasing abundance postnatally (Fig. 2c, green box) included CRP (area: 81.3; r2: 0.67), complement 9 (C9, area: 105.8; r2: 0.97) and apolipoprotein A-IV (APOA4, area: 63.4; r2: 0.93) (Fig. 2d). Proteins with increased abundances were associated with acute inflammatory response, complement cascade and response to bacterium (Fig. 2e, S3, Table S3g, S3h). Notably, both increasing and decreasing protein levels with postnatal age were observed for proteins involved in the adaptive immune system and selenium micronutrient network (Fig. 2e, S3).

The postnatal development of the complement system in the first four weeks in preterm infants remains not completely elucidated, yet C9 was the most extreme increasing postnatal change (area > 100) (Fig. 2c). Therefore, we investigated the changes in levels of proteins involved in the entire complement cascade. In total, we quantified 36 proteins covering the entire complement cascade (Fig. 3, S5). The majority of the complement cascade regulating proteins remained stable (|area|< 20, white box) in abundances over time, including CD55 (area: −12.1; r2: 0.90), clusterin (area: −10; r2: 0.74), complement factor I (area: 17.2; r2: 0.97) complement factor D (area: −1.8; r2: 0.88) and complement factor H (area: 11.6; r2: 0.94). However, decreasing levels of CD59 (area: −31.6; r2: 0.93) and increasing levels of complement factor B (area: 28.5; r2: 0.98) were observed. Especially proteins that are a part of the membrane attack complex, complement component 6 (area: 20.9; r2: 0.80), 7 (area: 8.1; r2: 0.41), and 8 (alpha, C8A, area: 22.1; r2: 0.92; beta, C8B, area: 22.0; r2: 0.90; gamma, C8G, area: 56.8; r2: 0.93), increased in abundance with time postnatally.

Fig. 3: Schematic representation of changes in complement cascade proteins.
figure 3

Graphical representation of the complement cascade annotated by the effect size as calculated in Fig. 3, and left white if it was classified as stable over time. Red letters indicate proteins were not quantified in this study and grey circles indicate changes in protein levels the mass spectrometer is not capable of distinguishing. Square boxes represent complement cascade inhibitors.

Development of specific protein profiles associated with SGA

Because SGA is a known risk factor in preterm infants for short- and long-term health impairment, we investigated protein differences between preterm neonates born SGA with preterm neonates born AGA in protein profiles at birth. In this study, SGA infants were born with a median gestational age of 27.9 weeks (IQR 27.1, 28.9) and a median birth weight of 810 gram (IQR 590, 919 gram). As expected, SGA infants had a significantly lower birth weight, were significantly more often born via caesarean section, more often diagnosed with moderate or severe BPD and were more often on invasive mechanical ventilation compared with infants born AGA (Table S1)

Comparison of the sera at birth of AGA versus SGA preterm infants revealed limited differences, with seven proteins significantly altered (Fig. 4a, Table S3i). The abundances of six of these proteins were reduced in SGA infants, including hemopexin (HPX), asialoglycoprotein receptor 2 (ASGR2), bone morphogenetic protein 1 (BMP1), plasma alpha-L-fucosidase (FUCA2), serum amyloid A-4 protein (SAA4) and neutrophil gelatinase-associated lipocalin (LCN2). Only WAP, Kazal, immunoglobulin, Kunitz NTR domain-containing protein 2 (WFIKKN2) exhibited higher levels in SGA preterm infants (Fig. 4b). No enrichment for biological processes was observed. To explore whether these significant proteins remained different over time, the protein levels over time were plotted. The levels for most proteins, e.g. SAA4, normalized rapidly. However the levels of HPX, a glycoprotein that binds free heme in circulation to prevent iron loss and iron-induced oxidative damage, normalized slowly over time (Fig. 4c, S6), with a more rapid increase in levels for SGA infants.

Fig. 4: Differences in the serum proteome of SGA and AGA infants.
figure 4

a Volcano plot representing the comparison of differences in cord blood samples from small for gestational age (SGA) infants (n = 10) to appropriate for gestational age (AGA) infants (n = 21) determined with imputed data, with significantly up-regulated proteins in red and down-regulated proteins in blue. The x-axis represents the log2 fold-change in protein levels and the y-axis represents the -log10 t-test p-value after Benjamini-Hochberg correction. b Tukey box-and-whisker plot representing the distribution of label-free quantification (LFQ)-intensity levels of all significant proteins in SGA (orange) and AGA (green) infants, including hemopexin (HPX), asialoglycoprotein receptor 2 (ASGR2), bone morphogenetic protein 1 (BMP1), serum amyloid A-4 protein (SAA4), plasma alpha-L-fucosidase (FUCA2), neutrophil gelatinase-associated lipocalin (LCN2) and WAP, Kazal, immunoglobulin, Kunitz and NTR domain-containing protein 2 (WFIKKN2). c Tukey box-and-whisker plot representing the distribution of LFQ-intensity levels in SGA (orange) and AGA (green) infants of HPX and SAA4 with the difference in CV and LFQ (effect size) over time on the right. d Differences in trajectories over time after birth. The x-axis represents the difference in abundance (log-fold change) and the y-axis represent the difference in trend over time. The black square represents all proteins within the normal distribution (0.05 ≤ value ≤ 0.95). Significantly up-regulated proteins in cord blood of SGA infants are highlighted in red and down-regulated proteins in blue. e Tukey box-and-whisker plot representing the distribution of LFQ-intensity levels in SGA (orange) and AGA (green) infants of adiponectin (ADIPOQ), haptoglobin (HP; HPR), serum amyloid A-1/2 (SAA1; SAA2) and immunoglobulin heavy constant alpha 1 (IGHA1).

To further investigate differences in developmental trajectories between SGA and AGA preterm infants, we plotted the difference in developmental trends (longitudinal CV, |CV|> 30%) in comparison to the difference in protein levels across all timepoints (LFC < Q0.5 or LFQ > Q0.95) (Table S3j). As expected based on Fig. 5c, HPX was observed with a difference in developmental trend and an effect size outside the range of normal distribution (grey box). This showed this approach can be used to identify deviations in postnatal trends (Fig. 4c, d). The most pronounces differences (|LFC|> 1) included the upregulation of immunoglobulin heavy constant alpha 1 (IGHA1) and downregulation of haptoglobin (HP) in SGA infants compared to AGA infants (Fig. 4e). In total, 69 proteins showed similar trends (|CV|< 30%) with different protein levels. Enrichment analysis revealed 24 proteins enriched for response to bacteria and the immune system, including C4b-binding protein alpha chain (C4BPA), -beta chain (C4BPB), leucine-rich alpha-2-glycoprotein (LRG1), IGHG2;IGHG4 and IGHG1;IGHG3;IGHG4 (Table S3k). Finally, the most extreme differences in protein levels with similar trends over time were the lower levels of the insulin sensitizing hormone adiponectin (ADIPOQ) and acute phase protein serum amyloid A (SAA)-1/2 in SGA infants compared to AGA infants (Fig. 4e).

Discussion

We highlight the potential of MS-based serum proteomics in pediatric research by studying the proteome of preterm infants, using residual material from a clinical cohort (n = 67).27 This is especially powerful in this vulnerable population due to the limited blood volume required15 and information-rich output it provides. We showcase this technique to be able to capture changes in the proteomic landscape associated to postnatal changes in protein levels and identify differences in proteins trajectories according to the degree of fetal growth restriction (SGA compared to AGA).

In line with our findings, others have previously reported differences in protein levels of fetal hemoglobin, alpha-fetoprotein, alpha-2-macroglobulin, haptoglobin and collagen-alpha chains between adults and full-term infants.25,44 Strikingly, the proteomic depth measured by MS-based serum proteomics of preterm infants is much deeper compared to healthy adults, with a substantial group of proteins unique to infants and a difference in data distribution. To our knowledge, this has not been described previously. However, it has been shown that for albumin the concentration is drastically lower in preterm infants (20 g/L albumin, 35 μg/μL protein)45,46 compared to later in life (34–54 g/L albumin, 70 μg/μL protein).47,48 We hypothesize that the increased proteomic depth is in part also an effect of the differences in total plasma protein and albumin concentrations in infants and adults, and that through albumin depletion some of the infants-specific proteins may be identified in adults as well. However we expect other proteins such as the placenta enriched proteins to remain infant-specific. Similarity we expect some cerebral cortex protein to remain specific, since the cerebral cortex undergoes an important developmental period occurring between 24 to 40 weeks postmenstrual age, including cerebral maturation.49 As these infants are born during this time the presence of cerebral-cortex enriched proteins in the blood may indicate immaturity of the blood–brain barrier in preterm infants. This is in accordance with previous studies that showed increased levels of proteins linked to neurodevelopment in cerebral spinal fluid in preterm infants compared to full term infants.50

To characterize changes in protein abundance levels of circulating proteins over time after birth, we utilized the dynamics of well-known developmental proteins as a reference, including embryonic- (HBZ, HBE), fetal hemoglobin (HBG, HBA), and immunoglobulins. For hemoglobin, the switch from fetal to adult hemoglobin (HBA, HBB, or HBD) starts shortly after birth,17 and is completed after approximately six months.51 In consonance with previous literature,17,18,51 we observed decreasing protein levels for embryonic (HBZ, HBE) and fetal (HBG, HBA) hemoglobin over time after birth in preterm infants. Furthermore, in full-term infants maternal IgG levels decrease postnatally and infant IgG gradually increases, resulting in a net decrease of total IgG in the first six months,20,21 which was also observed for preterm infants. Furthermore, C9 showed the largest increase in protein levels over time. Short-term developmental profiles of the complement system have previously been described for the first week after birth in a cohort of full-term infants.16 Through evaluation of the protein level changes of this entire complement cascade we observed that drastic changes occurred in most proteins of the MAC complex, which consists of C5b, C6, C7, C8α, C8β, C8γ and multiple copies of C9.52 Notably, C7 remained stable over time in preterm infants, in contrast to the decreasing levels observed for full-term infants.16 Often, these soluble forms of the MAC-complex are detected in the serum of patients suffering from infections,53,54,55 including septic patients.56 Previous studies have shown that infants have lower concentrations of C9 at birth compared to adults.57 Moreover, C9 has been shown to play an important role in killing of the Escherichia coli bacteria in septic infants,58 indicating the importance of C9 in sepsis. In this cohort 27 (40.3%) infants had culture-proven sepsis. It may be that the drastic increase in C9 protein levels over time is, at least partially, due to the immune response of septic infants in this cohort (Table S1). For further characterization of the effect of sepsis on this cascade, it may be interesting to look for differences between septic and postnatally age-matched non-septic infants at the time surrounding a sepsis episode and include measurement of C5b, as this initiates the formation of MAC resulting in cell lysis and inflammatory triggers,59 or other complement activation products.

The total number of proteins that were classified to be changing over time after preterm birth in this analysis underlines that these preterm infants indeed undergo drastic changes in circulating protein levels after birth, which should be accounted for in the characterization of disease-related changes. To further explore this, we evaluated differences in cord blood and age-related changes after birth for SGA, a known risk factor in preterm infants for short- and long-term health impairment. In cord blood, which is often used to study preterm infants, we found few significant differences in protein levels between SGA infants compared to AGA infants. Moreover, the significantly different protein levels between SGA infants and AGA infants were not linked to a specific biological process. In contrast, longitudinal monitoring of the groups allowed for the identification of protein differences associated with SGA infants. The observation of ADIPOQ indicates the validity of this approach as it is one of the most important hormones in insulin sensitivity and homeostasis energy,60 and has been described to be found at lower levels in cord blood of SGA infants compared with AGA infants.61 This is associated with an altered pattern of fat accumulation in this population.62 Among the other proteins with similar developmental trends and different protein levels, we also found lower levels of platelet proteins including platelet basic protein (PPBP) and platelet factor 4 (PF4). This is likely associated with the more frequent occurrence of thrombocytopenia in SGA infants.63 Of note, higher levels of ADIPOQ, as seen in AGA infants, have been suggested to be associated with megakaryocytic maturation in mice,64,65 which is responsible for the release of platelets from the bone marrow. Thus the lower levels of platelet proteins, which are known to correlate to platelet count,66 may be a result of the lower levels of ADIPOQ. Besides ADIPOQ and platelets, we observed alterations in protein levels involved in the adaptive (IGHG1;IGHG3;IGHG4) and innate C4BPA, C4BPB, C8G) immune system. As IgG mediates the activation of complement to ward off infections, the higher levels of these proteins might elude to an ongoing infection. In addition, we also observed various proteins with a difference in both developmental trends and proteins levels (SAA1, HP, IGHA1, HPX and GLRX). The difference in protein abundances combined with increased variability in SGA infants could indicate that these proteins were also impacted by short-term complications, as SGA is a known risk-factor for complications, such as BPD (Table S1). This highlights the opportunity that MS-based proteomics entails to further study protein alterations in common complications including bleeding, BPD, and sepsis.

Our study has limitations that should be taken into consideration when interpreting our findings. Importantly, the limited differences between SGA and AGA infants at birth could also be a result of variation in the protein levels due to GA, lack of cord blood samples, and heterogeneity in the SGA diagnosis.67,68,69 SGA, especially in preterm infants < 32 weeks GA, is commonly used as a proxy for intrauterine growth restriction.70 However they are not interchangeable,69 as the definition of SGA does not distinguish between neonates who are small but otherwise healthy and those with growth restriction due to pathological conditions as one would see in intrauterine growth restriction.71,72 Moreover, the definition of SGA used here is based on a set threshold according to the Dutch reference curve,28 and is thus used categorically. Some protein levels may, however, be dependent upon the continuous scale of birthweight instead, which may induce variation and thus limit statistical power in the comparison. Secondly, MS-based proteomics as applied in this study is not capable of distinguishing between the activated and non-activated form of proteins such as for complement component 5 and its cleaved C5a counterpart. Therefore, we have to be careful in drawing conclusions on the changes in the activation of elements in the complement cascade that preterm infants undergo. Further exploration could also be performed with MS using different acquisition approaches as it could then be capable of evaluating cleavage-specific changes in proteins across the entire complement cascade. Another limitation that needs to be considered is the relatively low number of infants included in the study. Given the explorative nature of our study, we did not perform a power analysis beforehand.

Conclusions

This work demonstrates the potential of unbiased MS-based serum profiling in a vulnerable population of preterm infants with low blood volumes. This approach allowed for the evaluation of systematic changes in development after preterm birth through characterization of biological processes in which these circulating proteins are involved and further characterization of the complement cascade, one of the major processes in the human body. We also demonstrate that longitudinal monitoring of this population with this approach can provide insight into perturbations associated with SGA, and can thus be an important tool to study disease development and progression in preterm infants. These findings provide a stepping stone to focus on health complications that are frequently encountered in preterm infants, along with long-term health consequences.