This study was performed using sera from children participating in either the BABYDIAB  or BABYDIET  studies. These birth cohort studies enrolled children with a family history of type 1 diabetes and are prospectively monitoring the natural history of islet autoimmunity and type 1 diabetes. Together, they have enrolled 2441 children [20, 21]. By November 2014, 124 children had developed multiple islet autoantibodies and 82 of these children had progressed to clinical type 1 diabetes .
Islet autoantibodies were measured using radiobinding assays as previously described [8, 20]. The antibody assays were evaluated in the Diabetes Autoantibody Standardization Program (Laboratory 121) [23–25]. Diabetes was diagnosed according to the ADA Expert Committee criteria . Both studies were approved by the ethics committee of Bavaria, Germany (Bayerische Landesärztekammer No. 95357 and Ludwig-Maximilians University No. 329/00, respectively), and adhered to the principles of the Declaration of Helsinki.
Sample selection and study design
The analysis was performed in two phases: a peptide-selection phase in which shotgun proteomics was performed to identify peptides of potential interest, which were then measured by targeted proteomics in a second application phase (Fig. 1 and electronic supplementary material [ESM] Fig.1). For the selection phase, we applied shotgun proteomics to samples from children who developed islet autoantibodies and progressed to clinical diabetes within 3.5 years (‘rapid’ progression: 15 children; median follow-up from seroconversion 1.9 years, interquartile range [IQR] 1.0–2.9 years, range 0.5–3.3 years) or ≥9.5 years (‘slow’ progression: 15 children; median follow-up from seroconversion 14.5 years, IQR 12.9–15.5 years, range 9.5–17.4 years), and from 15 children who remained islet autoantibody-negative (median follow-up from birth 15.9 years, IQR 14.2–17.4 years, range 5.9–21.7 years) matched for sex and age (Fig. 1). Two sample times were separately analysed. Specifically, one sample from each child was obtained shortly after seroconversion to the first islet autoantibody (median 0.8 years, IQR 0.3–1.4 years; sample set 1) or at the corresponding age in islet autoantibody-negative children, while the other sample was obtained at a later time (median 1.2 years after the first sample, IQR 0.8–2.9 years; sample set 2). Four children were excluded from sample set 2 in the selection phase because they had already progressed to overt diabetes by the time the second sample had been collected after seroconversion.
For the application phase, we randomly selected 70 of the remaining children who developed islet autoantibodies (median age 3.2 years, median follow-up time 12.8 years, IQR 9.6–16.6 years) and 70 sex- and age-matched islet autoantibody-negative children (median age 3.1 years, median follow-up time 10.8 years, IQR 7.2–14.4 years) (Fig. 1).
We performed targeted proteomics on the peptides that discriminated between groups in the selection phase (see detailed description below). Samples from the 70 islet autoantibody-positive children were obtained shortly after seroconversion (median 1.0 years, IQR 0.5–1.3 years; Fig. 1) and 60 children were multiple islet autoantibody-positive at the time of proteomics measurement.
Sample preparation for MS
Plasma samples were depleted from highly abundant proteins and proteolysed with trypsin as previously described . All samples were randomly distributed into one of three batches for processing, and the experimenters were blinded to the sample-group allocation during the experiment. For quality control of depletion, digestion and MS measurements, each sample was spiked with ribulose-1,5-bisphosphate carboxylase oxygenase (Sigma Aldrich, Taufkirchen, Germany) at a final amount of 50 fmol in each 10 μl serum sample. After digestion, samples were stored at −80°C until further use.
Non-targeted liquid chromatography tandem MS (LC-MS/MS) and label-free quantification
LC-MS/MS analyses were performed as previously described  on an LTQ-Orbitrap XL instrument (Thermo Fisher Scientific, Dreieich, Germany) operated with an RSLC system (Ultimate 3000, Thermo Fisher Scientific). The RAW files (Thermo Fisher Scientific) were analysed using the Progenesis LC-MS software (version 4.0; Nonlinear Dynamics, Waters, Eschborn, Germany), as previously described [27, 29].
Targeted LC-MS/MS using selected reaction monitoring (SRM)
Skyline software (MacCoss Lab Software, Seattle, WA, USA) was used to create the SRM assays . We developed and optimised an SRM assay if at least one peptide per protein satisfied the quality criteria defined using the AuDIT algorithm  for reproducible and reliable SRM measurement. Isotope-labelled, synthetic peptides (heavy peptides; PEPotec; Thermo Fisher Scientific, Ulm, Germany) were used as internal controls for correct signal integration and relative quantification. The heavy peptide mix was added to the digested sample before the MS measurement.
SRM-MS analyses were performed on a Tempo Nano MDLC system (Eksigent Technologies, Dublin, OH, USA) coupled online to a triple quadrupole QTrap4000 (AB SCIEX, Framingham, MA, USA) MS equipped with a nanospray ion source . During the MS measurements, the preselected proteotypic peptides were fragmented and the areas under the chromatographic curves of the resulting transitions formed the basis of the SRM quantifications.
Processing of SRM data
SRM data were processed using the Skyline software as previously described . Briefly, after manual quality control, heavy to light peptide ratios were calculated on fragment levels, log2 transformed and corrected for batch effects by linear regression, followed by averaging fragment values to peptides. The peptide values were normalised against control protein peptides and are referred to as adjusted intensities. Peptides with unreliable signals (>20% of measurements below the limits of quantification per peptide) were removed, resulting in robust SRM assays for 82 peptides covering 50 proteins (ESM Table 1).
Statistical analysis in the selection phase
In the selection phase, using a univariate non-parametric test (Wilcoxon rank-sum test), we assessed group differences in both sample sets (one collected shortly after seroconversion and one collected at a later time point) between: (1) islet autoantibody-positive vs autoantibody-negative children; (2) autoantibody-negative children vs slow progressors; (3) autoantibody-negative children vs rapid progressors; and (4) slow vs rapid progressors. Multiple hypothesis testing was corrected for by controlling the false discovery rate (FDR) at 0.05.
A double cross-validation (dCV) approach was then used to identify multivariable predictive protein and peptide signatures for the same eight comparisons (two sample sets and four group comparisons each). This approach selected a minimal combination of peptides that provided high discriminative accuracy, and estimated an unbiased, non-over-fitted AUC . A detailed explanation of the approach and the parameter settings used in our study can be found in the ESM Method.
Peptides occurring with at least 75% selection frequency in at least one of the eight comparisons were compiled into a candidate ‘selection’ list. To maximise our coverage, this list was extended by 14 peptides that were reported in a recent proteomics study .
Statistical analysis in the application phase
In the application phase, we tested for differences in peptide levels between islet autoantibody-positive and autoantibody-negative children using Wilcoxon rank-sum tests. To model the time from seroconversion to type 1 diabetes, we fitted univariate Cox regression models within the islet autoantibody-positive samples. Multiple hypothesis testing was corrected for by controlling the FDR at 0.05. Highly correlated peptides were identified using Pearson’s correlation coefficient.
We again applied the dCV algorithm to find multivariable peptide signatures discriminating between islet autoantibody-positive and autoantibody-negative samples. A modified version of this algorithm that used Cox models instead of classification models was then applied to identify a predictive signature of progression time within the autoantibody-positive children. For the dCV analyses in the application phase, we also included age as an explanatory variable. Details on the dCV approach in the application phase can be found in the ESM Method.
Peptides with a selection frequency of at least 50% were used to fit a final Cox model, yielding progression time risk scores for each autoantibody-positive individual in the application set. These scores were divided into low-, medium- and high-risk tertiles. Differences in the survival curves between the tertiles were assessed using logrank tests. In order to investigate the improvement in discrimination conferred by the selected peptides in addition to age, a Cox model containing only age was compared with the combined model by ANOVA. In addition, the discrimination performance over time of the combined model and of age alone was evaluated using the survival AUC measure . As an overall measure of discrimination, an integrated AUC was calculated.
All analyses were performed using R version 3.2.0 (www.r-project.org).
GeneRanker software (Genomatix software suite V3.5; Genomatix, Munich, Germany) was used to evaluate protein enrichment. Gene symbols for the respective proteins were used as identifiers. Gene ontology enrichment was calculated by comparing all significantly different proteins identified in the application phase as discriminating between islet autoantibody-positive and autoantibody-negative children against all proteins identified in plasma in the discovery phase. Redundancies in enriched terms for biological processes were curated manually.