SPIROMICS is a longitudinal, multi-center, observational study with two major goals: 1) to provide robust criteria for sub-classifying COPD participants into groups to evaluate therapeutic efficacy during clinical trials; and 2) to identify biomarkers to use as intermediate outcomes to predict clinical benefit reliably during therapeutic trials [1]. SPIROMICS is enrolling 3200 participants who will undergo a baseline and three annual follow up visits (four total over three years). Visits will include detailed clinical evaluation. Collection of blood specimens occurs at baseline, and during visits 2 and 4. Blood analytes will be measured to determine whether they may provide a picture of COPD clinical phenotypes relevant to the two broad goals of the study.

It is well-appreciated that some blood analytes are more reliably measured in one sample type compared to others, e.g., serum versus plasma, and that absolute levels of analytes can vary depending upon the nature of blood processing [25]. During coagulation in serum samples, clot formation removes proteins from the blood sample (e.g., fibrinogen) and platelet activation releases proteins such as proinflammatory cytokines and various metabolites, which can alter analyte levels relative to plasma. Platelet activation can also affect measured levels of analytes in plasma, despite the addition of additives that prevent clot formation. This effect is due to release of the analyte from the platelets during processing [6]. Interactions between platelets and platelet mediators with leukocytes may cause leukocytes to release mediators as well. Furthermore, protein/analyte degradation during sample preparation and storage can also affect analyte measurements.

A variety of analytes of potential interest to COPD pathogenesis are known to have different values in plasma and serum (e.g., fibrinogen, matrix metalloproteinases, cytokines). Hence, the choice of blood sample type for analyte quantification is important for SPIROMICS [7, 8]. To maximize the types of analytes and assays that can ultimately be performed in this cohort, serum, EDTA plasma, and ETDA plasma plus proteinase inhibitors (P100™) specimens are all being collected in SPIROMICS. Because there will be many samples collected over multiple time points, selecting the most reliable blood sample type for each analyte will improve reliability, longitudinally and across samples, and will conserve resources.

The choice of measurement platform is also of crucial importance. Methods for blood analyte analysis vary from routine ELISA-based, single-analyte measurements to large-scale proteomics and metabolomics analyses that measure thousands of analytes simultaneously. Intermediate coverage platforms, such as ELISA-based methods conducted via multiplexing, e.g., Luminex™ (Luminex Corp, Austin, TX), are also viable options [9]. While the sensitivity of assays measuring individual analytes is likely to be higher than multiplex assays, single analyte analysis is expensive in terms of sample usage (volume) and cost per analyte. The purpose of this pilot study was to use the multiple types of blood samples collected within SPIROMICS to determine whether certain groups of analytes measured via multiplexing can be measured more reliably in one sample type versus another. SPIROMICS investigators selected a battery of analytes that were of interest to the goals of SPIROMICS and analyzed 105 specific analytes grouped in 12 multiplexes, plus a simplex for microalbumin, analyzed in serum, EDTA plasma and P100 plasma.


Sample collection

Blood is being collected from SPIROMICS participants as part of their baseline (initial) visit and additionally 1 and 3 years after the baseline visit. SPIROMICS subjects are requested to fast after midnight, and blood is drawn early in the day of the study visit. For the entire SPIROMICS, eight tubes are collected in the following order: Two 8.5 mL red-stoppered serum tubes [Vacutainer® Plus plastic serum tube; Becton-Dickinson (BD) Diagnostics, Franklin Lakes, NJ; product number 367888]; one 10 ml yellow-stoppered tube containing 1.5 mL ACD anticoagulant (BD product number 364606); two 10 mL and one 4 mL lavender-stoppered tubes containing a sprayed on K2EDTA anticoagulant (BD product numbers 366643 and 367861); one 8.5 mL P100 red-stoppered plasma collection tube with a mechanical separator and sprayed on K2EDTA anti-coagulant and proprietary protease inhibitor additives (antiproteases; BD product number 366448); one 2.5 mL red-stoppered tube with RNA preservation solution (Paxgene™ RNA, BD; product number 762165). All samples are processed within one hour of collection, aliquoted, and frozen at −80°C, shipped to the SPIROMICS Biospecimen Processing Center and kept frozen at −80°C for future use. Mean processing times for the samples used in this study were (in minutes) 39, 32, and 49 for serum, EDTA plasma, and P100 plasma, respectively. Processing involves immediate inversion of tubes several times after sample draw and centrifugation at room temperature at 1100–1300 relative centrifugal force (RCF) for 10 minutes in a swinging bucket rotor of 15 minutes in a fixed angle centrifuge for serum and EDTA plasma, and 2500 RCF for 15–20 minutes or 1100–1600 RCF for 30 minutes for P100 plasma. SPIROMICS protocols require dividing each blood collection tube into aliquots of 150 μl to minimize freeze-thaw cycles. The 13-plexes run in this pilot study required 3 aliquots each of serum, P100 plasma and EDTA plasma from each patient. The aliquots were sent frozen to Myriad-RBM, where they were thawed, pooled, diluted and immediately utilized for analyte determination according to standard practices. Each pooled sample was run in duplicate, providing 2 replicates from the same blood draw for each blood sample type.

Selection of analytes and multiplexes

We first identified priority biomarker candidates based on known COPD pathophysiology and previously published literature, then selected from the assays available at Myriad-RBM, which were primarily multiplexes (Luminex xMap technology, Myriad-RBM Inc., Austin TX). Each multiplex measured a number of analytes in addition to the priority biomarkers. The number of analytes per plex varied from 1–14 (see Additional file 1: Table S1 and Additional file 2: Table S2). In total, 105 analytes were evaluated on the 13 plexes, 12 of which were multiplexes.

Selection of samples

We next selected samples from 24 SPIROMICS participants chosen to represent individuals with a range of disease severity assessed by Global Initiative for Chronic Obstructive Lung Disease (GOLD) spirometric stage classification ( at the time of the blood draw [mean/median age 64/65 years; 12 females, 12 males; six non-smokers, three at risk smokers, 5 GOLD stage 1 (mild), 4 GOLD stage 2 (moderate), 6 GOLD stage 3 (severe)]. Due to the small sample size, there was no intent to utilize the data from this study to correlate analyte levels to clinical phenotype; however, the range of GOLD spirometric stage provided an opportunity to assess some analytes that may have varying blood levels based on certain disease conditions associated with COPD.

Results received

The measured concentration of each analyte, as well as the lower limit of quantification (LLOQ), least detectible concentration (LDD; concentration three standard deviations above diluent blank reading), and the low to high normal range were provided by Myriad-RBM. The LLOQ was used as the lowest reliable value. It is defined as the lowest concentration of analyte reliably detected and at which the total error meets the laboratory’s requirements for precision. In this case, the laboratory’s requirements for precision is the concentration of an analyte at which the coefficient of variation of replicate standard (Myriad-RBM defined) samples is 30%. If a sample value was below the LLOQ, it was reported as < LLOQ for that analyte. Rarely, a sample could not be measured due to technical problems during processing and these were reported as ND (not determined).

Statistical methods: measurability and reliability

For each analyte, identical analyses were carried out for the three sample types. In the first step, the percentage of samples below the LLOQ was calculated, and the number of subjects with both replicates ≥ LLOQ was determined. No further analysis was done on a particular analyte if one or both replicates were less than LLOQ in more than 50% of subjects. For the remaining analytes, descriptive statistics were calculated for the subset of subjects with both replicate values ≥ LLOQ including the mean, within-subject standard deviation, reliability coefficient, and within-subject coefficient of variation (CV). All statistical analyses were conducted using SAS version 9.2.

The within-subject and between-subject variance were calculated from the overall variance using a simple random effects linear model with a random subject intercept. The within-subject standard deviation is the square root of the within-subject variance. The reliability coefficient is the ratio of the between-subject variance and the total variance in the samples. Reliability is dependent on the overall variability of the samples. Thus, for analytes with very little variability between subjects, small variation in replicate samples within a subject may result in a reduced reliability coefficient. Conversely, for analytes with large variability between subjects, a relatively large variation in replicate samples within a subject may still result in a favorable reliability coefficient.

The within-subject coefficient of variation (CV) was calculated as the ratio of the within-subject standard deviation and the overall mean, multiplied by 100 (expressed as percentage). Smaller values of the CV indicate that the within-subject variation is small compared to the mean. In general, a CV of less than 10% is considered acceptable [10]. Alternatively, an analytic CV one half of the biologic CV may be useful in identifying change in analyte value [11]. Due to the large number of analytes and the relatively small sample size, no tests of statistical significance were performed. The following descriptive comparisons between sample types were identified as notable: differences in reliability >15%, ratios of coefficient of variation < 0.667 or > 1.5, and ratios of means < 0.667 or > 1.5.


Reliability and CV are similar across blood sample types, with important exceptions

The majority of analytes produced similar results in all sample types, both based on detection and on measurements of reliability and CV between duplicate samples (Figures 1, 2, and 3; Additional file 2: Table S2, Additional file 3: Figure S1). Exceptions are listed in Table 1 and highlighted in Figures 2 and 3.

Figure 1
figure 1

Scatterplots of coefficient of variation (CV) and reliability score for all consistently detectible analytes for 24 subjects. Coefficient of variation (CV; left panel) and reliability (right panel) are plotted as shown for either serum and EDTA plasma (top row) or P100 and EDTA plasma (bottom row). Outliers in these figures with either CV > 20% or reliability < 0.60 are discussed further in Table 1.

Figure 2
figure 2

Plots indicating analytes with notable differences in CV or reliability between serum and EDTA plasma for 24 subjects. A) The ratio of CV (serum:EDTA plasma) is plotted in rank order from largest to smallest by analyte. Analytes with notably better CV in EDTA plasma (ratio >1.5; lower CV in plasma) and notably better CV in serum (CV ratio <0.667; lower CV in serum) are indicated B) The difference in reliability score between serum and plasma (serum minus plasma) is is plotted in rank order from largest to smallest by analyte. Analytes with notably better reliability in serum versus plasma (difference > +0.15) and better reliability in EDTA plasma versus serum (difference < −0.015) are indicated. Horizontal lines indicate descriptive cut-points used to define notable performance.

Figure 3
figure 3

Plot indicating analytes with notable differences in mean levels between serum and EDTA plasma for 24 subjects. The ratio of measured mean analyte levels (serum:EDTA plasma) is plotted in rank order from largest to smallest by analyte. Analytes with notably higher levels measured in serum (ratio >1.5) and EDTA plasma (< 0.667) are indicated. Horizontal lines indicate descriptive cut-points used to define notable differences.

Table 1 Analytes identified that showed variation in measurement performance across blood sample types #

Reliability and CV were not determined for twenty-three of the 105 analytes (22%) because they were not detectable in both replicates for at least 50% of subjects within each of the three sample types (Table 1). Four analytes, IL1A, IL1RN, IL12B, and OLR1 were consistently detected in serum but not in plasma; conversely, three analytes, fibrinogen [FGA_FGB_FGG], IFNG, and MMP2 were consistently detected in plasma samples but not serum. An additional 7 analytes (7%) were consistently detected in at least one of the 3 sample types, but had low reliability measurements (< 60%) or high CV (> 20%) in the sample types that were consistently detected.

Twelve analytes (11%) performed notably better in serum versus EDTA plasma and another 11 better in EDTA plasma than serum, based on descriptive cut-points for differences of reliability and ratios of coefficient of variation values (see Table 1, Figures 1 and 2). The majority of the analytes performed similarly in EDTA plasma compared to P100 plasma (Figure 1). Only one analyte (CCL4) performed better in P100 plasma relative to EDTA plasma, yet the difference was modest (Additional file 2: Table S2). Eight analytes showed notably better performance (lower CV) in EDTA plasma compared to P100 (Table 1). Reliability of EDTA and P100 plasma was within 15% for all consistently detectable analytes.

Absolute values vary across sample types

Except for analytes that were only detected in either serum or plasma, the mean expression values differed by > 1.5 fold between plasma and serum for 20 analytes (19%) (analytes listed in Table 1; Additional file 2: Table S2; Supplemental Figure 1 displays the mean and a dot plot for each analyte and sample type). Of these, serum produced higher values for 17 analytes (16%), while plasma produced higher values for three (3%). We identified no notable differences in mean levels between EDTA plasma and P100 plasma samples (data not shown).

Individual analyte performance characteristics are not consistent within multiplexes

Each multiplex had its own unique features related to analytes detection and sample-type performance, which are summarized in Table 2. Within each multiplex, a majority of the analytes were detectable in either serum or plasma, with the exception of Myriad-RBM multiplexes HMPC49 and HMPCORE1 where 3/5 and 12/17 analytes were not detectable or not reliably measured, respectively. Multiplex HMPCORE4 performed with noticeably worse CV for most analytes compared to other multiplexes, with only one analyte (GC) having a CV <10%. Serum had noticeably better reliability and/or CV for detected analytes in HCVD4, HMPC62 and HMPC83; while plasma generally performed better for detected analytes in HMP8, HMPC35, and HMPCORE4. For multiplexes HMPC19, HMPCORE1, and HMPCORE2, certain represented analytes performed better in serum and others in plasma. For HMPC42 and HMPC84, all sample types performed similarly.

Table 2 Summary of blood sample type differences across 12 multiplexes for 24 subjects


Easily measurable biomarkers that mark complex disease phenotypes, such as those found in COPD, would be extremely valuable for the purposes of diagnosis, treatment individualization, patient selection for clinical trials, and as surrogate markers for disease progression. While the number of clinically available biomarkers for such purposes remains low, it is likely to increase as new biomarkers are discovered and as analytical methods continue to improve. Blood is an ideal sample for biomarker collection given the comparative ease of collection. Depending upon the analyte to be measured, the choice of serum versus plasma and analytical platform can be critical decisions.

In terms of overall quantification and detectability, we found that results using serum and plasma were similar for most measured analytes. The four major exceptions [OLR1, IFNG, MMP2, FGA_FGB_FGG (fibrinogen); Table 1] are consistent with previous observations. Fibrinogen is depleted in serum, since it is removed during the clotting process [12]. Poor detection of MMP2 in the serum as compared to plasma is consistent with some previous findings [8], but not others [13]. Our finding that IFNG was measured at reduced levels in serum compared to plasma is also consistent with published findings [14]. The mechanism of this loss is not clear, but the observation is critical for interpretation of data describing the innate and adaptive immune responses in tissue, where pro-coagulant activity may result in a falsely low measurement of this critical cytokine. Why OLR1 (oxidized low density lipoprotein receptor 1) was identified only in serum is not clear, but the observation likely has analytical and/or biological relevance. Thus, results of our studies are congruent with other studies suggesting that platelet activation alters concentrations of many analytes during sample processing. This effect will increase serum values, relative to plasma, for factors that are released from platelets or leukocytes during clotting, and conversely decrease serum values, relative to plasma, for factors that co-localize with clots.

We believe that similar biologic behaviour may explain our results for the far larger number of analytes for which the mean expression values differed (Supplemental Figure 1, Table 1) even though values > LLOQ were measured for all sample types. For example, VEGFA and SERPINE1 are known to be localized in platelet granules, and higher levels of these analytes in serum versus plasma are expected [15, 16]. The results of our studies are congruent with other studies suggesting the role of platelet activation in the release of many analytes during sample processing. Interestingly, there was no strong trend for higher mean levels to produce improved performance in reliability or CV; in fact, several analytes showed the reverse. Thus, alteration in analyte concentration due to blood coagulation during sample collection potentially interferes with the search for biomarkers that correlate with disease processes by obscuring actual circulating levels in the patients. Inhibition of platelet activation, which can occur even in plasma during blood sample processing, is the logic behind the development of CTAD blood (plasma) collection tubes, which are designed to prevent platelet activation [17].

Nevertheless, the study did not identify any notable differences in mean analyte levels between EDTA plasma and P100 plasma, despite the presence of protease inhibitors in the P100 blood collection tubes. These results are similar to the findings of others using mass spectroscopy and multiplex ELISA methods [1820]. Enzymatic degradation has been reported to occur during blood collection and processing [5, 21]; protease inhibitors have stabilized the proteome in some studies [22]. However, any on-going proteolysis may not necessarily result in loss of antigenicity, the basis for the multiplex assays utilized in this work. Rapid processing and careful storage may also have prevented degradation in this study. The substantial additional cost of the P100 plasma tubes should be carefully weighed against their potential benefit, and the results for the analytes tested here support the conclusion that P100 tubes are generally not required.

Multiplexes are conceptually economical in terms of cost, sample volume, or both. However, these economies are only achieved if the multiplexes yield detectable, valid, and reproducible results for the analytes of interest to the study. Several factors should be considered when selecting the most appropriate blood sample type for this format. First, the specific analytes comprising the multiplex should be weighed against disease processes and biological questions. For example, in multiplex HMPC19, FAS and HGF performed better in serum, whereas CCL16 performed better in plasma. In such a situation, the analytes of greatest research interest based on underlying pathophysiology of the disease might drive this choice of sample type. Second, the value of economizing should be balanced against the necessity for sensitivity. For example, the largest multiplex represented in this study was HMPCORE1, which consisted of 17 analytes with links to inflammation or inflammatory processes, which are relevant to COPD pathogenesis. In this pilot, 12/17 of these analytes were not detected or not detected reliably > LLOQ in any of the sample types. While this finding could indeed reflect the lack of inflammation in the selected subjects, it more likely results from loss of sensitivity compared to standard single analyte immuno assays. Finally, for assays in which key analytes produce high CV, such as HMPCORE4, likely due to the need to dilute the original samples many-fold to bring analyte levels within the levels detected by the standard curve, the necessity of running duplicates, or even triplicate samples should be considered. In SPIROMICS, the high CV for CRP and fibrinogen are of particular relevance given the previous studies evaluating these proteins as biomarkers of disease status [23, 24].


Despite the inability to detect some analytes, likely because of limited sensitivity, the Myriad-RBM platform is useful for biomarker profiling in SPIROMICS. For many of the analytes evaluated in this study, the performance of the multiplex assays using serum versus EDTA plasma versus P100 plasma was similar. However, for certain potentially critical analytes [(e. g., fibrinogen, MMP9, CRP [23, 25] reliability and/ or CV differed depending upon sample source. Additionally, in our study, several analytes of importance to COPD pathogenesis were measured but had a high (>10%) CV between duplicate samples. The choice of sample type and analytical platform must ultimately depend on a balance between availability and the need for sensitivity and reliability. The data provided in this study will be useful to other investigators considering the use of serum or plasma for specific tests utilizing the Myriad-RBM or other similar platforms.