Background

Dementia is a syndrome with a complex pathophysiology, characterised by a heterogeneous group of clinical features and pathological hallmarks (e.g., β-amyloidopathy, tauopathy, synapse loss, oxidative stress, inflammation) [1]. In particular, Alzheimer’s disease (AD) has a long pre-clinical phase (20–30 years) [2] with pathological detection requiring visualisation of on-going amyloidopathy and tauopathy in brains of affected subjects, preferably in an early phase of the disease [3].

Pathological diagnosis can be achieved by regulatory approved β-amyloid (Aβ)-positron emission tomography (PET) and Tau-PET imaging methods [4]. PET technology is considered a minimally invasive method, with little to no complications or risks for the subject. Harmonisation is required between tracers and interpretations performed by the individual investigators. Furthermore, each tracer utilises a different brain region as negative control and each has a different specificity towards Aβ-oligomers [5]. Only visual interpretation of scans by trained physicians is presently approved by the Food and Drug Administration (FDA) [6]. The method is expensive and requires highly specialised instrumentation, training, and logistics [7]. Results are used by clinicians to estimate Aβ-neuritic plaque density or tangle load (FDA Tauvid [8]) in adult patients with cognitive impairment who are being evaluated for AD and other causes of cognitive decline.

Cerebrospinal fluid (CSF), which surrounds the brain and the spinal cord, is considered as a “mirror” of the brain [9]. CSF collection is a minimally invasive procedure, which is safe when performed with care and appropriate precautions [10, 11]. Changes in CSF protein biomarker profiles, when measured accurately, demonstrate abnormalities at least 20 years prior to the expected age of onset in dominantly inherited AD pedigrees [12]; earlier than can currently be detected with imaging modalities [13]. In the last two decades, CSF diagnosis of AD has focused especially on the quantification of proteins that have been identified in plaques and tangles, such as Aβ1–42 and phospho-tau [14, 15]. More recently, other CSF proteins (e.g., alpha-synuclein [α-synuclein], Neurogranin [Ng], beta-site amyloid precursor protein cleaving enzyme 1[BACE1], neuropentraxin2, neurofilament [NFL]) have revealed the presence of co-pathologies/co-morbidities in the brain, such as Lewy bodies or loss of synapses. The latter can help to predict the rate of future cognitive decline and will likely provide tools for better stratification of individuals for inclusion in clinical trials. At the regulatory level, efforts are on-going in Europe (European Medicines Agency (EMA)) and the USA (FDA) to qualify CSF proteins for inclusion in clinical trials [16, 17].

CSF AD biomarker analysis, especially for inter-laboratory comparisons, has previously been hampered by variabilities across pre-analytical handling, assay design, and laboratory performance, exacerbated by the absence of consensus on how to collect, process, and store CSF [18]. Consequently, considerable differences in absolute concentrations of AD biomarkers are reported by different centers using the same assay, leading to non-uniform cut-off values for an identical context of use. These difficulties have been mitigated by (i) the establishment of the Alzheimer’s Association quality control program which has documented a continuous improvement of assay performance by dedicated vendors of the immunoassays [19], (ii) the release of certified reference methods for analysis of CSF Aβ1–42 [20] (no reference materials are available yet for the other CSF proteins), and (iii) the introduction of automated biofluid analysis platforms to reduce inter/intra laboratory variation [21]. These achievements have allowed diagnostic laboratories to establish internal operator training programs and provide a tool to the assay vendors to harmonise their results with the current best practice in the field. However, accurate quantification of CSF proteins still requires extensive standardisation of sample analysis, as well as standard operating procedures for collection and storage of CSF [22].

Several guidance papers have been published with the ultimate goal of generating a consensus on how to handle CSF sample for analysis of Aβ proteins [18, 22, 23]. Except for the most recent guidance [24] (Hansson et al. The Alzheimer’s association international guidelines for handling of cerebrospinal fluid for routine clinical measurements of amyloid β and tau. Alz Dementia. Submitted), most recommendations were based on expert opinion rather than experimental evidence. A key issue not considered in detail in the guidance documents is the methodology for CSF collection through lumbar puncture (LP). During LP, CSF can be collected by allowing it to drip into the collection tube (gravity drip) or by aspiration with a syringe (aspiration). Proponents of gravity drip maintain that when a syringe is used to aspirate CSF, the extra surface area of the syringe (even when it is polypropylene) may adsorb analytes and thus influence assay results, while others believe that if a suitable polypropylene syringe is used, the resulting assays for AD biomarkers will be unaffected [25]. The method of CSF collection is an important issue because taking volumes of CSF greater than 10 mL by gravity drip is time consuming and can be uncomfortable for patients. Time taken could be an impediment to routine CSF sampling if higher throughput is desired for both diagnosis and monitoring of AD. Standardised guidelines are published for LP CSF extraction [26]; however, these are yet to provide evidence as to consistency of results post different extraction methods.

To objectively investigate potential effects of CSF collection methodology through LP when assaying AD biomarker concentrations, we provide a detailed comparison of gravity drip collection with aspiration using a polypropylene syringe during the same collection for each subject. All other pre-analytical and analytical aspects of the procedures for sample analysis were identical. We hypothesised that analyte concentrations would be unaltered by extraction method, not only for Aβ and tau proteins, but also for synapse proteins.

Methodology

Participant information

CSF samples from 50 participants of the Australian Imaging, Biomarkers and Lifestyle (AIBL) study were collected using aspiration then gravity drip methods during the same LP visit. Participants were deemed either cognitively normal (CN) (N = 36; 70%), to have mild cognitive impairment (MCI; N = 8; 15%) or to have AD (N = 6; 15%) after clinical and neuropsychological assessments, conducted as previously described [27]. Clinical assessment was taken within 6 months of CSF collection. Data for clinical parameters such as Mini-Mental State Exam (MMSE), Clinical Dementia Rating (CDR) score, the AIBL Pre-clinical Alzheimer’s Cognitive Composite (AIBL-PACC), and Apolipoprotein E ε4 (APOE ε4) allele status were available for all participants. All data represented in this study are cross sectional.

Lumbar puncture

CSF was collected by LP, in the morning, from overnight fasted participants, using protocols described in detail elsewhere [28]. Briefly, the LPs were performed with the subjects in a sitting position, using a Temena (Polymedic®, EU, tamena.com) spinal needle micro-tip (22/27G × 103 mm; CAT 21922–27), or a RapID set pencil point spinal needle (25G; Smiths Medical ASD, Inc., Keene, NH, USA) if there was difficulty with the fine needle. Initially, up to 6 mL of CSF was aspirated for routine microbiological/biochemical assessment and other concurrent studies, then 8 mL of CSF was collected by gravity drip into a 15 mL polypropylene tube (Greiner Bio-One188271, Fisher Scientific, Goteborg, Sweden), and placed onto wet ice. After gravity collection, a polypropylene syringe (BD, North Ryde, NSW, Australia) was then used to aspirate 2 mL of CSF, which was then transferred to a second Greiner Bio-One188271 polypropylene tube, on wet ice. Samples were processed within 1 h by centrifugation (2000 × g, 4 °C, for 10 min) and the supernatant transferred to a new Greiner Bio-One188271 polypropylene tube before being aliquoted in 300 μL volumes into Nunc Cryobank polypropylene tubes (NUN374088, Thermo Fisher, MA, USA). Samples were stored in liquid nitrogen vapour tanks and only thawed once, immediately before analysis. Prior to thawing, CSF was shipped on dry ice to ADx NeuroSciences and stored at − 80 °C until the biomarker analysis. The range of time taken to collect the CSF by gravity was 10–15 min and the range for aspiration was 0.5–1 min. Parameters of the sample collection and adverse incidents have been reported previously [28].

Biomarker assay

Samples from the 50 participants were tested for six analytes in the facilities of ADx NeuroSciences: α-synuclein, Aβ1–42 (herein reported as Aβ42), Aβ1–40 (herein reported as Aβ40), total tau, BACE-1 and Ng (trunc P75) (Assay details are described in Table 1). In parallel, all samples were verified for blood contamination by testing for haemoglobin (Hb) content using an in-house developed Hb-assay [29]. The latter is required to allow a more accurate interpretation of the α-synuclein concentrations obtained [30]. Samples from one subject were added onto each ELISA plate, aimed at reducing inter-plate variability. Furthermore, for Aβ40 analysis the samples were pre-diluted 21-fold with the sample diluent into a predilution plate that was treated with the same diluent for 30 min (as described previously in [31]). For all other 5 analytes the samples were directly added into the antibody-coated plates for analysis. The reported concentrations were based on duplicate sample measurements for all six analytes except for Aβ42, Aβ40, total tau, BACE-1, wherein respectively 7, 8, 1, and 2 samples were tested singly (due to a technical error or too low remaining volumes). All samples had a value within the measuring range as defined by the provided calibrators.

Table 1 Assay characteristics for each individual biomarker

Different levels of run-validation acceptance criteria were integrated in the test procedure. For each test run, both kit control samples (Positive Control 1 and 2, prepared using lyophilized calibrators) were within the acceptance range as described on the certificate of analysis of the kit lot used. Criteria for blank value (OD < 0.100) and calibrator curves (OD highest calibrator > 1.2) were approved. In addition, three “in-house” QC samples (QC1, QC2, QC3) were included in each assay run for assessment of test run performance. The composition of the QC samples is based upon neat CSF obtained from a commercial source. QC1 and QC2 were samples obtained from an individual subject, while QC3 was composed of a pool of two individual CSF samples. Samples for this purpose were collected retrospectively, and no clinical diagnosis is available for these samples. After their preparation (QC1, 2, and 3), samples were aliquoted and frozen. Before each run, an aliquot was thawed and included in the test run (duplicate testing).

PET imaging

Of the 50 samples, 49 had PET-Aβ imaging available acquired using either 11C-Pittsburgh Compound B (PiB; N = 4), 18F-florbetapir (FBP; N = 15), or 18F-flutemetamol (FLUTE; N = 30) tracers. The acquisition protocol for each radioligand has been detailed previously [32,33,34]. Briefly, a 20-min acquisition was started 50 min after either PiB or florbetapir injection and 90 min after flutemetamol injection. Aβ-amyloid PET scans were spatially normalised using CapAIBL [35] and the standard Centiloid (CL) method was applied for quantitation [5]. Participants were classified as PET-Aβ+ if their CL value was 20 or greater; otherwise, they were classed as PET-Aβ−.

Statistical analysis

Clinical and demographic parameters were assessed via chi-square test (gender, APOE ε4 allele status), generalised linear modelling (GLM; age, AIBL-PACC), and Kruskal-Wallis test (MMSE, CDR score) where appropriate. Biomarker comparisons between the two extraction methods were conducted using concordance correlations (CC), Passing Block Regression analyses, and Bland-Altman plot analyses. Paired t-tests were computed to assess possible differences between biomarker levels between extraction samples, with Box and Whisker plots demonstrating differences in biomarker means between clinical classification for both gravity drip and aspiration samples. Standardised differences between biomarker levels from aspiration and gravity drip extraction methods are presented via error bar plot (Fig. 4). Statistical analyses were performed using the R statistical environment (Version 3.6.1) [36].

Results

Cohort demographics

From a total of 50 participants who underwent both gravity drip and aspiration CSF extraction protocols, 49 had PET-Aβ imaging, of which 23 were PET-Aβ+. Of the CN group, 35% were PET-Aβ+, while 71% and 100% of MCI and AD participants respectively were PET-Aβ+. There were no differences in the proportion of males to females, the proportion of APOE ε4 allele carriage, or age between the three clinical classification groups. As expected, participants with either MCI or AD had significantly lower MMSE scores and significantly higher CDR scores (Table 2).

Table 2 Sub-cohort demographics

Assay characteristics and performance

Table 1 shows the analytical performance characteristics for each biomarker assay. The means for the intra-assay percent coefficient of variation (%CV, standard deviation [SD]) based on duplicate clinical samples were 3.6 (3.0)% for Aβ42, 2.1 (1.79)% for Aβ40, 3.0 (2.6)% for Tau, 3.7 (3.2)% for Neurogranin, 3.4 (2.9)% for α-synuclein, and 3.6 (3.0)% for BACE1.

Run-acceptance

All kit control concentrations were within the acceptance range for all assays and test runs. OD values for each calibrator concentration were within the standard acceptance criteria. The blank value and highest calibrator point were within specification for all analytes over all test runs. Furthermore, monitoring of the QC samples revealed an inter-assay variability between 5.0% (lowest %CV for Neurogranin) and 14.9% (highest %CV for total Tau; see also results in Table 1).

Biomarker concordance correlations between CSF extraction methods

Using the six biomarkers that were measured, along with the three ratios (Aβ42/40, Aβ42/Tau, and (Aβ42/40)/Tau) for all 50 participants, concordance correlations (CC) were all greater than 0.85 (Fig. 1). Strongest concordance correlations (CC > 0.95) for individual biomarkers between extraction methods were found for Tau (0.993 [95% confidence interval (CI) 0.988–0.996]), α-synuclein (0.995, [0.991–0.997]), BACE1 (0.987 [0.977–0.992]), Neurogranin (0.985 [0.976–0.991]), and Aβ42 (0.951 [0.915–0.972]). Of the three ratios, Aβ42/Tau had the highest CC (0.966 [0.942–0.981]).

Fig. 1
figure 1

ai Concordance correlation plots per biomarker. Black points represent data from positron emission tomography (PET)-Aβ− participants; grey points represent data from PET-Aβ+ participants. Round points represent those participants who were cognitively normal (CN); square points represent those participants with mild cognitive impairment (MCI); triangular points represent those participants with Alzheimer’s disease (AD). The solid black line represents the concordance correlation (CC) between gravity and aspiration extraction methods. Aβ; amyloid beta, BACE1; beta-secretase 1

Further testing of the concordance via the Passing Bablock method defined regression equations and plots (Supplementary Figure 1) for the relationship between extraction methods. For each biomarker, the plot shows the linear relationship, with thin bootstrap confidence intervals defining small variation between the extraction methods. For individual biomarkers, BACE1, Tau, α-synuclein, and Neurogranin had the smallest confidence intervals, indicating a close fit between the two measurements, whist for the ratio biomarkers the confidence intervals spread wider, with larger biomarker values indicating an increased variance in the fit amongst the larger ratio values.

Assessment of concordance via Bland-Altman plots

Each of the nine biomarkers (six individual biomarkers and three biomarker ratios) showed a reasonable spread of data points, with plots showing markers having only either one, two, or three points that fit outside ±1.96SD around zero (Fig. 2). Disregarding the outliers, symmetry around the zero-difference line for each individual biomarker was maintained. Lower mean levels for the ratio biomarkers (Aβ42/40)/Tau and Aβ42/Tau demonstrated smaller differences between extraction methods, while larger mean differences showed larger spread of the data. Of the outliers, six participant samples were responsible for all values outside the ± 1.96SD lines. The participant with a large negative difference for BACE1 (− 457.7) also had a large negative difference for Tau (− 78.4). Another participant with unusually high BACE1 values (> 4500) also had a large difference for Tau (− 58.8). The sample with the largest negative difference for Aβ42/40 (− 0.04) also had the largest negative difference for (Aβ42/40)/Tau (− 1.742).

Fig. 2
figure 2

ai Bland Altman plots per biomarker. Black horizontal line represents the point at which the biomarker mean difference between aspiration and gravity drip extraction methods is equal to zero. Lower grey dashed line represents the point at which the value on the y-axis is 1.96 standard deviations below zero. Upper grey horizontal dashed line represents the point at which the value on the y-axis is 1.96 standard deviations above zero

Paired sample comparisons

Comparing biomarkers via both the complete sample and stratified by clinical classification showed no significant difference in mean biomarker levels between gravity drip and aspiration extraction methods (p > 0.05, Table 3). For the individual biomarkers, BACE1 and total Tau had p values closer to one (p > 0.9), indicating smaller differences between extraction methods, while all ratio biomarkers achieved similar performance (p > 0.9). Similar to stratification by clinical classification, there were no statistical differences in biomarker levels found when stratifying data by PET-Aβ status (Supplementary Figure 2).

Table 3 Mean values of AD CSF biomarkers, gravity drip versus aspiration extraction method

Finally, we evaluated whether the observed differences in biomarker concentrations between extraction methods (gravity drip, aspiration) are affected by the selected biomarker or biomarker combination. Results are presented in Fig. 4. While the overall change in concentration between methods is limited, it is obvious from the figure that the variation between subjects is much lower for total Tau, α-synuclein, and Neurogranin as compared to either Aβ proteins used as a single biomarker or when integrated into a ratio.

Outlier samples

Supplementary Table 1 shows the demographic and clinical details for the participants who had CSF values considered as outliers from either gravity drip or aspiration extraction methods. Visualisation refers to how the outlier was detected. For example, the term “Box” refers to the outlier being detected from the Box and Whisker plots (Fig. 3), while the term “BA” refers to the outlier being detected from the Bland-Altman plot (Fig. 2). Here the outlier value represents a large difference in the biomarker value between extraction methods (i.e., the value lies outside the grey dashed lines). Outlier values depicted here are also seen in Fig. 3, where outlier values are consistent across extraction methods.

Fig. 3
figure 3

ai Box and whisker plots of median biomarker levels between extraction method and clinical classification. Black boxes and points represent data from samples extracted using the aspiration method. Grey boxes and points represent data from samples extracted using the gravity drip method. Upper lines on each box represent the 3rd quartile, middle lines represent the median value, and the lower lines represent the first 1st quartile

Discussions

In this study, we aimed to assess the concordance in biomarker levels between gravity drip and aspiration CSF extraction methods. After investigating each of six individual and three ratio biomarkers using multiple concordance methods, it is clear that biomarker reliability is independent of CSF extraction method. For the majority of biomarkers, concordance correlation coefficients were greater than 0.95, with reduction in precision due to a few outliers and the small sample size. Overall repeatability was consistent within both the complete sample and across individual clinical groups.

Assessment of biomarker outliers from both aspiration and gravity drip extracted samples showed that outlier values were independent of extraction method; i.e., the reason for the aberrant biomarker level was unrelated to extraction method. Furthermore, assessment of biomarker levels between extraction methods, stratified by either clinical classification or by PET-Aβ status, did not increase the variance in biomarker levels, strengthening the claim of stability across a range of different clinical or phenotypic populations.

The minimal changes across extraction methods seen in most biomarkers, as compared to the variability between extraction methods for Aβ40 and Aβ42 (Fig. 4), again reiterates the fact that Aβ is a more challenging protein for analytical assays. The number of confounding factors at the pre-analytical and analytical level that affect Aβ levels in biological fluids is higher than for other proteins. It is not clear precisely which confounding factor(s) might contribute to the variance observed in the current study. No gradient effect was observed in CSF by Le Bastard et al. (2015) for Aβ42, total tau, and pTau181P as measured by ELISA (n = 20) [37]. Some gradient effect was seen with higher collection volumes for CSF Aβ42 when analysed on an automated chemiluminesent platform [38]. The same paper provided evidence that CSF-Aβ42 levels were stable if fresh samples were processed within 2 h, followed by a freeze thaw cycle. All AIBL samples were processed within 1 h post collection. Samples were put on wet ice immediately after collection before further processing. Further, Darrow et al. (2020) also noted an effect from blood contamination, especially in the thawed samples. Nevertheless, in this study blood contamination did not account for the higher variances observed for Aβ40 or Aβ42 levels, as verified by quantification of Hb concentrations in each CSF sample. New experimental designs and testing procedures are required to solve such dilemmas.

Fig. 4
figure 4

Error bar plot for the standardised difference in biomarker levels between extraction methods. Biomarker values were standardised by removing the mean from each value and then dividing by the standard deviation. Normalised values of biomarkers tested from the gravity drip extraction method were then removed from the values of biomarkers from the aspiration method for plotting. Error bars represent the difference values from each participant, with the mid-point being the point between the minimum and maximum difference values

The method of CSF collection is an important step in the pre-analytical handling of CSF samples. While some investigators routinely use gravity drip, others use aspiration. Gravity drip has the drawback of unpredictable variation of collection times and may potentially take considerably longer than aspiration, thereby reducing feasibility in busy clinics. In dementia evaluation settings, the longer duration of CSF acquisition can be a particular problem for a patient with memory impairment or dementia since repeated reassurance and explanation may be required. Our results demonstrate that syringe aspiration does not have a significant effect on analyte concentrations and, therefore, should be acceptable and allow predictable and more rapid CSF collection.

This report extends the data of Rembach et al. [17] by utilising optimised assay formats and the inclusion of other biomarker proteins which will become important in future stratification of subjects within the several disease areas of neurodegeneration. Our report shows the strong concordance of CSF biomarkers when CSF is collected either by standard gravity drip or syringe aspiration (Table 3). In particular, more rapid collection by aspiration suggests that wider adoption of aspiration is feasible and may become the preferred means of CSF collection for the detection of AD CSF profiles. Shorter duration (~ 10 min from gravity drip to ~ 1 min for aspiration) should help increase acceptance both by patients undergoing the procedure and also staff conducting the LP Furthermore, it is our belief that the outcomes from this study will be relevant to data generated on other technology platforms.

Limitations

The current study is not without limitations. Firstly, regarding the extractions, all aspirations were performed subsequent to gravity drip collection as per AIBL protocol. Although reversal of collection order is thought unlikely to have any effect on biomarker levels, this notion has not been formally tested. A second limitation of this study was the relatively small number of subjects included, with varying number of samples per clinical classification. The small sample size for participants with MCI/AD may over represent the strength of the biomarker concordance between extraction methods, and future research needs to be performed with a larger sample size to specifically test this. The results of this study should therefore be interpreted in the context of these limitations.

Conclusions

In summary, the current study provides strong evidence that key CSF protein biomarker measurements are not influenced by extraction through either the gravity drip or aspiration method, and that CSF results utilising either method are inter-changeable. Much time can potentially be saved and subject burden reduced using the syringe extraction approach compared to gravity drip. Results of this study should be incorporated into the new consensus guidelines for CSF collection, storage, and analysis of biomarkers.