INTRODUCTION

Alzheimer’s disease (AD) affects millions worldwide, with rising incidence as populations age. Despite extensive efforts across industry and academia, an effective strategy for disease modification remains elusive. Amongst the main concerns has been how to predict who will succumb to AD and how to monitor clinical improvement during therapeutic intervention. To this end, AD biomarkers have received much attention, including CSF amyloid beta 42 (Aβ42), which is lower in AD patients than in normal controls (16). Low CSF Aβ42 levels also predict conversion from mild cognitive impairment (MCI) to AD (712).

Most Aβ42 assays rely on immunochemical detection, whether conventional ELISAs, electrochemiluminescence-based ELISAs or bead-based methods. One of the most widely used assays is the INNOTEST® ELISA from Innogenetics NV which has been cited in numerous publications from many study centers (35,1316).

Published studies using this kit invariably conduct the assay on neat CSF according to the manufacturer’s instructions. To our knowledge, no group has undertaken an independent and comprehensive method development and validation study for this assay.

In the present study, we first undertook an extensive method development study of the INNOTEST® Aβ42 ELISA, creating a number of important modifications to the assay. These were: (1) establishment of the minimum required dilution (MRD) for CSF samples, which upon empirical testing in both AD and normal human CSF was a dilution of one in six, (2) changes in the calibration curve design to better bracket Aβ42 concentration ranges observed in clinical samples, and (3) introduction of quality control (QC) samples composed of human CSF from pre-mortem AD donors.

We also systematically examined Aβ42 recovery from storage and assay vials. Although polypropylene is considered to be the least problematic polymer for adherence of hydrophobic peptides (1719), it is known that Aβ42 can variably adsorb to polypropylene vials made by different manufacturers (20). Since such adsorbance would impair assay precision and accuracy, we tested different polypropylene vial types for Aβ42 recovery after freeze/thaw, and we established a mitigation strategy in the event of Aβ42 adsorbance.

Once these modifications were established, we then executed a full method validation in which the assay was characterized in terms of precision, accuracyFootnote 1, selectivity, specificity, parallelism, linearity of dilution, and stability. The guidelines for biomarker assay validation set out by Lee et al. (21) were used as guiding principles. As presented herein, the assay performs very well with excellent precision, accuracy and reliability.

MATERIALS AND METHODS

Materials

INNOTEST® β-AMYLOID(1–42) assay kit was obtained from Innogenetics (Ghent, Belgium) and utilizes a 21F12/3D6 sandwich pair and a Aβ1–42 peptide solution as calibration standard. Single-subject lots of normal and AD donor CSF were obtained from Precision Med, Inc. (San Diego, CA). Polypropylene assay tubes were obtained from Eppendorf (Hamburg, Germany; LoBind 1.5 mL #0030 108 116) and Thermo Fisher Scientific (Waltham, MA; 1 mL Nunc cryotubes #375353).

Methods

Method Development

Immunoassay Procedure

For initial method development, the assay was performed per manufacturer directions. See ELECTRONIC SUPPLEMENTARY MATERIALS for details. Assay modifications were made throughout the method development process; final assay conditions are described in “Method Validation” below.

MRD Determination

Donor CSF samples were diluted in INNOTEST® Sample Diluent in Eppendorf LoBind tubes prior to loading of the set-up plate. The assay procedure was followed as above from that point forward. The dilution factor beyond which the back-calculated analyte concentration values remained consistent was deemed the MRD. All subsequent development assays were performed with an upfront dilution by the MRD value determined in these series of experiments (MRD = 6).

42 Adsorption Testing

Three individual lots of donor CSF were thawed and tested to provide a “baseline” Aβ42 value. 250 μL of each lot was transferred to a separate Nunc vial, refrozen overnight at −70°C, thawed and re-tested. In addition, pooled donor CSF designated low (LQC) and mid-QC (MQC) and donor CSF spiked with Aβ42 calibrator (designated high QC (HQC)) were prepared in LoBind tubes and aliquotted for −70°C storage. For each QC level tested, a single aliquot of CSF was thawed and tested to provide a “baseline” Aβ42 value. Prior to re-freezing, 50 μL of each lot was transferred to a separate LoBind tube, refrozen overnight at −70°C, thawed and re-tested. Aβ42 levels before and after freeze–thaw were compared to determine loss of analyte in each tube type. See Supplementary Fig. 1a, b for an illustration.

Adsorption Mitigation

A previously unthawed vial of each of three AD donor CSF lots was thawed and aliquotted as follows: one 50-μL volume was transferred to a LoBind tube and two 100-μL volumes were transferred to two Nunc vials. All aliquots were frozen overnight at −70°C. Upon thawing, the LoBind tube and one of the Nunc vials were loaded onto the assay plate with no treatment aside from dilution to MRD. The other Nunc vial was spiked with 10 μL 2.2 % Tween-20 (in Sample Diluent), vortexed, incubated at room temperature for 30 min, and vortexed again before assaying. The resulting Aβ42 concentration values were compared to assess the ability of 0.2 % Tween-20 to mitigate Aβ42 adsorption. See Supplementary Fig. 2 for an illustration.

Method Validation

Preparation of Calibration Curve

The calibration standard (CS) material (Aβ1–42 Peptide; 91,163 pg/ml) from the kit was used to prepare ten standards at the nominal concentrations of 40.0, 50.0, 62.5, 125, 175, 250, 375, 500, 750 and 1,000 pg/mL in sample diluent with 0.033 % Tween-20. Unspiked sample diluent with 0.033 % Tween-20 was used as a buffer blank. The points above and below the target upper limit of quantitation (ULOQ) and lower limit of quantitation (LLOQ) were used as anchor points in the standard curve. Each standard was assayed as 1 duplicate set (n = 2).

Standard Curve Evaluation

Six standard curves were evaluated in six independent assays. Standard curves were fitted using 4-Parameter Logistic regression and (1/Y) weighting using Watson LIMS™ v. 7.03; the fit was selected based on analysis of over 20 method development runs. Acceptance was based on the following precision and accuracy criteria for each non-anchor CS level: back-calculated concentrations ≤25.0 % CV (≤30.0 % at LLOQ and ULOQ) and within ±25.0 % Bias (±30.0 % at LLOQ and ULOQ). Individual CS levels could be deactivated according to the above criteria, as long as at least 6 non-anchor points remained and no consecutive CS levels were deactivated.

Preparation of Validation Samples

Five levels of validation samples (VS) were prepared and used for accuracy and precision studies. VS2 and VS3 were created by pooling pre-screened AD donor CSF samples to obtain approximate endogenous levels of 870 and 2,200 pg/mL, respectively. VS2 and VS3 pools were aliquotted and frozen for use for the remainder of the validation as LQC and MQC (see below). Additionally, VS2 was aliquotted into vials to create VS1 (LLOQ; 375 pg/mL) by dilution in Sample Diluent and VS4 (3,600 pg/mL) and VS5 (ULOQ, 4,500 pg/mL) by spiking with Aβ42 calibrator on the day of assay. Additionally, two individual incurred samples (IS) were analyzed during the core runs and mean endogenous analyte levels were qualified for use in all stability assays. Each VS and IS sample was treated with 2.2 % Tween-20 at 10 % of total volume for at least 30 min and then diluted 6-fold in Sample Diluent for a final Tween-20 concentration of 0.033 %. Unless otherwise noted, all CSF samples were treated in this manner in the remaining validation experiments. A dilution factor of 6.6 was applied in Watson LIMS™ to accommodate MRD and Tween-20 addition to each sample.

Inter- and Intra-Assay Accuracy and Precision

Six accuracy and precision runs were performed over 3 days by two analysts, with VS1–VS5 samples in five sets of duplicate wells (n = 10 wells).

The target acceptance criteria for inter- and intra-assay accuracy and precision was ≤25.0 % CV or absolute bias for VS2, VS3, and VS4; ≤30.0 % CV or absolute bias for VS1 (LLOQ) and VS5 (ULOQ). Mean analytical recovery calculations for all VS levels were based on the established Aβ42 concentrations defined above.

For subsequent validation runs, VS2, VS3, and VS4 were assayed in duplicate on all plates as LQC, MQC, and HQC, respectively. Examining the six accuracy and precision runs, we elected to amend the validation plan to set the QC and CS acceptance criteria to ≤20 % for CV and absolute bias for all samples except for the calibrators at LLOQ and ULOQ for which the acceptance was set to ≤25 %. For assay data to be accepted, at least six non-anchor CS levels and at least four of the six QC samples (with at least one acceptable QC at each of the LQC, MQC, and HQC levels) on each plate had to meet these adjusted criteria. Mean analytical recoveries for LQC and MQC were based on established endogenous Aβ42 concentrations determined for VS2 and VS3 during the core runs: 797 pg/mL for VS2 (LQC) and 2,161 pg/mL for VS3 (MQC). HQC continued to be prepared on the day of assay by spiking 2,730 pg/mL of Aβ42 calibration standard into VS (LQC); the established Aβ42 concentration used to calculate mean analytical recovery for HQC was therefore 3,527 pg/mL (2,730 + 797).

Selectivity

Ten individual AD donor CSF samples were assayed in duplicate both untreated and spiked with Aβ42 calibrator solution at an approximate MQC concentration (2,200 pg/mL). Endogenous values determined for the untreated sample were subtracted from that of the spiked sample and recovery was calculated as a percentage of the nominal spike level (2,200 pg/mL). The target criterion for recovery was ±20 % bias from nominal for 80 % of samples.

Dilutional Linearity/Hook Effect

Dilutional linearity and hook effect was evaluated by spiking with Aβ42 calibrator solution to a combined level of 9,000 pg/mL into the LQC CSF solution and then diluting 2-, 5-, 10-, 25-, 50-, and 100-fold in sample diluent with 0.033 % Tween-20. Precision and accuracy of each sample with an established concentration falling between LLOQ and ULOQ were subjected the acceptance criteria defined after the core runs. Dilutional samples with established concentrations above the ULOQ were expected to read above ULOQ, indicating the absence of a hook effect within the limits of the concentrations tested.

Parallelism

Parallelism was assessed using three individual AD donor CSF samples with varying amounts of endogenous Aβ42. Samples were tested neat and diluted to various levels in Sample Diluent; for this experiment, the assay MRD was not applied to the samples but Tween-20 was added to the neat solutions prior to diluting them. Parallelism was evaluated by determining the mean back-calculated concentration of all samples diluted beyond the point at which matrix interference is absent as described in Ref. (21); precision at ≤20 % CV of the mean described above was considered an indication of parallelism.

Specificity and Interference from Hemolysis

The effect of Aβ40 on the quantification of Aβ42 was evaluated using LQC and HQC CSF solutions analyzed in the presence of either Aβ40 at 20 %, 50 % and 100 % of the assay ULOQ or hemolyzed blood spiked at 1 in 10, 1 in 100 and 1 in 1,000 (10, 1, 0.1 % blood contamination, respectively). Acceptance was based on the accuracy and precision criteria established in the core runs.

Stability

High and Low CSF QCs in LoBind tubes and two individual CSF samples (IS1 and IS2) in Nunc tubes (three duplicate sets at each level) were used for stability evaluation. Two thirds of the stability QCs at each level must have met the acceptance criteria for %bias and %CV as defined after the core runs. If a stability time-point failed to yield acceptable results, then stability was deemed confirmed up to the previous time-point tested.

The above samples were tested for bench-top (room temperature), refrigerated (on ice), and long-term frozen (−70°C) stability, as well as freeze/thaw stability (IS only). Bench-top and refrigerated stability was assessed at 4–6 and 16–24 h, and both prior to and after Tween-20 addition. Freeze/thaw stability of IS CSF after Tween-20 addition was assessed after one, two, and three cycles. Long-term stability was assessed at 1 and 3 months for QC and IS samples prior to Tween-20 addition and in IS samples after Tween addition. Each sample was subjected to MRD and Tween-20 treatment (except when Tween had been added prior to stability assessment). HQC samples tested for long-term frozen stability were freshly spiked from LQC with Aβ42 calibrator solution immediately prior to analysis, and therefore the samples held for long-term stability all contained the LQC Aβ42 concentration.

Compliance Statement

The validation experiments described herein were conducted in a GLP-compliant environment with the Tandem Labs Quality Assurance Unit overseeing the laboratory’s compliance program and assuring the quality and integrity of the validation test data generated. Unless otherwise stated, all procedures were performed in accordance with Tandem’s internal standard operating plans and the TCAS10-082 Validation Plan.

RESULTS

Method Development

Validation Feasibility Testing

INNOTEST® ELISA performance was evaluated in a series of experiments that comprehensively simulated an advanced method validation (i.e., addressing all components of assay performance required by regulatory agencies) (21). The INNOTEST® kit is based on a set of Aβ42 synthetic peptide calibrators in assay buffer spanning 125 to 2,000 pg/mL, but contains no QC samples. We modified the calibration curve to include additional points, better defining the optimal fit to a nonlinear regression algorithm. The range of the curve was also adjusted as method development progressed to best situate QC samples in relation to the anticipated reference range. To provide suitable material for VS and QC samples, for establishment of the anticipated reference range, and for selectivity testing, incurred CSF samples were obtained both from ongoing clinical studies and from a commercial source. The results of most of these experiments are recapitulated in “Method Validation”, but two important assay modifications are described in detail below.

Establishment of the MRD

CSF samples from two AD, one MCI, and two normal individuals were tested at various dilution levels to determine the impact of matrix on the measurement of endogenous Aβ42. Figure 1 illustrates that neat CSF generated Aβ42 concentration values that were on average 40–50 % lower than those reported for the same CSF samples after dilution. This apparent matrix interference was consistent across various analyte levels and was similar in normal, MCI and AD samples. Although dilution of CSF to 50 or 33 % (data not shown) relieved much of the apparent interference, dilution to 16.7 % CSF or lower was required to eliminate the effect. The back-calculated Aβ42 concentrations at 6-, 8-, and 16-fold dilutions of CSF were very similar to each other (CV of the mean of the three values, <10 %), indicating that parallelism exists between the calibration curve and the endogenous analyte upon appropriate dilution of the matrix. For these reasons, the MRD of the assay was determined to be six; all CSF samples and controls were diluted 6-fold for all subsequent experiments.

Fig. 1
figure 1

MRD determination in 5 individual donor CSF lots. CSF from normal elderly individuals (blue lines) and elderly individuals diagnosed with mild cognitive impairment (MCI, green line) or Alzheimer’s disease (AD, red lines) was diluted in sample diluent to various levels prior to loading into the assay plate. The back-calculated Aβ42 concentration for each lot at each dilution level was plotted against the applied dilution factor. Error bars describe the range around the mean of two replicates

Detergent Mitigation of Aβ42 Adsorption to Polypropylene Vials

As discussed earlier, the recovery of Aβ42 from polymer vials during handling and quantification has been a recurring concern (18,20). As we had selected Nunc brand polypropylene cryovials (#375353) for the storage of clinical CSF samples and Eppendorf Protein LoBind tubes (#0030 108.116) for assay execution, we first evaluated both tube types for Aβ42 recovery after freeze–thaw as described in “Methods” and Supplementary Fig. 1a, b. These experiments revealed that Aβ42 recovery from LoBind tubes after a single freeze–thaw cycle is 91–95 % of the expected value, but that a significant and variable proportion (30–70 %) of Aβ42 is lost after a single cycle in Nunc cryovials (Supplementary Fig. 3). The >90 % recovery in LoBind tubes indicated that the low recoveries observed in Nunc vials was due to adsorption of Aβ42 to tube walls (as opposed to degradation or aggregation). To investigate adsorption mitigation, Tween-20 addition was tested as described in “Methods” and Supplementary Fig. 2. Figure 2 demonstrates that Tween-20 effectively recovers the proportion of CSF Aβ42 that would otherwise remain inaccessible in Nunc tubes. This is consistent with a report by McCush et al. (22) and provides an ex post facto adsorption mitigation strategy for CSF that was not treated with detergent prior to storage. The presence of Tween-20 at the final concentration of 0.033 % was innocuous to the performance of the calibration curve and to Aβ42 quantitation (Supplementary Fig. 4). The assay was therefore modified to include a 10 % (v/v) spike of 2.2 % Tween-20 in assay buffer to all CSF samples in method validation and any subsequent clinical analysis.

Fig. 2
figure 2

Mitigation of Aβ42 adsorption to polypropylene via Tween-20 addition. AD donor CSF lots were tested after an overnight incubation at −70°C in Eppendorf LoBind tubes (black bars), after an overnight incubation at −70°C in 1.0 mL Nunc cryovials but prior to Tween-20 addition (white bars), and after an overnight incubation at −70°C in 1.0 mL Nunc cryovials and following the addition of Tween-20 to achieve a final Tween concentration of 0.2 % (see “Materials and Methods”) (gray bars). For each CSF lot, the Aβ42 concentration determined after LoBind incubation was set to 100 % in order to calculate the percent recovery of the other treatment methods. Error bars describe the range around the mean of two replicates

Summary of Assay Modifications After Method Development

Figure 3 illustrates the specifications of the modified INNOTEST® Aβ42 ELISA at the conclusion of method development. An approximate clinical reference range of 1,500 to 3,500 pg/mL was established by analyzing multiple CSF samples from early AD patients both at an MRD of six and treated with Tween-20 (data not shown). The calibration range of the original assay extended to 2,000 pg/mL; since the incurred CSF samples were subjected to MRD and the buffer-based synthetic calibrators were not, the range of quantification was effectively magnified 6-fold (to 12,000 pg/mL) with the original calibration curve. To better span the expected concentration range of samples, the highest calibrator was reduced from 2,000 to 1,000 pg/mL per well (6,000 pg/mL after adjustment for MRD). A 10-point calibration curve was designed to place 7 points in the quantification range and include three anchor points. The LLOQ of the assay was established at 375 pg/mL (62.5 pg/mL calibrator in well) and the ULOQ was set at 4,500 pg/mL (750 pg/mL calibrator in well). Five VS samples were selected for the core runs of method validation as defined in “Methods” and Fig. 3. The two non-spiked VS samples (VS2 and VS3) were created by pooling individual incurred CSF samples of established Aβ42 concentration. These pools were qualified in two separate development runs to assign an endogenous value for calculation of the spike level for VS4 (high validation standard) and VS5 (ULOQ) as well as the dilution factor for creation of VS1 (LLOQ).

Fig. 3
figure 3

Calibration curve and VS samples for method validation. The final calibration series consisted of synthetic Aβ42 peptide (small circles) in buffer at well concentrations of 1,000, 750, 500, 375, 250, 175, 125, 62.5, 50, and 40 pg/mL; anchor points were at 1,000, 50, and 40 pg/mL. The LLOQ and ULOQ of the method (dashed lines) were defined by the calibrators at well Aβ42 concentrations of 62.5 and 750 pg/mL, respectively. Five VS levels (large circles) were employed in the core runs and consisted of non-spiked AD donor CSF (blue circles) or CSF spiked with Aβ42 calibrator (red circles). The anticipated reference range determined from a limited number of incurred CSF samples is shown as a gray-shaded region

Method Validation

Calibration Curve Performance

Six validation runs were performed over 3 days by two analysts to establish the performance of the calibration curve. Each run passed pre-defined acceptance criteria as specified in “Methods.” The precision of the interpolated values of all calibrators in the quantitative assay range was ≤3.8 % CV. The accuracy was also excellent, with ≤4.1 % absolute bias. Each calibrator series fit well to the four-parameter logistic algorithm with 1/Y weighting (see Fig. 4).

Fig. 4
figure 4

Calibration curve and goodness of fit to 4-PL. The logarithm of the mean OD signal over all six core runs is shown for all ten Aβ42 calibrators. The line represents the best fit of the data to a four-parameter logistic function with 1/Y weighting. Error bars represent ±1 standard deviation around the mean of six independent experiments

Precision and Accuracy

Over six runs performed on three separate days by two analysts, all VS levels passed the a priori acceptance criterion of total errorFootnote 2 (TE; %CV + [%bias]) ≤ 30 %. In fact, the highest value of TE was 18.4 % (observed in VS5/ULOQ). In Table I, the intra-assay precision and accuracy ranges at each VS level as well as the values for inter-assay precision and accuracy are displayed. Within a plate, the precision values range between 0.6 and 8.4 %CV, while between plates, the values fall between 4.2 and 12.1 %CV. All VS levels performed equally well whether they were constructed by dilution of a CSF pool (VS1), by using endogenous CSF pools (VS2 and VS3) or by spiking a CSF pool with calibrator (VS4 and VS5). The accuracy of the assessments varied to a greater degree than did the precision, ranging from −22.1 to 12.9 % bias (both extremes occurred at VS5/ULOQ), but all bias values fell within the pre-established acceptance criterion of 30 %. Aside from VS5/ULOQ, the intra-assay accuracy across the six runs ranged from −17.4 to 4.2 % bias; inter-assay accuracy varied from −10.9 to −1.8 % bias.

Table I Precision and Accuracy of the Method

Selectivity

Ten individual AD CSF samples were tested for recovery of a known spike concentration of synthetic calibrator. A spike concentration approximately equal to VS3 was selected to ensure that all tested samples would remain in range despite anticipated variation in endogenous Aβ42 concentration. Acceptance criteria for this experiment was ≤±20 % bias. The endogenous analyte levels ranged from 483 to 1,950 pg/mL. Across that range, the spike was recovered to an acceptable level (86.1 to 98.8 %) regardless of endogenous Aβ42 concentration (see Supplementary Table I).

Specificity

The ability of the assay to distinguish between Aβ42 and Aβ40 was tested in both VS2 (designated LQC after the core runs) and VS4 (HQC). Both QC samples were tested in the presence of 0, 900, 2,250, and 4,500 pg/mL Aβ40. The acceptance criterion for this experiment was set to ≤±20 % bias from established Aβ42 concentration. In all cases, Aβ42 recovery was acceptable (range 86.1–98.8 %), with no correlation between recovery and spike concentration (see Supplementary Fig. 5).

Parallelism

Three individual AD CSF samples were diluted from neat (in this experiment, MRD was not performed) to 1 in 8 to assess the parallelism between the endogenous analyte in CSF and the synthetic calibrators in assay buffer. As in method development, the Aβ42 reported concentration is suppressed in 100 and 50 % CSF (see Fig. 5). In this experiment, dilutions of 1 in 4, 1 in 6, and 1 in 8 all generate a similar back-calculated concentration, indicating that the assay could run at an MRD of four or six, and that the assay demonstrates parallelism at CSF concentrations of 25 % or below.

Fig. 5
figure 5

Parallelism of calibrator and CSF sample concentration response. Three lots of individual AD donor CSF were diluted in assay buffer to various levels prior to loading into the assay plate. The back-calculated Aβ42 concentration for each lot at each dilution level was plotted against the applied dilution factor. Error bars describe the range around the mean of two replicates

Linearity of Dilution

To test for dilutional linearity and the presence of a hook effect, LQC was spiked to 9,000 pg/mL with Aβ42 calibrator and then diluted in several increments up to 100-fold. All samples with a nominal Aβ42 concentration ≥4,500 pg/mL generated a signal above the ULOQ, indicating that no hook effect occurs at high Aβ42 concentration. The dilutions with nominal Aβ42 concentration values within the quantitative range of the assay displayed a linear relationship between expected and observed Aβ42 concentration (see Supplementary Fig. 6). These results indicate that the assay displays acceptable dilutional linearity within the quantitative range and no hook effect.

Interference from Hemolysis

The measured Aβ42 concentration in two lots of AD donor CSF spiked with three levels of lysed blood was within 20 % of the appropriate mock-treated control (data not shown). Therefore, the method was deemed free from interference from hemolyzed blood up to the highest contamination level (10 %) tested.

Stability

LQC and HQC in LoBind tubes and two individual ISs in Nunc cryovials were stability tested at room temperature, on ice, after several freeze–thaw cycles, and at −70°C. Benchtop stability at room temperature was established for both QCs in LoBind tubes and ISs in Nunc cryovials to 21 h in the absence of Tween-20 and 5 h in the presence of Tween-20. Benchtop stability on ice was established for both QCs in LoBind tubes and ISs in Nunc cryovials to 22.5 h in the absence of Tween-20 and 4 h in the presence of Tween-20. Incurred samples in Nunc vials were tolerant to three cycles of freeze–thaw (additional cycles were not tested). Long-term frozen stability studies in LoBind and Nunc has been established to 3 months at −70°C (studies ongoing). These studies indicate that Aβ42 is sufficiently stable in CSF over timeframes appropriate for assay execution, with sufficient tolerance for unexpected variations from defined procedure.

DISCUSSION

General Observations on the Use of Aβ42 Assays in the Field

The Innogenetics INNOTEST® Aβ42 ELISA kit has been commercially available since 2000 and is one of the most widely used kits for CSF Aβ42 measurement. It has been successfully used to categorize subjects by disease status (3,4,15,16) and has the potential to be an FDA-approved diagnostic test for AD. There is also growing interest in using these types of kits to measure CSF Aβ42 changes as a pharmacodynamic biomarker of drug effect in clinical trials (23,24). Since such pharmacodynamic changes may be subtle, excellent assay precision and reproducibility are highly desirable.

In order to support our own clinical efforts, we have, to our knowledge, performed the first GLP-equivalent advanced validation of this (or any other Aβ42) assay. During pre-validation method development, we systematically examined every aspect of the method, and made significant modifications when required to fulfill advanced validation requirements. A summary of the original manufacturer-recommended assay parameters, and the various modifications we made to support advanced validation can be found in Table II. These modifications, and the advanced assay validation that followed, are discussed in more detail below.

Table II Method Development Aspects and Improvements to the Assay

Establishment of Minimum Required Dilution for CSF in INNOTEST ELISA

The INNOTEST® manufacturer recommends that CSF samples be analyzed neat, and the field has followed this recommendation in establishing apparent Aβ42 levels in AD, MCI and cognitively normal subjects (e.g., 3,9,15,16,25). However, we have demonstrated in this report that dilution of CSF results in increased Aβ42 measured concentrations; such apparent signal suppression in neat CSF was also demonstrated in this assay recently by Bjerke et al. (17), although an MRD was not determined. Due to the adequate sensitivity of the method we opted to consider MRD as the dilution of matrix that fully mitigates the impact of matrix on the back-calculated analyte value (in this case, a dilution factor of 6). Lower dilution factors can be considered if concerns about sensitivity outweigh concerns about complete recovery of Aβ42 in the CSF (however, the precision of the assay would need to be re-established). To our knowledge, the present study is the first to rigorously assign an MRD for this assay in CSF.

Establishment of Accurate Reference Range for Aβ42 in AD CSF

A critical implication of our MRD analysis is that published studies using neat samples to establish reference ranges in normal and diseased human populations likely underestimate true Aβ42 concentrations. These studies (for example, 35,15,16) utilizing the INNOTEST® Aβ42 ELISA reported that human Aβ42 CSF concentrations average approximately 560–760 pg/mL in normal subjects and approximately 380–490 pg/mL in AD. In order to use the most appropriate calibration curve for our study, we analyzed multiple incurred mild AD study samples after MRD/Tween treatment and observed an Aβ42 range of 1,500–3,500 pg/mL. Further investigation of multiple normal and AD CSF samples comparing this modified INNOTEST® assay with other methodologies (once validated to a similarly advanced level) would provide a greater understanding of the true Aβ42 concentration range in humans and would provide an indication of the absolute accuracy of each validated method.

Modification of Calibration Curve to Correctly Fit Accurate Reference Range

The establishment of an MRD in this method effectively expanded the quantitative range of the method 6-fold, as diluted CSF samples were queried against neat, buffer-based calibrators. In order to optimally orient calibrators and QC samples in relation to the established reference range, the calibration range was adjusted as defined in Table II. The quantitative range of the optimized method was designed to accommodate a substantial increase or decrease in CSF Aβ42 levels due to a therapeutic effect during a clinical trial (see Fig. 3). This modified calibration series performed well in validation and is recommended for future users of the method, coupled with an MRD of 6.

Detergent Mitigation of Aβ42 Adsorption

Studies performed by Innogenetics and others (20) (Vanderstichele et al., personal communications) indicate that Aβ42 can adsorb significantly and variably to polypropylene tubes made by different manufacturers. To respond to loss of Aβ42 in our clinical sample storage vials, we used a method introduced by McCush et al. (22) in which addition of a final concentration of 0.2 % Tween-20 can recover adsorbed peptide. Furthermore, we were able to extend those findings by showing complete mitigation of Aβ42 adsorption, using non-adsorbing LoBind tubes as comparators. The chosen level of Tween-20 does not interfere with assay performance (Supplementary Fig. 4), and therefore this post-collection mitigation strategy is suggested for use across a wide range of polypropylene tubes and for existing samples already in long-term storage.

Appropriate Source of VS/QC Material

In their description of fit-for-purpose biomarker development and validation, Lee et al. (21) state that VS and QC material “should be as closely related to the study samples as possible.” We followed these recommendations by sourcing enough CSF from human AD subjects to create large pools for both assay validation and sample analysis. This CSF was collected, processed and stored using defined and standardized procedures and therefore was deemed superior to widely available remnant CSF. In addition, AD CSF contains lower Aβ42 concentrations, enabling the creation of VS/QC pools from purely native CSF to optimally control the performance of the method during clinical sample analysis. However, to establish the full quantitative range of the method and to allow for potential modulation of Aβ42 concentration upon therapy, adulteration of the CSF was necessary: HQC and ULOQ were created by spiking in synthetic Aβ42 and LLOQ was generated by dilution of CSF in assay buffer. Both manipulations (spiking or dilution) were supported by the MRD/Parallelism experiments (Figs. 1 and 5), in that synthetic analyte possessed a similar concentration/signal correlation to endogenous Aβ42 and back-calculated sample concentrations remain constant at CSF levels below 16.7 %. We believe that the use of highly relevant human CSF for method validation and QC, represents a significant improvement over buffer-based or remnant CSF approaches.

Performance of the Optimized Method

The process of method validation revealed that the optimized method is suitable for the detection of biomarker shifts during a clinical study. The intra-assay precision throughout the quantitative range was determined to be below 10 % and both the intra- and inter-assay precision results compare well to commonly employed acceptance criteria for biomarker assays (26). As discussed above, accuracy could be validated only to a relative level due to the presence of endogenous analyte in all human CSF samples. Although absolute accuracy is ultimately not required for use of the method to assess therapeutic effects, the mitigation of two major sources of inaccuracy as reported here represents significant progress toward that goal. Further development of the method as a diagnostic tool would require cross validation with other validated quantitative techniques in order to confirm absolute accuracy. Despite the limitations imposed by the matrix in question, the accuracy results and total error calculations generated during method validation revealed the consistency of quantitation across the entire quantitative range and compare well to the biomarker field (26). The optimized method is selective and specific for Aβ42, is free of the artifacts of non-parallelism, dilutional non-linearity, and hook effect, and is tolerant of significant blood contamination of CSF. The CSF-based QC samples introduced into the method are stable under conditions expected during normal assay execution, and CSF samples can be stored at −70°C for at least 3 months.

Implications for the AD Field

In addition to diagnosing and stratifying patients, there is an increasing interest in using CSF Aβ42 changes as a potential marker of drug efficacy, therefore requiring a reliable, accurate and precise Aβ42 assay. However, the lack of rigorously validated methods, as well as the documented variability both between and within commonly used Aβ42 assay platforms has impeded the use of these assays in drug trials. In this present study, we have demonstrated that the INNOTEST Aβ42 assay meets advanced validation criteria when an MRD is incorporated, and is thus useful for measuring drug induced biomarker changes. However, the different absolute values between platforms create uncertainty in the absolute accuracy of each method. In several recent reports, QC human CSF samples were tested in three commonly used assay kits: INNO-BIA AlzBio3® (a Luminex bead-based method), the MSD® MULTI-SPOT® Human (6E10) Abeta Triplex Assay, and the INNOTEST Aβ42 ELISA (17,27,28). Marked differences in Aβ42 concentration within the same CSF sample were revealed when the methods were compared, with some differences of greater than 3-fold reported. The increase in Aβ42 concentration upon dilution to MRD and adsorption mitigation revealed in this report raises the distinct possibility that matrix effects and/or adsorbance contribute to the observed inter-platform discordance. Careful validation of all CSF Aβ42 assays would surely advance the effort to harmonize the clinical application of these technologies.

CONCLUSIONS

In conclusion, we have conducted the first GLP-level method development and advanced validation of a method for the quantitation of Aβ42 in human CSF. Mitigation of both matrix interference and analyte adsorption were critical to a successful validation. In addition, we have conducted this validation using CSF from individual AD patients for all critical experiments including precision and accuracy. This modified ELISA assay can be used with confidence to precisely and accurately measure CSF Aβ42 in current and future clinical studies of novel therapeutic modalities.