Introduction

Immunogenicity of biotherapeutics is generally evaluated through the detection of anti-drug antibodies (ADAs). Neutralizing antibodies (NAbs) are a subset of ADAs that inhibit binding of the biotherapeutic to its target. This inhibition may result in the neutralization of the biotherapeutic’s pharmacologic activity, potentially reducing efficacy. For biotherapeutics that are homologous to endogenous protein with non-redundant function, NAbs may cross react with the endogenous counterpart and can have critical consequences for human safety (1, 2). Therefore, assessing and reporting immunogenicity of biotherapeutics including NAbs is a regulatory expectation during clinical development and is tied to the immunogenicity risk assessment (3,4,5,6,7). In most cases, NAb testing is conducted subsequent to ADA testing, and testing in the NAb assay is limited to ADA-positive (i.e., confirmed) samples for purposes of characterizing the immune response as either neutralizing (NAb-positive) or non-neutralizing (NAb-negative). In rare instances, NAb testing may be conducted in parallel with ADA testing and may leverage one or more NAb-specific assay tiers (i.e., screen, confirm, titer). This parallel testing flow may be implemented in situations where NAb is used for trial inclusion/exclusion purposes, or to support otherwise high-risk products.

The NAb assay is a functional assay, evaluating the inhibition of the biotherapeutic’s activity. There are many NAb assay detection platforms currently in use, but all NAb assays fall under two main categories, non-cell-based assays (i.e., ligand binding assays; LBAs) and cell-based assays. The rational and strategies for selecting the appropriate NAb assay format have been extensively discussed elsewhere (8). The overarching principle is that the assay format should reflect the in vivo therapeutic mechanism of action (MoA) to generate clinically meaningful data. Cell-based assays have been traditionally preferred by regulatory agencies since they may better reflect the biological activity and MoA of the drug. These assays depend on the availability of cell lines that stably express the receptor related to the MoA of the drug (9). Alternatively, non-cell-based assays (e.g., competitive ligand binding assays with recombinant target, such as soluble target or receptor-Fc fusion proteins) are highly reproducible and relatively easy for a trained analyst to perform and validate. They have shown to be a viable assay option for detection of NAbs to a variety of biotherapeutics, with comparable or even superior assay performance including drug tolerance (10,11,12). The sponsor needs to decide on the most suitable assay format based on the MoA of the drug, the development program, reagent availability, and mutual understanding with the regulatory agencies.

The utility of NAb assessment in immunogenicity testing strategy has been passionately debated within the biopharmaceutical industry, especially for biotherapeutics with low risk of potential immunogenicity (e.g., fully human monoclonal antibody biotherapeutics) (13). In some instances, the results from a sensitive, robust pharmacodynamic (PD) biomarker assay can be a more biologically relevant tool by confirming the presence of in vivo neutralizing activity and the relationship to clinical outcomes. Therefore, an integrated analysis of pharmacokinetic, pharmacodynamic, and immunogenicity data to interpret neutralizing activity of low risk biotherapeutics has been proposed by the industry as an alternative approach in understanding any neutralization of therapeutic activity (14, 15). The decision to develop a NAb assay or a pharmacodynamic/biomarker assay for in vivo NAb assessment should be evaluated on a case-by-case basis. Early engagement with regulatory agencies is highly encouraged to ensure alignment on the NAb assessment approach to support the regulatory filing process.

The main goal of this publication is to provide a framework and a succinct reporting structure for NAb assay validations. This industry-wide alignment will hopefully streamline the review process by regulatory agencies. The subsequent sections summarize our recommendations for the reporting of NAb assay validation data that can be implemented across sponsors and included in various regulatory document sections pertaining to immunogenicity. ADA validation testing and reporting is out of scope for this manuscript and was published independently in Myler et al. (16).

Method Summary

Applicable method summaries should be included in the validation report and relevant sections of the common technical document (CTD) including the summary of bioanalytical studies and associated analytical methods, CTD module 2.7.1.4, and the integrated summary of immunogenicity (ISI), CTD module 5.3.5.3. Table I provides an example of the most relevant information to be included in the validation report method summary. Method summaries in the CTD module 2.7.1.4 should include all relevant detail needed to understand the complexity and history of the bioanalytical method. The ISI should include sufficient method summary detail to understand the methods within their context of use supporting the overall program. These will aid reviewers in understanding the scope of the validations, including the analytes of interest, validation data, amendments to the initial validation, participating bioanalytical laboratories, and specific clinical projects (16).

Table I Method Summary

An outline of critical assay parameters including critical reagent specifications, assay platform, method format, sample pre-treatment including assay minimum required dilution (MRD), sample volume needed for analysis with the sample storage conditions, cell passage limits, and control and sample criteria will help reviewers understand the context of the assay. Select examples of method summary details have been included in italics in Table I for reference. Due to the numerous assay formats and detection platforms that might be employed for NAb assays, Table I may not be a comprehensive list of key assay details. Thus, method-specific details can be added, edited, or omitted as they apply to a specific assay.

The intent of the method summary table is to provide a comprehensive understanding of the method parameters, including testing done after the initial pre-study validation, to provide regulators with an adequate understanding of the evolution of validation data throughout the life cycle of the assay pertinent to a specific method, patient population(s), and associated filing(s). Details from independently validated methods do not have to be included in this table but can be included in separate validation reports and in the ISI, as applicable, that is submitted at filing. Links to associated reports and any applicable amendments should be included and accessible for reviewers.

Assay Acceptance Criteria

Assay acceptance criteria should be included in Table I as applicable to either direct, indirect assay, or other assay formats (17). Control composition and associated control criteria will depend on the assay format (17). These criteria can be calculated using data from all the control sample results generated during validation, prior to the initiation of the in-study sample analysis phase. If sufficient data from control samples were not available to reliably estimate these limits during pre-study validation, additional data from the in-study phase can be utilized to re-calculate these limits. In such cases, provisional study phase criteria can be applied until a sufficient dataset is available for a more robust assessment. These limits, along with other assay parameters, also may have to be re-assessed during the in-study phase whenever there is a significant change in the assay conditions or reagents. For example, in the case of a new immunization to generate another batch of polyclonal positive control (PC) or introduction of a different source PC, a re-evaluation of the low positive control level, including adequate sensitivity and associated assay acceptance criteria, should be performed. It is not crucial to re-assess assay sensitivity, selectivity, and drug tolerance with the exchange of a new PC, as these assay parameters have already been established in validation. However, if changes to the assay conditions impact the cut point (CP), then parameters such as sensitivity, selectivity, and drug tolerance should be re-assessed. These concepts are further discussed in the sections pertaining to sensitivity, selectivity, and drug tolerance below and in the preceding literature (16).

The following approaches can be used to establish assay criteria and to track assay performance over time. These criteria are included in Table I and should be updated to accommodate the specific assay format.

  1. 1.

    Minimally, the low positive control (LPC), high positive control (HPC), and the plate controls, such as the negative control (NC), drug control (DC), and ligand control (LC), are evaluated for assay performance as applicable.

    1. a.

      A middle PC (MPC) may be included in the validation when there is sufficient assay range and is optional during sample analysis.

    2. b.

      Extraction controls may also be included in the validation as applicable and are optional during sample analysis.

  2. 2.

    The percent coefficient of variation (%CV) between replicate wells should normally be ≤ 25%. Higher CV may be suitable as addressed in the FDA guidance, Immunogenicity Testing of Therapeutic Protein Products—Developing and Validating Assays for Anti-Drug Antibody Detection (5).

  3. 3.

    Rank order should be maintained between the HPC, LPC, and CP.

  4. 4.

    Upper and/or lower limits using ratio or raw signal, as applicable, can be applied to plate and positive controls using the following:

    1. a.

      Upper limit: mean + t(0.01,n-1) × SD. Lower limit: mean − t(0.01,n-1) × SD. In these equations, the mean and standard deviation (SD) are calculated using the data from all the control samples tested during pre-study validation; t(0.01,n-1) is the critical value from the 1-sided t-distribution with n-1 degrees of freedom corresponding to a 1% error rate, and n represents the number of independent replicate results used in this evaluation.

    2. b.

      Should the calculated upper or lower limits for the LPC expand beyond the CP, the CP can be used as the upper or lower limit depending on the assay format.

    3. c.

      One sided limits may be suitable.

  5. 5.

    Assay criteria may be established using normalized data (e.g., ratios), as well as raw signals.

Generally, the limits closer to the normalization factor are typically used for system suitability; however, it could be appropriate to use the acceptance ranges in some instances. Examples are as follows:

  1. a.

    For a direct cell-based NAb assay, the drug stimulates the cellular response, and the PC inhibits the drug response, resulting in a PC/DC ratio less than 1. Here, PC ratios are below the CP, a lower limit of acceptance would be established, and the CP set as the upper limit.

  2. b.

    For an indirect cell-based NAb assay, the drug inhibits the cellular response, and the PC reverses the inhibition, resulting in a PC/DC ratio above 1. The CP would be established as the lower limit of acceptance and the PC/DC range would be established for the upper limit.

Trends in control data post-validation should be monitored to identify drifts or changes in assay performance over time. If there is a significant change in the assay conditions or critical reagents, re-assessment of control limits for sample testing should be indicated.

Cut Point

As with the entire immunogenicity testing strategy, establishing scientifically sound and meaningful NAb CPs are informed by the overarching immunogenicity risk of the therapeutic.

Statistical Approach Considerations

There are numerous publications outlining statistical methods for the evaluation of immunogenicity cut point(s) (5, 18,19,20,21,22,23,24,25), including recent recommendations published in the AAPS ADAH white paper (16). Establishment of CPs for both ADA and NAb assays share many common underlying principles. The intent here is to focus on the considerations and limitations that are unique and specific for NAb assays.

For both cell-based NAb assays and ligand binding NAb assays, normalization of data is recommended to mitigate potential assay variability and assay signal drift over time. Generally, there is a benefit in this floating method normalization application, particularly in cell-based assays due to the inherent variable nature of cells. Data is normalized by calculating a signal to background ratio (S/B) where the numerator is the individual subject sera signal and the denominator can be the mean, median, or geometric mean of a pooled negative control from the same plate. The S/B ratios may be calculated from log-transformed or non-transformed data.

Identification and removal of outliers is an important step in the CP evaluation process due to their impact on the cut point outcome. The proposed process and considerations for identifying analytical and biological outliers have been described in the literature (18, 19, 21,22,23,24). Multiple approaches can be justified; examples include box plots, subject-level residuals, and mixed effects model. However, some caution should be applied to iterative removal of outliers or application of overly stringent outlier criteria. The identified outliers may make sense from a statistical perspective, but it is recommended to evaluate if the identified outliers also make sense from a biological perspective. Some populations (e.g., rheumatoid arthritis) are prone to interfering factors and strict statistically driven outlier removal in such populations may result in a high rate of false positivity.

The current US Pharmacopeia and FDA guidance documents recommend a minimum 30 subjects tested on at least 3 different days by at least 2 analysts for NAb cut point determinations (5, 20). We recognize that LBA NAb assays and cell-based NAb assays are not the same and there are differences between the two assay formats. It may be appropriate for LBA NAb assays to have a minimum of 50 individuals tested for the CP, similar to LBA-based ADA assays, while cell-based assays have at least 30 individuals tested in the CP (5).

Some common computational methods to derive CPs rely on the assumption of a normally distributed population with homogeneous variance. Verification of this assumption can be challenging when working with small sample sizes, as deviations from normality are harder to detect. The industry has presented recommendations when to apply parametric, robust parametric or non-parametric methods for reporting the CP estimate and lower confidence limit cut points (18). The 2019 FDA guidance (5) mentions the lower confidence limit cut point approach as it provides greater likelihood of achieving the desired false positive rate (22). When a dataset is determined to be normally distributed (typically using Shapiro–Wilk test) and not skewed, a standard parametric approach can be applied to ensure approximately 1% false positive rate. If the dataset is determined to be not normally distributed (i.e., p < 0.05) but only show moderate skewness/kurtosis, a more robust parametric approach where median and median absolute deviation (MAD) can be applied to cut point determination and may be best applied on a log-transformed dataset. The non-parametric percentile cut point approach is robust against non-normality, but sensitive to outliers, and may need a larger sample size to obtain a reliable cut point estimate (18). This method should be applied when there is a departure from normality and the dataset is highly skewed (18, 21).

The methodology for NAb cut point evaluations will vary among applications and populations, and should be supported with a sound rationale. There is no one-size-fits-all approach.

Experimental Design for NAb Assay Cut Point

The experimental design for determining a CP for a neutralizing antibody assay takes several of the same concepts as have been laid out in ADA assays. Conversely, there are special considerations to be considered in the experimental design for a NAb assay. Those considerations may also differ based on the format of the assay (i.e., cell-based vs. LBA).

The similarities for a NAb CP vs. an ADA CP are derived from ensuring the results are statistically sound and relevant to the study samples that are to be evaluated. Initially outlined in 2008 by Shankar et al. and reiterated in 2017 by Devanarayan et al., the CP design should be balanced, that is to say, the samples used to determine CP are divided into equal subgroups and each analyst tests the samples exactly once in each assay run per plate (21). The design should also allow for the differentiation of biologic vs. analytical error by ensuring each matrix is analyzed by each analyst in each run during CP evaluation (18). Finally, use of samples from the target population is recommended for use in establishing the most suitable CP. If normal healthy individual samples are included in the CP determination, any differences between populations should be considered.

There are, of course, differences between the CP experimental designs with neutralizing antibodies as compared to ADA assays. NAb assays are frequently developed later in the drug development process when more is known about the pharmacokinetics and pharmacodynamics of the therapeutic. This additional knowledge continues to inform the immunogenicity risk assessment, therefore driving a more customized experimental design. Some of the potential design modifications may be to ensure all the analytical factors associated with the assay are incorporated, especially for cell-based assays. Other design modifications may be focused on expanding or narrowing the biological variability. Biological variability may be considered further through sub-group analysis, ultimately to ensure the appropriateness of a given CP to the population of interest.

Confirmatory and Titer Assay Tiers

The adoption of a confirmatory assay tier in NAb testing is tied to an overall NAb testing strategy within each company. It is not a common practice to have a confirmatory tier for Nab assays since, in most cases, samples have already been confirmed to have drug-specific ADA. However, due to the complexity of bioassays, especially cell-based bioassays, a confirmation of specificity may be useful to distinguish between a true NAb response and false positives stemming from potential assay artifacts (e.g., assay variability or non-specific inhibitory activities that may be present in the sample). Implementation of a confirmatory step may be particularly useful in cases when NAb assays are performed independently of the conventional tiered testing strategy where NAb is assessed only for ADA-positive samples (e.g., in cases where NAb may be used for trial inclusion/exclusion, enzyme replacement therapy, etc.)

There are several approaches to design a confirmatory bioassay:

  • Alternative stimuli. This approach is usually applied in a direct bioassay for sample containing cytotoxicity or unknown inhibitory factors. It is important to identify a cell line that responds to specific drug product, as well as alternative stimuli.

  • Sample background. This approach is usually applied in an indirect bioassay for serum samples containing stimulating factors. The baseline response from samples can be detected by treating cells with only sample serum. Elevated response can be an indication of non-specific neutralizing activity.

  • Protein A/G/L immunodepletion. This approach has been used in a confirmatory assay. A comparison of response from serum before and after protein A/G/L beads pre-treatment confirms the neutralizing activity observed in the screening assay from a NAb. Implementation of protein A/G/L immunodepletion as a confirmatory tier in NAb testing is recommended when pre-existing anti-drug NAb is a concern to the drug development program.

A confirmatory assay value (CAV) is calculated for each sample. Based on the approach used in the confirmatory assay, CAV can be defined as (1) sample response normalized by response from alternative stimuli; (2) sample response normalized by response from negative control (background); and (3) ratio between response from protein A/G/L treated and untreated sample. Similar to the statistical approach used for screening assay CP, confirmatory bioassay CP is also determined based on assay variability established with samples from treatment-naïve subjects. A 1% false positive rate is usually used to derive the confirmatory assay CP.

Implementation of a NAb titer tier is not a common practice but may be useful for programs with high immunogenicity risk such as cytokines or growth factors where knowledge of severity of neutralizing activity is important for clinicians. Determination of NAb titer may also be useful in distinguishing treatment-induced and treatment-boosted neutralizing antibodies (26). If NAb titer determination is indicated, 2- or threefold dilution with undiluted matrix is recommended for titrating NAb samples (9). Similar approaches to the ADA titer assay are recommended for the establishment of titer plate acceptance criteria, the calculation, and the reporting of NAb titer (16).

Challenges of Assessing NAb Assay Cut Point Suitability During the In-Study Phase

An approach for verifying the suitability of ADA screening assay CP determined from validation has been described (18). In this paper, suitability of the ADA screening assay CP mainly relies on the observed false positive error rate (FPER) from in-study baseline samples after excluding samples with pre-existing ADA. If the observed FPER value from in-study baseline samples is within the range of 2 to 11%, the screening ADA CP from validation is deemed suitable for that particular patient population. However, this approach poses unique challenges when applied to the NAb assay CP determined from validation. Unlike ADA testing strategy, the confirmatory step is not routinely performed for NAb-positive samples (e.g., excluding samples with pre-existing NAb), and it is not a regulatory requirement to implement in general immunogenicity testing strategy for biotherapeutic drugs (5, 6). Furthermore, low ADA incidence from lower risk biotherapeutic drugs results in small numbers of study samples to test in the NAb assay, which becomes a critical limitation for accurately determining the observed FPER value of a NAb CP during the in-study phase. Limited or inaccessible baseline samples are another common practical concern for assessing the suitability of a NAb CP, particularly for pediatric or rare disease populations. From a technical perspective, cell-based assays have relatively lower throughput with higher variability which may impact the assessment of NAb CP appropriateness. Therefore, the observed FPER approach is deemed to be impractical for assessing the appropriateness of a NAb CP during the in-study phase.

A recent paper demonstrated that the acceptable ranges for number of false positive samples were highly dependent on baseline data size used for assessing ADA CP (27). For a NAb CP statistically determined to achieve approximately 1% false positive rate, it is possible to have 0% observed FPER in in-study baseline samples when the baseline data size is less than 299. This phenomenon has been widely observed in the bioanalytical community. Under the circumstance with limited NAb sample analysis, it is not recommended to perform additional NAb sample analysis solely for demonstrating the suitability of NAb CP in in-study phase.

In instances when re-establishment of the NAb CP is warranted based on in-study data, ADA baseline samples may be utilized to establish a suitable NAb CP.

Pre-existing NAb

Pre-existing antibodies with neutralizing activity in drug-treatment naïve samples can lead to elevated CPs, decreased assay sensitivity, and potentially false negative NAb results (28). Thus, to appropriately establish a CP, samples with pre-existing NAbs should be eliminated from CP calculation. There are several approaches to identify pre-existing antibodies, such as including inhibition of assay signals by drug, depletion with protein A/G/L resin, or relevant statistical approaches. If assay responses from drug-naïve samples fall into distinct subpopulations (bi-modal distribution), the least reactive population could be designated as “negative” and most reactive population as “positive” or pre-existing (29).

Mitigation strategies to overcome impact of pre-existing NAb on CP establishment depend on prevalence and signal distribution of pre-existing signals of the drug-naïve sample population. Pre-existing NAbs of low prevalence and with distinct highly reactive samples would be removed as outliers from sample population without any impact on the resultant CP. If low prevalence pre-existing NAbs display signals on the continuous scale, outlier removal may not be feasible, and the entire sample population would be used for CP establishment. However, it is unlikely that a low percentage of pre-existing NAb samples in overall sample population would significantly elevate CPs and reduce assay sensitivity to unacceptable levels. When pre-existing NAb is of high prevalence, outlier exclusion is not amenable. Typically, sample populations with high prevalence of pre-existing NAb result in highly elevated CPs and compromised assay sensitivities (28).

One approach to mitigate the impact of high prevalence of pre-existing NAb is to generate a pseudo-negative population, which could then be used for calculation of CP by conventional methods. Pseudo-negative populations could be created using similar approaches that were described above for confirmation of pre-existing NAb. For example, using statistical methods, a pseudo-negative population could be derived from the least reactive species of the bi-modal signal distribution of the drug-naïve samples (29). Pseudo-negative population could also be created by immunodepletion of validation samples using protein A/G/L. In this procedure, all antibodies (IgG, IgM, IgA), including NAb, will be removed in the purification step and depleted samples could be used for CP determination. While protein A/G/L immunodepletion is an effective procedure, in removing pre-existing NAb, it has its own challenges. It is labor intensive and, moreover, the procedure may introduce differences between matrix of A/G/L treated validation samples and non-treated study samples.

In some special cases, pseudo-negative population could be established using immune-inhibition or sequestering of pre-existing NAb (28). Due to the design of the NAb assay, where drug is an assay reagent, drug itself cannot be used for sequestering of pre-existing NAb. However, in cases with multi-domain molecules or bispecific antibodies, pre-existing NAb could be sequestered by one of the domains of the drug (or by a domain-containing molecule). Schneider et al. (28) presented a NAb method for recombinant CD22-PE38 immunotoxin where pre-existing reactivity against PE38 toxin domain was of high prevalence. The samples were incubated with a non-CD22 toxin-containing molecule to sequester pre-existing NAb. The method generated pseudo-negative population allowing establishment of CP using a conventional method. The immune-inhibition method heavily relies on availability of immune-inhibition reagents and requires development of this critical reagent ahead of time.

Sensitivity Assessment and Selection of Low Positive Control Concentrations

Assay sensitivity is defined as the concentration of a positive control neutralizing antibody which produces a result that is equal to the assay CP (1). Current FDA guidance recommends targeted sensitivity for ADA methods used to monitor clinical samples of approximately 100 ng/mL. There is currently no specific guidance for the sensitivity expectations for NAb assays, predominantly due to the limitations of some cell-based assay formats and differences in the neutralizing capacities of the positive control antibody available (17).

Assay sensitivity is determined using a neutralizing PC diluted in pooled matrix. The PC is diluted into neat matrix at ≥ 5 concentrations spanning the assay CP. In the rare case that a confirmatory assay is used, it is recommended that the concentrations chosen span the CPs for both the screening and confirmation tier. The dilution steps should be < 2- or threefold to increase the accuracy of interpolation of the NAb concentration equal to the CP.

At least six independently prepared sensitivity curves should be evaluated across multiple days by multiple analysts. It could also add value to include runs with incubation times at minimum and maximum times, including multiple instruments and lots of critical reagents, and cells at different passages to demonstrate robustness.

Depending on the therapeutic’s MoA and the NAb assay format, it may be necessary to evaluate the sensitivity of multiple positive controls. For example, bispecific therapeutics will require evaluating PCs that neutralize both parts of the drug, especially when using a two-cell system where individual targets of the bispecific are on different cells.

Data from each run included in the sensitivity assessment are used to calculate the concentration that corresponds to the plate-specific screening CP. The data from each curve should be fit to interpolate the PC concentration equal to CP (e.g., linear regression of points above and below the CP). The mean and standard deviation (SD) of the interpolated concentrations are calculated from all PC curves and the assay sensitivity is calculated using the 95% upper confidence limit from the mean and SD [(mean + t(0.05,n-1) × SD] (5). Alternatively, based on the performance of the PC in the assay, it might not be possible to obtain results that can be fit to a regression model. In these cases, the assay sensitivity could be calculated from the mean and SD of the PC above the CP from all curves. As there is no guidance on the expected sensitivity of NAb assays, a general target of 500–1000 ng/mL could be used; however, it might not be achievable with available PCs or assay formats, nor sufficient for clinical impact assessment.

Based on the results of the sensitivity data, the LPC concentration should be calculated at the 99% upper confidence limit corresponding to a 1% failure rate [mean + t(0.01,n-1) × SD] (5). If the calculated concentration is within one 2- (or 3-) fold dilution of the LPC determined in method development and used during CP determination, the LPC concentration does not need to be changed for the remaining validation experiments. Alternatively, the LPC concentration can be set as the mean of the interpolated PC concentration from the sensitivity assessments. PC concentrations are reported in Table II and are updated during life cycle management as needed.

Table II Validation Summary

Control Precision

The precision of the assay controls should be evaluated during method validation and include controls critical to monitor method performance. Precision data are reported in Table II. Controls are determined by the type of method being used (e.g., indirect or direct cell-based assay). For direct assays, where the drug acts directly on the cells to elicit a response (e.g., agonist, growth factor), the controls could include a positive control at low and high levels, a drug control, and a negative control. For indirect assays, where the drug mechanism of action does not act directly on a cell line to elicit a response, controls could include a negative control, drug control, ligand control, and positive controls. For each assay format, at least three levels of PC (HPC, middle PC (MPC), and LPC and the NC) are included in validation precision evaluations. For other validation assessments such as drug tolerance, sensitivity, and selectivity, the HPC, LPC, and NC are sufficient and the MPC is optional. Cell-based assays frequently have relatively small dynamic range. In these cases, the MPC may be optional for the entire validation including the precision assessments. The MPC is optional during sample analysis irrespective of the dynamic range. Additional controls should be included as applicable to the assay format, for example, extraction controls or depletion controls.

Intra-assay precision may be assessed by evaluating independent preparation of positive control levels within a single assay assessment. Determination of intra-assay precision should be done using at least 6 independent determinations (positive control replicate sets) for each control done on one plate on 1 day. Depending on the format of the cell-based assay, there could be limitations to the number of wells used on an assay plate (e.g., only the inner wells of the plates are used or control and samples are tested in three wells on an assay plate). The number of precision determinations for these assays with lower throughput can be less than six (e.g., three replicate sets).

Inter-assay precision may be assessed by using control data from all validation runs (assuring that there are results with n =  ≥ 6 runs performed by multiple analysts over multiple days). Alternatively, inter-assay precision may be determined using runs (≥ 6) designated specifically as runs for precision determinations. It is also recommended that these be performed by multiple analysts over multiple days with at least six replicates of each control (three for assays with lower throughput).

The calculation of precision is typically done using normalized results (shown below); however, raw signal responses may be used, like in the case of the negative control. Depending on the assay formats and controls used, examples of normalization could be as follows:

  • • Positive control response/drug control response (indirect assay)

  • • Drug control response/negative control response (direct assay)

  • • Ligand control response/drug control response (indirect assay)

  • • Vector control response/negative control response (direct assay)

Table II has an example of how to report summarized precision results, understanding that it does not cover all possible scenarios and should be added or omitted in accordance. Precision acceptance for cell-based assays is often higher than LBA (i.e., 25 to 30%CV for cell-based assays compared to 20%CV for plate-based assays) and should be defined based on assay performance from method development data.

Variance for the negative control raw signal is expected to be higher than that obtained for the normalized control values (5). Depending on the use of the negative control in the assay (i.e., in direct assay format used to determined assay CP), the acceptance criteria for the negative control precision can be set higher (e.g., 30–40%); otherwise, the precision of the negative control may simply be reported in the validation report.

Positive Control Selection

Similar to ADA assays, NAb assays utilize surrogate positive controls to represent study samples as closely as possible, but may not be an accurate representation of all study samples given the diversity of the immune response across a population. This section puts forth recommendations for selecting an appropriate PC, noting that it is not an exhaustive collection of all the possible approaches, and we recommend using scientific justification as needed.

When possible, multiple candidate antibodies should be screened to choose an appropriate PC for the NAb assay. The following parameters should be evaluated prior to choosing the PC: (1) NAb activity and dynamic range; (2) sensitivity; (3) drug tolerance; (4) the ability to withstand sample processing treatments (e.g., acid dissociation). Drug tolerance and sample processing are often inter-dependent and, hence, should be evaluated in tandem. It is a common practice to evaluate whether the PC used in the ADA assay could also be a good candidate for the NAb assay. Depending on the PC generation strategy and the percentage of neutralizing antibodies in the total population, the ADA assay’s PC could potentially be a suitable NAb PC.

When several candidates are being evaluated for PC selection, it might be valuable to outline the decision process in the validation report capturing salient details. For complex cases, we have added Table III as an example of how this data might be documented for retrieval if requested by health authorities. This could include the candidate PCs evaluated, assay performances, sensitivities, and the final PC selection. Additional information can be provided at the sponsors’ discretion to provide greater assay context.

Table III Positive Control Selection

As with ADA assays, both polyclonal antibodies (pAb) and monoclonal antibodies (mAb) are suitable for use in NAb assays (16) provided they have sufficient neutralizing properties. For long-term studies, it may not be feasible to maintain lot-lot consistency for a pAb, especially when additional immunizations are needed. In such cases, a pAb and a mAb PC may be used during validation with the mAb PC being used for continued assay monitoring. It is recommended that PC sensitivity is < 1 μg/mL; however, NAb assay sensitivity depends on the assay format and is balanced against other critical criteria such as drug, target, and matrix tolerance as discussed in subsequent sections of this manuscript. If the selected PC cannot achieve this sensitivity and an alternative option is not available, we recommend outlining the optimization efforts made to achieve greater sensitivity and associated justification in the validation report.

Negative Control Selection

It has become a standard practice to use a CP factor in NAb detection by normalizing (e.g., dividing or multiplying) the signal of subject sample by the mean or median negative control (NC) signal from the same plate. The advantage of this normalization is to account for the potential drift in the assay signal across the assay plates and runs. Therefore, the NC selection becomes crucial to establishing assay CP, characterizing the NAb assays and subsequent sample analyses.

Regulatory agencies recommend using the same matrix as the samples to be analyzed and to make a NC pool from treatment-naïve subjects. Given that sufficient volume of pre-dose samples from targeted populations is usually unavailable, and consistency over time is desired, a bulk pool of matrix from healthy individuals is, in most cases, appropriate to serve as the NC reducing the need to bridge in a new NC and/or re-establish assay criteria that are dependent on the NC. To minimize the possibility of introducing matrix components in the NC pool that may interfere with the NAb assay, individual samples are screened in the assay prior to selecting those to be included in the NC pool. In addition, various strategies have been applied to minimize matrix interference. For instance, IgG-depleted serum has been used when there is high prevalence of pre-existing antibodies. Heat inactivation of serum and plasma samples can be used to eliminate interference on NAb detection from non-specific blood factors, such as complement or heat labile serum factors. During validation, the NC acceptance criteria should be established, such as precision and the raw instrument value range. During validation, the precision of the NC is characterized to establish assay criteria for in-study sample analysis.

Under ideal circumstances, the NC pool is prepared in sufficient volume to perform validation and study sample analysis to reduce the requirements for bridging in a new NC and potentially having to re-establish assay criteria that are dependent on the NC. In cases where a new NC pool is needed, preparation of the new NC pool should be done before exhaustion of the prior NC pool to allow for direct comparison during bridging experiments.

Any new NC pool that was not used in validation and in setting applicable assay criteria should be carefully screened to demonstrate that the performance of new NC meets the criteria established during the assay validation. Preferably, the new NC pool can be qualified by demonstrating comparability of the responses for multiple replicates of the prior and newly introduced NC pool run on the same plate. For example, prior and new NC pools can be run along with LPC and HPC spiked in both pools, with multiple replicates on each plate, over multiple plates, by at least two analysts over at least 2 days. If the prior NC is not available, comparison of the distribution of signal for the newly introduced NC pool, run on several plates over at least two days, can be compared with the historical data to ensure they cover a similar range. To account for any response variance in the NC pool, it is recommended that PCs be prepared in the same pool as the negative control. When study populations change, it is worth noting that the NC should be evaluated for its validity against the in-study samples.

Selectivity/Specificity

It is important to understand the selectivity and specificity of the NAb assay irrespective of whether a LBA or cell-based assay format is being used. For a bioanalytical assay, selectivity is the ability of the assay to measure only the analyte of interest, despite the presence of interfering components in the sample matrix; and specificity is the ability of a method to exclusively detect the target analyte, in this case the NAb molecule. Failure to establish selectivity and specificity can lead to non-specific results.

Selectivity

Interfering components may result in false results or in the complete inability of the assay to detect NAb. Therefore, key assessments in testing the NAb assay selectivity during validation are as follows:

  • • Matrix interference

  • • Large molecule concomitant medications/co-medications

  • • Cell lines, as they may be responsive to multiple stimuli other than the therapeutic under evaluation

Matrix Interference

NAb assays may be more susceptible to matrix interference than ADA assays. Matrix interference can potentially result in false positives or false negatives depending on the assay format. Cell-based assays are particularly susceptible because they are more complex with various confounding factors that may result in assay interference. Both cell-based assays and LBAs are typically run as screening assays, providing qualitative results where samples classify as either positive or negative. A confirmatory tier can be used to eliminate false positives but is not expected by regulatory agencies. NAb assays are often less sensitive than ADA assays and rarely have high MRD. As such, sample dilution is not a workable mitigation strategy in overcoming interference in NAb assays. Sample pre-treatment procedures, such as (1) acid dissociation, (2) acid-capture-elution (ACE), (3) solid-phase extraction with acid dissociation (SPEAD), (4) biotin drug extraction with acid dissociation (30, 31), and (5) precipitation and acid dissociation (PandA) (32), became common practices in current ADA assays to overcome drug interference. In addition to drug removal, the sample pre-treatment steps also remove interfering matrix components significantly improving matrix interference. Sample pre-treatments to remove drug may be used in NAb assays. However, if such sample pre-treatments are incorporated in NAb assays, one should ensure that NAb reactivity is preserved after the sample manipulation. Approaches could also include melon gel to remove interfering substances for non-monoclonal antibody therapeutics or the use of protein A/G to purify the ADA from the matrix (4). Because NAb assays are susceptible to interference and sample dilution has limited ability to overcome the impact, we recommend evaluation of NAb assay interference during early development with assessment of anticipated disease populations (6, 7).

Disease State Matrix

During assay validation, selectivity is evaluated in at least 10 individual samples from the target matrix population, i.e., donors with the relevant disease states and in certain situations normal healthy individuals. Matrix samples, unspiked and spiked with the positive control antibody, should be examined in the assay and may be analyzed in parallel. In situations where sample volume is limiting, the unspiked readout may be substituted from the CP evaluation. The spiked samples should include at least a spike of the LPC. Additional concentrations, at the MPC or HPC levels, could also be included in selectivity assessment if indicated. The recommended acceptance criteria for unspiked samples include categorization as negative for eight out of ten samples. Spiked samples should be categorized as positive for eight out of ten samples. The selectivity results are summarized in Table II including all applicable patient populations. If selectivity tested at the LPC level fails to pass method acceptance criteria, additional levels of PC may be evaluated, and the results discussed and justified in the validation report. Additional selectivity testing may be indicated when a new disease state is introduced if not already included in the original validation.

Hemolysis/Lipemic/Bilirubin Samples

In addition to disease state matrix-specific components, study patient samples may also contain high levels of hemoglobin (hemolyzed samples), lipids (lipemic samples), and/or bilirubin which may hinder NAb detection. A risk-based approach could be applied depending on the therapeutic area, disease indication, and the possible presence of these interferents in clinical samples. Evaluation of interference from hemolysis and lipemia is performed in NAb method validations. Evaluation of interference from bilirubin is suggested for relevant disease state populations (e.g., hepatitis) and should be warranted per the clinical protocol. Due to the complexity of NAb assays and limited mitigation strategies for removing matrix interference factors, evaluation of these three interference factors is recommended during validation as described below and outlined in Table II.

The impact of hemolysis, lipemia, and bilirubin can be evaluated using individual samples or pooled human matrix. Depending on the levels of free hemoglobin, lipids, and bilirubin in disease state matrix selectivity samples, these assessments may or may not provide a cumulative effect of all interfering factors. To separate out the potential effects of the disease state matrix components from the potential effects of hemolysis, lipemia, and elevated levels of bilirubin on analyte detection, healthy human matrix can be used.

Hemolysis samples could be either individual matrix samples with known levels of hemolysis or samples prepared by spiking 100% hemolyzed blood into pooled sample matrix such as that used for the NC to contain 2–3% of hemolyzed blood. Alternatively, hemolyzed samples could be prepared by spiking approximately 2–3% whole blood into individual samples or pooled matrix, which should be subjected to at least one freeze–thaw cycle. Lipemic interference can be evaluated in samples containing ≥ 300 mg/dL of triglycerides. Individual samples with these levels could be acquired or prepared by spiking lipid solutions, such as intralipid into pooled matrix. Similarly, bilirubin interference can be evaluated in individual samples containing ≥ 1.2 mg/dL of bilirubin or in samples prepared by spiking bilirubin stock into pooled matrix.

For hemolysis, lipemia, and bilirubin interference testing, samples should be evaluated unspiked and spiked, minimally at the LPC level. At least four out of five of the unspiked samples should categorize as negative and four out of five of spiked samples should categorize as positive. If you select to run less than five samples or replicates in the case of pooled samples, you should consider justifying this decision in your validation report based upon your assay variability and supportive method development data. Study samples with levels of interferents greater than the limits determined to be acceptable in validation should be excluded from reported study results. Alternate levels of interferents can be examined in cases where the assay cannot tolerate the levels specified herein. For example, if 2–3% hemolyzed blood is not tolerated, then it is suggested to evaluate tolerance with lower levels of hemolyzed blood. All acceptable results should be reported in Table II and tolerance levels should be included in the method.

Concomitant Medication Interference

Cross-reactivity between components of the NAb assays and concomitant medications (co-med) could lead to interference. Potential interference should be tested if the co-med interferes with the mechanism of the drug or has the potential to modulate the assay outcome. Co-med interference can be assessed in pooled healthy matrix containing anticipated physiological concentrations of the co-med, such as Ctrough and/or Cmax levels, in unspiked and LPC samples. Additional control or co-med levels can be examined as indicated. Co-med interference should be assessed minimally in the screening assay in a single run. If the PC fails to screen positive or if the unspiked sample fails to screen negative in the presence of the co-med, these results should be reported in Table II. The level of interference should be justified in the validation report and potentially addressed in the ISI pending any clinical consequence.

Specificity

Specificity testing is recommended during validation by evaluating the impact of structurally similar but non-target binding biologics (e.g., non-specific antibodies or isotype controls) in pooled matrix, unspiked and spiked at the LPC and/or HPC. Specificity is demonstrated if the unspiked sample classifies as negative and PC classify as positive in the presence of the structurally similar compound. For cell-based NAb, it is also possible to test for specificity using an alternative stimulus at a physiologically relevant concentration that gives an approximately equivalent signal in the assay as the drug. This could include endogenous counterparts, such as soluble receptors or cytokines, and should be selected based upon the specific therapeutic and disease indication (8).

Drug and Target Tolerance

Because NAb assays are designed to detect the presence of ADAs that inhibit drug function/activity, the presence of circulating drug in study samples can interfere with NAb detection; therefore, drug tolerance optimization is critical. The presence of physiologically relevant levels of target in study samples can also interfere with drug activity and NAb detection adding another layer of complexity. Thus, it is critical to understand the concentrations and nature of interfering substances such as drug and target in the sample to mitigate such interferences (33).

Drug Tolerance

In a typical NAb assay, a fixed amount of drug reagent will be pre-incubated with study or control samples before adding to the LBA or cell-based neutralizing bioassay. If samples contain NAbs, they will bind drug reagent and block its activity in the assay. This reduced drug activity as compared to the negative control sample implies the presence of NAb. If study samples contain circulating drug, the ability to detect NAb can be reduced, thus making it critical to understanding the levels of circulating drug the assay can withstand and still adequately detect NAb. If drug levels in a patient sample are higher than the determined drug tolerance, the additional drug may enhance or generate false positive results or suppress or generate false negatives results depending on the assay format. Thus, in cases where the drug is not completely washed out at the time of sample collection, it is necessary to understand the impact of drug levels in study samples on the NAb assay result. A thorough evaluation of drug tolerance should be performed during method validation to demonstrate the levels of circulating drug that do not interfere with NAb detection. These results should be summarized in Table II and should be discussed within the clinical context in the ISI.

Drug interference for cell-based NAb assays can be more challenging to overcome as compared to ligand binding ADA assays due to a variety of factors related to the distinct assay designs. ADA methods are typically more sensitive and can include simpler drug tolerance mitigations steps such as overnight incubation or acid dissociation without compromising suitable assay sensitivity. These types of mitigation steps can also be done with NAb assays but have largely been less successful in providing adequate drug tolerance and preserving adequate assay sensitivity especially in cases where drug levels exceed NAb levels. More complex mitigation systems, such as circulating drug removal, may be needed for NAb assays to achieve the desired drug tolerance and sensitivity. To minimize drug interference, it is important to select sampling time points when circulating drug levels are at their lowest possible level. For example, samples should be collected prior to dosing during the dosing phase at drug trough levels (immediately prior to the next dose), and at the end of the study after an appropriate washout or non-dosing period approximately equivalent to five half-lives after the last exposure (5).

As explained above, when there is a higher molar ratio of circulating drug than that of NAb, drug tolerance cannot be mitigated by applying a higher MRD, optimizing reagent concentrations, incubation times, or applying a simple acid dissociation. In these cases, more sophisticated removal techniques are used to enrich NAb and overcome the circulating drug interference, and some examples include affinity capture and bead extraction using acid dissociation (BEAD) (30, 31, 34), or heat dissociation (BEHD) (30, 31, 34), and a more recent method using PEG precipitation to get rid of free drug, followed with acid dissociation and biotin drug as assay drug (PABAD) (35). SPEAD (33, 36) and ACE (37, 38) may also be suitable for NAb assays. Disadvantages of these techniques include the potential loss of acid-labile NAb, poor NAb recovery during processing, and/or worsening of target interference. In addition, leaching of drug from either added biotinylated drug or drug present in the sample can impact assay results. Thus, it is imperative to optimize the conditions between reagent drug levels which drive sensitivity and the interference from leaching drug. It is suggested that the mildest assay conditions that still result in adequate drug tolerance be selected. Overall, the combination of an appropriate sample collection schedule, sample pre-treatment, and assay optimization can ideally result in the ability of the assay to detect NAb-positive study samples in the presence of circulating drug.

To validate drug tolerance, samples containing PC and drug are prepared in pooled matrix. Drug concentrations are selected that span the range of anticipated drug levels at the time of sample collection, frequently but not always trough PK levels. These can be based on historical or predicted PK values. After drug measurements in study samples become available, additional drug levels may need to be evaluated if observed drug levels, at the time of NAb sample collection, are higher than those tested during validation. A drug titration containing the expected concentrations in the samples is spiked into pooled matrix at the NC, LPC, MPC, and HPC. Alternatively, a serial dilution of PC concentrations can be used to determine the assay sensitivity in the presence of drug. PC and drug concentration should be reported in mass units in undiluted matrix. All tested concentrations should be included in Table II and described in the validation report or associated amendment as they pertain to actual study data once it is available. It is very important that the maximum drug concentrations expected to be present in NAb study samples be tested and described in the text of the validation report and discussed in relation to the validated drug tolerance result. Failure to do so will likely result in a query during regulatory review.

The highest concentration of drug tested that produces a positive result in the screening assay at a given PC concentration is deemed as the drug tolerance limit. Alternatively, the drug tolerance limit can be interpolated from the two drug levels on the drug titration curve that produce values immediately below and above the screening or confirmatory assay CP, similar to the calculation for relative sensitivity for the PC. A single validation run is sufficient if drug tolerance is determined to be sufficient to detect NAbs in study samples at expected drug concentrations or if pre-validation data are consistent with validation data. If drug tolerance results are variable or close to method requirements, additional validation runs (i.e., at least 3) may be valuable to ensure reporting of reliable results. If multiple drug tolerance runs are performed, the median tolerated drug concentration should be reported in the validation summary table (Table II), recognizing that this is an approximation and that the assay drug tolerance may sometimes fall above or below this value. If multiple PC antibodies are used as PC controls for the detection of NAb against various functional domains, the limits of drug tolerance shall be reported for all PC antibodies used in the assay.

The drug tolerance samples should be incubated for approximately 1 h (if the assay format allows, an overnight incubation is ideal) to allow the formation of complexes and may be frozen prior to analysis to better represent study sample conditions. The re-evaluation of drug tolerance is recommended when there is a change of assay CP (e.g., new indication or new lot of critical reagent). In such cases, it is advisable to re-calculate the drug tolerance level with existing validation data by applying the new CP.

Target Tolerance

The target of a biotherapeutic drug, when in its soluble form (e.g., ligand, soluble receptor, proteolytic fragment of whole protein), may cause interference in either cell-based or non-cell-based NAb assays (33). This section will briefly discuss the potential impact target may have on NAb assay results and mitigation strategies with specific focus on how to assess and report target tolerance during the assay validation.

For most NAb assays, the presence of soluble target in the matrix causes false positive results. Even though it is less common, target can cause false negative results, depending on NAb assay format (39). Table IV summarizes different NAb assay formats and potential impact (i.e., false positive or negative results) of target on assay results. The drug concentration, ADA titers, and the affinity/avidity of ADA responses may further contribute to the level of target interference. Erroneous NAb results caused by target interference can also impact the accuracy of assay CP assessment. For these reasons, the impact of the target on the NAb assays should be thoroughly evaluated during method development.

Table IV Examples of the Impact of Target Interference for Different NAb Assay Formats

Soluble target concentration can be affected by factors such as the disease biology, genetic regulation, proteolytic activity, and/or drug mechanism of action. Target concentrations may change in disease state matrix when compared to matrix from healthy donors. For example, the soluble form of B-cell maturation antigen (BCMA) is elevated in serum samples from multiple myeloma patients (40). It has also been observed that total target concentration can go up after drug administration (41,42,43). In addition, strategies to mitigate drug interference (e.g., sample pre-treatment with acid) may release a higher amount of free target into the sample, leading to an increased risk of target interference in the assay. Therefore, the NAb assay should be optimized in order to achieve an adequate level of target tolerance and generate reliable assay data.

It is often the case that reliable values for soluble target concentration in the sample matrix may not be available when the NAb assay method development starts. While the target concentration reported in the literature could be a useful starting point, there are cases where the published values may not be reliably estimated and therefore misleading. To assess the adequacy of assay target tolerance, it is important to accurately estimate the concentrations of soluble target present in the sample matrix of disease population. It is challenging to reliably measure free target because analytical variables such as antibody pairs used in the assay, sample dilution, or incubation time can influence the binding equilibrium between target and drug, resulting in inaccurate quantitation of free target. To accurately assess target concentrations in matrix, we recommend using a qualified or validated pharmacodynamic (PD) assay to measure free and/or total target concentrations in disease state and healthy donor matrix. Furthermore, PD data from clinical studies can also be used to help determine the optimal level of target tolerance expected for the NAb assay.

Like ADA assays, different approaches have been implemented to improve target tolerance for NAb assays. Acid treatment of matrix samples can alter protein conformation and inactivate soluble target in the serum to reduce its assay interference. Multiple acids combined with alkaline buffers may be evaluated to identify the most effective one to disrupt the target protein and mitigate its interference without affecting NAb activity. It is important to note that acid treatment meanwhile dissociates drug-target complex and releases excessive soluble target protein into sample matrix, generating additional interference for the NAb assay. Under this scenario, the accumulated soluble target could be removed using a biotinylated anti-target antibody conjugated to solid-phase surface or blocked by adding an anti-target antibody directly to matrix samples. It is important to mention that this target-blocking approach may not be suitable for cell-based assays or homogeneous competitive ligand binding assays since the anti-target antibody would interfere with the readout of these assays.

Target tolerance is tested during validation at a range of clinically relevant levels, typically the highest level of soluble target (free or total, depending on the context and feasibility as discussed above) post treatment, in the study disease population(s). The recombinant target used for the assay should be close to the physiological form. Soluble targets without fusion to a framework such as Fc or produced from mammalian cell lines are recommended for a more relevant assessment of target tolerance level. The Fc fusion may increase the target stability during conduct of the assay, potentially leading to an under-estimation of target tolerance. In cases where target interference leads to false positive results, a target titration curve can be done in the NC pool. If target interference results in false negatives, the target tolerance limit would be evaluated using the target titration in NC pool at LPC or other PC levels (e.g., 100, 250, and 500 ng/mL). The target tolerance limit would typically be reported as the target concentration at or above the CP. The target tolerance should be evaluated in a minimum of one independent run, and the results will be used to assess target tolerance limit. Estimates of target tolerance during method development may suggest whether increasing the number of runs on which target tolerance is assessed during validation to obtain an estimate within context of assay variability is indicated. The target tolerance level should be higher than the anticipated physiological target concentrations in samples. The target tolerance limit, along with the expected physiological target concentration in the respective population (if applicable), is to be listed in the validation summary table (Table II). PC and/or target tolerance limit should be based on mass units in undiluted matrix.

During clinical development, the new disease population or alternative drug dosing scheme may potentially result in an elevated target concentration. Under this situation, as a part of assay life cycle management, target tolerance re-assessment and assay re-optimization may be indicated to achieve a higher target tolerance and accurate NAb assessment.

Sample Stability

It is critical to maintain appropriate chain of custody, storage, and handling of clinical samples to ensure sample integrity during the bioanalysis timeframe (2). To ensure the continued integrity of NAbs in collected samples, handling processes should be assessed during assay validation. Since it is not practical to use a subset of clinical study samples to evaluate sample stability, positive controls may serve as a surrogate for the clinical study sample. Including a negative control sample (matrix only) is optional and should be based on development data. Freshly thawed “Time 0” samples or freshly thawed plate controls (LPC and HPC) are included in each plate.

Sample stability assessments for NAb assays include freeze–thaw and short-term (both bench-top and 2 to 8 °C) evaluations. Long-term stability, tested at − 20 °C and below, is not recommended in the 2019 FDA immunogenicity testing guidance and is supported by multiple publications (44, 45). The goal of bench-top stability assessment is to demonstrate that samples are stable when left at room temperature beyond the duration of expected sample preparation time. It is generally recommended that three independent aliquots of the LPC and HPC, thawed at room temperature and maintained on the bench for up to 24 h, are tested. The goal for 2 to 8 °C stability is to assess the retention of neutralizing antibodies in the sample when thawed and then stored at refrigerated conditions. Three independent aliquots of LPC and HPC samples stored for 24 to72 h at 2 to 8 °C should be tested.

Freeze–thaw (F/T) stability assessments provide information for the number of F/T cycles that a sample aliquot may undergo and still retain sample integrity specific to neutralizing antibodies. The targeted maximum number of F/T cycles shall be based on the number of sample aliquots available at each time point and the NAb assay testing strategy (such as sample repeats, titer evaluation, other characterization works). Three independent aliquots of the LPC and HPC are frozen and thawed with initial freeze step greater than 24 h and all subsequent freeze steps at least 12 h between cycles. The recommended practice is to analyze F/T samples at the targeted maximum number of cycles. If results from this assessment do not meet stability assessment criteria, samples with fewer F/T cycles may be analyzed.

The acceptance criteria used for sample stability can vary, but it is important that these criteria are defined in the validation plan. Commonly, acceptable stability assessments are expected to have 2 of 3 stability samples that produce the expected result (positive or negative for NAb). When a NAb titer tier is indicated, all stressed LPC and HPC samples may be compared to freshly thawed “Time 0” samples. Additional criteria may be adopted (e.g., within the acceptance range set in the validation plan) to account for semi-quantitative nature of titer tier analysis.

Assay Robustness

The FDA guidance for immunogenicity testing of therapeutic protein products recommends the assessment of assay robustness to predict the reliability of the assay when used to analyze study samples. Negligible change of control sample responses, when specific steps of the method are varied, is the most common measure of a robust method.

The guidance recommends that the sponsor monitors robustness during the method development phase, and if small changes during specific steps of the assay impact results, precautions should be taken to control that step. Targeted minimum and maximum ranges for these method critical parameters should be identified and assessed during the method validation to ensure that the allowed variance has minimal impact on sample responses. Control responses from assays that test the established variance range are expected to meet assay acceptance criteria, confirming consistent method performance.

The use of Design of Experiment (DoE) as a systematic approach in the method development phase of assay optimization has been recommended (46). Results from DoE experiments may lead to the identification of key parameters that should be evaluated in method validation. Once the method has been finalized after method validation is complete, it is recommended that the assay conditions are not altered so that the assay performance remains consistent. Incubation temperature is one example of an assay parameter that may be optimized with DoE. Once the incubation temperature is assessed for each individual step of the assay, the nominal temperature should be documented in the method, along with a range of variance that is consistent with the qualified range of the incubator (e.g., 37 °C should be recorded as nominal temperature for 35 °C to 39 °C, per the incubator performance specifications).

The following robustness parameters are recommended for evaluation during method development/validation: equipment, incubation, plate position effects, critical reagents, and cell performance. Results should be summarized in Table II.

Cell Performance

In cell-based assays, seeding density, specified as the number of viable cells per unit of volume while in suspension, has a direct impact on the assay response. Therefore, an upper and lower limit should be assessed according to the method and documented when using continuous culture methods (e.g., 4 × 105–8 × 105 cells/mL). The percentage of live, healthy cells within the population (e.g., ≥ 90%) should be assessed during method robustness evaluations and documented as cell viability.

Because some cell lines do not maintain their characteristics indefinitely in continuous culture, resulting in control sample response drift outside of acceptable limits, it is recommended to specify a passage limit during method development and confirm this limit as part of method validation. Additional options are to record the “days in culture” ± number of days, instead of using a cell passage range for both the master cell bank (MCB) and working cell bank (WCB).

Critical Reagents

Critical reagents should be defined in the method. A partial validation or qualification run(s) are recommended to assess the performance of new critical regent lots used during study sample analysis. The use of multiple lots of non-critical reagents should be documented as an assay robustness parameter (47).

Incubation

The allowed variance for incubation times during the assay (e.g., coating or blocking of assay plate, signal development) should also be evaluated as a robustness parameter during method validation. Also, considering practical limits for a given step (e.g., 60 ± 5 min), combining different steps in the same robustness test is acceptable (e.g., all minimum or all maximum incubation times).

Equipment

Details for each instrument that is used to perform the assay should be documented (e.g., instrument ID, serial number, next inspection). This includes incubators, plate shakers, plate washers, and plate readers, as well as any liquid handlers (e.g., pipette or automation systems). If multiple instruments (of the same type) are used during the validation, these should be indicated.

Plate Effect

While the number of replicate wells should be determined during method development, plate effect or plate homogeneity can be evaluated as an assay robustness parameter during method development and/or validation. Assessment of intra-assay precision during method validation may also provide context around plate positional variability. It is recommended that a single sample prepared with negative control, drug, or positive control (dependent on assay format) from a single preparation to be tested across a full plate. Signal readout from all wells in the assay plate should be used to calculate %CV. For acceptable assay performance, the recommended %CV should generally be ≤ 30%. The complexity of the assay format should be considered when setting a target on %CV. Alternate approaches, as predefined in the validation plan (e.g., assessment of %CV, minimum/maximum values from a combination of rows and columns or concentric circles), are also acceptable.

Partial Validation

The extent of validation depends on the stage of product development and the risks of consequences of immunogenicity to subjects associated with the therapeutic protein product. Per the 2019 FDA Guidance (5), a partial validation involving assessments of assay sensitivity, specificity, precision, cut point, and drug tolerance may be adequate for the earlier stages of clinical development, whereas for high-risk products, full validation before any clinical studies may be indicated. While NAb assays are expected to be fully validated and implemented at the time of phase III pivotal studies, the timing of NAb implementation should be based upon the overall immunogenicity risk assessment and assay strategy (3). There are cases when this analysis is indicated earlier in product development, such as in the case of use for patient stratification and enrollment or to better understand neutralizing antibody effects on exposure, safety, and efficacy that can inform the course of drug development. In addition to the text in the FDA guidance, it is common to perform partial validations when a variable in the assays is modified. For example, when a new disease state is introduced, a critical reagent is changed, or the assay is optimized for better performance. In these cases, partial validations would include a subset of validation experiments to characterize any changes to the method performance and re-set any associated acceptance criteria. It should be noted that the regulatory guidance addressing immunogenicity (3,4,5,6,7, 20), cross-reference numerous associated guidances (48,49,50,51), and scientific judgement should be used when applying principles across guidance.

Cross-validation

Current bioanalytical method validation guidance addresses the performance of and rationale for cross-validation of PK (48,49,50,51) and immunogenicity assays (5). The FDA’s Immunogenicity Guidance (2019) equates the term reproducibility to the term cross-validation and states that it is needed when more than one laboratory will be used to assess samples. Reproducibility testing is meant to establish the comparability of the data produced by each laboratory and includes sensitivity, drug tolerance, and precision assessments in each laboratory. In addition to lab-to-lab comparability testing, we would also like to mention that comparability testing may be done under circumstances when method or platform changes occur. Per the ICHM10 guidance, cross-validation is needed to demonstrate how the reported data are related when multiple bioanalytical methods and/or multiple bioanalytical laboratories are involved. If we take these two concepts together, there could be two scenarios:

  1. 1.

    The validation data from each lab are deemed comparable, i.e., within limits set based upon scientific justification, and may be combined to support special dosing regimens, or regulatory decisions regarding safety, efficacy, and labeling.

  2. 2.

    The validation data from each lab are not comparable but the relationship between the two methods/labs has been established through cross-validation experiments. In this case, full validations in each lab are needed to support the clinical data generated in each lab, and the data across labs may be compared to support special dosing regimens, or regulatory decisions regarding safety, efficacy, and labeling.

These scenarios are supported by the principles set forth in the ICHM10 and generally apply to NAb assays. We recommend consultation with health authorities if there are special considerations not covered in the current guidance.

Here, we aim to provide practical examples on how comparability may be assessed for a validated NAb assay. Before reproducibility (cross-validation) assessments can commence, the assay will have to be transferred to the second laboratory. Variables will include analysts, standard lab supplies, potentially equipment specifications, and newly prepared reagents including conjugations of critical reagents, blank matrix, and controls. Ideally, a method is transferred to a secondary laboratory with the original reagents to reduce differences in performance. Optimization of the method may be needed in the secondary lab to further reduce lab-to-lab differences. If the method is fully validated in the originating lab, then a partial validation of the transferred method in the secondary laboratory is used to establish the comparability of precision, sensitivity, and drug tolerance between labs (FDA 2019 guidance (5)). To effectively establish these parameters, a lab-specific CP and associated acceptance criteria are frequently adopted. Additional parameters, such as selectivity and stability, may not be repeated given the known performance of the molecule in the matrix as reported in the originating lab’s validation report. If a new CP is established, depending on the degree of difference of the newly established CP, a second laboratory may choose to re-evaluate selectivity, similar to the approach taken for disease-specific CPs. When sensitivity, drug tolerance, and precision are deemed to be comparable, it is likely that the positivity rate will also be comparable across laboratories. Several publications discuss comparability assessments for immunogenicity assays and can provide further reference (52,53,54,55,56). Herein, we propose a practical approach for the comparability assessment of NAb assays.

Controls

Controls for a NAb cross-validation may minimally include negative control (NC), ligand control (LC), and/or drug control (DC), positive controls (PC), and potentially patient samples. Ideally, during a cross-validation, the same source reagents and samples are analyzed across laboratories, including but not exclusive to the NC, LC, DC, and PCs, to reduce variables. When it is not possible to share reagents and samples across laboratories, for example, if there are distribution or shipping constraints such as commonly encountered in some Asian Pacific regions, the secondary lab should assess the new NC, LC, DC, and PCs for similar performance to the original lab. If marked differences in sensitivity are noted and optimization of critical reagents has been performed to attempt to bring the methods into alignment, the concentrations of controls may be adjusted as needed. Performance may be evaluated by raw response limits or ratio limits (LC/DC, DC/NC, PC/DC). In addition to controls, study samples or spiked surrogate samples, representative of study samples, may also be evaluated in cross-validations. If clinical study samples are to be used in cross-validation, the sponsor should ensure the appropriate informed consent is in place to allow this evaluation.

Criteria for Assay Comparability

There are no defined quantitative criteria to assess comparability for NAb assays; thus, the sponsor should use scientific justification in the discussion of the reproducibility results specifically addressing similarities and differences in precision, sensitivity, and drug tolerance and how this may impact clinical results across studies, methods, or labs. When using study samples to assess comparability, criteria could also include a positive/negative rate where, for example, ≥ 80% of the samples, at low and high levels as appropriate, had concordant results. If enough data exists to do so, it may also be suitable to discuss the clinical relevance of any noted differences. Cross-validation results should be reported in the method validation summary table (Table II), described in the validation report or associated addendum, and should include mitigation plans as applicable to address significant differences in sensitivity, drug tolerance, or potential associated clinical results.

Discussion

NAb assays are critical to characterize the immunogenicity of biologic therapeutics and to understand its clinical relevance. Different formats (e.g., cell-based or ligand binding) can be used for a NAb assay, depending on the type of biologic and its immunogenicity risk. An immunogenicity bioanalytical strategy, which includes target performance criteria for the NAb assay, should be established before assay development, based on the molecule’s risk assessment and the specific clinical program.

As with ADA assays, it is important to consider the quasi-quantitative (if titer is reported) or qualitative (positive/negative) nature of the results, as there is no authentic calibration reference standard and surrogate positive control(s) are used to characterize the method. Thus, clinical relevance must be established through careful analysis of the results in the context of their relationship to applicable clinical endpoints.

A sound understanding of the factors that impact results and how to address them is paramount to the method development and subsequent validation of NAb assays. Herein, we provided thoughts and recommendations on how to validate NAb assays, taking into consideration important factors such as sensitivity, drug tolerance, and reproducibility. Recommendations are also given on how to document the validation results and how to approach cross-validation and bridging of critical reagents.

The recommendations provided herein will enable the preparation of a self-standing method validation report that will allow a reviewer to assess if the assay is suitable for the proposed application. It can be helpful to provide a brief summary of critical findings from the method development phase in the introductory section of the method validation report such as rationale for choice of sample pre-treatment steps, critical reagents source, method of preparation of the surrogate positive control antibody, and a list of batch numbers for all critical reagents.

Since the validation summary tables are intended to be updated throughout the progression of a clinical program, from first in human through filing, it may be most practical to maintain the summary tables and any associated addendum independently following the initial validation report for inclusion in the ISI which also includes clinical relevance of the immunogenicity data and is submitted with each filing (57).

NAb assays for newly emerging therapeutic modalities may have unique attributes not fully addressed in this manuscript. In these cases, a scientific-based approach should be taken, and the associated justification should be provided. For example, in the case of therapeutics with multiple mechanisms of action such as a combination therapeutic formulation or bispecific therapeutic, an assessment should be done to determine the NAb strategy. It may be necessary to utilize separate PCs to different domains of the therapeutic and/or to develop multiple NAb assays. For select gene therapies, NAb assays against multiple subunits of the drug product such as the transgenic-expressed protein or enzyme and the delivery vector should be developed. These products are also recognized to have a relatively higher prevalence of pre-existing antibodies which need mitigation mechanisms in establishing a suitable NC and CP (58, 59). In some cases, a NAb assay may be developed alongside of the therapeutic as a companion diagnostic (CDx). This paper does not cover the regulatory requirements and recommendations associated with the development and filing of CDx which are covered under separate regulatory guidance (60).

NAb data are used in concert with other applicable study endpoints, such as those used to understand exposure, safety, and efficacy. In selected cases, where there is a highly sensitive PD marker and/or an appropriately designed PK assay that generate(s) data that inform clinical activity, it may be possible to use these in lieu of a NAb assay. This should be taken under consideration when developing the comprehensive bioanalytical strategy.

Finally, in-study data is routinely used to supplement the limited set of validation data, and assay parameters may be modified over time, with adequate documentation. This may apply to the LPC level selected, the assay limit ranges, or CP; selectivity would also be re-assessed in new disease indications, and the measured drug and target levels could better inform tolerance needs.

Conclusions

The purpose of this article is to improve consistency, clarity, and completeness of information presented in the method validation report for NAb assays by building on experience gained to date by industry and regulators. The recommendations are intended to facilitate the processes for preparation and review of the method validation report by providing model method and validation summary tables, in conjunction with practical advice on populating the data fields.

This includes a summary of the methodological details (Table I) and defines the requisite data fields to be completed for each relevant assay performance criterion (Table II). These formats have been designed to meet current regulatory standards and industry practices. Given the breadth of assay formats, some data fields may not be relevant and can be eliminated. Similarly, there may be omissions in these summary tables that should be added to fully address specific products, NAb assay formats, and associated bioanalytical strategy. The tabular formats are intended to be updated to reflect the evolution of the method during clinical development and the post-authorization lifecycle. All relevant information should be accessible for reference by health authorities to assess under the appropriate context, in tandem with the full body of scientific and clinical data within the ISI and other pertinent submission sections.