Introduction

This workshop report provides an overview of the discussions at the Land O’Lakes Microsampling workshop organized by the American Association of Pharmaceutical Scientists (AAPS) and held as a virtual event on the 7th of July 2020. In 2009, a workshop was organized to discuss methodology, implementation, and best practices for the adoption of dried blood spot (DBS) sampling within the pharmaceutical industry (1). Over the past decade, the utility of microsampling has evolved as demonstrated by its broader adoption for the collection of small volumes of blood samples in both non-clinical and clinical studies supporting the drug development continuum (2,3). Additionally, significant advancements have also been made with the introduction of novel and innovative blood sampling techniques and devices focused on patient-centric sampling, where a blood sample can be collected without the need for a visit to the clinical site or healthcare provider, a need that has become critical especially in the current COVID-19 pandemic (4). Given that the drug development continuum is highly regulated and governed by numerous guidelines and standards, moving away from the default or standard blood collection techniques and introduction of novel techniques requires demonstrating equivalence or establishing correlation between the default or standard versus the novel approach.

The workshop was based on five topics identified by the AAPS Microsampling subgroup that were considered important for the broader adoption and advancement of microsampling during drug development. The focus was on enabling broader adoption of microsampling supporting patients’ needs, convenience, and the transformation from clinic-centric to patient-centric drug development. In order to enable a robust discussion during the workshop, voice-recorded presentations were prepared by each speaker and distributed to the participants prior to the workshop. Each presentation included questions to initiate the discussion. The discussions captured the current challenges and generated resolutions, paths forward, as well as caveats. The five topics and the discussions are detailed below.

Selection of Blood Sampling Site—Does It Matter

Blood samples are routinely collected via direct puncture to a vein (venous blood), most often located in the antecubital area of the arm or the back (top) of the hand. Venous blood is deoxygenated blood, as opposed to arterial blood, and is the specimen of choice for most routine laboratory tests and for established laboratory reference ranges. The use of venous blood is also the default source for analyzing circulating drug and metabolite concentrations during drug development. Blood collection is performed by a phlebotomist using a procedure that has been established since the introduction of vacutainers in 1947 by Joseph Kleiner. Collection of large volumes of blood (several milliliters) was required to achieve the analytical sensitivity (quantification limits) based on the instrumentation and technologies available over the past decades. Alternatively, the collection of capillary blood is a viable option when small volumes of blood are desired (i.e., microsampling) and has been practiced for many decades for applications such as neonatal screening using blood from a heel-stick (introduced in the 1960’s by Dr. Robert Guthrie) and finger-stick blood for blood glucose monitoring at home (especially following the introduction of easy-to-use home glucose meters in the 1980’s).

Capillary blood is collected from the capillary bed that consists of the smallest arteries (arterioles) and veins (venules) and therefore is a mix of both venous and arterial blood. Advances in analytical technologies have enabled the use/need of very small volumes of blood/plasma/serum for analysis. Current analytical methods used during drug development for the quantification of drugs, metabolites, biomarkers, etc. typically use volumes in the range of 10 to 25 μL and therefore require the collection of much smaller volumes of blood than what the vacutainers are designed for. Recent advancements in blood sampling technologies have introduced several techniques and devices for the collection of capillary blood samples from a finger stick or from the upper arm (and potentially from other parts of the body).

Data comparing venous blood versus capillary blood is sparse, most likely due to the fact that the concentrations were equivalent. There are only a few published examples comparing drug concentrations in venous blood versus capillary blood (5,6,7,8,9). Other documented examples include the comparison of glucose concentrations between capillary blood and venous blood (10,11). While these examples showed that the concentrations between capillary and venous blood were comparable, exceptions include the reporting of approximately 10% higher glucose concentrations in capillary blood compared to venous blood (11), and higher concentrations of olanzapine in capillary blood compared to venous blood which was not considered clinically meaningful (5).

Since venous blood collected into a vacutainer has been the standard practice, there is a need to understand the correlation between the concentrations in venous blood versus the concentrations in the microsampled blood, especially when they represent different matrices (i.e., plasma/serum from venous blood versus microsampled blood). Establishing correlation enables the transformation of microsampled concentration data to venous concentration equivalents, or vice versa, if needed (12). Since there is no distinction made between venous blood collected via standard venipuncture or blood collected via a peripherally inserted central catheter (PICC line) or a central line catheter, the question was raised if the same should apply to capillary blood collected from different locations (dried blood from finger stick versus dried blood from elsewhere on body) or using different techniques/devices (generating the same type of sample).

Overall, there was general agreement that blood is blood, but there was value in establishing correlation between venous blood (the gold standard), which is analyzed as plasma or serum, versus microsampled capillary blood (which is analyzed as blood). However, there was no rationale to compare capillary blood concentrations collected using the same device from different body sites (i.e., finger stick blood versus capillary blood from different body location) or to compare concentrations collected between different microsampling devices if the devices are collecting the same matrix (both collecting dried blood, e.g., Mitra VAMS (volumetric absorptive microsampling) and Tasso-M20). The bioanalytical method(s) should be validated for the techniques/devices used in each study (e.g., dried blood from Mitra VAMs tips and dried blood from Tasso-M20 tips) to establish all relevant assay validation parameters including stability in the relevant collection device. The ability to allow the sampling of capillary blood from locations besides a finger stick can be extremely beneficial, especially in pediatric patients and critically ill patients, as well as from a convenience perspective when operating in home sampling situations. Empirical evidence suggests some patients may prefer the use of a non-finger stick collection, if that is an option. The need for establishing correlation would be less important (or not needed) if the blood sampling was being conducted only for the purpose of evaluating patient compliance/adherence or for exploratory reasons (i.e., for the purposes of internal decision-making during drug development).

One of the concerns with capillary blood is that the sample may be contaminated with interstitial fluids and may bias the measurements; however, such examples are not common. A recent clinical application comparing self-collected microsamples from a finger stick showed minimal to no dilution effect (13). It was generally agreed that “dilution” of the sample due to interstitial fluid is negligible and if needed this could be totally avoided by not collecting the first drop of blood (when possible).

When Do You Need a Bridging Study

The 2018 FDA Bioanalytical Method Validation guidance addresses the use of dried blood spots, one of the microsampling technologies, and the need for conducting “correlative studies with traditional sampling during drug development” (12). The guidance also encourages sponsors to seek feedback from the appropriate FDA review division early in drug development. It should be noted that correlative studies are commonly referred to as bridging studies within the drug development vernacular. “Whether a bridging study is needed or not” could depend on various factors—as described below—and can be raised as a question during the interactions with the FDA and other regulatory agencies. While it may be premature to provide guidance to accommodate all situations, the workshop was intended to highlight and discuss some common situations and provide general consensus on the need for bridging studies and demonstration of correlation. Perspectives and considerations on the adoption of microsampling and the conduct of bridging studies during clinical development was provided in 2014, published by the Innovation and Quality (IQ) Consortium Microsampling Working Group, and guidance on the adoption of microsampling during non-clinical toxicokinetic studies was provided as the ICH Guideline S3A Q&A in 2017 (14,15).

The objective of conducting bridging studies is to establish a correlation between two measurements, to understand the limitations of the new assay, and ensure that the drug concentration measurements support the pharmacokinetic (PK) interpretation between the two methods. The following scenarios describe and discuss typical situations encountered during drug development.

The first scenario describes a situation where different sample matrices are used in the traditional assay and the new assay. A common case here is to change from a plasma assay to a dried blood assay. A frequently encountered example will be to enable the generation of PK parameters from pediatric trials, especially when involving a neonate group, where blood volumes are limited and implementing a microsampling technique such as dried blood spot (DBS) sample collection could potentially enhance patient enrollment. Adopting DBS will require establishing the blood-to-plasma ratio to accommodate the wide range of hematocrit (HCT) in neonate patients, as well as including the appropriate HCT range in the validation experiments to meet the FDA guidelines (12). Additionally, ex vivo bridging can be demonstrated with QCs prepared in contrived matrices, where the QCs are divided to make both DBS and plasma samples. The concordant results from this test can add additional assurance of in vivo bridging in clinical studies. Conducting bridging studies in neonates and younger pediatric populations is extremely challenging due to blood volume limitations. Workshop discussions suggested that additional bridging studies involving neonate patients would not be needed if a bridging study had been performed previously with patient samples in higher-aged pediatric studies or in adult patient studies. However, it was recommended to obtain a few representative bridging samples, if and when possible, from the actual study patients (at least from the older age group pediatric patients) in order to demonstrate that the correlation is maintained. Early communication with the appropriate regulatory agency is critical in this situation. The microsampling process could be built in the study protocol or added in the protocol via an amendment. It was recommended that highlighting the microsampling approach (and requesting feedback) in the cover letter would be a good approach to gain direct feedback from the agency.

The second scenario represents a situation where different sample matrices are used but there is no need for transformation of the measured concentrations. An example is the traditional use of blood lysate assays versus DBS assays, when blood concentrations are more pharmacologically relevant. The objective here is to use DBS for late-stage clinical studies since DBS is preferred for broader implementation across clinical sites compared to blood lysate, which may be prone to errors resulting from the dilution when preparing blood lysates. Overall, DBS is a simpler blood sampling technique that can be implemented even in resource- and infrastructure-limited sites. If DBS was already used in discovery toxicokinetic (TK) studies, bridging studies could be conducted in non-clinical GLP studies to correlate blood lysate versus DBS, and the IND filing could use TK parameters derived from DBS samples with blood lysate concentration as supporting information. While it seems that either or both methods (blood lysate and DBS) could be used during clinical development, it is prudent to get feedback from the regulatory agency early in the drug development process. This could be done during the IND submission, seeking specific feedback about whether additional bridging is needed during clinical development. Workshop discussions recommended that there was no need to conduct bridging if the same technique used in preclinical was used during clinical development, but changing from a “wet” matrix (blood lysate) to a “dry” matrix (DBS) will require bridging to establish correlation (14).

The third scenario describes a situation where the same sample matrices are used for both methods (i.e., traditional method and microsampling method) but the sample collection processes are different. A common example is the collection of microplasma sampling versus traditional plasma sampling. This will involve the inclusion of assay validation experiments to cover all study aspects, especially the evaluation of sample transfer, storage, and freeze thaw cycles for the microsamples. If desired, bridging of the two techniques (venous plasma versus microsampled plasma) could be performed in a non-clinical study to demonstrate equivalency of the methodology. Additional investigations would be needed in certain situations (disease types, patient populations) to evaluate the impact of the microsampling on study endpoints especially if they involve hematologic endpoints.

How Do You Demonstrate Concordance and Decide on Sample Size for Correlation/Concordance

Statistical tools are employed to explore the relationship between drug concentrations derived from blood samples obtained by both microsampling and traditional sampling methods, analyzed in an appropriate bioanalytical assay and its overall pharmacokinetic parameters (12,16,17). They assist in defining ways to show that the two sets of results obtained by different methods are alike, although not exactly the same, or how different they are, with a definition of an “acceptable difference” (17).

A current practice for determining concordance/comparability is to compute the difference in concentrations for each data pair and express this difference relative to the average of the two results. The methods are considered comparable if the relative percent differences for at least 2/3 (67%) of the samples are within ± 20% for a small molecule or ± 30% for a large molecule biotherapeutic.

Statistical approaches that provide more objective criteria(ion) for determining concordance/comparability include (1) scatterplots with the Identity Line, as well as a regression line such as a Deming (18) or Passing-Bablok (19); (2) Bland-Altman (B-A) scatterplots with an estimate of bias and confidence limits to assess accuracy (20); (3) a modification of the B-A plots with limits of agreement “LoA” for a precision assessment, along with an acceptance range for the bias limits and LoA (21); and (4) the standard bioequivalence rule (22).

In order to standardize an approach for describing relationships between results, the statistical tools employed need to be easily executable. They need to be applied to an adequate number of samples that cover the entire calibration curve range to maximize power. Lastly, in the case where it is desirable to demonstrate comparability, a decision needs to be made as to what “comparable” means quantitatively.

A scatterplot with the test (y-axis) and reference (x-axis) concentrations and the “Identity Line” (y = x; slope = 1 and intercept = 0) is the simplest way to describe the nature of the relationship between the data pairs and to assess potential bias. A more objective evaluation adds a Deming or Passing-Bablok regression line, available in some commonly used software packages. The 95% confidence intervals (CI) for the slope and intercept are tests of whether these parameters include their hypothesized values (1 and 0, respectively). Since these limits can be very narrow when the data pair variation is low, an alternative approach would be an “acceptance range” within which these limits should fall (e.g., the 95% CI for the slope should be within 0.9–1.10). An added value of these tools is that, in the case of proportional changes in concentrations between methods, the regression equation can be used to adjust concentrations from one method to the other.

In order to correctly apply these tests, certain theoretical assumptions need to be satisfied: while Passing-Bablok regression does not assume normality of the data distributions (Deming does), both methods assume knowledge about the underlying variance structures for the two sets of data, and both assume that the line has no curvature. A simple regression residual plot, including the B-A plot, can address concerns around curvature.

If available, the concordance correlation coefficient (CCC) is an excellent measure of agreement, looking at the relationship of the data pairs relative to the Identity Line. It is better than Pearson “r” which measures strength of relationship rather than agreement. Interpretation of the CCC is similar to that of the Pearson “r” since it is scaled from − 1 to + 1, where strong agreement is associated with values near 1. Confidence limits can be calculated to show how precisely the CCC is estimated.

The B-A plot, an adaptation of the Tukey mean-difference plot, is a way to explore the relationship between sample spread (represented on the y-axis) versus location (represented on the x-axis). It nicely follows the Deming regression since it converts interpretation of the differences around a 45-degree diagonal line to interpretation of differences around a horizontal “zero” line. So it is a kind of residual plot for the Deming regression that can show trends suggesting curvature. The standard method includes an estimate of average bias, with 95% CI, which provides an estimate of accuracy. A modification suggested in (21) includes 67% limits of agreement (LoA) and a proposal for an acceptance rule that if both the bias CI and the LoA fall within a pre-defined acceptance range (e.g., 20% or 25%), the methods are comparable. Both versions of the B-A plots are easy to compute. The modification that includes 67% LoA is inspired by the 4-6-15/20 rule (23). The premise is that if the bias (accuracy) and agreement (precision) lines are within x%, then the two sets of results can be considered comparable.

Workshop discussions led to the conclusion that the use of X-Y scatter plots with appropriate assessment of concordance and B-A plots with appropriate CI/LoA were the most broadly (and easily) applicable techniques to assess comparability/concordance.

The ability to make good decisions using the methods described above depends on the power, which is controlled by estimated mean difference of interest, variability (%CV), and sample size. Table I shows the minimum number of samples required to show differences of 10–20% between methods, for different intra-sample %CVs. The calculations are based a paired t test and are shown in Table I.

Table I Minimum Number of Samples Required to Show Mean Differences Between Methods

Application of the standard bioequivalence (BE) rule using log-transformed concentrations has been proposed for providing an objective, statistically based definition of comparability, especially helpful when incurred samples are used. Referred to as the “two one-sided t test approach,” the difference in log means for test and reference along with the 90% CI for the log mean difference are computed in the context of a repeated measures analysis of variance model. The antilog of these values yields the ratio of geometric means with 90% CI. The BE rule states that methods are comparable if the 90% CI for the ratio of geometric means falls within the acceptance range. This range is generally ± 20% for BE studies, but if powered for appropriate sample size, alternative ranges can be utilized. Calculation is straightforward, but use of the method requires an accurate estimate of sample size, which can be based on incurred sample data or the intra-run %CV from the validation QC samples. Table II shows, for various %CV and expected mean differences between methods, the approximate minimum sample sizes needed to assure that the 90% CI for the ratio of geometric means falls within ± 20% (24). A different set of sample sizes is needed for different equivalence ranges as shown in Table II.

Table II Minimum Number of Samples Needed to Meet the BE Rule (20% Equivalence Interval)

Allowing for up to a 10% difference between test and reference, the recommendation is to use between 30 to 40 data points at minimum, where the data points to be collected come from a larger number of individuals (i.e., two data points each from 20 individuals as opposed five data points each from eight individuals).

Patient-Centric Sampling—Opportunities and Challenges

The patient-centric sampling section focused on opportunities and challenges. New technological innovation can enable more patient-centric approaches like remote sample collection at home, which could revolutionize the way clinical trials are conducted (4). Remote sample collection could shift the clinical trial paradigm from site centric to patient centric. This “bringing the trial to the patient” model has many potential benefits for both the trial participants and sponsors. At-home sampling capability could provide clinical trial enrollment opportunities to individuals living in remote or underdeveloped areas. Participants would not have to be geographically co-located with a trial which could enhance trial enrollment and retention. Currently trial data collection is limited to site visits, which can be very sparse in late-phase trials. Implementing remote sampling could enable more robust data sets for drug development decision making.

Dried blood spot technology has been widely and successfully implemented in healthcare for newborn screening, therapeutic screening, and pharmaceutical clinical trials (2). This proves the utility of both microsampling and the collection of dried matrices which are amenable to at-home sample collection. New devices are being developed to enable simple, painless sample collection (25,26). In addition, the COVID-19 pandemic molecular and antibody testing has highlighted the need for collecting biological samples outside traditional settings as seen with sample collection at drive-through facilities, as well as the in-home setting to provide critical infectivity data.

When implementing remote sample collection, there are three regulatory components to consider: (1) are there specific safety concerns with the device, (2) what is the quality of the endpoint assay and the quality of sample collection, and (3) what is the acceptability of the data and how will the data be used. Data may be used for inclusion/exclusion criteria for treatment, primary or secondary endpoints for inclusion in a filing, or as exploratory endpoints for internal decision-making. The quality of the collected sample and whether the collection is supervised or unsupervised may impact how the data can ultimately be used.

Adoption of new technology can be disruptive to existing operational processes. Although there are many opportunities for patient-centric sampling, there are challenges in implementation both logistical and scientific. Training is a huge component, and this includes training of both sites and participants to ensure sample collection quality. This also involves developing and translating training materials into multiple languages for large global trials. Establishing chain of custody of the sample is critical and was a major concern in the session discussion since ensuring that the sample was collected and handled correctly is critical for data accuracy. The date/time of sample collection must also be captured remotely and integrated into existing internal databases. Although there are many newly developed devices, many of these are not commercially mature, which makes implementation difficult.

In addition, new assays must be developed on alternative sorbents with high sensitivity needs to accommodate the small sample volumes. Since these samples have a potential for exposure to varying environmental conditions like heat and high humidity, sample stability must be assessed and established under many conditions. Although there are barriers to implementation, the discussion focused on how patient-centric sampling opportunities make the development of new processes worth the investment and that sponsors need to take the first step and learn.

A major focus of the discussion was the need for remote safety laboratory assessments and the need to build capability beyond testing for drug concentrations. This is critical for developing site-less clinical trials. Remote safety lab testing would need partnerships between sponsors, device manufacturers, and companies responsible for commercial assays. This may need a consortium to speed the development of these approaches.

What Else Can We Do with Microsampling?

Currently, the uses of microsampling technologies such as DBS and VAMs have been successfully applied to nonclinical pharmacology/toxicology and pediatric clinical studies, where the collection of small volumes of blood is advantageous. However, the current clinical trial/drug development paradigm is limited in some respects. These studies are conducted as discrete units, under highly controlled conditions, with limited patient enrollment. Consequently, they provide data on drug behavior which in a sense is contrived or derived under the most ideal conditions. For instance, drug-drug inhibition studies where maximal inhibitor drug doses are employed represent the worst-case scenario. However, the questions “are patients chronically exposed to maximal inhibition?” and “should patients be dosed adjusted for drug-drug interactions (DDI) differently over time?” cannot be answered with our current testing paradigm. Further, many patients who would be good candidates for the therapeutics being developed cannot be included simply because of geographical isolation from the study centers where clinical trials are conducted. Therefore, the development of microsampling technologies, especially patient portable technologies, offers the possibility of a very useful adjunct to the current drug development paradigm.

Developing the ability to send patients home with a sampling device would allow for greater inclusion of patients and generation of data to better understand the behavior of new drugs that we develop. Such technologies would allow us to ask and answer more (and perhaps better) questions about how DDI affect patients, or how their diet impacts therapy, or whether their compliance habits come into play. Then we may be better able to adjust doses for these considerations. Overall, in conjunction with traditional drug development, portable microsampling technologies offer the potential for greater patient individualization of therapy by helping us better understand the behavior of the drug and the patients, and via therapeutic drug monitoring, mediate more effective interventions under more realistic conditions of daily life.

The conversation that followed centered basically on three questions: what would the FDA think of such approaches, who would be the interested parties, and how could they be engaged in developing these ideas? Dr. Booth replied that the FDA is not averse to the development and use of new technologies. Naturally, there will always be questions about the appropriate validation of these methods to ensure safe use and the generation of reliable data. These “nuts and bolts” issues have always been successfully addressed, so this should not be viewed as impediment to the development of new science or approaches. Interaction of the bioanalytical community with clinical and clinical pharmacology communities was encouraged, because, as the ultimate users of drug concentration data, they have a vested interest in these technologies and they are crucial to the discussions of how to capitalize on these technologies in drug development. In these interactions, the bioanalytical community plays an important role in raising awareness about the technical possibilities regarding microsampling. Whether these discussions could be leveraged by consortia such as the IQ Consortium or should be mediated by professional groups such as AAPS was discussed but remains an open question.

Conclusions

As an adjunct to traditional drug development, microsampling, especially portable technologies, represents an exciting means to better understand drug behavior in a more realistic daily setting, enhance individualized patient medicine, and include a greater number and variety of patients who can benefit from these new therapies. Broader adoption of microsampling during drug development, as well as general healthcare has been slow and occasional. This could be partially attributed to the fact that new technologies are disruptive to established processes, as well as the “perceived” lack of “validation” of such. This workshop discussed key issues that were identified by the members of the AAPS microsampling subgroup and provided recommendations and additional guidance ranging from understanding capillary blood, to how and when to establish correlation and using the appropriate number of samples needed in statistical analysis, to overcoming challenges with patient-centric-sampling, and the need/benefit for continuous collection of data and integration with the tools of tomorrow.