Stratum Corneum Sampling to Assess Bioequivalence between Topical Acyclovir Products

Purpose To examine the potential of stratum corneum (SC) sampling via tape-stripping in humans to assess bioequivalence of topical acyclovir drug products, and to explore the potential value of alternative metrics of local skin bioavailability calculable from SC sampling experiments. Methods Three acyclovir creams were considered in two separate studies in which drug amounts in the SC after uptake and clearance periods were measured and used to assess bioequivalence. In each study, a “reference” formulation (evaluated twice) was compared to the “test” in 10 subjects. Each application site was replicated to achieve greater statistical power with fewer volunteers. Results SC sampling revealed similarities and differences between products consistent with results from other surrogate bioequivalence measures, including dermal open-flow microperfusion experiments. Further analysis of the tape-stripping data permitted acyclovir flux into the viable skin to be deduced and drug concentration in that ‘compartment’ to be estimated. Conclusions Acyclovir quantities determined in the SC, following a single-time point uptake and clearance protocol, can be judiciously used both to objectively compare product performance in vivo and to assess delivery of the active into skin tissue below the barrier, thereby permitting local concentrations at or near to the site of action to be determined. Electronic supplementary material The online version of this article (10.1007/s11095-019-2707-3) contains supplementary material, which is available to authorized users.

. Number of tapes (columns) collected and mass of SC removed (symbols) from each subject for three ACV creams after 6 h uptake and 17 h of clearance. The results are represented as the arithmetic mean and standard deviation of the arithmetic average of the duplicate measurements in 10 subjects for each of the three products after uptake and clearance. *Significantly different from other cream applications measured after uptake or clearance in the same study (p<0.05). Notes: [1] The geometric mean for the duplicate measurements of the mass of drug in the SC is used because we expect (based on other skin permeation measurements) that this is likely to be log-normally distributed. However, it is not clear that the number of tape-strips, or the mass of SC collected on the tape-strips, would be expected to be log-normally distributed. It therefore makes more sense to use the arithmetic average of the duplicates. In practice, though, this changes minimally the numerical value of the specific data points which are plotted above. [2] For Study 2, clearance, the maximum permitted number of tape-strips (30) were taken at 57 of 60 sites. In contrast, in Study 2, uptake, and in Study 1, uptake and clearance, at only about half of the sites were 30 tape-strips removed (27 for uptake and clearance in Study 1, and 36 for uptake in Study 2).

Figure S2.
Average SC collection efficiency of the tape strips (mg/cm2) for each cream (arithmetic mean ± 90% confidence interval for 10 subjects) after 6 h uptake (filled symbols) and 17 h clearance (open symbols).
These results and the methodology employed indicate some possible procedural recommendations that might be introduced to ensure the uniformity and reproducibility of the approach.
First, the refined SC sampling technique introduced in the econazole work (1) did not measure the mass of SC on the tapes. However, a pilot study was performed that employed TEWL measurements to decide if tape-stripping had completely and reliably removed most of the SC; hence, it was unnecessary to measure the mass of SC on the tapes. A sensible suggestion, therefore, would be to quantify the amount of SC removed unless a pilot study (using TEWL, for example) had verified that the method did indeed ensure that most of the barrier had been collected.
Second, the improved protocol (1) specified that 12 tape-strips should always be taken (in fact, no TEWL values were recorded in that study until 12 tape-strips had been removed). In the present work, at only one site, on a single volunteer, were less than 12 tape-strips acquired before the procedure was stopped. A reasonable recommendation would therefore be to set a 12 tape-strip minimum for SC sampling. Nonetheless, attention should always be paid to the nature and quantity of the excipients present in the drug products tested, as there will certainly be some cases (see, for example, (2) where a formulation alters the SC and dramatically increases the amount of tissue collected). In such circumstances, to spare volunteers from inordinate discomfort, acquisition of less  Figure S3. The concentration profiles for the ACV-AT products were clearly different than that observed for the ACV-US cream after both uptake and clearance.

Calculated metrics
The mass per unit area of drug in the tape strips was determined in the k th replicated site treated with formulation i on subject j ( ijk Q ). Each formulation in the two studies was measured at 2 sites in 10 subjects; i.e., nr = 2 and n = 10. The first-order clearance rate constant (k) and the flux of drug out of the SC into the underlying tissue in vivo J were calculated from the geometric mean of the duplicate ( .  (Table S1) and for in vivo, J ij and k ij (Table S2).

Bioequivalence assessment
Bioequivalence calculations for a balanced study design with nr replicated measurements of each formulation in a total of n subjects were conducted using the following procedures for a metric determined in the k th replicated site treated with formulation i on subject j ( ijk M ).

S8
For ijk M representing a metric that is always a positive number (i.e., the mass per unit area of drug collected in the SC tape strips, ijk Q ) the BE analysis is performed by comparing the log-transformed value of ijk M , defined as ijk Z , determined in each subject and then averaging across subjects as follows: The within subject variance for a formulation i measured in subject j is calculated as follows:  Table S3.
Note that calculated values for the GMR and the lower and upper confidence intervals are the same for data transformed using natural logarithms (as described here) or base 10 logarithms (as described previously (1)) provided the anti-log step is consistent with the type of log transformation (i.e., exp(x) for natural log transformed data and 10 x for base 10 log transformed data). Here, we recommend natural log transformation for the ABE calculations to be consistent with the scaled average bioequivalence (SABE) procedure below, which is specific to the type of logarithmic transformation.

Scaled average bioequivalence (SABE) assessment
The SABE methodology is indicated when the within subject standard deviation for the reference formulation (i.e., Wi s for i = 2 in an assessment of the difference between formulations 1 and 2 for the natural log-transformed metric) is    Table S3. S12 Table S1. ACV amounts (µg/cm 2 ) and log-transformed amounts recovered from the SC after uptake and clearance in the replicate samples ( ijk Q , replicate k of formulation i in subject j ) from each of the 10 subjects for the US-C+ and US-Ref formulations (designated as formulations 1 and 2, respectively)

Power simulations of bioequivalence assessments
We evaluated the number of subjects required to adequately power the traditional ABE and SABE methods for the m = 1.25 limit by performing simulations. For each such study, the inputs of the power function are the within-subject standard deviation of the reference product ( 2 W s in an assessment of the difference between formulations 1 and 2), the between-subjects standard deviation ( I s ), the number of subjects and the number of replicates. This process is repeated 500,000 times under the assumption of bioequivalence. The value of the power is then the percentage of these trials that correctly captured the equivalence relationship between the two products. Table S6 lists Table   S7. The SABE methodology was estimated to achieve a statistical power close to 80% with 10 subjects for the products compared in Study 1 for both uptake and clearance, and for uptake in Study 2 (which involved a different cohort of 10 subjects); see Figures S5 and S6 and Table S7. Approximately 15 subjects are needed to adequately power the clearance results in Study 2 ( Figure S6). By comparison, the traditional ABE methodology is estimated to require, between 15 and 50 subjects to achieve the same power, with fewer subjects needed in the assessment of the positive control with the corresponding reference product (Figures S5 and S6 and Table S7). Increasing replication from two to three sites for each product in this study had minimal benefit, reducing the number of subjects required to achieve the same power in the SABE assessment by approximately one subject ( Figure S7). Figure S5. Estimated power as a function of the number of subjects (n) for traditional average bioequivalence (ABE) and scaled average bioequivalence (SABE) assessments of the positive control to the corresponding reference product in Studies 1 and 2 for the bioequivalence margin m = 1.25: uptake (solid) and clearance (dashed). The statistical power of 0.8 commonly recommended for acceptance by regulatory agencies is indicated. Figure S6. Estimated power as a function of the number of subjects (n) for traditional average bioequivalence (ABE) and scaled average bioequivalence (SABE) assessments of the test to reference products in Studies 1 and 2 for the bioequivalence margin m = 1.25: uptake (solid) and clearance (dashed). The statistical power of 0.8 commonly recommended for acceptance by regulatory agencies is indicated. Figure S7. Estimated power as a function of the number of subject (n) when the number of replicates for comparing the test and reference products in Studies 1 and 2 is increased from two (dashed) to three (solid) for the bioequivalence margin m = 1.25. The statistical power of 0.8 commonly recommended for acceptance by regulatory agencies is indicated. The power simulation results for m = 1.25 compared with m = 1.33 are presented in Figure S8 for assessments of the positive control to the corresponding reference product (i.e., US-C+ to US-Ref and AT-C+ to AT-Ref in Studies 1 and 2, respectively), and in Figure S9 for assessments of the test to reference product (i.e., UK-Test to US-Ref and US-Test to AT-Ref in Studies 1 and 2, respectively).
S22 Table S7 lists the expected minimum number of subjects required to achieve a statistical power of at least 80% for m equal to both 1.25 and 1.33.
The power of a bioequivalence study using cutaneous pharmacokinetic endpoints can be substantially increased by widening the bioequivalence limits from the traditional m = 1.25 to m = 1.33 for an ABE assessment. The advantage of this approach is that fewer subjects are needed to power the study. However, the disadvantage of widening the bioequivalence limits is that it essentially lowers the standard for comparability of the test and reference products. Using an SABE analysis instead of an ABE analysis, while maintaining the traditional bioequivalence limit of m = 1.25, increases the power of the study to an even greater degree than by widening the bioequivalence limits for an ABE analysis to m = 1.33. The additional power gained by widening the bioequivalence limits from m = 1.25 to m = 1.33 in an SABE analysis is much smaller than for the ABE assessment. The results presented in Figure S8 confirm that the comparison of AT-C+ and AT-Ref products in Study 2 with 10 subjects was slightly underpowered for an SABE assessment at m = 1.25, but is adequately powered at m = 1.33 (in which case the SABE assessment successfully demonstrated bioequivalence as shown in Table 4 in the paper).