FormalPara Key Summary Points

Why carry out this study?

Excessive daytime sleepiness associated with obstructive sleep apnea or narcolepsy can impair attention and vigilance, which may lead to impaired driving performance and increase the risk for motor vehicle crashes.

Evaluating the effects of treatments for excessive daytime sleepiness, such as solriamfetol, on task performance is challenging because of differences in sleep history and sleep behavior between individuals.

These analyses explored the use of the Sleep, Activity, Fatigue, and Task Effectiveness (SAFTE) biomathematical model as a proxy for healthy controls using psychomotor vigilance task data from two placebo-controlled clinical trials of solriamfetol.

What was learned from the study?

This study represents a novel application of the SAFTE biomathematical model to approximate task effectiveness in healthy controls in sleep disorder research and provides valuable lessons that may optimize future research.

Future studies should perform a priori power analyses for model-tested outcomes and use sleep measures that capture sleep fragmentation characteristic of sleep disorders for sleep input (e.g., total sleep time rather than time in bed).

The use of SAFTE to predict performance in patients with narcolepsy may be limited due to the model’s assumption of a normal relationship between sleep duration and changes in performance.

Introduction

Excessive daytime sleepiness (EDS) is a core symptom of narcolepsy [1] and has been reported to affect between 47% and 87% of patients with obstructive sleep apnea (OSA) at the time of their diagnosis [2, 3]. Further, EDS is reported to persist in 9–22% of patients with OSA despite treatment with continuous positive airway pressure [4, 5]. EDS associated with sleep disorders can impair vigilance and attention [6] and increases the risk for occupational injuries [7] and motor vehicle [8] crashes. Indeed, individuals with either narcolepsy or OSA are at increased risk of traffic collisions [9, 10], and a recent study of patients with OSA found that the presence of sleepiness rather than OSA severity (determined with the apnea-hypopnea index) determined this risk [11]. Wake-promoting agents have been demonstrated to reduce EDS in narcolepsy and OSA, and have been shown to improve real or simulated driving performance in controlled clinical studies [12, 13].

In clinical settings, attention and vigilance can be examined using a variety of measures, including the psychomotor vigilance task (PVT), a measure of sustained attention that correlates with subjective sleepiness [14]. Furthermore, sleep-related declines in PVT function correlate with poor performance on driving measures, such as lane drift [15]. Compared with healthy controls, patients with sleep disorders and sleep deprivation perform poorly on the PVT [16, 17]. However, some commonly used PVT metrics have been found to lack sensitivity, and differential vulnerability to poor performance on the PVT has been noted in a subset of healthy individuals [18, 19]. Additionally, measures commonly used to assess performance in clinical trials rarely overlap with those used in occupational settings.

When evaluating treatments for EDS secondary to sleep disorders, comparisons can be made between patients and healthy controls, and/or between patients receiving the treatment of interest and those receiving placebo, to determine the impact on performance in the target population. Ideally, comparisons would be made against a control group with the same sleep history; however, current approaches do not control for differences in sleep behavior (as either a function of disease history or drug effect) between individuals. Because differences in sleep behavior can affect outcomes of interest, such as measures of attention [20], the ability to isolate the specific effects of treatments on such outcomes can be confounded. Integrating biomathematical modeling, as used in occupational assessments of performance, may help compensate for shortcomings related to control of sleep behavior in clinical research.

The Sleep, Activity, Fatigue, and Task Effectiveness (SAFTE) biomathematical model has been used to predict the effects of fatigue on performance [21] in healthy individuals [22]. This model predicts cognitive performance based on several input parameters, including the accumulation of sleep over time, taking into account circadian processes and time of day. Pairing the SAFTE model with the Fatigue Avoidance Scheduling Tool (FAST) has provided a means (SAFTE-FAST) to estimate and mitigate exposure to fatigue risk throughout the day. The SAFTE model has widespread use among transportation (e.g., aviation, rail) agencies and the US Department of Defense, where it allows for the construction of schedules that avoid performance impairment attributable to fatigue [22]. The SAFTE model has previously been demonstrated to predict human factors–related freight rail accident risk based on predicted fatigue-induced impairments [23]. Analysis showed that the relative risk was increased by 42% when SAFTE-predicted effectiveness scores were ≤ 77%, but was reduced by 30% when scores were > 90%. However, the SAFTE model has not been investigated in patients with sleep disorders. In the novel application described herein, the SAFTE model functioned as a proxy for healthy controls by predicting the treatment- and time-dependent effects that would have been expected had the participants been healthy controls with exactly the same sleep history as the participants from whom data were collected.

Solriamfetol (Sunosi™; Axsome Therapeutics, Inc., New York, NY), a dopamine and norepinephrine reuptake inhibitor, is approved in the USA [24] and European Union [25] to improve wakefulness in adults with EDS associated with narcolepsy (75–150 mg once daily) or OSA (37.5–150 mg once daily). Solriamfetol was shown to reduce EDS and improved quality of life in short-term (12 weeks) and longer term (up to 52 weeks) phase 3 clinical trials in participants with OSA or narcolepsy [26,27,28,29,30,31,32]. The effect of solriamfetol on real-world driving was recently examined in two randomized, crossover, placebo-controlled phase 2 trials in participants with narcolepsy or OSA [33, 34]. In both studies, solriamfetol significantly decreased (improved) standard deviation of lateral position (SDLP) at 2 h post-dose, meeting the primary efficacy endpoints. PVT and actigraphy data were also collected during these trials.

The aim of the analysis reported here was to explore the utility of the SAFTE model as a substitute for a healthy control group using predicted PVT performance based on actigraphy-derived sleep parameters compared with actual PVT performance from these two solriamfetol phase 2 clinical trials.

Methods

Study Design

This analysis used data collected during two phase 2 studies examining the effects of solriamfetol on driving performance in participants with OSA (NCT02806895, EudraCT 2015-003930-28) or narcolepsy (NCT02806908, EudraCT 2015-003931-36) [33, 34]. The SAFTE modeling analyses were planned secondary efficacy endpoints for both trials. In these trials, participants were randomly assigned 1:1 to receive solriamfetol 150 mg/day for 3 days followed by 300 mg/day for 4 days, or placebo for all 7 days (Period 1), and then were crossed over to the other 7-day treatment (Period 2). As the studies were conducted before finalization of regulatory approval or dosing recommendations, the dose of 300 mg/day was based on prior phase 2 clinical trial results [35, 36] and aligned with the maximum dose used in the phase 3 clinical trials [26,27,28, 30].

These studies were conducted in accordance with the Declaration of Helsinki of 1964, and its later amendments. The study protocols were approved by the medical ethics committee of University Hospital Maastricht and Maastricht University (www.toetsingonline.nl; OSA: NL56214.068.16; narcolepsy: NL56215.068.16), and all participants provided written informed consent. The primary endpoint for both studies was the effect of solriamfetol on on-road driving performance, as assessed by SDLP at 2 h post-dose (reported separately).

Participants

As reported previously [33, 34], participants ranged in age from 21 to 75 years, had either an OSA diagnosis per International Classification of Sleep Disorders–Third Edition (ICSD-3; American Academy of Sleep Medicine, Darien, IL, USA) or a narcolepsy diagnosis per ICSD-3 or Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (American Psychiatric Association, Richmond, VA, USA), and an average total nightly sleep ≥ 6 h. Additional inclusion criteria for participants with OSA included baseline Epworth Sleepiness Scale (ESS) score ≥ 10, mean sleep latency < 30 min on the Maintenance of Wakefulness Test (MWT), and one of the following: use of a primary therapy for OSA ≥ 1 night per week with ≤ 2 days of variation week to week, lack of use of primary therapy after a history of  ≥ 1 month of attempted use with ≥ 1 documented adjustment, or a history of surgical intervention. Participants who were unwilling to attempt to use ≥ 1 primary OSA therapies were excluded.

Procedures

Participants were instructed to take a single capsule containing their treatment once daily, within 1 h of waking in the morning, on an empty stomach, and then to wait ≥ 30 min before having breakfast. At visits on days 7 and 14, participants completed the PVT along with an on-road driving test and other measures (reported separately) [33, 34]. On test days, the PVT was administered pre-dose and at approximately 2 and 6 h post-dose. The PVT was administered over 10 min, with visual stimuli appearing at random variable intervals of 2–10 s; participants responded to a digital signal by pressing a key on a computer terminal. Participants also wore an actigraph from screening through day 14 (except during testing sessions, as data were extracted during that time) and kept accompanying sleep diaries.

Assessments and Outcomes

Participants completed the PVT by responding to a digital signal on a computer terminal via key presses, as described above. Calculated PVT measures included mean reaction time (MRT), lapses (reaction times > 500 ms or failures to respond), and errors (responses made without stimulus or false starts).

Sleep metrics, including time in bed (TIB), total sleep time (TST), and daily sleep intervals (DSIs), were measured at baseline and throughout the treatment period. Raw actigraphy data were scored using Actiware software (Philips Respironics, Bend, OR, USA) and manual scoring techniques [37]. Sleep diaries were used to assess the major sleep interval start and stop times and any naps not scored by the Philips algorithm but confirmed by activity pattern. For manual scoring, sleep diaries were compared with actigraphy data.

Task Effectiveness

Actual task effectiveness scores were calculated based on each participant’s inverse reaction time (IRT [1/RT]) as described below. Sleep intervals derived from participants’ actigraphy data—or from sleep diaries, if available, where actigraphy data were missing—were used to calculate modeled healthy control task effectiveness scores using the SAFTE [38, 39]. Participants evidenced good sleep hygiene, with minimal missing data. Task effectiveness was not modeled for participants missing data from the night before PVT testing. A SAFTE model effectiveness score of 100 indicates typical best performance during the day, based on a normal healthy population. SAFTE-FAST effectiveness predictions have been validated against IRT on the PVT [22].

Statistical Analyses

Target enrollment in the studies used to generate PVT data was based on the primary endpoint [33, 34]. SAFTE modeling was an experimental endpoint and as such was not considered for sample size calculations.

SAFTE-FAST modeled task effectiveness based on measured sleep served as a proxy for performance measures from healthy controls, referred to here as “modeled healthy control task effectiveness”. Sleep intervals exported from the Actiware software—date and time of each recorded sleep interval’s start and end—were used in SAFTE-FAST to determine a continuous estimate of modeled healthy control task effectiveness across the entire study period. As SAFTE-FAST generates a continuous estimate of task effectiveness, modeled healthy control task effectiveness scores from the time when PVTs were taken were identified using time stamp data from the PVT. Synchronized task effectiveness scores were exported from SAFTE-FAST for subsequent comparison against actual effectiveness.

Actual effectiveness scores were calculated as a percentage of a theoretical “best performance” from participants’ average IRT per PVT trial (for all PVT test sessions, actual effectiveness = IRT/3.96 × 100). Average speed (IRT) on a PVT under optimal conditions—healthy sleepers taking the test at 11:00 in the morning, following an 8-h sleep opportunity—is approximately 3.96 [40], and is assumed to represent an actual effectiveness score of 100 (a healthy participant’s typical best performance). Repeated-measures analysis of variance (ANOVA) was used to estimate the effect of treatment on actual effectiveness over time.

For additional analyses of data from model input parameters, repeated-measures ANOVA was performed to compare the effects of treatment on PVT measures (IRT, MRT, lapses, and errors). Fixed effects included treatment, time, and treatment × time interaction. Treatment order; habitual TIB, TST, and DSIs; and previous day’s TIB, TST, and DSIs were treated as covariates, while participant was modeled as a random effect. Post hoc tests (marginal means, pairwise comparisons) were performed to compare PVT measures between each trial. Mean habitual sleep parameters and sleep parameters obtained the day before PVT testing were compared across all conditions and with those obtained during the baseline period using Student’s t tests, assuming unequal variance. The anticipated risk reduction associated with solriamfetol was computed as the difference between PVT performance metrics by treatment condition. All statistical analyses were performed with Stata 15.1 (StataCorp LLC, College Station, TX, USA) and Excel 2016 (Microsoft Corp., Redmond, WA, USA). P values were not controlled for multiplicity and therefore are nominal.

Results

Demographics

For participants with OSA (safety population, n = 34; detailed demographics and disposition previously reported [33, 34]), 88% were male, mean (standard deviation [SD]) age was 51.6 (12.3) years, mean (SD) MWT sleep latency was 14.3 (7.3) min, and 85% used a primary OSA therapy.

For participants with narcolepsy (safety population, n = 24; detailed demographics and disposition previously reported [33, 34]), 54% were male, mean (SD) age was 40.4 (11.8) years, and mean (SD) MWT sleep latency was 4.0 (2.5) min (MWT latencies were based on historical data and were not collected as part of the study; n = 22).

Participants with missing sleep data (due to nonadherence or device errors) were excluded from analyses of PVT data and biomathematical modeling. Data were analyzed for 31 participants with OSA and 20 participants with narcolepsy.

Use of SAFTE Biomathematical Modeling to Characterize Task Effectiveness Changes

Participants with OSA

In participants with OSA, analysis of actual task effectiveness showed no main effect of treatment or treatment × time interaction effects based on repeated-measures ANOVA (Table 1); however, there was an effect of previous day’s sleep (P = 0.001). In a regression analysis, longer sleep the night before testing was not independently related to actual effectiveness (R2 = 0.01, P = 0.98), suggesting this relationship was nonlinear. Actual task effectiveness was lower in participants treated with solriamfetol compared with placebo at pre-dose (P = 0.02) but not at 2 or 6 h post-dose (P > 0.05 for both).

Table 1 Actual and modeled healthy control task effectiveness scores by treatment

In participants with OSA, repeated-measures ANOVA of modeled healthy control task effectiveness showed no main effect of treatment or treatment × time interaction effects; however, similarly to actual task effectiveness, there was an effect of previous day’s sleep (P = 0.04). Longer sleep was independently related to higher modeled healthy control task effectiveness (R2 = 0.47, P < 0.001).

There were no differences between modeled healthy control and actual task effectiveness scores for solriamfetol or for placebo in participants with OSA (P > 0.05 overall and for all time points) (Fig. 1).

Fig. 1
figure 1

Predicted and actual task effectiveness across PVT test sessions for participants with OSA under a Placebo conditions and b Solriamfetol conditions. Actual task effectiveness between drug conditions was compared with modeled healthy control task effectiveness using paired-samples t test, controlling for drug condition and test session. P values are nominal. aModeled healthy control vs. actual task effectiveness. OSA obstructive sleep apnea, PVT psychomotor vigilance task, SD standard deviation

Participants with narcolepsy

In participants with narcolepsy, repeated-measures ANOVA showed a treatment × time interaction effect (P < 0.001) on actual task effectiveness, with task effectiveness improving after administration of solriamfetol (from pre-dose to 2 h post-dose to 6 h post-dose) but deteriorating after the administration of placebo (Table 1). Additionally, there was a main effect of previous day’s sleep (P = 0.004), similar to that observed in participants with OSA. Longer sleep the night before testing was independently related to poorer actual effectiveness (R2 = 0.21, P < 0.001)

There was no main effect of treatment or treatment × time interaction effects on modeled healthy control task effectiveness based on repeated-measures ANOVA; however, in line with other analyses, there was an effect of previous day’s sleep on modeled healthy control task effectiveness (P < 0.001). Longer sleep was independently related to higher modeled healthy control task effectiveness (R2 = 0.47, P < 0.001).

In participants with narcolepsy, modeled healthy control task effectiveness was higher than actual task effectiveness for placebo (P = 0.009), primarily driven by PVT performance at 2 h post-dose (P = 0.03). Modeled healthy control and actual task effectiveness did not differ for solriamfetol (P > 0.05 overall and at all time points) (Fig. 2).

Fig. 2
figure 2

Predicted and actual task effectiveness across PVT test sessions for participants with narcolepsy under (a) Placebo conditions and (b) Solriamfetol conditions. Actual task effectiveness between drug conditions was compared with modeled healthy control task effectiveness using paired-samples t test, controlling for drug condition and test session. P values are nominal. aModeled healthy control vs. actual task effectiveness. PVT, psychomotor vigilance task, SD, standard deviation

Change in SAFTE model input parameters and PVT measures

In participants with OSA or narcolepsy (Electronic Supplementary Material Figs. S1 and S2, respectively), there were no differences between placebo and solriamfetol treatment in either habitual sleep measures (TIB, TST, DSIs) or TIB, TST, or DSIs measured the day before PVT testing.

There was no main effect of solriamfetol on any PVT measure in participants with OSA (Table 2; repeated-measures ANOVA, P > 0.05 for all) or participants with narcolepsy (Table 3; repeated-measures ANOVA, P > 0.05 for all).

Table 2 Results of psychomotor vigilance task in participants with obstructive sleep apnea
Table 3 Results of psychomotor vigilance task in participants with narcolepsy

Discussion

In this analysis, SAFTE modeling data (i.e., modeled healthy control task effectiveness) were used as a proxy for healthy participants in a clinical trial. This represents a novel application of SAFTE, which previously has been used to model the effects of sleep deprivation in healthy populations [21, 22]. In participants with OSA, actual task effectiveness did not differ from modeled healthy control task effectiveness with placebo or with solriamfetol, despite participants’ EDS (evidenced by a MWT mean sleep latency of 14 min at baseline). In light of this finding (i.e., an apparent lack of a deficit in performance in participants with OSA), it is unsurprising that actual task effectiveness did not differ between placebo and solriamfetol in participants with OSA. In participants with narcolepsy, actual task effectiveness with placebo was lower than modeled healthy control task effectiveness (suggesting a deficit in performance in participants with narcolepsy), whereas actual task effectiveness with solriamfetol did not differ from modeled healthy control effectiveness; actual task effectiveness improved with solriamfetol relative to placebo. Thus, in contrast to the differences in SDLP between solriamfetol and placebo (reported elsewhere), the present findings suggest that, in these participants with OSA, task effectiveness while on placebo or solriamfetol was similar to that of “healthy controls”, despite the participants’ sleep disorder. In participants with narcolepsy, however, task effectiveness while on placebo was lower than that of “healthy controls” and improved on solriamfetol to a level similar to that of “healthy controls.”

Analyses of the primary outcomes of these studies demonstrated that solriamfetol improved real-world on-road driving performance, as measured by SDLP, at 2 h post-dose in participants with narcolepsy as well as in participants with OSA. Therefore, the SAFTE model findings can be considered to be generally consistent with the primary study findings for participants with narcolepsy. In contrast, the results for participants with OSA suggest that, despite their degree of EDS, these individuals had a relative lack of impairment in PVT performance and task effectiveness scores while on placebo. Although the small sample sizes preclude any definitive conclusions, it is possible that the lack of detectable improvement may be due to a “ceiling effect.” This finding may indicate that PVT performance is not sensitive enough to detect impairment attributable to EDS in OSA in a study of this size.

There was no effect of solriamfetol treatment on any of the sleep measures recorded in participants with OSA or participants with narcolepsy. This finding aligns with data from phase 3 trials of solriamfetol in participants with narcolepsy and OSA in which sleep measured by polysomnography was unchanged with solriamfetol compared with placebo [25, 26]. However, as sleep is a fundamental input for the SAFTE model, the lack of an effect of solriamfetol on sleep measures is a key consideration in interpreting the present findings.

The present analysis found no effect of solriamfetol on PVT metrics in participants with OSA; however, participants with narcolepsy were faster (IRT) and had fewer lapses after treatment with solriamfetol compared with placebo. These results are consistent with the SAFTE findings. However, previous studies established the efficacy of solriamfetol in reducing EDS in patients with OSA or narcolepsy [26, 28, 30], and the primary results of the present studies demonstrated that solriamfetol improved real-world on-road driving performance, as measured by SDLP, at 2 h post-dose in both groups of participants. As this study was powered for SDLP rather than PVT or SAFTE, it may be that larger studies are needed to observe further differences in participants with OSA.

Several factors should be kept in mind when considering the PVT findings in particular. Most important, these studies had small sample sizes and were not powered to detect a difference in PVT scores with solriamfetol treatment. In patients with OSA, PVT scores were previously found not to differ by apnea–hypopnea index severity [41], and have been shown to be associated with subjective (ESS) but not objective (MWT) sleepiness in patients with OSA [14]. Further, PVT performance has been found to be worse in patients with narcolepsy than in patients with insufficient sleep syndrome [16].

The real-world driving performance, as measured by SDLP (an established correlate of drug- and alcohol-induced motor vehicle crash risk [42]), was improved at 2 h post-dose in both studies [33, 34]. SAFTE-modeled healthy control task effectiveness was not examined in the context of real-world driving performance in healthy controls in this study or in previous studies. The SAFTE model was not designed as a performance predictor for SDLP. If in future studies investigators intend to use a biomathematical model in lieu of a healthy control population, power analyses should be performed to reflect PVT performance or similar model-tested performance outcomes a priori. The SAFTE model has not been optimized for the prediction of performance or risk in people with sleep disorders.

While the goal for the current analysis was to use SAFTE-FAST as a proxy for healthy controls, some lessons can be extrapolated to the application of biomathematical modeling to sleep disorders. For this analysis, the model used patient TIB and assumed excellent sleep quality, as the goal of the model was to serve as a proxy for healthy controls. Patients with sleep disorders often have sleep fragmentation (e.g., because of apneas, as in patients with OSA), which may not be apparent in actigraphy data and is not considered when computing TIB. SAFTE-FAST can use TST instead of TIB for a sleep input, and users can modify the sleep quality setting to reflect disrupted sleep. Using TST or adjusting the sleep quality of sleep events could help replicate the objective fragmentation of sleep seen in sleep disorders. One area of limitation is that the model assumes a normal relationship between sleep duration and changes in performance, but this relationship may be weaker in patients with narcolepsy [43]. The weakening of this relationship may explain why longer prior day’s sleep was not independently correlated with improved actual task effectiveness in either group. The biological underpinnings that could explain the breakdown between sleep and performance in narcolepsy need to be better understood before a biomathematical model can reasonably be developed.

This novel application of the SAFTE model in a clinical trial setting highlights the need for additional metrics and thereby informs future studies. The crossover design used here strengthens the validity of its comparisons by controlling for interindividual differences between treatment conditions. Indeed, although PVT and SAFTE-FAST provide useful safety benchmarks, neither PVT nor SAFTE-FAST task effectiveness scores are used in isolation as determinants of ability to drive or perform other tasks. Aviation and other industries want to limit the amount of time spent below 77% effectiveness, which is equivalent to a 0.05% blood alcohol concentration [44]. In the present analysis of participants with OSA or narcolepsy, both populations had actual task effectiveness levels above this cutoff based on the model as implemented; however, these findings should not be over-interpreted as an equivocal statement of safety, especially given the extensive published literature demonstrating the increased risk of motor vehicle crashes in these populations [9, 10, 45,46,47,48].

There are several limitations to this study. First, the exclusion of participants who did not habitually sleep ≥ 6 h per night may limit the interpretation of the effect of solriamfetol on participants whose sleep patterns are severely disrupted by OSA or narcolepsy. Therefore, it is possible that a more broadly defined population of patients with OSA or narcolepsy with varied sleep tendencies would have yielded different results. Second, there was no washout period between treatments, thus there may have been carryover effects of solriamfetol. However, testing did not occur until day 7 of each treatment period. Solriamfetol is rapidly absorbed (median Tmax: appprox. 2 h), with a mean elimination half-life of approximately 6 h [49], which suggests that plasma levels would be negligible by day 7 after the final dose of solriamfetol. Thus, the study design effectively created a 7-day “washout” for participants who crossed over from solriamfetol to placebo. Further, a previous phase 2 trial in participants with narcolepsy did not suggest carryover effects when demonstrating the efficacy of solriamfetol using a crossover design with no washout between treatment periods [35]. Third, there are no published task effectiveness standards for patients with OSA or narcolepsy, thus there is no basis in the literature for comparison of the present results. Fourth, actual healthy control data were not collected in this study to compare against SAFTE prediction of modeled healthy control task effectiveness, or to compare sleep quality against SAFTE assumptions about healthy control sleep history or PVT performance. Fifth, PVT was the only attention measure used in the analysis. Since PVT performance does not always predict impaired attention associated with sleep deprivation, even among healthy participants [18], inclusion of additional attention metrics may have provided additional context with respect to drug effects and performance. Finally, all participants received 300 mg/day of solriamfetol in the final 4 days of the active treatment period; therefore, any impact of solriamfetol on task effectiveness would reflect effects of the 300 mg/day dose. The maximum recommended dose of solriamfetol in the US and EU is 150 mg/day.

Conclusions

This study represents a novel application of the SAFTE-FAST biomathematical model as a substitution for healthy controls in a research investigation of patients with sleep disorders. The results provide valuable lessons that may help to optimize future studies that incorporate this model to ensure the applicability of their results. For example, future studies should examine the use of sleep measures (other than TIB) that reflect the objective fragmentation of sleep seen in sleep disorders. This analysis also provides additional context for how PVT performance during a clinical trial may differ from performance on real-world performance measures, such as lane drift.