Subjects and protocol
Twenty-two subjects (12 men and ten women), aged 18–27 years old, participated in this study. Recruitment was done through social media and posters at the Université du Québec en Outaouais in Gatineau (Québec, Canada), where the study was conducted. The protocol was approved by the ethics committee of the university in accordance with the declaration of Helsinki. Exclusion criteria were the presence of a neurological, psychiatric, hormonal, or sleep disorder; taking psychotropic drugs or tricyclic birth control pills; a history of alcohol or drugs abuse; the presence of an uncorrected vision disorder; and being a smoker. Also, only individuals who kept a regular sleep–wake schedule, slept regularly from 7 to 9 h at night, had a habitual bedtime between 10 PM and midnight and a habitual wake time between 6 AM and 8 AM and nonnappers were allowed to participate. People who had worked night shifts in the last three months or who had jet lag (one week of recovery for each 1 h of jet lag) were also excluded.
All subjects were pretrained on a PVT-192 (PVT) and on a fourth-generation iPod touch (iOS 7.1) that had s2P installed (version 1.2; released on October 28, 2013). The training consisted of three practice sessions on each device to familiarize the subject with both the RT tests and the equipment. Subjects were then monitored using actigraphy and a sleep diary for 7 days before the TSD protocol was applied. During this period, all subjects were asked to maintain their regular sleep–wake habits and daily activities and to abstain from taking naps, doing intense physical exercise after 6 PM, and taking nonprescribed drugs (except Tylenol). Coffee consumption was restricted to a maximum of one coffee per day (ingested before noon), and alcohol was forbidden for three days before the TSD session.
On the seventh day, starting at 8 AM (on average, about an hour after waking up in the morning), subjects completed subjective measures of sleepiness (the Stanford Sleepiness Scale [SSS], the Karolinska Sleepiness Scale [KSS], and the visual analogue scale [VAS]) and both vigilance tests (s2P and PVT) in a counterbalanced design at every even hour until 8 PM at home. Then, the testing sessions continued at the sleep laboratory under close supervision from 10 PM until 6 PM the next day, for a total of about 35 h of consecutive waking. During each testing session, a 5-min break was scheduled between the two RT tests. Each subject had to fill out a testing diary in which he confirmed the time and the order of completion of both RT tests. During the night at the laboratory, subjects were free to engage in various activities, such as reading, board games, watching movies, or using the Internet. Meals and snacks were provided to control consumption of stimulants (caffeine, sugars). A safe departure from the lab was scheduled with each subject (they either took the bus or were driven home by the experimenter).
Outcomes measures
Sleep-2-Peak
Sleep-2-Peak is an app running on Apple (iPod Touch and iPhone) and Andoid (Blackberry, Samsung, etc.) mobile operating systems. In the present study, the s2P app (version 1.2; released on October 28, 2013) was installed on a fourth-generation iPod Touch (iOs 7.1) from the Apple company, with a screen size of 7.5 cm × 5.0 cm. The app is designed to track changes in RTs over the course of the day and can be used to relate these changes to the components of sleep (Proactive Life LLC, 2012). The user can retrieve all of the data and graphs obtained with the app by e-mail. The task involves tapping on the screen as quickly as possible with the dominant index finger when the stimulus, a sun, appears on the device screen (see Fig. 1). The sun’s size is 34 mm in diameter, including 3-mm sunrays (28-mm diameter without sunrays). The sun is centered horizontally and distanced from the top of the screen at one third of the vertical screen dimension. The stimulus size and location are always the same, independent of the mobile device or screen size used. The specific instructions were “Hover the index finger of your dominant hand close (1 cm) to the screen. Tap as quickly as possible on the sun when it appears.” Subjects were also asked to hold the iPod in their nondominant hand at lower-abdomen level and to sit upright in a chair in a quiet room without distractions. They were asked to keep their arms free from the armrests and to keep both feet on the ground. No immediate feedback on the subject’s RTs were presented after each trial, but the subject viewed their average RT at the end of each session.
S2P offers the flexibility of adjusting the duration of the testing session from 10 s (one trial) to several minutes: up to 60 trials (10 min) on the Android version, and up to 999 trials (166.5 min) on iOS. Considering the fact that a 3-min version of the PVT instead of the classic 10-min version has shown to be a promising tool to differentiate alert from sleepy individuals (Basner & Dinges, 2011), 3-min versions of both s2P and the PVT were selected for the present study. For software and programming reasons, the interstimulus intervals (ISIs) in s2P are set randomnly from 4 to 15 s, which differs from the PVT (with ISIs of 2 to 10 s). Data on the touch responsiveness delay are integrated in the app so that the RTs obtained are accurate.
All standard outcome variables were extracted from s2P in order to compare them to the PVT classic outcome variables (Basner & Dinges, 2011): the numbers of lapses and false starts, the mean RT, the reciprocal response time (RRT = 1/RT), the 10 % fastest RT, and the 10 % slowest RRT. Only mean RTs over 100 ms were entered in the analyses; those falling below 100 ms were considered false starts. Mean RTs longer than 500 ms were counted as lapses.
Psychomotor vigilance test
The PVT is currently the gold standard for objectively measuring alertness and was used in this study as the main tool for validation of the s2P app. The PVT was performed on the PVT-192 device (Ambulatory Monitoring Inc., Ardsley, NY) in a 3-min version. The subjects were told to maintain their dominant index finger on the button (a 1-cm black square on the lower part of the device) and to press as quickly as possible when a red stimulus counter appeared on the small screen (located on the upper part of the device). Subjects were asked to hold the device in their nondominant hand at lower-abdomen level, and to sit upright in a chair in a quiet room without distractions. They were asked to keep their arms free from the armrests and to keep both feet on the ground. Pressing the button automatically stopped the counter, and thus displayed for a 1-s period the RTs of the person. The ISIs varied from 2 to 10 s. The same variables were calculated as those extracted for s2P: the numbers of lapses and false starts, the mean RT, the RRT, the 10 % fastest RT, and the 10 % slowest RRT. The same parameters were set for false starts (<100 ms) and lapses (>500 ms).
Subjective measures of alertness
In addition to comparing s2P’s performance to outcomes on the classic PVT, the changes in RTs on s2P following sleep loss were compared to various subjective measures of sleepiness. To do this, three widely used questionnaires were administered along with the PVT and s2P: the SSS, the KSS, and a VAS.
The SSS (Hoddes, Zarcone, Smythe, Phillips, & Dement, 1973) is a validated test of subjective sleepiness with a 7-statement scale ranging from 1 feeling active, vital, alert, or wide awake to 7 No longer fighting sleep, sleep onset soon; having dream-like thoughts. The subject was told to choose the value corresponding to the statement that best fit how they felt at the current moment; thus, the dependent variable varied from 1 to 7 (MacLean, Fekken, Saskin, & Knowles, 1992). The KSS (Åkerstedt & Gillberg, 1990) is also a validated test measuring the current degree of sleepiness, but it is on a 9-point scale (from 1 very alert to 9 very sleepy, great effort to keep awake, fighting sleep). States 1, 3, 5, 7, and 9 are labeled, and the intermediate states are only noted. The subject chooses the number that best fits his or her level of sleepiness (Åkerstedt & Gillberg, 1990). A VAS is often used to assess the current degree of sleepiness. On a straight 100-mm line, ranging from not sleepy at all to extremely sleepy, the individual stated with a dash their current level of sleepiness. The distance in millimeters between the beginning of the scale and the location of the drawn line reflected the level of sleepiness and was considered the dependent variable (Herbert, Johns, & Doré, 1976).
Data analysis
The data analysis and statistical procedure presented here are based on Basner, Mollicone, and Dinges (2011). All analyses were generated using SPSS version 21 and Mathematica version 8. For various reasons (technical problems, late morning awakening, personal obligation, etc.), test bouts were missing for some of the subjects. A total of 333 pairs of PVT and s2P test bouts, out of 396 possible, were included in the final analysis. To take into account multiple comparisons, a conservative significance level of p = .001 was used unless otherwise specified.
To verify whether the PVT and s2P had similar sensitivities to the TSD protocol (from 8 AM on the first day to 6 PM the next day), the strength of the relationship between the two devices was determined. Thus, Pearson product-moment correlations between each device’s outcomes were conducted (mean performance score from 8 AM to 6 PM for each dependent variable). A significant positive correlation would indicate that both tests measured substantially the same construct, whereas a nonsignificant positive or a negative correlation would indicate a discrepancy between the measures.
The main concern in validation studies for PVT-type tasks is to determine whether the new test is as sensitive to effectively detect cognitive decline as the original PVT (Basner & Dinges, 2012; Basner et al., 2011; Lamond et al., 2008; Loh et al., 2004; Thorne et al., 2005). To evaluate the ability of s2P to differentiate between the sleep-deprived and alert states of the subjects, the test bouts from 8 AM to 10 PM were averaged to reflect the “alert” state, whereas the test bouts from 12 AM to 6 PM were averaged to reflect “sleepiness.” A similar cutoff had been used in previous studies (Basner et al., 2011) and is based on research showing that performance of the PVT usually begins to decline after 16 h awake (Van Dongen, Maislin, Mullington, & Dinges, 2003).
We used a one-sample t test to determine whether there was a significant difference between the sleep-deprived state and the non-sleep-deprived state, and then calculated the effect sizes for those analyses. The effect size can be interpreted as small (>0.2 and <0.5), medium (≥0.5 and <0.8), and large (≥0.8) according to Cohen (1988). As a measure of effect size precision, we calculated 95 % nonparametric bootstrap confidence intervals based on 1,000,000 samples (Efron & Tibshirani, 1993).
To verify whether both devices were able to track fatigue-related changes during the TSD, a repeated measures within-groups analysis of variance (ANOVA) of Device (s2P vs. PVT) × Test Time (test bouts from 8 AM to 6 PM the next day) was calculated on all outcomes variables. To eliminate possible systematic differences in the performance on each device due to confounding factors (hardware, operating mode, etc.), the mean RTs on each device were also presented centered on the average performance in the “alert” state. The centering method had previously been used in other validation studies (Basner et al., 2011; Lamond et al., 2008; Roach et al., 2006). In this case, we again calculated, for each moment, 95 % nonparametric bootstrap confidence intervals based on 1,000,000 samples. The outcomes between both devices were then compared using t tests for paired samples for every test bout from 12 AM to 6 PM. To accommodate multiple calculations, we adjusted the p values using the false discovery rate method (Curran-Everett, 2000).
To verify that the performance on s2P varied in accordance with changes in subjective measures of sleepiness, Pearson product-moment correlations were conducted between each outcome variable on both devices and each dependent variable of the subjective measures (SSS total score, KSS total score, VAS score).