Introduction

Approximately one third of the general adult population suffers from insomnia, reporting difficulties in initiating sleep or in maintaining sleep, and feelings of nonrestorative sleep (Ohayon 2002; Morin et al. 2006). Although non-pharmacological strategies, such as cognitive behavioral therapy, are increasingly being implemented in the treatment of insomnia, pharmacotherapy is still the most frequently used treatment (Morin et al. 2007). The primary choice of sleep-enhancing medication is sedative hypnotics, such as benzodiazepines and the newer benzodiazepine receptor agonists zopiclone, zolpidem, and zaleplon (Verster et al. 2007; Lader 2011).

Ideally, a hypnotic should improve sleep, but be free from residual sedative effects after arising. It is known, however, that a number of hypnotics that are currently prescribed can produce next-day residual sedation, depending on type of hypnotic, dose, time after administration, and frequency of dosing (Vermeeren 2004). This may lead to impairment of a wide range of cognitive abilities and can have serious consequences for daily activities, such as driving a car. In epidemiological studies, for example, it has been shown that use of hypnotics is related to an increased risk of becoming involved in traffic and occupational accidents (Ray et al. 1992; Hemmelgarn et al. 1997; Barbone et al. 1998; Neutel 1998; Dubois et al. 2008; Dassanayake et al. 2011). Experimental studies assessing actual driving performance after administration of hypnotics confirm these data by showing residual driving impairment in the morning after dosing (Volkerts and O’Hanlon 1988; Vermeeren 1995; Vermeeren et al. 1998, 2002; Verster et al. 2002; Leufkens et al. 2009).

In order to select the safest alternative among the available hypnotics, patients and prescribing physicians should be informed about the possible impairing effects of hypnotics. To date, information of the residual effects on driving performance is mainly derived from experimental studies conducted with young, healthy, hypnotic-naïve volunteers after a single night of treatment. Investigating the residual effects in this population leaves two important issues unanswered. First, it may be possible that the effects of hypnotics interact with insomnia in such a way that they are experienced differently between insomniacs and healthy, young volunteers. Secondly, the majority of insomniacs use hypnotics chronically, which may induce tolerance to their residual effects. As a consequence, the impairing effects may be less pronounced in insomniacs than in healthy volunteers.

Untreated insomniacs report reduced performance in daily life routines, which may have serious detrimental consequences (Chambers and Keller 1993; Varkevisser and Kerkhof 2005; Sarsours et al. 2011; Roth et al. 2011). It can, therefore, be expected that daytime performance after hypnotic-induced sleep is improved in insomniacs. Healthy volunteers, on the other hand, cannot benefit from hypnotic treatment as their performance is already optimal. Impairment observed due to hypnotic sedation in healthy volunteers may therefore be an overestimation of the net effects of hypnotics in patients. However, most experimental studies examining cognitive and psychomotor performance showed that daytime functioning in untreated insomnia patients was not (Riedel and Lichstein 2000; Fulda and Schulz 2001) or only minimally impaired (Shekleton et al. 2010; Bastien 2011; Fortier-Brochu et al. 2012).

Whereas experimental studies have failed to demonstrate impaired daytime functioning, it has been shown in a cross-sectional epidemiological study that difficulties in sleeping are associated with an increased risk in occupational fatal accidents (Akerstedt et al. 2002). More recently, in a study investigating the relationship between health-related complaints and crash involvement risk, it was found that sleep disturbances were associated with an elevated risk of becoming involved in car accidents (Sagberg 2006). The differences in findings of experimental and epidemiological research may be explained by possible limitations related to the type of research. Epidemiological studies may not have been able to control for other factors contributing to impaired daytime functioning. For example, insomnia is strongly associated with disorders such as depression and anxiety (Stewart et al. 2006). These disorders have been shown to affect daily functions as well (Kindermann and Brown 1997; Kizilbash et al. 2002; Wingen et al. 2006) and may have influenced the results. The absence of objective impairment in experimental studies is mainly explained by a lack of laboratory tests demanding high effort (Vignola et al. 2000; Altena et al. 2008; Edinger et al. 2008). Most performance tasks are of short duration and insomniacs may relatively easy be able to maintain high-level performance during testing (Varkevisser and Kerkhof 2005).

The second issue that has not yet been clarified in experimental designs using healthy young volunteers is whether residual effects of hypnotics are still present in insomniacs who chronically use hypnotics. Although it is not recommended to use hypnotics for periods longer than 4 weeks, a majority of insomnia patients are treated for prolonged periods (Ashton 2005; Paterniti et al. 2002). This may induce the development of tolerance to the residual effects of hypnotics (Bateson 2002). The impairing effects in hypnotic-naïve volunteers might therefore be larger than the actual effects on performance of insomniacs chronically using hypnotics. For example, a recent experimental study demonstrated that performance of chronic users of hypnotics was comparable to that of untreated insomnia patients and self-defined good sleepers (Vignola et al. 2000).

In summary, it is unclear whether driving performance of chronic users of hypnotics is impaired. The primary objective of the present study was therefore to compare actual driving performance and driving-related skills of chronic users of hypnotics to good sleepers. In case no difference in performance is found between these groups, this might be due to tolerance to the adverse effects of hypnotics, or the therapeutic effects of hypnotics on insomnia and performance, or both. To determine whether insomnia itself has adverse effects on driving performance, the secondary objective was to compare driving and driving-related skills between insomnia patients who do not or infrequently use hypnotics and good sleepers.

Methods

Subjects

A total of 42 insomnia patients and 21 healthy controls were recruited through a network of local general practitioners in the region of Maastricht, The Netherlands (Regionaal Netwerk Huisartsen), and by advertisement in local newspapers. All participants had to meet the following inclusion criteria: aged between 50 and 75 years; possession of a valid driving license for at least 3 years; average driving experience of at least 3,000 km per year over the last 3 years; good health based on a pre-study physical examination, medical history, vital signs, electrocardiogram, blood biochemistry, hematology, serology, and urinalysis. Exclusion criteria were history of drug or alcohol abuse; presence of a significant medical, neurological, psychiatric disorder, or sleep disorder other than insomnia; chronic use of medication that affects driving performance, except hypnotics; drinking more than 6 cups of coffee per day; drinking more than 21 units of alcohol per week; smoking more than 10 cigarettes per day; and body mass index outside the range of 19 to 30 kg/m2.

Insomnia patients had to meet the inclusion criteria for primary insomnia according to DSM-IV (American Psychiatric Association 1994): (1) subjective complaints of insomnia, defined as difficulties initiating sleep (sleep latency >30 min) and/or maintaining sleep (awakenings >30 min); (2) duration of more than 1 month; (3) the sleep disturbance causes clinically significant distress or impairment; (4) insomnia does not occur exclusively during the course of a mental disorder; and (5) insomnia is not due to another medical or sleep disorder or effects of medication or drug abuse.

Volunteers were screened by a telephone interview, questionnaires, and a physical examination. Insomnia patients’ sleep complaints were evaluated by a trained psychologist using Dutch versions of the Pittsburgh Sleep Quality Index (PSQI) (Buysse et al. 1989), the Sleep Wake Experience List (SWEL) (van Diest et al. 1989), and the Groningen Sleep Quality Scale (GSQS) (Mulder-Hajonides van der Meulen 1981). The GSQS provides a score between 0 and 14 representing a number of sleep complaints. In addition, subjects completed a sleep log for 14 days, providing daily information about estimated sleep and wake times, and sleep quality using the GSQS (Mulder-Hajonides van der Meulen 1981). Major psychopathology was screened using the Symptom Checklist 90 Revised (SCL90-R) (Derogatis 1983), the Beck Depression Inventory (BDI) (Beck et al. 1961), the State-Trait Anxiety Inventory (STAI) (Spielberger et al. 1983), and the Multidimensional Fatigue Inventory (Smets et al. 1995).

Insomnia patients were assigned to one of two groups depending on the frequency and duration of their use of hypnotic drugs (benzodiazepine, zopiclone, or zolpidem). Patients were assigned to a “frequent users” group when they used a hypnotic for at least four nights per week and longer than 3 months (n = 22). Patients not using hypnotics or using hypnotics less than or equal to 3 days per week were assigned to the “infrequent users” group (n = 20).

The study was conducted in accordance with the code of ethics on human experimentation established by the World Medical Association’s Declaration of Helsinki (1964) and amended in Edinburgh (2000). The protocol was approved by the medical ethics committee of Maastricht University and University Hospital of Maastricht. Subjects were explained the aims, methods, and potential hazards of the study and they signed a written informed consent prior to any study-related assessments.

Assessments

Driving performance

Driving performance was assessed using two standardized driving tests developed to measure different aspects of driving performance. The primary test is the Highway Driving Test (O’Hanlon 1984) which measures road tracking performance. Performance in this test is mainly determined by the delay lag between sensory information, execution of motor reaction, and the vehicle’s dynamic response. In this test, subjects operate a specially instrumented vehicle over a 100-km (61-mi) primary highway circuit, accompanied by a licensed driving instructor having access to dual controls. The subjects’ task is to maintain a constant speed of 95 km/h (58 mi/h) and a steady lateral position between the delineated boundaries of the slower traffic lane. The vehicle speed and lateral position are continuously recorded. These signals are edited offline to remove data recorded during overtaking maneuvers or disturbances caused by roadway or traffic situations. The remaining data are then used to calculate means and standard deviations of lateral position and speed. Standard deviation of lateral position (SDLP in centimeters) is the primary outcome variable. SDLP is a measure of road tracking error or “weaving.” The test duration is approximately 1 h.

SDLP is an extremely reliable index of individual driving performance and has proven sensitive to many sedating drugs (O’Hanlon and Ramaekers 1995; O’Hanlon et al. 1995; Ramaekers 2003; Vermeeren 2004; Verster 2004). Test–retest reliability ranges between 0.7 and 0.9 in individual studies and is on average 0.85. The test was calibrated for the effects of alcohol in a closed circuit study wherein 24 social drinkers were tested sober and after controlled drinking to raise blood alcohol concentrations in steps of 0.3 g/L to a maximum of 1.2 g/L (Louwerens et al. 1987). In line with the relationship between blood alcohol concentration (BAC) and accident risk as estimated in a large epidemiological study by Borkenstein (1974), the relationship between BAC and SDLP was shown to be an exponential function. Based on this relationship, BACs of 0.5, 0.8, and 1.0 g/L were associated with mean changes in SDLP of 2.4, 4.2, and 5.1 cm, respectively. The increase in SDLP caused by 0.5 g/L alcohol is considered clinically meaningful since accident risk has been demonstrated to increase significantly above this BAC in large epidemiological studies (Borkenstein 1974; Krüger et al. 1994). The average SDLP score of 219 healthy volunteers was 18.2 cm (SD = 3.1 cm) after placebo treatments in various recent studies (data on file). Based on these data, sample sizes of 20 subjects per group ensure 80 % power to detect a clinically meaningful difference between groups (with an alpha level of 0.05) in the primary variable of this study, SDLP.

The Car-Following Test (Brookhuis et al. 1994; Ramaekers and O’Hanlon 1994) measures changes in controlled information processing such as selective attention, stimulus interpretation and decision making, and speed of an adaptive motor response to events which are common in driving. In the test, two vehicles travel in tandem over a two-lane, undivided, secondary highway at 70 km/h (44 mi/h). An investigator drives the leading car and the subject, in the second car, is instructed to follow at a distance between 25 and 35 m. Subjects are further instructed to constantly attend the leading car since it may slow down or speed up at unpredictable times. They are required to follow the leading car’s speed movements, i.e., maintain the initial headway by matching the velocity of the car to the other’s. During the test, the speed of the leading car is automatically controlled by a modified “cruise control” system. At the beginning, it is set to maintain a constant speed of 70 km/h and, by activating a microprocessor, the investigator can start sinusoidal speed changes reaching amplitude of −10 km/h and returning to the starting level within 50 s. The maneuver is repeated six times. The leading car’s speed and signals indicating the beginning of the maneuver are transmitted via telemetry to be recorded in the following vehicle together with the following vehicle’s speed. Phase-delay converted to a measure of the subject’s average reaction time to the movement of the leading vehicle (RT, in second) is taken as the primary dependent variable in this test. A secondary measure in the Car-Following Test is Gain. It represents an amplification factor between the signals of the two cars. This will be larger than 1 when the subject overreacts to speed adaptations of the leading car. Test duration is 25 min.

Cognitive and psychomotor performance

Cognitive and psychomotor performance was assessed by tests for word learning, digit span, tracking, divided attention, sustained attention, and inhibitory control. Tests were selected based on their sensitivity to residual sedating effects of hypnotics or sleep disturbances, and their relationship to driving performance (Vermeeren 1995; Vermeeren et al. 1998, 2002; Fulda and Schulz 2001; Verster et al. 2002; Vermeeren 2004; Leufkens et al. 2009; Leufkens and Vermeeren 2009).

The Word Learning Test (Rey 1964) is a verbal memory test for the assessment of immediate recall, delayed recall, and recognition performance. Fifteen monosyllabic nouns are presented, and at the end of the sequence, the subject is asked to recall as many words as possible. This procedure is repeated five times, and after a delay of at least 30 min, the subject is again required to recall as many words as possible. At this trial, the nouns are not presented. Finally, a sequence of 30 monosyllabic nouns is presented, containing 15 nouns from the original set and 15 new nouns in random order. The subject has to indicate whether a noun originates from the old set or it is from a new set of nouns. The main performance parameters are Immediate Total Recall Score (number of immediately correctly recalled words), Delayed Recall Score (number of correctly recalled words after a 30-min delay), Recognition Score (number of correctly recognized words), and Recognition Response Time (in millisecond).

The Digit Span Forward and Backward is a subtest of the Wechsler Adult Intelligence Scale-Revised and measures the storage capacity of an individual’s working memory (Wechsler 1981). In this test, subjects are asked to repeat orally presented digits with increasing sequence length, either in forward or reverse order. There are two trials at each series length, and the test continues until both trials of a series length are failed. One point is awarded for each correct trial resulting in a Digit Span Forward Score and a Digit Span Backward Score.

The Critical Tracking Test (Jex et al. 1966) measures the ability to control an unstable signal in a tracking task. The signal deviates horizontally from a midpoint with increasing velocity and the subject has to compensate this signal deviation by moving a joystick in opposite direction. The velocity (lambda in radian/second) at which the subject loses control is measured. The test includes five trials of which the lowest and the highest score are discarded. The performance parameter is the average lambda (radian/second) of the remaining three scores, reflecting the critical tracking velocity.

The Divided Attention Task (Moskowitz 1973) measures the ability to divide attention between two simultaneously performed tasks. The first task is to perform the tracking test at a fixed level of difficulty, with velocity set at 50 % of the maximum score obtained after extensive training of the Critical Tracking Test. In the other task, the subject has to monitor 24 single digits that are presented in the four corners of the screen. The digits change asynchronously at 5-s intervals. The subjects are instructed to remove their foot from a pedal as rapidly as possible whenever the digit “2” appears. This signal occurs twice at every location, in random order, at intervals of 5 to 25 s.

In the Stop Signal Task (Logan et al. 1984), the concept of inhibitory control is defined as the ability to stop a pending thought or action and to begin another. The paradigm consists of two concurrent tasks, i.e., a go task (primary task) and a stop task (secondary task). The go signals (primary task stimuli) are two letters (“X” or “O”) presented one at a time in the center of a computer screen. Subjects are required to respond to each letter as quickly as possible by pressing on of two response buttons. Occasionally, a stop signal (secondary task stimulus) occurs during the test. Subjects are required to withhold any response in case a stop signal is presented. The stop signal consists of an auditory cue, i.e., a 1,000-Hz tone, that is presented for 100 ms. The interval at which the stop signal is presented is dependent from the subject’s own successful and unsuccessful inhibitions. By continuously monitoring the subject’s response, the Stop Signal Delay is adjusted producing the probability of responding [p(respond|signal)] equal to 0.50. Consequently, the stop signal reaction time is estimated by determining the mean of the inhibition function, which is then subtracted from the mean go RT. Task duration is 14 min.

The Psychomotor Vigilance Task (Dinges and Powell 1985) is based on a simple visual RT test. Subjects are required to respond to a visual stimulus presented at variable interval (2,000 to 10,000 ms) by pressing a button with the dominant hand. The visual stimulus is a counter turning on and incrementing from 0 to 60 s at 1-ms intervals. In response to the subject’s button press, the counter display stops incrementing, allowing the subject 1 s to read the RT before the counter restarts. If a response has not been made in 60 s, the clock resets and the counter restarts. The median reaction time and the number of lapses (i.e., response times >500 ms) were used as the main performance parameters. The test duration is 10 min.

Polysomnography

On nights before testing, sleep quality and duration were measured by polysomnography using montage including electroencephalogram, electrooculogram, and electromyogram. Sleep stages were visually assessed by experienced technicians according to standardized criteria (Iber et al. 2007). Technicians were only informed about the age and sex of the subjects, but were blinded to the group affiliation of the subjects. Parameters derived after analysis are sleep onset latency (in minute), wake after sleep onset (in minute), total sleep time (in minute), sleep efficiency (in percent), and number of awakenings.

Subjective evaluations

Upon arising, subjects completed the specific version of the Groningen Subjective Quality of Sleep questionnaire (GSQS) (Mulder-Hajonides van der Meulen 1981), providing a score between 0 and 14 representing a number of sleep complaints. In addition, subjects estimated sleep onset latency (in minute), number of awakenings, time awake before rising (in minute), and total sleep time (in minute).

Subjective evaluations of mood, sedation, and driving quality were assessed using a series of visual analogue scales (100 mm). The subjects were instructed to rate their subjective feelings using a 16-item mood scale which provides three-factor analytically defined summary scores for “alertness,” “contentedness,” and “calmness” (Bond 1974).

Subjective feelings of sleepiness were rated with the Karolinska Sleepiness Scale (Akerstedt and Gillberg 1990), ranging from 1 (extremely alert) to 9 (very sleepy, fighting sleep).

Subjects rated the degree of mental effort they had to put in driving performance with the Rating Scale Mental Effort (Zijlstra 1993). The scale is a visual analogue scale (150 mm) with additional verbal labels.

The driving instructors rated each subject’s driving quality and apparent sedation at the conclusion of the Highway Driving Test, using two 100 mm visual analogue scales ranging from “very bad” to “very good,” and from “not at all” to “completely,” respectively.

Procedure

All subjects completed two nights of sleep evaluation and testing. The first night was a habituation and practice condition to familiarize the subjects with the sleeping facilities and polysomnographic and test procedures. The second night was for actual sleep and performance assessments. Within 10 days before their habituation night, the subjects were individually trained to perform the cognitive and psychomotor tests during two sessions of approximately 1.5 h.

A test condition started in the evening of day 1, when the subjects arrived at the site at approximately 1900 hours, and lasted until day 2, when they were discharged at approximately 114 hours. On arrival at the sleeping facility, the subjects rated their subjective feelings and subjective sleepiness. From 1930 until 2030 hours, they performed the first session of laboratory tests, comprising the Word Learning Test immediate and first delayed recall, the Critical Tracking Task, the Divided Attention Task, the Psychomotor Vigilance Task, the Stop Signal Task, and the Digit Span forward and backward. Hereafter, electrodes for polysomnographic recording were attached and subjects retired to bed at 2330 hours. Immediately preceding retiring, the subjects in the “frequent users” group ingested their own prescribed hypnotic, whereas the subjects in the “infrequent users” group and controls did not ingest medication.

The subjects were awakened at 0730 hours, and after arising, a light standardized breakfast was served. At 0800 hours, the subjects evaluated sleep quality and duration, and feelings of daytime sleepiness and alertness. Subsequently, they started the second session of laboratory tests, comprising the second delayed recall and word recognition parts of the Word Learning Test, the Critical Tracking Task, the Divided Attention Task, the Psychomotor Vigilance Task, and the Digit Span test. At 0900 hours, the subjects were transported to the Highway Driving Test which they performed between 0930 and 1030 hours. Upon completion, the subjects rated the mental effort it took to perform this driving test, and continued to perform a Car-Following Test. After this, the subjects returned to the testing facilities for removal of the electrodes and were discharged.

During participation, use of caffeine was prohibited from 8 h prior to arrival on test days, until discharge the next morning. Alcohol intake was not allowed from 24 h prior to each dosing until discharge. Smoking was prohibited from 1 h prior to bedtime until discharge.

Statistical analysis

The primary parameter of the study was the SDLP (in centimeter). All performance-related parameters were analyzed for group differences at separate time points (evening, morning) using analysis of covariance (ANCOVA) with participant’s sex, age, and years of education as covariates. Significant (p < 0.05) main effects of group were further analyzed using three pairwise comparisons with LSD adjustment for multiple comparisons. Sleep parameters were analyzed for group differences using analysis of variance (ANOVA) with Welch test for unequal variances. Significant (p < 0.05) main effects of group were further analyzed using three pairwise comparisons with LSD adjustment for multiple comparisons. A natural log transformation was used before analysis of highly skewed variables. Tables show means and standard deviations of variances untransformed scores.

All statistical analyses were done by using the Statistical Package for the Social Sciences (SPSS) statistical program (version 21.0.0.1 for Windows; SPSS, Chicago, IL).

Results

Group differences for the screening variables

A total of 63 subjects (34 men, 29 women) completed the study. They had a mean (±SD) age of 61.6 (±5.1) years and an average education of 12 (±3) years. They had their driving license on average for 39 (±8) years and drove on average 9,965 (±7,848) km per year over the last 3 years. There were no significant differences between the three groups (Table 1).

Table 1 Means (±SD) of pre-study group characteristics

Evaluation of sleep at home differed significantly between groups. As shown in Table 1, sleep quality was poorer in both insomnia groups as compared to controls, as indicated by significantly higher scores on the PSQI, GSQS-general, and the sleep subscale of the SCL90-R (p < 0.001). There were no differences in mean PSQI, GSQS, and SCL90-R sleep scores between the insomnia groups. Scores on the SWEL showed that sleep complaints were most frequently classified by patients as sleep initiation and sleep maintenance problems, and occasionally as early morning awakenings, with similar prevalences in frequent and infrequent users. None of the controls reported problems with sleep onset, sleep maintenance, or early awakening.

Both insomnia groups scored significantly higher than controls on rating scales of depression (BDI p < 0.001; SCL90-R p < 0.004), without significant differences between the frequent and infrequent users. Anxiety was also increased in insomniacs as compared to controls, as shown by significantly higher scores on both subscales of the STAI in both groups (STAI state: frequent users p = 0.044 and infrequent users p < 0.001; STAI trait: frequent users p = 0.005 and infrequent users p < 0.001). A significant difference on the SCL90-R anxiety scale was found between the infrequent users and controls (p = 0.006), but not between the frequent users and controls. There were no significant differences in anxiety ratings between the insomnia groups.

The Multidimensional Fatigue Inventory showed significant differences between groups on all five subscales. Both insomnia groups reported suffering from significantly more general fatigue (both p < 0.001), physical fatigue (both p < 0.001), and more mental fatigue (frequent users p < 0.001 and infrequent users p = 0.002) when compared to the control group. In addition, both groups reported significant reductions in motivation (frequent users p = 0.015 and infrequent users p = 0.027), and frequent users also reported reduced activity (p = 0.040). No differences between the insomnia groups were found on any of the scales.

Sleep diary

The 2-week sleep diary showed that complaints of disturbed sleep as measured by the GSQS-specific were significantly increased in both insomnia groups as compared to controls (Table 2). Compared with the healthy, good sleepers, the infrequent users group reported significantly more sleep complaints (p < 0.001), longer sleep onset time (p < 0.01), shorter total sleep time (p < 0.001), reduced sleep efficiency (p < 0.001), earlier morning awakenings (p < 0.05), and more awakenings (p < 0.001). Sleep quality in the frequent users group was significantly worse as compared to the good sleepers in number of sleep complaints (p < 0.001), sleep onset time (p < 0.01), and sleep efficiency (p < 0.01). Comparisons between the two insomnia groups revealed that the infrequent users reported a shorter total sleep time (p < 0.01), the lower sleep efficiency (p < 0.01), and more awakenings (p < 0.05) than the frequent users showing that sleep was worst in infrequent user group.

Table 2 Sleep diary

Hypnotic use

The hypnotics used in the frequent users group were zopiclone (n = 4), temazepam (n = 4), midazolam (n = 4), oxazepam (n = 3), zolpidem (n = 2), lormetazepam (n = 2), clonazepam (n = 1), flurazepam (n = 1), and nitrazepam (n = 1). Their average (±SD) duration of hypnotic use was 7.7 (6.8) years. The average (±SD) nightly use of the hypnotics was 6.4 (1.2) nights a week (Table 3).

Table 3 Overview of individual hypnotic use and their pharmacokinetic properties. Hypnotics are listed in increasing order of expected residual effects

Within the infrequent users group, seven patients reported no history of hypnotic use. The remaining 13 patients used hypnotics infrequently and irregularly. Their average (±SD) nightly use was 4.1 (2.9) nights a month and with an average (±SD) duration of hypnotic use of 7.8 (7.9) years. The hypnotics used were temazepam (n = 6), zopiclone (n = 4), lorazepam (n = 1), loprazolam (n = 1), and nitrazepam (n = 1).

Sleep the night before driving

Table 4 presents the sleep parameters of the objective and subjective sleep evaluation for each group. Polysomnography showed no differences between the three groups on any of the parameters measured. In contrast, significant overall group differences were found for subjective evaluations of total sleep time (p = 0.005), sleep onset time, and sleep quality (p = 0.001). Both insomnia groups reported significantly more sleep complaints on the GSQS than the control group (p values <0.01). In addition, the infrequent users group reported significantly shorter total sleep times (p < 0.01) and longer sleep onset times (p < 0.05) than the control group. The frequent users group did not differ from the control group.

Table 4 Mean (±SD) of objective and subjective sleep parameters

Driving, cognitive, and psychomotor performance

Figure 1 and Table 5 show the driving performance parameters. The primary performance parameter, SDLP in the highway driving test, was on average normal and similar for infrequent users (17.7 ± 2.9 cm), frequent users (17.4 ± 4.3 cm), and controls (16.8 ± 2.7 cm) (Fig. 1). Table 3 illustrates the hypnotics and doses used by the frequent users and the corresponding SDLP scores. Inspection of the average SDLP scores from users of category I drugs and category II drugs shows that performance in the category I group appears to be better than the category II group. Mean (±SD) SDLP scores are 16.8 (4.7) and 18.3 (3.5) cm, respectively.

Fig. 1
figure 1

Individual SDLP data for each group separately. Horizontal bars indicate average SDLP

Table 5 Mean (±SD) of driving performance parameters

The secondary driving task, the Car-Following Test, was not different between the groups as well. All participants were able to respond to the changes of the preceding car in a similar fashion.

Finally, there were no significant differences in performance on any of the psychomotor or cognitive tests. There were tendencies (p < 0.10) towards group differences in the divided attention test and the digit span test. For the digit span test, this was mainly due to worse performance by frequent users as compared to controls in the morning. In the divided attention test, there was a tendency for group differences in the tracking subtask, due to worse tracking performance by the controls as compared to the both patient groups.

Comparisons between performance in the evening and morning sessions showed significant differences in the reaction times in the PVT and delayed recall in the word learning test. The latter was as expected because the interval between learning and delayed recall increased from evening to morning sessions. Average reaction time in the PVT was significantly faster in the morning as compared to the evening in controls (p < 0.005), but not in the insomnia groups.

Performance in the critical tracking test, divided attention task, and the stop signal task showed no significant differences between times of day (evening vs. morning).

Subjective rating scales

As shown in Table 6, patients felt on average less alert and more sleepy than controls as measured by the mood rating scale and the KSS, but this was only significant for alertness in the evening (p = 0.002). In the evening, frequent users had significantly lower alertness scores than controls. Furthermore, all groups felt on average less alert and more sleepy in the morning as compared to the evening. Pairwise comparisons showed that these differences were only significant in the controls (p < 0.01), but not in the patients.

Table 6 Mean (±SD) of psychomotor and cognitive performance parameters

No group differences were found in the evaluation by the instructors of the participants’ driving quality and apparent sedation. In addition, although showing a marginal trend to the detriment of the infrequent users, no differences between the groups were found in subjectively rated mental effort the participants needed to put in the driving test.

Discussion

The present study is the first directly comparing driving performance between insomnia patients who chronically use hypnotics, insomnia patients who do not or infrequently use hypnotics, and healthy, good sleepers. Results show that driving, as measured by a standardized highway driving test and a car-following test, is not impaired in insomniacs, irrespective of the use of hypnotics. In addition, the present study shows that driving-related psychomotor and cognitive performance is not different between pharmacologically treated insomnia patients, untreated insomnia patients, and healthy, good sleepers.

Results of the study corroborate previous findings showing an absence of neuropsychological deficits in both pharmacologically treated and untreated insomniacs (Vignola et al. 2000). Aside from minor attentional problems in insomniacs, that study reported no significant differences in performance between treated insomniacs, untreated insomniacs, and healthy, good sleepers. The authors suggested, however, that the absence of cognitive impairment may have been due to methodological limitations. The tests used in their study were of short duration and demanded low effort (e.g., digit symbol substitution test and purdue pegboard test). Consequently, insomnia patients may have been able to exert enough effort for a short period to complete the tests successfully.

In the present study, subjects performed a standardized highway driving test for approximately 1 h. The prolonged attentional demands of the task were expected to reveal possible performance deficits in insomnia patients, which were not found with tasks of short duration. Yet, there were no indications of deterioration in driving performance in insomnia patients, irrespective of use of hypnotics.

These results can be interpreted in two ways. First, sleep in the insomnia groups was undisturbed according to polysomnographic criteria leaving daytime performance unaffected. Despite significantly more subjective sleep complaints reported by the insomniacs when compared with the healthy, good sleepers, there was no objective evidence for any sleeping problems. Polysomnographic data showed that there were no differences between the groups on any of the sleep parameters obtained from the laboratory recordings. Discrepancies between subjective and objective sleep parameters have been established in previous studies (e.g., Orff et al. 2007; Vignola et al. 2000). A limitation in those studies was that participants’ sleep was only assessed during one night, possibly causing “first night effects” in healthy, good sleepers and “reversed first night effects” in insomnia patients. The present study, therefore, used the recommended second night recordings for determination of objective sleep complaints. Yet, there were no indications of any objective evidence for sleeping difficulties. It is suggested, however, that the current standards of sleep analysis may not have sufficient sensitivity for distinguishing insomnia from healthy, undisturbed sleep (Bastien et al. 2003). A possible solution can be found in spectral analysis of the sleep microstructure, dissociating characteristic electroencephalographic activities. More research is needed to confirm this presumption.

The lack of objectively measured sleep complaints may be also due to the sleeping environment. It has been shown that home-based polysomnographic recordings yield different results than recordings at the laboratory (Edinger et al. 1997). Insomnia patients’ sleep appears more disturbed when they sleep at home than when they sleep at the laboratory. Indeed, subjective sleep evaluations in the present study show that sleep quality at the laboratory improved in both insomnia groups. In addition, subjective sleep quality for the healthy controls was worse after laboratory sleep than after sleep at home. Sleep quality changed even close to insomnia levels on some of the parameters. These changes in sleep quality may have resulted in a less pronounced difference between the groups and may have minimized existing differences in objective sleep recordings. Additionally, the changed sleep quality may have had an influence on daytime performance, i.e., improved in patients due to improved sleep and impaired in healthy controls due to impaired sleep. Improved daytime functioning as a result of improved sleep in insomnia patients has been shown previously (Edinger et al. 2003). In contrast, healthy good sleepers, in particular elderly, do not seem to suffer greatly from a single night of disturbed sleep (Philip et al. 2004). Still, it remains to be investigated whether sleep at home would influence driving performance differently than sleep at the laboratory in both insomnia patients and good sleepers.

A second explanation for the absence of impairment in driving performance may be that the insomnia patients, who had ample driving experience, may have been easily able to complete the driving tests successfully. In a recent study, it has been shown that establishing performance impairment in insomnia may be dependent on task complexity (Altena et al. 2008). Insomnia patients performed worse than healthy controls only in a complex vigilance task, whereas the patients’ performance in a simple reaction time task appeared to be even better than healthy controls. The authors concluded that chronic insomnia is associated with cognitive dysregulation, but that this may only be revealed in tasks measuring higher level functioning. Driving is a well-practiced and highly automated skill (Brouwer 2002) and may not require such high demands on cognitive functioning. Judging from the low values scored on the mental effort scale, this assumption seems to be confirmed. Scores on the scale can range from 0 to 150. The average scores in the present study were 30.3 for the medicated insomnia patients, 35.7 for the unmedicated insomnia patients, and 20.3 for the healthy controls. In a study comparing cognitive performance between patients with seasonal allergic rhinitis and healthy controls, subjects evaluated the mental effort they had to put in a 45-min sustained attention test considerably higher (Hartgerink-Lutgens et al. 2009). Mental effort scores were around 90 for both groups, indicating substantial higher demands of that test as compared with the highway driving test.

Yet, the insomniacs, in particular the infrequent users group, evaluated the degree of mental effort they had to put in the driving test tentatively higher than the healthy controls. Although not reaching statistical significance, this may suggests that the insomnia patients masked possible performance difficulties by increasing their effort. As mentioned earlier, the short duration of cognitive tasks was considered to be a limitation for revealing any performance deficits in insomnia patients. It seems that even tasks with duration of at least 1 h can be completed successfully by individuals reporting sleeping difficulties.

In addition to the findings that driving is not affected in insomnia patients, results of the present study show that driving is not impaired in patients chronically using hypnotics as well. The absence of impairment, combined with the still present subjective sleep complaints, suggests the development of tolerance to both therapeutic and residual effects of hypnotics. With respect to the residual effects, the results are partly supported by epidemiological data (Neutel 1998). These showed that prolonged use of hypnotics is associated with a lowered risk of becoming involved in a car accident when compared with initial use of hypnotics. Nevertheless, the risk of injurious traffic accidents after chronic use of hypnotics remained twice as high in long-term hypnotic users in comparison to healthy, unmedicated drivers.

The absence of residual effects in the present study may be explained by the wide variety of hypnotic drugs and doses used in the frequent users group, however. In addition, the majority of hypnotics were unlikely to produce residual effects between 10 to 11 h post dose. Consequently, the large differences in degree of residual effects may have contributed to the variability of performance in this group and masked any detectable impairment. This is supported by the inspection of the average SDLP scores from the category I and the category II users, showing that the former group had a slightly lower average SDLP score (16.8 cm) than the latter (18.3 cm). Comparing these values with two studies investigating the residual effects of zopiclone 7.5 mg in younger (Leufkens et al. 2009) and older (Leufkens and Vermeeren 2009) healthy, medication naïve, participants reveals that performance was in the range of driving after administration of placebo. Both age groups had placebo SDLP scores of 17.8 cm (95 % confidence interval 16.8–18.2 cm). Future research in patients chronically using the same hypnotic is needed to shed more light on this issue. The present study aimed, however, to evaluate driving performance in a representative, non-selective study sample of insomnia patients chronically using hypnotics.

The results may differ when driving is investigated in a younger sample of insomniacs. A recent review showed that the residual effect of zopiclone on driving in older subjects (age ranging from 56 to 73) was generally less than that found in younger subjects (age ranging from 21 to 45 years) (Leufkens and Vermeeren 2013). An explanation for this effect may be found in age-related increases in driving experience. The driving-related skills of less experienced drivers may be more sensitive to drug-induced sedation. Consequently, driving performance could be impaired to a larger extent by hypnotic use in younger insomnia patients than in older patients.

It may be argued that the driving tests are not sensitive enough to detect performance deficits as a consequence of a disorder. Studies assessing driving performance in other patient groups using the same standardized driving test have been conducted previously. Direct comparisons with healthy controls have been reported for patients with chronic nonmalignant pain (Veldhuijzen et al. 2006) and depressed patients receiving long-term antidepressant treatment (Wingen et al. 2006). In contrast to the present study, these studies showed that driving was significantly impaired in both patient groups as compared with healthy controls. These results suggest that the driving test has sufficient sensitivity for detecting driving impairment in patient groups. In addition, sample sizes of the previous studies were similar to the present study, suggesting that the absence of effects of insomnia could not be explained in terms of statistical power.

Finally, it may be questioned whether the study had sufficient power to detect differences in driving performance between insomnia patients and controls. We do not believe this is the case. First of all, the mean difference in SDLP between infrequent users and controls was small (i.e., less than 1 cm), and not considered clinically relevant. The minimum clinically relevant difference in SDLP is 2.4 cm which corresponds to the effect of alcohol when blood alcohol concentrations are at the legal limit for driving in most countries. Secondly, previous studies assessing driving performance in other patients groups, using the same standardized driving test, have shown that the method is sufficiently sensitive to detect significant impairment in small samples of patients with chronic nonmalignant pain (Veldhuijzen et al. 2006) and depressed patients receiving long-term antidepressant treatment (Wingen et al. 2006). It seems therefore that the effects of insomnia and chronic use of hypnotics are relatively small and less debilitating than the effects of pain and depression.

To conclude, results of the present study indicate that driving performance is not substantially impaired in older chronic users of hypnotics and older insomnia patients who do not use hypnotics. This supports conclusions from previous studies that effects of insomnia and chronic use of hypnotics on cognitive performance are subtle.