Introduction

Salivary cortisol is frequently used as a surrogate for free serum cortisol. Flatter diurnal cortisol slope (DCS) has been associated with poorer emotional health outcomes [1]. The cortisol awakening response (CAR) is positively related to general life stress [2] and alterations of the cortisol awakening response are associated with chronic stress [3]. However, salivary cortisol is rapidly converted to salivary cortisone [4] and cortisone was recently found to be a superior surrogate for free serum cortisol compared to salivary cortisol [5, 6]. Despite this, studies investigating both cortisol and cortisone circadian rhythm patterns in subjects in free-living conditions are lacking [7]. To assess these measures it is necessary to collect salivary samples at several time points and across multiple days [8, 9], which can be challenging when participants are performing the sampling in a daily-life setting [10]. Therefore, it is important to examine participant compliance to a sampling protocol, and to investigate the within subject day-to-day variability of cortisol and cortisone outcomes. This could inform decisions that may help minimize the participant burden and optimize use of resources.

The primary aim of this study was to investigate the level of compliance to a home-based three-day salivary sampling protocol among healthy adults. The secondary aims were to investigate the within subject day-to-day variability of cortisol and cortisone outcomes and estimate the required number of measuring days to obtain acceptable reproducibility, and to inform statistical power calculations for future trials.

Main text

Methods

Study design and data

The current study uses data from the SCREENS (no abbreviation) pilot trial (www.clinicaltrials.gov (NCT03788525)), which is a parallel-group two-arm cluster randomized trial with no control group carried out in a free-living setting [11]. The SCREENS pilot trial included 12 families (see Additional file 1).

Salivary sampling protocol

Participants received instructions and were provided with written information detailing the saliva sampling protocol in a face-to-face meeting with a member of the research team in the participants’ home. Participants were instructed to collect salivary samples on three consecutive days at baseline and follow-up, respectively. Samples were collected immediately upon awakening (S1) i.e. when they woke up and got out of bed, 30 min and 45 min after awakening and once just before bedtime (Additional file 2). Participants were provided with a dual digital timer (S. Brannan Sons Ltd., England) and were instructed to start the timer upon awakening. The timer then rang 30 and 45 min after awakening reminding participants to collect the second and third sample, respectively.

Participants were also instructed not to eat, smoke, exercise, or drink anything but water between the morning samples. Participants were allowed to brush their teeth within the first 20 min after the first salivary sample, and instructed not to drink water after the first 20 min had passed and until they finished the morning sampling routine. In the last 30 min before the evening sample (just before bedtime), participants were instructed not to eat, smoke, exercise, or drink anything other than water.

Samples were collected using Salivette tubes containing a synthetic swab (Starstedt, Nümbrecht, Germany). To collect a sample, participants were instructed to place a swab in their mouths, chew on it lightly for 45–60 s, transfer it back into the tube directly from the mouth and put a pre-labelled sticker on the tube. The samples were stored in a freezer in the home of the participants before they, at the end of the trial, were transported to Slagelse Hospital for storage at −80 °C before laboratory analyses.

Analyses of salivary cortisol and cortisone

Salivary cortisol and cortisone were measured using isotope dilution-liquid chromatography-tandem mass spectrometry. Cortisol and cortisone awakening response was both calculated as the difference between the sample collected immediately upon awakening and the sample 30 min later (CAR30) and as the difference between the sample collected immediately upon awakening and the value of the second or third morning samples, whichever was the largest (CARpeak). Cortisol and cortisone awakening response summary indicators were calculated as the area under the curve (CARauc) for the morning samples (upon awakening, 30 min, 45 min after awakening). Diurnal cortisol and cortisone slope was measured as wake-to-bed slopes and peak-to-bed slopes. Wake-to-bed slopes were calculated by subtracting the bedtime sample value from the sample collected immediately upon awakening and divided by the number of hours separating these two samples. Peak-to-bed slopes were calculated by subtracting the bedtime sample from the peak value of the second or third sample in the morning and divided by the number of hours separating these two samples.

Checklist

Participants filled in a checklist to report their wake-up time, bedtime, and the time they had taken each sample. The research team manually corrected if a participant had made an obvious typo e.g. in recorded time of wakening or salivary sampling (5 samples). An obvious typo could be if a participant reported salivary collection at 07:00 am, 07:15 am and 07:45 am, but the wake-up time was reported to be 08:00 am. Then, wake-up time was corrected to 07:00 am. The self-reported times were used to measure the level of compliance to the protocol by investigating the timing of the samples according to the wakeup time (Table 1).

Table 1 Definition of compliance

Socioeconomic status

Education levels were coded using the International Standard Classification of Education (ISCED) [12] and used as an indicator of socioeconomic status.

Statistical analyses

To investigate the reproducibility of cortisol and cortisone concentrations over the three measurement days, within-subject coefficient of variation (CV%) was calculated and intraclass correlation were estimated using linear mixed models with baseline cortisol and cortisone measurements as outcome and subject-id as random effect [13] and corresponds to the expected correlation between cortisol or cortisone outcome measures (i.e. the awakening response) in any pair of days of measurement. Follow-up measurements were used for an additional secondary analysis. The Spearman Brown prophecy formula was used to calculate the necessary sampling days to obtain moderate and high reproducibility [14]. Finally, we conducted a series of power calculations to estimate necessary sample sizes for a future 80% powered parallel group superiority randomized trial for a range of clinically relevant differences (see Additional file 3).

Statistical analyses were performed using the software StataIC (version 16).

Results

Participants

Nineteen adults completed the study. Two participants were noncompliant at one of the six sampling days. The samples from this day were excluded from the analysis of reproducibility (8 samples). The research team checked the objective sleep measurement for these participants before the participant was excluded. The final analytic sample included 354 samples nested in 16 participants (Additional files 4 and 5) and 180 complete days at baseline and 171 days at follow-up. The analyses of compliance were completed before correction and exclusion (n = 19, samples = 434). The raw cortisol data for each included individual at different time points and different days are shown in Additional file 6 to visualize the within and between subject variability.

Compliance

As shown in Table 2, 18 (95%) participants were compliant to number of samples according to the definition. A total of 16 (84%) participants were compliant to reporting the salivary sampling time in the checklist for 16 or more samples. Similarly, 16 (84%) participants were compliant to the timing of all morning samples. Higher demands to complete data (3 days) resulted in less compliant participants. A small difference was found in the number of compliant participants at baseline and follow-up when analyzing all three days.

Table 2 Compliance to the salivary sampling protocol

Reproducibility of cortisol and cortisone

The within-subject CV% for samples obtained on comparable time-points on different days were moderate-to-large (mean CV% = 14.7%-75.3%, Table 3 and Additional file 7). Analyses indicated moderate within-subject reproducibility for the area under the curve for the morning samples and peak-to-bed slope for both cortisol and cortisone and CARpeak for cortisone at baseline (Table 3). The rest of the intraclass correlation coefficients indicated high within-subject day-to-day variability. The within and between subject reproducibility for the area under the curve for the morning salivary samples and peak-to-bed slope for cortisol at baseline and follow-up are also visualized graphically in Additional file 8. The predicted number of measurement days needed to obtain high reproducibility (intraclass correlation coefficients = 0.80) was three days for peak-to-bed slope for both cortisol and cortisone and 4 days for the area under the curve for the morning salivary samples for cortisol and cortisone. To obtain moderate reproducibility (intraclass correlation coefficients = 0.70) for the area under the curve for the morning salivary samples for cortisol and cortisone and peak-to-bed cortisone slope the samples need to be collected over two days. One day of samples for peak-to-bed cortisol slope is needed to obtain an intraclass correlation coefficients value of 0.70. One day of samples is enough to obtain intraclass correlation coefficients values of 0.60 for the area under the curve for the morning salivary samples and peak-to-bed slope for cortisol and cortisone. The analyses indicated fairly similar results when they were based on follow-up measurements (Additional file 9).

Table 3 Intraclass correlation coefficients (ICC) and required measuring days based on baseline measurements

Necessary sample size in a future study

According to our power calculations, minimally required sample sizes for future parallel group superiority randomized trials are 56 to 162, 36 to 58 and 26 to 42 participants to be able to detect a small (Cohen’s d = 0.3), moderate (Cohen’s d = 0.5) and large (Cohen’s d = 0.6) effect size, respectively (depending on the cortisol and cortisone outcome) (Additional file 3).

Discussion

The main finding of the current study was the high level of compliance to the home-based three-day salivary sampling protocol that was designed to balance the burden on participants and acquisition of valid data. Although a 5–15% amount of missing data due to lack of compliance to the assessment protocol is unlikely to be a serious threat to the internal validity of a future study, additional measures that may improve participant compliance may be needed. The protocol could be improved by validating the timing of the salivary sampling using objective sleep measurements (i.e. a wrist-worn device) to record the wake-up time. Furthermore, reporting errors may be eliminated by using an app to record timing.

The area under the curve for the morning salivary samples and peak-to-bed slope showed moderate day-to-day reproducibility over three days, which are in line with findings in the literature [15,16,17]. The current study found high day-to-day variability and low reproducibility for the few remaining measurements of cortisol and cortisone, which also consist with the literature [15, 17, 18]. Besides, a previous study by Bakusic et al. confirms a high-day-to-day variability of both cortisol and cortisone. However, this study found higher reproducibility for cortisone compared to cortisol which may indicate that cortisone may be a more stable measure compared to cortisol [7]. In the current study we found no evidence of a difference in reproducibility between cortisol and cortisone.

The results showed that samples should be collected over at least 3–4 and 1–2 days if high or moderate reproducibility respectively is minimally desirable.

The study possesses several strength including the carful collection of participants reported timing of samples, bed-and wake times, and the possibility of cross-checking self-reported wake-time with objective sleep measurement made it possible to obtain a reasonable investigation of the feasibility of our sampling protocol. Furthermore, a strength of this study was the use of isotope dilution-liquid chromatography-tandem mass spectrometry which has high accuracy and specificity.

Conclusion

In conclusion, we found high levels of compliance to the home-based salivary sampling protocol in a sample of healthy adults. The within subject day-to-day variability was fairly high for all cortisol and cortisone outcomes investigated. A two-day sampling protocol appears to yield the right balance between resource use and minimization of participant burden if a moderate reproducibility is deemed sufficient in a study.

Limitations

The assessments of compliance to the timing of the samples were made based on self-report. Also, the relatively small number of participants may limit the external validity of the study. Finally, it was only possible to use the objective sleep and wake time to validate if the self-reported waking time was accurately reported in some cases.