Temporal development of unpleasantness and symptomatology
Data collection
In this first part, we investigate how unpleasantness and symptomatology develop with the progression of motion sickness. To do so, we (re-)analyzed motion sickness ratings collected during five previously published experiments (Exp 1 = Nooij et al. 2017b; Exp 2 = Nooij et al. 2017a; Exp 3 = Nooij et al. 2021; Exp 4 = Bos et al. 2005; Exp 5 = Bos 2015) and two additional experiments to be published later (Exp 6–7). In all experiments, subjects were exposed to either physical or virtual motion for a maximum duration of 30 min and indicated their level of unpleasantness or symptomatology at regular intervals (two to five minutes). Unpleasantness was assessed in Exp 1–3 using the FMS, whilst symptomatology was assessed in Exp 4–7 using the MISC. The provocative stimulation was aborted when a subject reported a FMS class of ≥ 15 or a MISC class of ≥ 7, except for Exp 4 that used no stop-criterion. All experiments (except for Exp 3) consisted of multiple provocative sessions, which were presented on separate days. Additional experimental details are summarized in Supplementary Table S1.
Data analysis
We analyzed the FMS ratings from 58 subjects performing a total of 132 sessions with at least two ratings within each session, and MISC ratings from 148 subjects performing a total of 528 sessions with at least two ratings within each session. For all scale ratings, we analyzed the difference in rated class between two consecutive ratings, which we will further refer to as a rating transition. We first determined the number of observed transitions between two classes, and subsequently calculated the proportion of cases in which the rating after a certain class remained constant, increased, or decreased. Our null hypothesis is a monotonic increase of unpleasantness and symptomatology with the progression of motion sickness over time, implying that their respective ratings should increase or remain constant. Decreases in ratings might occur due to random fluctuations in rating, and thus should be infrequent and evenly distributed over the whole range of the scale.
To promote a comparison with the normalized results for unpleasantness on the psychophysical scaling tasks (see next section), we rescaled the FMS to describe the temporal development of unpleasantness to range from 0 “no sickness” to 1 “frank sickness”, which we refer to as FMS’.
Relationship between unpleasantness and symptomatology
Data collection
In the second part, we assessed the relationship between unpleasantness and symptomatology. This part was performed in Exp 6 and 7, in which subjects performed a psychophysical scaling task before and/or after the last provocative session of the experiment.
In Exp 6, subjects judged the level of unpleasantness associated with each MISC class using magnitude estimations (MAG) as originally used for the ratio scaling of psychophysical stimuli, such as the brightness of light (Stevens 1956) or social phenomena (Kuennapas and Wikstroem 1963; Lodge 1981; Venrooij et al. 2015). We here asked subjects to draw lines whose lengths represented the level of unpleasantness they associated with each MISC class description (1 to 10). We only provided the descriptions, without referring to the numerical values corresponding to the classes. We provided two A4 papers in landscape orientation, with a horizontal 10.5 cm reference line at the top of each page. This line represented the unpleasantness for MISC 6, whose description was printed below the line. In addition, four or five other descriptions were printed below, which we asked subjects to judge by drawing a line. We explained subjects that drawing a line twice the length of the reference line, would imply twice the amount of unpleasantness as compared to the reference symptom (i.e., feeling a little nauseated). Lines could be of any length, if needed consisting of multiple line segments. The class descriptions were randomized in four different orders. We let subjects perform this task both before the first session and after the last, to investigate whether exposure to a provocative motion affected the judgements.
In Exp 7, we investigated whether the choice of reference class affected the judgements. We therefore repeated the MAG task of Exp 6 using class description MISC 4 instead of MISC 6 as the reference. In addition, we investigated whether the type of psychophysical task affected the judgements by letting subjects perform a two-alternative forced choice (2AFC) task (Thurstone 1927). In this 2AFC task, we presented subjects two MISC class descriptions and asked them “which of these two symptoms do you consider most unpleasant?”. Ignoring the order of the two descriptions within each comparison, this resulted in 45 comparisons that were presented in a random order using a computer. Both the MAG and the 2AFC task were performed once, either before the first session, or after the last. The order of tasks was counterbalanced between subjects.
In Exp 6–7, we asked subjects to rate their experienced unpleasantness directly after a session on a visual analogue scale (VAS). Whilst the MAG and 2AFC tasks asked subjects to imagine how they would feel when experiencing the symptoms described, and were thus made independent of a motion stimulus, the VAS rating allowed for a direct comparison of the experienced unpleasantness and the highest MISC rating given during that session. The VAS consisted of a 12 cm line segment with endpoints “very unpleasant” and “very pleasant”. Subjects marked their judgement on this line and also indicated the main reason of their experienced unpleasantness, by choosing one of the following categories: motion sickness, physical stress, temperature, smell, sound, boredom, other, and not applicable.
Data analysis
To equalize the scale range between subjects and allow for an optimally balanced comparison of the three tasks, we normalized all psychophysical ratings. For the MAG task, we first measured the drawn line length (L) for each question with a ruler. We then determined the normalized MAG ratings for each subject using their shortest and longest drawn line, giving \({\text{MAG}}\; = \;\left( {L\; - \;L_{\min } } \right)/\left( {L_{\max } \; - \;L_{\min } } \right)\). We add subscripts 6 and 4 to refer to the reference used: MAG6 for the task using MISC 6 (n = 30) and MAG4 for the task using MISC 4 (n = 79). For the 2AFC task (n = 83), we first counted the number of times (C) a subject chose a MISC class as the most unpleasant. We then determined the normalized 2AFC ratings for each subject using the counts of the classes they had rated least and most unpleasant, giving \({\text{2AFC}}\; = \;\left( {C\; - \;C_{\min } } \right)/\left( {C_{\max } \; - \;C_{\min } } \right)\). For the VAS task (n = 107), we first measured the distance up to the mark that each subject had drawn (V). We then determined the normalized VAS rating for each subject by dividing this distance by the total line length, giving VAS = V/12.
Five subjects in Exp 6 and six subjects in Exp 7 did not perform all rating tasks. There were two subjects who misinterpreted the MAG4 task and reversed the sign for their line drawings (i.e., MISC 1 or 2 receiving 1 and MISC 9 or 10 receiving 0). They performed as expected in their 2AFC ratings. For these subjects, we replaced the MAG4 ratings by 1-MAG4. Due to an administrative error, two subjects performed the 2AFC task twice. We averaged their responses in the data analysis.
Our null hypothesis is a monotonic increase in unpleasantness with increasing symptom progression. To test for possible reductions in unpleasantness with increasing symptom progression, we compared the MAG and 2AFC ratings for all pairs of successive MISC classes using one-sided Wilcoxon Signed Rank tests with Bonferroni correction (α = 0.0056). For the VAS ratings, we followed the same procedure but with one-sided Mann–Whitney U tests instead (α = 0.0063).
Regarding the visual presentation of data, error bars are generally plotted in the direction of the axes. Because some data allowed for a within-subject comparison of ratings (Figs. 2a and 4a), we used the opportunity to determine the interquartile ranges in directions that take the within-subject characteristics into account: along the identity line and perpendicular to that. The rotation applied to these data resulted in the displacement of some medians due to an asymmetric distribution of data points (see Supplementary Fig. S1).