The timing processes sustaining perception of simultaneity and temporal order are respectively inferred from observers’ performance in simultaneity judgment (SJ) and temporal-order judgment (TOJ) tasks. In both cases, each trial presents a pair of stimuli whose onsets (or offsets) occur at arbitrary times, resulting in a temporal delay regarded as positive when the stimulus designated “test” lags the stimulus designated “reference” and negative when the test stimulus leads the reference stimulus. In SJ tasks, observers report whether presentation (or extinction) of the two stimuli was subjectively simultaneous. Because observers directly report perception of simultaneity, the temporal delay at which simultaneous responses are maximally prevalent is interpreted as the point of subjective simultaneity (PSS). In contrast, in TOJ tasks observers report which stimulus was subjectively presented (or extinguished) first (or second), with no option to report subjective simultaneity. The temporal delay at which observers give equal numbers of “test first” and “reference first” (alternatively, “test second” and “reference second”) responses is taken as the PSS, but this is only an indirect measure because the TOJ task avoids collecting evidence of perceived simultaneity. Strictly speaking, such analysis of TOJ data only identifies the temporal delay at which “test first” and “reference first” responses are equally prevalent, which is synonymous with the PSS only under the strong additional assumption that observers give “test first” and “reference first” responses equally often when they perceive simultaneity. Unfortunately, the validity of such assumption cannot be tested with TOJ data for lack of evidence as to when observers perceived simultaneity and, then, guessed a response.

PSS estimates obtained with SJ and TOJ tasks have systematically been reported to differ in within-subjects studies using both tasks with the same stimuli and conditions (Barnett-Cowan & Harris, 2009, 2011; Capa, Duval, Blaison, & Giersch, 2014; Donohue, Woldorff, & Mitroff, 2010; Fujisaki & Nishida, 2009; Li & Cai, 2014; Linares & Holcombe, 2014; Love, Petrini, Cheng, & Pollick, 2013; Schneider & Bavelier, 2003; Stevenson & Wallace, 2013; van Eijk, Kohlrausch, Juola, & van de Par, 2008; van Eijk et al. 2010; Sanders, Chang, Hiss, Uchanski, & Hullar, 2011; Vatakis, Navarra, Soto-Faraco, & Spence, 2008). This has prompted the notion that simultaneity and temporal-order judgments are governed by different processes (Spence & Parise, 2010; Vroomen & Keetels, 2010). It should be noted that the PSS estimates supporting this conclusion are obtained by fitting arbitrary psychometric functions to the data, so that the SJ estimate of the PSS is the abscissa at which the fitted function peaks and the TOJ estimate is the abscissa at which the fitted function evaluates to 0.5. The fitted functions accommodate the path of the data to obtain these estimates but without links to the timing, decisional, and response processes that presumably produced the data. The absence of these links only permits a description of observed performance, precluding an evaluation of the underlying processes.

This paper looks at performance in SJ and TOJ tasks under an independent-channels model that makes explicit assumptions about the operation of timing, decisional, and response processes (García-Pérez & Alcalá-Quintana, 2012a). The resultant model-based psychometric functions thus include parameters relating to each of these processes and permit a direct test of differences in each of them across SJ and TOJ tasks. In search for these differences, the model was used to analyze 12 data sets from published studies in which both tasks defined a within-subjects factor. The analysis of data from three of the studies will be described in full detail. In the first one, Linares and Holcombe (2014) reported different PSS estimates from SJ and TOJ tasks with audiovisual pairs and concluded that perceptual latencies differ across tasks. In the second, Capa et al. (2014) had schizophrenia patients and normal controls perform SJ and TOJ tasks with visuo-visual pairs and compared PSS estimates across tasks as well as error rates in the TOJ task with error rates predicted from performance in the SJ task, concluding that PSSs as well as actual vs. predicted error rates differ, further showing that patients display a selective impairment at long temporal delays only in TOJ tasks. In the third study, Li and Cai (2014) reported different PSS estimates from SJ and TOJ tasks with visuo-visual pairs involving digits of different numerical magnitudes, concluding that under both tasks large numbers are perceived later than small numbers. Analysis of the remaining data sets is succinctly described in the Electronic Supplementary Material and involved visuo-visual pairs under a diversity of attentional cuing conditions (Schneider & Bavelier, 2003): audio-visual, visuo-tactile, audio-tactile, and tactile-tactile pairs (Fujisaki & Nishida, 2009), and audio-vestibular pairs (Sanders et al., 2011).

All analyses corroborate that discrepancies in performance across SJ and TOJ tasks result from task-dependent decisional and response processes operating on the outcome of timing processes that are identical under both tasks. The next section briefly describes the model used in our analyses, including a variant that holds under the conditions of some of the data sets. Subsequent sections describe relevant details of the experiments whose data are analyzed, specific details of the way in which the model was fitted to each data set, and the results. The Discussion provides a final overview and also comments on some methodological issues arising in research on perception of temporal order and simultaneity.

Independent-channels model

The independent-channels (IC) model was presented in full detail in García-Pérez and Alcalá-Quintana (2012a) but a brief description will be given here. Before doing so, a broad overview of its components and assumptions is worth giving in plain language. In IC models, sensory signals from each stimulus component (e.g., auditory and visual) are independently processed through the corresponding sensory pathway to render a perceived onset time for each component. These perceived onsets vary across trials according to some distribution. A timing judgment (e.g., audio first, video first, or simultaneous) then results by application of a decision rule to the perceived onset of each component. Response processes subsequently map this judgment onto an observable response (among the options given by the task), although the observer may make response errors.

The model is formally presented next in terms adapted to the experimental conditions of Linares and Holcombe (2014), where the auditory (instead of the visual) component was the reference. For all practical purposes, this only implies that subscripts “a” and “v” (denoting the auditory and visual components) are swapped in model equations with respect to equations when the visual component is the reference. A variant of the model will subsequently be presented that applies to the conditions of Capa et al. (2014) and Li and Cai (2014), in which temporal delays were delivered via pairs of visual stimuli presented side by side on a monitor.

The perceived onsets (or perceived latencies) T v and T a of the visual and auditory components of a stimulus are random variables with densities g v and g a given by the shifted exponential distributions

$$ {g}_i(t)={\uplambda}_i \exp \left[-{\uplambda}_i\left(t-\left(\varDelta {t}_i + {\uptau}_i\right)\right)\right],\kern1em t\ge \varDelta {t}_i + {\uptau}_i,\kern1em i\in \left\{\mathrm{v},\ \mathrm{a}\right\} $$
(1)

where Δt i is the physical onset of component i and λ i and τ i reflect peripheral processing and neural transmission times. If the auditory component is the reference, Δt a = 0 by definition and Δt ≡ Δt v is the visual delay with which the pair is presented, where Δt < 0 (Δt > 0) reflects that the visual onset precedes (lags) the auditory onset. Figure 1a shows sample distributions when Δt = 0 (i.e., no delay), reflecting the variability with which visual and auditory onsets are perceived across trials.

Fig. 1
figure 1

IC model of timing judgments. (a) Distributions of perceived visual onset (red curve) and perceived auditory onset (blue curve) for simultaneously presented stimuli (i.e., with a visual delay Δt = 0 ms). The parameters of each distribution are indicated in the inset. (b) Distribution of perceived-onset differences (curve) and boundaries in decision space (vertical lines at D = ±δ with δ = 160), determining the probability of each type of judgment. The distribution of perceived-onset differences is asymmetric unless λa = λv and peaks at D = Δt + τ. (c) Curves describing the probability of each type of judgment as a function of visual delay Δt. Circles denote the probabilities when Δt = 0 ms, coming from the partition illustrated in (b)

On a given trial, perceived onsets are drawn from these distributions and judgments arise from a decision rule applied to the perceived-onset difference D = T vT a, which has the asymmetric Laplace distribution

$$ f\left(d;\ \varDelta t\right) = \left\{\begin{array}{cc}\hfill \frac{\uplambda_{\mathrm{a}}{\uplambda}_{\mathrm{v}}}{\uplambda_{\mathrm{a}}+{\uplambda}_{\mathrm{v}}} \exp \left[{\uplambda}_{\mathrm{a}}\left(d-\varDelta t-\uptau \right)\right]\hfill & \hfill \mathrm{if}\kern0.5em d\le \varDelta t+\uptau \hfill \\ {}\hfill \frac{\uplambda_{\mathrm{a}}{\uplambda}_{\mathrm{v}}}{\uplambda_{\mathrm{a}}+{\uplambda}_{\mathrm{v}}} \exp \left[-{\uplambda}_{\mathrm{v}}\left(d-\varDelta t-\uptau \right)\right]\hfill & \hfill \mathrm{if}\kern0.5em d>\varDelta t+\uptau \hfill \end{array}\right. $$
(2)

where τ = τv − τa. Figure 1b shows the distribution of D for the case in Fig. 1a. A resolution parameter δ (Fig. 1b) limits the observer’s ability to tell small differences in perceived onset. Then, “video-first” (VF) judgments arise when D is large and negative (D < −δ), “audio-first” (AF) judgments arise when D is large and positive (D > δ), and “simultaneous” (S) judgments arise when D is below the resolution limit (−δ ≤ D ≤ δ). The probability of each judgment varies with Δt, as this shifts the distribution of perceived-onset differences. Figure 1c shows the functions p VF, p S, and p AF describing how the probability of each judgment varies with Δt. These functions are given by

$$ {p}_{\mathrm{VF}}\left(\varDelta t\right) = F\left(-\delta;\ \varDelta t\right) $$
(3)
$$ {p}_{\mathrm{S}}\left(\varDelta t\right) = F\left(\delta;\ \varDelta t\right)-F\left(-\delta;\ \varDelta t\right) $$
(4)
$$ {p}_{\mathrm{AF}}\left(\varDelta t\right) = 1-F\left(\delta;\ \varDelta t\right) $$
(5)

with

$$ F\left(d;\ \varDelta t\right) = \kern0.5em {\displaystyle {\int}_{-\infty}^df\left(z;\kern0.5em \varDelta t\right)}\;\mathrm{d}z = \left\{\begin{array}{cc}\hfill \frac{\uplambda_{\mathrm{a}}}{\uplambda_{\mathrm{a}}+{\uplambda}_{\mathrm{v}}} \exp \left[{\uplambda}_{\mathrm{a}}\left(d-\varDelta t-\uptau \right)\right]\hfill & \hfill \mathrm{if}\kern0.5em d\le \varDelta t+\uptau \hfill \\ {}\hfill 1-\frac{\uplambda_{\mathrm{v}}}{\uplambda_{\mathrm{a}}+{\uplambda}_{\mathrm{v}}} \exp \left[-{\uplambda}_{\mathrm{v}}\left(d-\varDelta t-\uptau \right)\right]\hfill & \hfill \mathrm{if}\kern0.5em d>\varDelta t+\uptau \hfill \end{array}\right. $$
(6)

Under the model, these judgments precede and are independent of the question that observers answer at the end of the trial (i.e., the task that is administered). Also, the distribution of D in Eq. (2) must be invariant across SJ and TOJ tasks for any given observer when stimuli and conditions are identical. Nevertheless, the observed psychometric functions in SJ and TOJ tasks will differ because they express the probabilities in Eqs (3)–(5) differently, as discussed next.

In SJ tasks, AF and VF judgments are aggregated into nonsimultaneous (NS) judgments and the psychometric function for S responses in SJ tasks is

$$ {\Psi}_{\mathrm{S}\mathrm{J}}\left(\varDelta t\right) = {p}_{\mathrm{S}}\left(\varDelta t\right) $$
(7)

(Fig. 2a). In TOJ tasks, however, observers must give VF or AF responses even when they judge simultaneity. This calls for a guessing parameter ξ describing the probability with which an observer gives an AF response in these cases. The psychometric function for AF responses in TOJ tasks is, then,

Fig. 2
figure 2

Psychometric functions in SJ (a) and TOJ (b) tasks. In the absence of response errors, the psychometric function in SJ tasks equals the curve describing the probability of “simultaneous” judgments illustrated in Fig. 1c. In TOJ tasks, and also in the absence of response errors, the form of the psychometric function depends on how observers transform judgments of simultaneity into audio-first (AF) or video-first (VF) responses, which is determined by the response bias parameter ξ. When ξ = 1 (red curve), judgments of simultaneity are always reported as AF responses; when ξ = 0 (blue curve), judgments of simultaneity are always reported as VF responses instead; for intermediate cases, the psychometric function will lie in between those two. The vertical dashed line in (a) is the nominal PSS in the SJ task, defined at the peak of the psychometric function; the vertical dashed lines in (b) indicate the 50% point of the psychometric function (the nominal PSS in the TOJ task), whose location is greatly affected by the response bias parameter ξ

$$ {\Psi}_{\mathrm{TOJ}}\left(\varDelta t\right) = \xi {p}_{\mathrm{S}}\left(\varDelta t\right) + {p}_{\mathrm{AF}}\left(\varDelta t\right) $$
(8)

which may seem inconsistent with ΨSJ depending on the observer’s response bias (Fig. 2b).

The IC model just presented must be amended to cover realistically the mapping of judgments onto responses, a process during which errors can be made (see García-Pérez & Alcalá-Quintana, 2012a, 2012b). With this extension, the final form of the psychometric functions in SJ and TOJ tasks are

$$ {\Psi}_{\mathrm{S}\mathrm{J}}\left(\varDelta t\right) = {\boldsymbol{\upvarepsilon}}_{\mathrm{VF}}^{\left(\mathrm{S}\mathrm{J}\right)}\ {p}_{\mathrm{VF}}\left(\varDelta t\right) + \left(1 - {\boldsymbol{\upvarepsilon}}_{\mathrm{S}}^{\left(\mathrm{S}\mathrm{J}\right)}\right){p}_{\mathrm{S}}\left(\varDelta t\right) + {\boldsymbol{\upvarepsilon}}_{\mathrm{AF}}^{\left(\mathrm{S}\mathrm{J}\right)}\ {p}_{\mathrm{AF}}\left(\varDelta t\right) $$
(9)
$$ {\Psi}_{\mathrm{TOJ}}\left(\varDelta t\right) = {\boldsymbol{\upvarepsilon}}_{\mathrm{VF}}^{\left(\mathrm{T}\mathrm{O}\mathrm{J}\right)}\ {p}_{\mathrm{VF}}\left(\varDelta t\right) + \xi {p}_{\mathrm{S}}\left(\varDelta t\right) + \left(1 - {\boldsymbol{\upvarepsilon}}_{\mathrm{AF}}^{\left(\mathrm{T}\mathrm{O}\mathrm{J}\right)}\right){p}_{\mathrm{AF}}\left(\varDelta t\right) $$
(10)

where each ε reflects the probability of misreporting the judgment indicated by its subscript under the task indicated by its parenthetical superscript. Note that Eqs. (7) and (8) obtain when all εs are zero (i.e., when response errors are never made).

It is useful to consider at this point how IC model parameters affect the (theoretical) shapes of ΨSJ and ΨTOJ, although we will mostly discuss these effects with respect to ΨSJ because the same effects just carry over to ΨTOJ. Parameters λa and λv come from the distributions of perceived onsets (Fig. 1a), making ΨSJ bilaterally asymmetric about its peak when λa ≠ λv and being separately responsible for the different drop-off rates of ΨSJ on either side (Fig. 2a). Parameter τ is responsible for the location of the peak of the distribution of perceived-onset differences (Fig. 1b) which, in turn, determines the location of the peak of ΨSJ (Fig. 2a). Parameter δ determines the width of the central region in decision space (Fig. 1b), which has direct effects on the height of ΨSJ at its peak and also on its overall breadth: ΨSJ is short and narrow when δ is small, whereas it is tall and broad when δ is large (not illustrated in the preceding figures). Because these parameters produce different effects on ΨSJ, they are not confounded and, thus, can be recovered (within sampling error) from empirical SJ data showing various degrees of asymmetry, lateral shifts, or breadths and heights. Some examples of the empirical diversity of these characteristics will be seen in our forthcoming analyses of data. The effects of the additional response bias parameter ξ unique to the TOJ task can be appreciated in the three sample psychometric functions in Fig. 2b, which also incorporate the effects of the parameters discussed above. Finally, the ε parameters only affect the asymptotic behavior of either type of psychometric function at large (positive or negative) delays, although the effect can be as large as shown in some cases (e.g., SJ and TOJ data and fitted functions for observer DL at large negative delays in Fig. 4a below, or analogous effects at large delays of both signs for observer P07 in Fig. 6a below).

The IC model thus describes performance in SJ and TOJ tasks with timing parameters (λa, λv, and τ) that have common values as long as stimuli and conditions are identical for both tasks and with decisional and response parameters (δ, ξ, and the various εs) that may vary across tasks. These features permit direct tests of the model in within-subjects conditions, checking out the feasibility of accounting for the data under the assumption of identical timing processes across tasks. It should nevertheless be noted that timing parameters may vary greatly across sensory modalities (i.e., they will reasonably vary for visual, auditory, tactile, or vestibular stimuli due to specificities of peripheral processing and neural transmission along the corresponding pathways), across stimuli within the same sensory modality (e.g., due to the place–frequency map of the basilar membrane, low and high auditory frequencies reach their associated locations at different times; Uppenkamp, Fobel, & Patterson, 2001; Wojtczak, Beim, Micheyl, & Oxenham, 2012, Wojtczak et al. 2013) or across conditions involving the same stimuli (e.g., when different types of attentional cue are used in separate experimental conditions; see our analysis of Scheinder and Bavelier’s, 2003, data in the Electronic Supplementary Material). What the IC model posits is that, all else being equal, timing parameters do not vary across SJ and TOJ tasks.

The model simplifies when temporal delays are delivered via identical stimuli as in the experiment of Capa et al. (2014), under the reasonable assumption that timing processes do not operate differently when visual stimuli are presented on the left or on the right visual field. Even though both stimuli are identical, it is still necessary to designate one of them as the reference to differentiate negative from positive delays. Without loss of generality, we will regard the stimulus presented on the left side of the monitor as the reference so that subscripts “a” and “v” in the previous equations are respectively replaced with “l” and “.” Nevertheless, because the two stimuli are identical, perceived onsets must be identically distributed for both of them, yielding λl = λr = λ and τ = τr – τl = 0. Carrying over these substitutions, Eq. (2) reduces to the Laplace distribution

$$ f\left(d;\ \varDelta t\right) = \frac{\uplambda}{2} \exp \left[-\uplambda \left|d-\varDelta t\right|\right] $$
(11)

and Eq. (6) becomes

$$ F\left(d;\ \varDelta t\right)=\left\{\begin{array}{cc}\hfill \frac{1}{2} \exp \left[\uplambda \left(d-\varDelta t\right)\right]\hfill & \hfill \mathrm{if}\kern0.5em d\le \varDelta t\hfill \\ {}\hfill 1-\frac{1}{2} \exp \left[-\uplambda \left(d-\varDelta t\right)\right]\hfill & \hfill \mathrm{if}\kern0.5em d>\varDelta t\hfill \end{array}\right. $$
(12)

A further consequence of using pairs of identical stimuli is that the resultant removal of parameter τ permits relaxing the assumption of a symmetric central region in decision space. Amending the model to allow for non-symmetric boundaries respectively placed at δ1 and δ2 (with δ1 ≠ −δ2) incorporates the notion of decisional bias towards one of the response options (Fig. 3). A symmetric central region (and, hence, the assumption of lack of decisional bias) is needed in the general version of the model because SJ and TOJ tasks are instances of the method of single stimuli (MSS) and share its limitations. Specifically, and with reference to the diagram in Fig. 1b, a rigid displacement of the decision boundaries (by shifting the boundary currently at –δ to δ1 = –δ + k and similarly shifting the boundary at δ to δ2 = δ + k, with positive or negative k) produces the same psychometric functions that would result by leaving the boundaries at –δ and δ and replacing τ with τ + k. This non-identifiability is not a property of this model but a general shortcoming of MSS (García-Pérez & Alcalá-Quintana, 2013; Schneider & Bavelier, 2003; Yarrow, Jahn, Durant, & Arnold, 2011). Yet, under the conditions discussed now, where τ = 0, this non-identifiability is removed and potentially asymmetric boundaries at δ1 and δ2 can be estimated. Then, in place of Eqs. (3)–(5) above, the probabilities of “right-first” (RF), “simultaneous” (S), and “left-first” (LF) judgments are given by

Fig. 3
figure 3

IC model of timing judgments when temporal delays are delivered via identical visual stimuli presented on the left and on the right of a fixation point on a monitor. (a) Distribution of perceived onset for the only stimulus implied in this case. (b) Distribution of perceived-onset differences (curve) and potentially asymmetric boundaries in decision space (vertical lines at D = δ1 = −160 and D = δ2 = 80). Because perceived onsets are identically distributed for both stimuli, the distribution of perceived-onset differences is symmetric and peaks at D = Δt. (c) Curves describing the probability of each type of judgment as a function of the temporal delay Δt with which the stimulus on the right is presented. Circles denote the probabilities when Δt = 0 ms, coming from the partition illustrated in (b). Because the decision boundaries δ1 and δ2 are not symmetrically placed around D = 0, the vertical axis of bilateral symmetry of the ensemble of curves is at Δt = (δ1 + δ2)/2 = −40 ms

$$ {p}_{\mathrm{RF}}\left(\varDelta t\right) = F\left({\delta}_1;\ \varDelta t\right) $$
(13)
$$ {p}_{\mathrm{S}}\left(\varDelta t\right) = F\left({\delta}_2;\ \varDelta t\right)-F\left({\delta}_1;\ \varDelta t\right) $$
(14)
$$ {p}_{\mathrm{LF}}\left(\varDelta t\right) = 1-F\left({\delta}_2;\ \varDelta t\right) $$
(15)

which are subsequently combined with appropriately subscripted error parameters as in Eqs. (9) and (10) to render the psychometric functions in SJ and TOJ tasks under these conditions. Thus, the IC model again describes performance in SJ and TOJ tasks involving identical stimuli and conditions with common timing parameters (now only λ) and with decisional and response parameters (δ1, δ2, ξ, and the various εs) that may vary across tasks.

Data and model fit

Linares and Holcombe (2014)

In the study by Linares and Holcombe (2014), seven observers performed SJ and TOJ tasks with the same stimuli and conditions. The (reference) auditory stimulus was a 10-ms white noise burst; the visual (test) stimulus was a 10-ms change from black to white and back to black in the luminance of a circle displayed on a grey background. Test onset times were manipulated to occur with a delay Δt ranging from −250 ms to 250 ms in steps of 50 ms and 20 trials were given at each delay.

We downloaded the data made available by the authors from http://www.dlinares.org and fitted the IC model for each observer jointly across tasks using a fortran version of the software described in Alcalá-Quintana and García-Pérez (2013). A joint fit implies that the timing parameters λa, λv, and τ are estimated to have common values for both tasks, whereas decisional and response parameters (δ, ξ, and the εs) vary across tasks. If the data cannot be properly accounted for on the implied assumption that timing parameters have common values in both tasks, the fit will be poor and fitted curves will not follow the path of the data. Non-null values for the εs were not pre-assumed; rather, the software option model = “best” (Alcalá-Quintana & García-Pérez, 2013) was used to determine which εs ought to be non-null. The joint fit involves somewhere between 6 parameters (when all εs are null) and 11 parameters (when all εs are non-null) per observer. For comparison, we also fitted the data using the approach of Linares and Holcombe (2014), which separately fits a four-parameter cumulative Gaussian to TOJ data and a three-parameter scaled Gaussian to SJ data. This yields seven parameters (two of which are error rates) per observer. The interpretation of those parameters in terms of underlying timing processes is impossible, because the descriptive approach only produces curves whose location and slope best delineate the path of the data. For both fitting approaches, we computed the likelihood-ratio statistic G 2 as a measure of goodness of fit. The outcomes of the two approaches were compared with the Bayesian Information Criterion (BIC), which assesses economy of description for each approach by combining the goodness-of-fit measure and the number of parameters used to attain that fit.

Capa et al. (2014)

In the study by Capa et al. (2014), 20 schizophrenia patients and 20 normal controls performed SJ and TOJ tasks with identical stimuli and conditions. The reference stimulus for our analysis was a gray rectangle presented on the left of a central fixation point on the monitor; the test stimulus was an identical rectangle presented on the right of the fixation point with a temporal delay Δt that ranged from −96 ms to 96 ms in steps of 24 ms. Offset times did not differ, as both stimuli remained on display until observers responded. At each non-null temporal delay, 20 TOJ trials and 54 SJ trials were given; the number of trials was doubled at the null temporal delay. Some observers missed some trials in the SJ task, so that the effective number of SJ trials was usually smaller and also varied across temporal delays. In the TOJ task, observers reported which stimulus was presented second (rather than first), which is largely inconsequential for our analyses: Judgments LF and RF referred to in Eqs. (13)-(15) straightforwardly translate into “right second” (RS) and “left second” (LS), respectively.

We obtained raw data from Anne Giersch and fitted the adapted version of the IC model also jointly across tasks. The same software was used for this purpose, once modified to implement the constraints and extensions discussed above. The joint fit again implies that the timing parameter λ is estimated to have the same value in both tasks, whereas decisional and response parameters (δ1, δ2, ξ, and the εs) vary across tasks. Evidence that timing processes are identical in both tasks will again show in the form of a good fit (i.e., nonsignificant G 2) and fitted curves that follow the path of the data from each task. Decisions about which of the εs ought to be non-null were made as with the previous data set. In this case, the joint fit also involves somewhere between 6 parameters (when all εs are null) and 11 parameters (when all are non-null) per observer. Descriptive psychometric functions were not fitted to the data because the analyses reported by Capa et al. (2014) were not based on fitted functions.

Li and Cai (2014)

In the study of Li and Cai (2014), 16 observers carried out SJ and TOJ tasks with pairs of visual stimuli flashed for 50 ms on the left and on the right of a fixation point on a monitor under identical conditions. Stimuli were black digits presented on a uniform gray background and the reference versus test distinction was based on the numerical magnitude of the digits. The study investigated whether the numerical magnitude of a digit affects the speed with which it is processed. Small digits were “1” or “2” and large digits were “8” or “9.” A large and a small digit were selected at random for presentation in each trial and, in turn, the positions (left and right of the fixation point) of the two digits also were decided at random in each trial. The small digit was regarded as the reference irrespective of the position at which it was displayed and the large (test) digit was presented at the other position with a temporal delay Δt of 0, ±10, ±20, ±30, or ±50 ms. Data were aggregated across presentation positions and choice of digit, yielding 48 trials at each temporal delay under each task. Observers kept their gaze at the fixation point throughout the trial and reported whether or not the two presentations were subjectively simultaneous (SJ task) or the side on which a digit had subjectively appeared first (TOJ task; with “left first” and “right first” responses translated accordingly into “small number first” or “large number first”).

We fitted the IC model to raw data provided by Shuang-Xia Li, using the simplified model used for the data from Capa et al. (2014), because the distributions of perceived onsets are unlikely to differ for reference (small number) and test (large number) stimuli. The numerical magnitude of a digit can only be extracted through cognitive processing once the stimulus image has been peripherally processed and transmitted up the visual pathway, and such late processing cannot retroactively affect perceived onsets. Then, at the stage in which stimulus onsets are perceived, test and reference stimuli might only differ in their overall luminance (given by the spatial extent of the black area making up each digit), but these were reportedly equated by making all digits contain the same number of black pixels. Our analyses thus proceeded as described in the preceding section for the data from Capa et al. (2014), although we also fitted the general model in the unlikely event that it provided a better account of the data (which was not the case; those results will not be presented in detail).

Results

Linares and Holcombe (2014)

Figure 4a shows data and fitted IC model psychometric functions for each observer in each task (Table 1 for parameter estimates). The G 2 statistic was not significant for any observerFootnote 1 and model curves follow the path of the data as closely as the occasionally noisy data permit. Thus, performance in both tasks can be understood as emanating from identical timing processes governed by parameters λa, λv, and τ with the participation of task-dependent decisional and response processes that vary across tasks. Despite common timing parameters, the peak of ΨSJ (red vertical line in each panel) does not occur at the same Δt as the 50% point on ΨTOJ does (blue vertical line in each panel), mostly because the latter is determined by the guessing parameter ξ with which observers give AF or VF responses when they judge simultaneity (Fig. 2b). Indeed, individual estimates of ξ (Table 1) differ meaningfully from 0.5 except for observer SN, revealing that observers tend to give imbalanced AF and VF responses upon S judgments. For this reason, the 50% point on ΨTOJ does not reflect the true location of the (sensory) PSS, and a comparison with a PSS defined at the location of the peak of ΨSJ is inadequate. The lack of an S response option in TOJ tasks seems to prompt attentive observers to push their resolution limit to minimize the need for guessing. This results in a functionally narrower central region in decision space or, in other words, a smaller value for parameter δ. Evidence to this effect has been found in other studies (García-Pérez & Alcalá-Quintana, 2012a) and is apparent in Table 1: estimated δ(SJ) is invariably larger than estimated δ(TOJ) for each observer. A two-tailed, paired-samples t test for means revealed that these differences were statistically significant (t 6 = 3.36; p = 0.015).

Fig. 4
figure 4

Data and fitted psychometric functions for the study of Linares and Holcombe (2014) under the IC model (a) and under a descriptive approach (b). Results for the SJ task are shown in red; results for the TOJ task are shown in blue. Each column pertains to the observer indicated at the top. Estimated PSSs for each observer from the SJ and TOJ tasks are shown by vertical lines of the appropriate color in the upper panels

Table 1 Estimated IC model parameters in the joint fit to SJ and TOJ data for each observer in the experiment by Linares and Holcombe (2014)

Figure 4b shows the results of independently fitting a scaled Gaussian to SJ data and a cumulative Gaussian to TOJ data. This approach describes the paths of the data without links to underlying processes and only serves the goal of identifying landmarks (peaks or percent points) on the fitted functions. The fitted functions nevertheless do not do justice to the data. They cannot capture conspicuous asymmetries or flat plateaus in SJ data, and nonmonotonic patterns or reductions in slope around the null delay in TOJ data. Surely as a result of this failure to accommodate the data, the G 2 statistic (with 8 and 7 degrees of freedom respectively for SJ and TOJ data) was significant in 5 of 14 cases (35.7%): in the SJ task, for observers AH (G 2 = 35.60; p < 0.001), AL (G 2 = 21.11; p = 0.007), and SM (G 2 = 21.62; p = 0.006); in the TOJ task, for observers DL (G 2 = 21.45; p = 0.003) and SM (G 2 = 16.88; p = 0.018).

Note also in Table 1 that the good fit of the IC model was attained with only six parameters for observers SG and SN, with seven parameters for observers AH and GS, and with eight, nine, and ten parameters respectively for observers DL, SM, and AL. In contrast, the descriptive approach estimates a fixed number of seven parameters per observer. Although it seems clear in Fig. 4 that the better fit of the IC model is structural and not attained simply by recourse to more parameters, Fig. 5 shows a scatter plot of the BIC of each model across observers. A model with a smaller BIC accounts for the data more efficiently, provided the models under comparison fit the data similarly well in the first place (which is not the case here, as the descriptive approach failed to fit the data properly in a number of cases). By the BIC, the IC model also outperforms the descriptive approach for all observers, even in cases in which it uses more parameters to account for the data.

Fig. 5
figure 5

Plot of the BIC of the IC model against the BIC of the descriptive model for fitting data from the study of Linares and Holcombe (2014). Each symbol denotes an observer. The color of each symbol indicates the number of parameters in the IC model fit (see legend); the descriptive model fit always included seven parameters

Finally, related to the number of parameters needed to account for the data, note in Table 1 that error parameters were rarely needed to account for SJ data, whereas they were needed more often to account for TOJ data. This seems to be caused by a further difference between the tasks in terms of the responses they request. Specifically, some observers seem to identify non-simultaneity easily, but they find difficulties indicating which of the two stimuli came first. Then, errors in the TOJ task (defined as failures to correctly report which stimulus was presented first) are observed for delays at which no errors are observed in the SJ task (e.g., data at large negative or large positive visual delays for observers AH, AL, and GS in Fig. 4).

It is interesting to look at the picture that the IC model fit offers with respect to the issue investigated by Linares and Holcombe (2014), namely, differences in perceptual latency across tasks. First, note that the location of the peak of ΨSJ (the classical estimate of the PSS) is not interpretable as the mean difference between visual and auditory latencies (more generally, between the latencies of reference and test stimuli). Hence, the location of the peak of ΨSJ does not reflect the true PSS. As discussed in detail two paragraphs down, the mean difference in perceptual latency is 1/λv – 1/λa + τ under the IC model, whereas ΨSJ peaks instead at Δt = δ(λv – λa)/(λv + λa) – τ also under the model (García-Pérez & Alcalá-Quintana, 2012a). Second, even if ξ = 0.5, the 50% point on ΨTOJ occurs at Δt = δ(λv – λa)/(λv + λa) – τ + (logλv – logλa)/(λv + λa) (García-Pérez & Alcalá-Quintana, 2012a), matching the location of the peak of ΨSJ only when λv = λa. Then, (classical) PSS estimates from SJ and TOJ tasks have diverse influences and their comparison cannot reveal differences in perceptual latency.

In contrast, some of the IC model parameters describe the assumed distributions of perceived onsets and could thus be used to refer performance back to perceptual latencies. Specifically, from Eq. (2) with Δt i = 0 for i ∈ {v, a}, the mean and variance of the distributions of perceived onsets (perceptual latencies) are respectively μ i = 1/λ i + τ i and σ 2 i  = 1/λ 2 i . Estimating the distributions of visual and auditory latencies could be possible under a model-based approach, if it was not because SJ and TOJ tasks are instances of MSS and, thus, have the two inescapable shortcomings discussed next.

Even though observers rely on separate perceived latencies to make their judgment in each trial, they are requested to respond in a way that only the relative difference in latency (not the individual values) manifests under either task. This became clear during model development at the point in which parameters τa and τv combine additively into τ, which is the only parameter that can be estimated with these tasks. This is not a property of this model, but the consequence of a task requiring the comparison of two sensory inputs (two perceptual latencies in this case). Then, the mean perceptual latency of visual or auditory signals cannot be estimated in absolute terms and only the difference between them can be estimated at μv − μa = 1/λv – 1/λa + τ, identical for SJ and TOJ tasks when timing parameters are common. From parameter estimates in Table 1, the difference is negative at –4.4 only for observer AL and ranges from 10.7 for observer SM to 76.1 for observer AH. Estimated visual latencies that are longer than estimated auditory latencies are consistent with the slower peripheral processing and neural transmission of visual compared to auditory signals (see references to this effect in Vroomen & Keetels, 2010). Despite consistency with other sources of evidence, these estimated differences between visual and auditory latencies must be incorrect because of the second shortcoming of MSS.

In the general model, the central region in decision space is assumed to be symmetrically placed around the origin for reasons discussed above (Fig. 1b). Yet, this region might well be asymmetrically placed with boundaries at δ1 and δ2 (with δ1 ≠ −δ2) as a result of decisional bias (Fig. 3b). Because this asymmetry is confounded with parameter τ in the general model, fitting the model under the necessary assumption that |δ1| = |δ2| = δ converts nominal estimates of τ into estimates of τ + (δ1 + δ2)/2, with unknown δ1 and δ2. Then, estimates of μv − μa presented in the preceding paragraph are contaminated by a decisional bias whose magnitude cannot be assessed with SJ or TOJ tasks involving test and reference stimuli of different types.

In sum, the good fit of the IC model with common timing parameters to the data of Linares and Holcombe (2014) implies that identical perceptual latencies under both tasks cannot be ruled out, although estimating their values (or the difference between them) is hampered by the inevitable shortcomings of MSS. Some ways around these difficulties will be presented in the Discussion.

Capa et al. (2014)

Figure 6 shows data and fitted psychometric functions for each observer in each task, separately for schizophrenia patients (Fig. 6a) and normal controls (Fig. 6b), and with a final panel for average data and average fitted functions in each group. (Some observers were excluded from these averages; see the caption to Fig. 6.) Parameter estimates are listed in Tables 2 and 3. Compared with the analysis carried out by Capa et al. (2014), we did not collapse data at negative and positive delays of the same absolute magnitude. In fact, TOJ data often are asymmetric about the null delay (e.g., TOJ data for patients P01, P08, and P09 or controls C01, C06, C08, and C10). This asymmetry is captured by the fitted model, as also is the approximate symmetry of other data (e.g., TOJ data for patients P14, P19, and P20 or controls C03, C13, and C20). Asymmetric TOJ data are mostly determined by the response bias parameter ξ, but an asymmetric central region in decision space also plays a role. Several other peculiar aspects of the data are well captured by the model, as discussed next.

Fig. 6
figure 6

Data and fitted psychometric functions for the study of Capa et al. (2014) under the IC model in the sample of patients (a) and normal controls (b). Results for the SJ task are shown in red; results for the TOJ task are shown in blue. Each panel pertains to an observer; the grayed panel at the bottom right in each part plots data and psychometric functions aggregated across observers. Due to their peculiar responses in the TOJ task, data and fitted curves for patients P03, P04, P05, P15, P17, and P18 and control C19 are excluded from these averages

Table 2 Estimated IC model parameters in the joint fit to SJ and TOJ data for each schizophrenia patient in the experiment by Capa et al. (2014)
Table 3 Estimated IC model parameters in the joint fit to SJ and TOJ data for each normal control in the experiment by Capa et al. (2014)

First, note that some observers seemed to respond to the TOJ task in reverse, as the proportion of RS responses decreases as right delay increases (see TOJ data for patients P04, P05, P15, and P18 or for control C19). The model nevertheless fits the data well by assuming common timing parameters under both tasks, because this response pattern is taken to reflect response errors (Alcalá-Quintana & García-Pérez, 2013) that do not hamper the estimation of timing and decisional parameters determining other aspects of observed performance. Second, note that patient P03 responded to the TOJ task as if he/she were performing an SJ task instead, using the designated response key for RS judgments to report asynchrony. The model also accommodates these data via response error parameters, adequately describing other aspects of observed performance via timing and decisional parameters. Third, note that patient P17 did not give a single RS response in the TOJ task, providing uninformative data that are nevertheless compatible with the model via a broad central region in decision space coupled with a strong bias against the RS response. Fourth, the path of TOJ data often display a remarkable reduction in slope around the null delay (e.g., data from patients P02, P06, P07, P08, P10, P16, and P19 or from controls C01, C02, C05, C11, C12, C13, C15, C16, C17, C18, and C20). This characteristic is expected from the response bias parameter ξ (Fig. 2b) and is well captured by the model curves. Fifth, note that SJ data show bilateral symmetry, which is expected when the same stimulus is used to deliver delays and, hence, the distribution of perceived onsets is the same for both stimuli (Fig. 3). Model curves incorporating this assumption fit the data well, although the vertical axis of bilateral symmetry is not always at the null delay (e.g., data from patients P15 and P17) due to asymmetric placement of the decision boundaries (Fig. 3c). Bilateral symmetry sometimes breaks at the tails due to response errors (e.g., SJ data from patients P03, P06, P07, P11, and P14 or from control C20). Finally, note that TOJ performance again indicates errors of identification of temporal order when SJ performance reveals errorless identification of asynchrony (e.g., SJ and TOJ data points from patients P01 and P15 or controls C12 and C13 at large negative delays, or from patients P06, P10, and P13 or control C12 at large positive delays). All of these aspects are adequately captured through suitable model parameters, thus permitting an interpretable account of the timing, decisional, and response processes underlying observed performance within and across tasks.

Note in Tables 2 and 3 that the G 2 statistic was not significant for any patient and was significant for only two controls (C02 and C20). A statistically good fit is not surprising given that model curves follow the path of the data very closely for each observer, even in the two cases in which the G 2 statistic was significant. There are nevertheless meaningful differences between the location of the peak of ΨSJ and the location of the 50% point on ΨTOJ (when the latter exits) in the panel for each observer in Fig. 6, differences that are also patent at the average level among patients (see the grayed panel in Fig. 6a). These differences occur despite the fact that parameter λ is estimated to have the same value in both tasks under our joint fit, which reveals again that the different PSSs obtained with SJ and TOJ tasks are compatible with timing process operating identically under both tasks.

Given the manifest performance differences across patients and controls at the average level (grayed panels in Figs. 6a, b), differences in parameter estimates across groups are worth looking at. The average 1/λ was 17.11 in the group of patients and 14.72 in the group of controls. This difference was not significant by a two-tailed, separate-variance two-sample t test for means (z = 0.83; p = 0.404), revealing that timing processes do not differ between patients and controls. Differences between groups in decisional parameters also were sought by analyzing the width δ2 – δ1 and the center (δ1 + δ2)/2 of the central region in decision space (see scatter plots of these measures across tasks for each group in Fig. 7). Parameter estimates from patient P17 were excluded from these analyses, as this observer did not provide usable data for the estimation of task-dependent parameters under the TOJ task. The average width δ (SJ)2  − δ (SJ)1 was 80.18 among patients and 62.44 among controls, whereas the average width δ (TOJ)2  − δ (TOJ)1 was 93.10 among patients and 88.53 among controls. A 2×2 ANOVA with task (SJ vs. TOJ) as a repeated-measures factor, group (patients vs. controls) as a grouping factor, and δ2 – δ1 as the dependent variable revealed significant main effects of task (F 1,37 = 17.587, p < 0.001) with no main effects of group (F 1,37 = 2.072, p = 0.158) and no interaction (F 1,37 = 2.003, p = 0.165). This shows the absence of differences between patients and controls as to the width of the central region in decision space, although both groups display a higher temporal resolution (narrower central region) under the SJ task. A similar ANOVA using (δ1 + δ2)/2 as the dependent variable revealed no significant main effects of task (F 1,37 = 0.074, p = 0.788) or group (F 1,37 = 1.946, p = 0.171) and no significant interaction (F 1,37 = 2.984, p = 0.092). This shows that patients and controls do not differ either as to decisional bias.

Fig. 7
figure 7

Scatter plot of estimated width (a) and center (b) of the central region in decision space for each patient (red symbols) and control (blue symbols) in the study of Capa et al. (2014). In each panel, the oblique dashed line is the identity line (note that the horizontal and vertical ranges and scales are identical in panel a, but they are very different in panel b). Overlaid crosses sketch the distribution of the data in each group of observers. The two arms of each cross meet at the coordinates of the mean of each variable and the length of each arm spans the range from one standard deviation below the mean to one standard deviation above it

Having ruled out differences of timing or decisional processes between patients and controls by these analyses, the groups may still differ as to response errors. Average errors were the main focus of the analyses performed by Capa et al. (2014), and the following discussion attests to the fact that the groups differ mostly in their propensity to make errors. Gross evidence to this effect comes from the error parameters that turned out to be needed to account for the data. Note by a birds’ eye view of Tables 2 and 3 that error parameters were needed more often for patients than for controls under either task and, through a closer look at the tables, that estimated values of the required error parameters are generally smaller for controls than they are for patients. In addition, for patients, estimated values were larger under the TOJ task than under the SJ task. This outcome reflects characteristics that can be readily observed in the data, as discussed next.

Consider first the SJ task. For controls (red symbols and curves in Fig. 6b), data and fitted curves generally decrease down to an ordinate of zero as the absolute value of right delay increases, with scarce evidence of response errors at the outer extremes; in contrast, patients (red symbols and curves in Fig. 6a) display more evidence of response errors at the outer extremes. This shows clearly for average data (red symbols and curves in the grayed panels of Fig. 6) in that the left and right tails of the average curve follow the path of the raw data towards lower ordinate values in the group of controls than in the group of patients. Results are analogous in the TOJ task (blue symbols and curves in Fig. 6), even when observers are excluded whose inverted responses artificially increase the average error rate. (It should nevertheless be noted that the number of patients giving inverted responses was larger than the number of controls giving them, which might point to a further difference between the groups if these samples were representative.) While data and fitted curves for controls generally increase from zero at large negative delays to unity at large positive delays (Fig. 6b), data and fitted curves for patients often increase instead from above-zero levels (e.g., patients P01, P07, P12, and P20) and/or grow only to below-unity levels (e.g., patients P06, P07, P10, P11, P13, and P16) due to response errors. As a result, the average psychometric function for TOJ data is shallower for patients than it is for controls (Fig. 6, blue symbols and curves in the gray panels). Thus, the difference between the two average TOJ curves progressively increases as absolute delay increases, in line with the results reported by Capa et al. (2014) and their conclusion. It is nevertheless unclear why patients are more prone to errors than controls or why patients seem less able than controls to identify temporal order at relatively large delays where their SJ data suggest that they actually perceive asynchrony.

The good fit of the IC model with common timing parameters to the data of Capa et al. (2014) also permits further inferences analogous to those discussed at the end of our analyses of the data of Linares and Holcombe (2014). Capa et al. did not address these issues, but it is useful to discuss what the IC model can say about perceptual latencies in the simplified conditions that arise when the two stimuli used to deliver temporal delays are identically processed up the sensory pathway. We will use the subscript “v” (for visual) in parameters, but the discussion clearly holds regardless of what type of stimulus is involved. In these conditions, the mean and variance of perceptual latencies are respectively μv = 1/λv + τv and σ 2v  = 1/λ 2v  = 1/λ2 so that only the variance of perceptual latencies can again be estimated. The impossibility of a complete characterization of perceptual latencies in these simplified conditions is again a consequence of the limitations of MSS, and ways around them will be presented in the Discussion. On the other hand, the mean difference of perceptual latencies is trivially μ = μv − μv = 0, and τ = τv − τv = 0. This is what permits the estimation of potentially asymmetric boundaries δ1 and δ2 in decision space. In any case, the peak of ΨSJ does not occur at Δt = 0 when δ1 ≠ –δ2 (Fig. 3c), so the classical PSS estimate in these conditions is only a measure of decisional bias, not of the average difference in perceptual latencies that is known to be 0 by the preceding argument. Similarly, the location of the 50% point on ΨTOJ (the classical TOJ estimate of the PSS) is affected by this decisional bias, besides the larger influence that the response bias parameter ξ has on its location. Then, also under these simplified conditions, classical estimates of the PSS from SJ and TOJ tasks (which will rarely match) do not portray perceptual latencies.

Li and Cai (2014)

Figure 8 shows data and fitted psychometric functions for each observer in each task, including a final panel for average data and average fitted curves. Parameter estimates are given in Table 4. As before, psychometric functions with common timing parameters under both tasks account remarkably well for the different paths described by the data and also for the large differences in classical estimates of the PSS (i.e., the location of the peak of ΨSJ and the location of the 50% point on ΨTOJ). The G 2 statistic was significant only for observer #4, for whom TOJ data were remarkably noisy (Fig. 8, corresponding panel). For all observers, SJ data show the bilateral symmetry expected when perceived onsets have the same distribution for both stimuli, and TOJ data also show clear evidence of the intermediate region of reduced slope that reflects guesses and is responsible for the spurious shift of the 50% point away from the null delay. A comparison of the SJ and TOJ performance of each observer again reveals errors in the TOJ identification of temporal order when SJ identification of asynchrony is errorless: See SJ and TOJ data points at the largest (negative and positive) delays for virtually all observers, but especially for #4, #6, #8, #12, and #16.

Fig. 8
figure 8

Data and fitted psychometric functions for the study of Li and Cai (2014) under the IC model. Results for the SJ task are shown in red; results for the TOJ task are shown in blue. Each panel pertains to an observer; the grayed panel at the bottom right plots data and psychometric functions aggregated across observers

Table 4 Estimated IC model parameters in the joint fit to SJ and TOJ data for each observer in the experiment of Li and Cai (2014). The columns labeled G2 and p give the value and p value of the likelihood-ratio goodness-of-fit statistic. Parenthetical superscripts for δs and εs indicate the task (SJ or TOJ) to which the parameter applies. Long dashes in cells for εs indicate that the parameter is excluded from the fitted model

The average 1/λ across observers was 8.16. When we fitted the general model in which separate timing parameters are estimated for reference and test stimuli, the average 1/λs (for the reference, “small” stimulus) was 7.74 and the average 1/λl (for the test, “large” stimulus) was 8.79, with standard deviations of 2.64 and 2.39, respectively. We submitted these values to a simultaneous test for equality of means and variances (Bradley & Blackwood, 1989), which is more efficient and powerful than separate tests for equality of means and equality of variances (García-Pérez, 2013). The result showed that means and variances do not differ significantly (F 2,14 = 0.745; p = 0.492), as expected from the fact that perceived onsets cannot be affected by features determined cognitively once the stimuli have been processed upstream the sensory pathway. This leaves open the questions of how decisional or response processes produce the different PSSs reported for SJ and TOJ tasks by Li and Cai (2014) and how the PSS under both tasks suggests that large numbers are perceived later than small numbers despite the fact that images of the numbers must be perceived before their numerical magnitudes can be assessed.

To look at differences in decisional processes between SJ and TOJ tasks, we computed the width and center of the central region in decision space as we did with the data from Capa et al. (2014). Scatter plots are shown in Fig. 9. The average width δ (SJ)2  − δ (SJ)1 was 48.65 and the average width δ (TOJ)2  − δ (TOJ)1 was 58.50. A two-tailed, paired-samples t test for means showed that the difference is statistically significant (t 15 = −2.34; p = 0.034), although an outlier with a large value for δ (TOJ)2  − δ (TOJ)1 has certainly contributed to this outcome (Fig. 9a, stray data point). As a result of this stray data point, a two-tailed, paired-samples t test for variances showed that the difference between the variability of δ (TOJ)2  − δ (TOJ)1 and the variability of δ (TOJ)2  − δ (TOJ)1 is statistically significant (t 14 = −2.47; p = 0.027). If the outlier is excluded, the central region in decision space is still significantly larger under the TOJ task (t 14 = −2.59; p = 0.021) but intersubject variability is analogous in both tasks (t 13 = 0.37; p = 0.714). On the other hand, the average center (δ (SJ)1  + δ (SJ)2 )/2 was −1.00 and the average center (δ (TOJ)1  + δ (TOJ)2 )/2 was 4.26, a difference that was not statistically significant by a two-tailed, paired-samples t test for means (t 15 = −1.73; p = 0.104). It is nevertheless clear from Fig. 9b that the variability of center locations across observers is much larger in the TOJ task than it is in the SJ task. A two-tailed, paired-samples t test for variances revealed that this difference is statistically significant (t 14 = −39.58; p = 9.0e−16). Removing the outlier observation for width did not alter these results for center. Altogether, these analyses reveal that, compared with the SJ task, temporal resolution is functionally smaller in the TOJ task and that decisional bias is subject to larger individual variability in the TOJ task. A nonsymmetric placement of the central region in decision space under the SJ task is responsible for the shift of the classical PSS away from the null delay (Fig. 3c), whereas asymmetric placement combined with response bias in the TOJ task produces a shift of the classical PSS that is not identical to that observed in the SJ task.

Fig. 9
figure 9

Scatter plot of estimated width (a) and center (b) of the central region in decision space for each observer in the study of Li and Cai (2014). In each panel, the oblique dashed line is the identity line (note that the horizontal and vertical ranges and scales are identical in panel a, but they are very different in panel b). Overlaid crosses sketch the distribution of the data in each panel. The two arms of the cross meet at the coordinates of the mean of each variable and the length of each arm spans the range from one standard deviation below the mean to one standard deviation above it

As for response errors, they were rare under the SJ task in the experiment of Li and Cai (2014), as is evident from the widespread absence of error parameters for this task in Table 4 and also from the fact that data points (and, accordingly, fitted curves) fall down to an ordinate of zero at the most extreme (positive or negative) delays (Fig. 8). In contrast, response errors were more prevalent in the TOJ task, as manifested in TOJ data paths that do not cover the full range of the ordinate (Fig. 8; see also the frequent occurrence of non-zero error parameters for TOJ data in Table 4). These aspects of the data seem to reveal again a difficulty to identify temporal order for delays at which asynchrony is clearly perceived.

Discussion

At the surface level, the 50% point on a psychometric function fitted to TOJ data rarely matches the location of the peak of a psychometric function fitted to SJ data in within-subjects studies involving the same stimuli and conditions. The same is true about just-noticeable differences or other performance measures extracted from SJ and TOJ data, although these additional measures have not been discussed here (García-Pérez & Alcalá-Quintana, 2012a). These empirically observable and undeniable facts reflect a mixture of timing, decisional, and response processes that can only be separated out by fitting model-based psychometric functions with parameters that capture these influences. Assessing whether timing processes differ across SJ and TOJ tasks requires consideration of these separate influences in an analysis that allows for explicit tests of identity.

Our model-based analyses supported the identity hypothesis by showing that observed differences in performance across SJ and TOJ tasks in the experiments by Linares and Holcombe (2014), Capa et al. (2014), and Li and Cai (2014) can be accounted for when common timing processes sustain both tasks. Further evidence to this effect arises from the analyses of data sets from other published studies (see the Electronic Supplementary Material). As discussed in the Introduction, different PSSs in each task arise naturally because TOJ estimates are indirect and greatly affected by the observer’s bias towards one of the response options when they perceive simultaneity. As shown in Fig. 2, this bias can shift the 50% point on ΨTOJ far away from the location of the peak of ΨSJ. The implicit assumption underlying classical analyses is that observers are unbiased and, thus, that they evenly choose both response options when they perceive simultaneity, but this assumption seems untenable in the light of the data.

The identity hypothesis is supported by other analyses that indicate that temporal-order and simultaneity judgments are mediated by identical timing processes (García-Pérez & Alcalá-Quintana, 2012a; Matsuzaki et al., 2014; Regener, Love, Petrini, & Pollick, 2014). Analogous model-based analyses in other areas of psychophysics also concur to indicate that differences in observed outcomes across tasks are a consequence of task-dependent decisional and response processes operating on the outcome of sensory processes that are invariant across tasks (Allik, Toom, & Rauk, 2014; García-Pérez & Peli, 2014).

We also have shown that IC model parameters relate to the timing, decisional, and response processes underlying performance in SJ and TOJ tasks and that their estimation is feasible. However, because SJ and TOJ tasks are both instances of MSS, the tasks themselves hide or combine some of the parameters that would additionally be needed for a complete characterization of timing and decisional processes. This shortcoming does not lessen validity to our conclusion regarding identical timing processes in SJ and TOJ tasks. Yet, they reveal that model-based analyses cannot overcome the inherent limitations of the task used to collect data when it comes to making quantitative inferences about processes that are confounded or hidden under the task. One way to overcome these limitations is to collect response times (RTs) separately and integrate RT data with SJ/TOJ data via a suitable model (Diederich & Colonius, 2015).

Our model-based approach suggests another way around these limitations. Specifically, the confound involving τ, δ1, and δ2 in the general model can be solved by abandoning MSS (i.e., SJ and TOJ tasks) and using instead a paired-comparison method of the type used by Allan and Kristofferson (1974) or Fouriezos, Capstick, Monette, Bellemare, Parkinson, and Dumoulin (2007), where each trial presents two temporal delays (i.e., two pairs of stimuli) that observers must compare. The theoretical advantages of paired-comparison methods over MSS have been discussed elsewhere (García-Pérez, 2014; García-Pérez & Alcalá-Quintana, 2013) and their validity has been empirically demonstrated (García-Pérez & Peli, 2014). A model of performance in simultaneity and temporal-order judgments under paired-comparison tasks has been developed and tested empirically, but these results have not been reported yet. Use of the paired-comparison task followed by a model-based analysis of the data permits estimating all the relevant sensory and decisional parameters that are needed to test hypotheses or make inferences about perceptual latencies for different types of stimuli or about differences in timing, decisional, and response processes across stimulus types (e.g., in studies on cross-modal perception of synchrony), experimental conditions (e.g., in studies on prior entry), or groups of subjects (e.g., in studies assessing differences between patients and controls).