Octuplicate this interval! Axiomatic examination of the ratio properties of duration perception

Birkenbusch, Jana; Ellermeier, Wolfgang; Kattner, Florian

doi:10.3758/s13414-015-0846-0

Octuplicate this interval! Axiomatic examination of the ratio properties of duration perception

Published: 27 March 2015

Volume 77, pages 1767–1780, (2015)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Octuplicate this interval! Axiomatic examination of the ratio properties of duration perception

Download PDF

Jana Birkenbusch¹,
Wolfgang Ellermeier¹ &
Florian Kattner¹

934 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

The relationship between the physical intensity of a stimulus and its perceived magnitude can be described by Stevens’ power law (Stevens, American Journal of Psychology, 69(1), 1–15, 1956), i.e., a power function with an exponent depending on the sensory modality studied. Direct scaling methods used to determine the power function exponent are based on the assumption that subjects are capable of processing ratios of magnitudes. The present experiments investigate whether this assumption holds for duration perception by empirically testing (Narens, Journal of Mathematical Psychology, 40(2), 109–129, 1996) fundamental axioms of monotonicity, commutativity, and multiplicativity. To determine whether the exponent can be interpreted in a meaningful way, i.e., whether it is invariant under changes of the reference stimulus, two further axioms, invertibility and weak multiplicativity (Augustin, Acta Psychologica, 128(1), 176–185, 2008) are evaluated. N=25 participants were required to adjust the duration of a comparison tone to specific ratios of different standard durations in two experiments. In accordance with previous findings for other sensory continua, monotonicity held for the duration adjustments of most participants. Significant violations of the commutativity axiom were found in 12.5 % of all pertinent tests, whereas multiplicativity was violated in 32 % of such tests. The axioms of weak multiplicativity and invertibility, however, were violated in over 50 % of the tests. These results indicate that even though a ratio scale for perceived duration exists, the numbers as used by the participants cannot always be taken at face value and that even though power functions fit the data quite well, the exponent depends on the size of the standard and therefore cannot always be interpreted in a meaningful way.

The role of Weber’s law in human time perception

Article 20 October 2020

Andrew Haigh, Deborah Apthorp & Lewis A. Bizo

Time-order errors and standard-position effects in duration discrimination: An experimental study and an analysis by the sensation-weighting model

Article 17 June 2015

Åke Hellström & Thomas H. Rammsayer

Taking a long look at isochrony: Perceived duration increases with temporal, but not stimulus regularity

Article Open access 24 October 2014

Ninja K. Horr & Massimiliano Di Luca

Introduction

Specifying the relationship between physical time and perceived duration has been explored in many facets in psychophysics. Particularly when duration perception is compared with other sensory modalities, Stevens’ power law is invoked. Employing it implies two related, and fundamental questions: First, whether perceived duration satisfies the condition of ratio scalability and second, whether the power law parameters obtained in duration scaling experiments remain unaffected by certain characteristics of the task. This study examines these questions by testing the validity of a number of pertinent axioms from representational measurement theory.

The relationship between the physical intensity of a stimulus and its perceived magnitude can be described by Stevens’ power law (1946, 1956), which is formulated as:

$$ \varphi(t) = \alpha t^{\beta}, t > 0. $$

(1)

That is, the perceived magnitude of a stimulus t is described by a power function α t ^β. Whereas the parameter α is a proportionality factor depending on the units used, the exponent β depends on the sensory modality. If the value of β is > 1, the perceived magnitude of the stimulus grows faster than the intensity of the physical stimulus. If β is < 1, the increments in perceived stimulus magnitude become smaller with increasing physical stimulus intensity. In the case of β=1, there is a directly proportional relationship between physical and perceived stimuli, i.e., the relationship can be described by a simple linear function.

Physical time and its perceived duration were also found to be related by a power function (Stevens and Galanter, 1957; Allan, 1979). The power function was fitted in several experiments applying different scaling methods (Eisler, 1975), among them ratings and magnitude estimation with and without a standard (Bobko et al., 1977). These approaches yielded exponents ranging from 0.44 to 1.87 (Kornbrot et al., 2013), with an average exponent of 0.90 most suitably describing the relationship between physical and perceived duration (Eisler, 1976).

Established methods to determine the exponent of Stevens’ psychophysical function are scaling procedures, in which participants are asked to produce correspondences between the perceived intensity of stimuli and numerical values consistent with the instruction. Stevens (1956) described two direct scaling methods, which are called magnitude estimation and magnitude production.

Though Stevens, in his later writings (e.g., Stevens, 1975) expressed a preference for using these methods without any constraints such as fixed standards or pre-assigned numerical values, their earliest applications were implemented in a similar manner as the classical methods to measure sensory thresholds, that is they used a fixed stimulus, the standard, and a variable stimulus called the comparison. These versions of magnitude estimation and production have later been termed ‘ratio estimation’ and ‘ratio production’, respectively (Gescheider, 1997).

There are two implicit assumptions fundamental to these direct scaling procedures: It is assumed that the participants are able to estimate or to produce perceived intensities on a ratio-scale level and, furthermore, that the numerals the participants use to describe their sensations may be treated like rational numbers in mathematics and therefore can be taken at face value.

Narens (1996) may be credited with making these assumptions explicit—never actually tested by Stevens or his followers—and formulated mathematical axioms providing a possibility to validate them. He distinguishes between behavioral and cognitive axioms: The untestable cognitive axioms describe the relationship between the participant’s unobservable sensation of a stimulus’ intensity and its numerical representation. The behavioral axioms characterize the participant’s behavior in a scaling experiment and relate their numerical representation to the number words used to describe the stimulus’ intensity. In contrast to the cognitive axioms, the behavioral axioms are empirically testable.

The behavioral axioms crucial for the assumption that participants are capable of estimating or producing ratios of stimulus intensities are monotonicity, commutativity, and multiplicativity. Their validity can be evaluated by analyzing data collected in magnitude or ratio production experiments (Luce, 2002). In the latter, when applied to the psychophysics of duration, the participant is instructed to adjust the duration of a comparison stimulus (such as w, x, y, z in the following), of the ratio of p, q or r of the perceived duration of the standard stimulus t: The notation (x,p,t) represents a participant’s adjustment x, which is perceived to last p times as long as the standard interval t, with the boldface letter referring to the number word used in the magnitude production instructions.

First of all, besides a number of technical axioms concerning the continuity of the physical stimulus values, the axiom of monotonicity (Augustin, 2008; Axiom 3.1 in Narens, 1996), also known as ordering, has to be tested. It is formulated as:

$$ \text{If} \ (x,\mathbf{p},t) \in E \ \text{and} \ (y,\mathbf{q},t) \in E, \ \text{then} \ p > q \Leftrightarrow x \succ y. $$

(2)

This means, if x has been adjusted to appear p times as long (×p, in the following) as the standard t and another adjustment y is q times as long (×q, in the following) as the standard, and p is greater than q, then the adjusted duration x must be longer than the duration y. According to Narens’ (1996) theory, if the axiom of monotonicity holds, it can be assumed that the perception of stimuli of the investigated modality occurs on a sensory continuum. It is a necessary condition not only for the subsequently elaborated axioms of commutativity and multiplicativity, but also fundamental to any scaling at all, because even the categories of an ordinal scale can be arranged in an ascending or descending (and therefore monotonic) order. Furthermore, the axiom of commutativity can be evaluated, which is formulated as:

$$\begin{array}{@{}rcl@{}} \text{If} \ (x,\mathbf{p},t) \in E, (z,\mathbf{q},x) \in E, (y,\mathbf{q},t) \in E, \ \text{and}\ \\ (w,\mathbf{p},y) \in E, \ \text{then} \ z = w. \end{array} $$

(3)

In other words, commutativity holds, if the stimulus duration resulting from a successive production sequence ×p×q is equal to the stimulus duration resulting from successive adjustments with interchanged ratio production factors ×q×p. For example, doubling the duration of a standard tone and then tripling the outcome should result in the same final duration as tripling the standard duration first and then doubling the result. Narens showed that if the axiom of commutativity holds, it can be assumed that the participant perceives stimulus magnitudes of the investigated modality on ratio scale level. But even if a ratio scale of perception does exist, there is no evidence that the scale values used by the participants can be interpreted as scientific numbers. To show the latter, the axiom of multiplicativity has to be evaluated, which is formulated as:

$$ \text{If} \ (x,\mathbf{p},t) \in E, (z,\mathbf{q},x) \ \text{and} \ r = qp, \ \text{then} \ (z,\mathbf{r},t) \in E. $$

(4)

In other words, the multiplicativity property holds, if the stimulus duration resulting from the successive adjustments ×p×q is equal to the stimulus duration resulting from a single adjustment ×r with r being the mathematical product of p and q. For example, doubling the duration of a standard tone and then tripling this adjustment should result in the same final duration as making the standard six times as long in a single adjustment. If the axiom of multiplicativity holds, the numerals as used by the participants to describe the perceived stimulus magnitudes can be taken at face value.

During the last decade, the axiomatic approach to magnitude scaling pioneered by Narens (1996) has been extended by Luce and colleagues (Luce, 2002, 2008; Luce et al., 2010). One recent interpretation concerning the axiom of multiplicativity argues, that a veridical interpretation of numbers and thus the validity of multiplicativity is not mandatory for direct ratio scaling: If the axiom of commutativity is satisfied, thus implying ratio scalability for the modality studied, it may be said that the participants interpret the numbers as some ratio, though not the exact ratio stated in the instructions.

The axiomatic framework has been applied to a number of psychophysical dimensions such as loudness (Ellermeier and Faulhammer, 2000; Steingrimsson and Luce, 2005a, b; Zimmer, 2005), area (Augustin and Maier 2008), brightness (Steingrimsson, 2011; Steingrimsson et al., 2012), and, most recently, pitch (Kattner and Ellermeier, 2014). Duration perception, however, has not been studied in this axiomatic manner.

Therefore, the aim of the first experiment was to investigate whether the fundamental axioms of Narens’ theory hold for duration perception, i.e., whether participants are capable of processing durations on a ratio scale. This was tested in a ratio production experiment in which participants were required to adjust the duration of a comparison tone to specific positive integer ratios of two different standard durations (t ₁=100 ms, t ₂=400 ms).

The experiment employed a method that is typical for axiomatic testing requiring the participant to adjust the duration of the comparison interval in an iterative fashion until it subjectively matches with the desired ratio. In contrast to one-shot estimations (e.g., “Turn the sound off as soon as it is p times as long.”), which seem to be less cumbersome, this procedure does not introduce a bias due to motor latency. Furthermore, the initial duration of the comparison was randomly chosen to fall above and below the estimated ‘target duration’ for the purpose of counterbalancing trials in which the participants had to shorten or lengthen the comparison tone.

In the second experiment, two further axioms, weak multiplicativity and invertibility (Augustin, 2008), were tested to provide evidence for the psychological meaningfulness of scaling perceived duration, i.e., whether the size of the power law exponent for duration perception remains unaffected by the size of the standard used in ratio production. Again, participants had to adjust the duration of auditory intervals to a certain ratio with respect to a standard tone (t ₃=600 ms). This time, fractions as well as integers were used as ratio production factors

Experiment 1

Method

Participants

Ten participants took part in the experiment. The sample consisted of four female and six male participants with a median age of 24 years ranging from 21 to 56 years. They did not have any prior knowledge of the hypotheses being tested. The experiment was conducted individually in a double-walled sound-attenuated listening chamber (IAC).

Stimuli and apparatus

All stimuli were sine waves of the same frequency of 440 Hz (A4 standard pitch) converted with a sampling rate of 44.1 kHz, and with 16-bit resolution. Their duration varied as a result of the protocol and contained 10-ms cosine-shaped rise-and-decay ramps to avoid unwanted switching transients. The standards were of fixed durations of 100 and 400 ms or of individual duration generated according to the adjustments of the participants. The comparison stimuli varied accordingly; their initial length was randomly chosen between one and ten times the duration of the corresponding standard. The tones were preset to a comfortable sound pressure level of 65 dB SPL. After passing through a headphone amplifier (Behringer HA 800 Powerplay PRO 8), the tones were presented diotically via headphones (Beyerdynamics DT 990 PRO). The experiment was programmed in MATLAB using the PsychToolbox-package by Brainard (1997) and Pelli (1997).

Procedure

In the first time-production experiment, the participants had to complete 264 trials altogether. They were divided into four identical test sessions taking place on different days. Each session was composed of three blocks of 22 trials, resulting in a total of 66 trials, respectively. After the completion of a block, the participants were allowed to take a short break. The recording of the data started after the participants had become familiar with the task during three practice trials at the beginning of each session.

Each trial consisted of two duration intervals marked by continuous tones, which were presented successively. The first tone, or standard, was of fixed duration, either 100 or 400 ms, while the second tone, or comparison, was of variable starting duration and could be adjusted by the participants. The tones were separated by a fixed silent inter-stimulus interval of 500 ms. During the presentation of both tones, a yellow numeral p (p=1,2,3,4,6,8) was displayed in the upper part of the screen, which was the instruction for the participant to adjust the duration of the second tone so that it was perceived to be p-times as long as the first tone. The adjustments could be made by pressing either the left cursor key for decreasing or the right cursor key for increasing the duration of the comparison tone. The steps for incrementing/decrementing duration were $\frac {1}{20}$ of the standard interval, that is 5 ms for the standard of 100 ms and 20 ms for the standard of 400 ms. To increase step size, participants could press the shift key together with the cursor key resulting in steps being ten times as long as the original steps, that is 50 ms or 200 ms, respectively.

The participants were asked to adjust the duration of the comparison tone step by step, i.e., after each key press response, the current standard and the altered comparison were replayed and the instruction was presented again. The participants were instructed to adjust the comparison until they were satisfied with the result and to eventually press the enter key to register the final value. The next trial started after an inter-trial interval of 2,000 ms. There was no time restriction to performing the task.

In each of the blocks, the standards of t ₁=100 and t ₂=400 ms were combined with the ratio production factors p=1,2,3,4,6 and 8. These trials are called basic trials and their outcomes are primarily used to test monotonicity. The testing of commutativity and multiplicativity is based on the outcomes of so-called successive trials, in which the individual adjustments produced by the participants in the basic trials were used as standards. They were combined with the ratio production factors q=2,3, and 4. Each type of adjustment was made 12 times, i=12. In the following, the basic adjustments are indicated by (x _i,p,t). As an example, (x ₃,2,100) is the third (i=3) adjustment of a trial with a ratio production factor p=2 and a standard stimulus t=100 ms.

In the successive trials, for each participant, the individual basic adjustments of each (x _i,p,100) and (x _i,p,400) were used as standard stimuli. More precisely, the new standards (x _i,2,100) and (x _i,2,400), derived from a basic doubling trial, had to be made q=2,3, and four times as long. Likewise, the standards (x _i,3,100), (x _i,4,100), (x _i,3,400) and (x _i,4,400) were subsequently doubled (q=2). The procedure might become more obvious by inspecting Fig. 1: The arrows starting from the x-axis depict the basic adjustments, whereas the arrows starting from the arrowheads depict the successive adjustments.

On the whole, there were 22 different types of adjustments: Each of the two standard stimuli was paired with each of the six ratio production factors p=1,2,3,4,6, and 8, resulting in 12 types of basic ×p adjustments. In addition, each standard was combined with each of the five pairs (p,q)=(2,2),(2,3),(2,4),(3,2), and (4,2), resulting in ten different types of successive ×p×q adjustments. Each type of adjustment was made 12 times, resulting in 264 trials per participant.

Results and discussion

Overall results

Overall mean adjustments for (N=10) participants are depicted in Fig. 1 in the upper panel for the shorter standard duration of t ₁=100 ms and in the lower panel for the longer standard of t ₂=400 ms. The mean number of adjustments made in one trial was M=13.3. In 66 % of all trials, participants made fine-step adjustments of duration. In further analyses, after a brief descriptive overview, the data sets of each participant are treated separately.

Monotonicity

The axiom of monotonicity was tested to confirm that duration perception of short intervals (100 to 4000 ms) occurs on a sensory continuum, i.e., that unequal temporal intervals are perceived as such and can be discriminated, respectively. From a descriptive point of view, the axiom of monotonicity seems to hold, because, as Fig. 1 shows, the mean outcome durations increase for increasing ratio production factors.

For the inferential statistics, two one-factor, repeated-measures analyses of variance (ANOVAs) tested the effect of the ratio production factor on the mean individual duration adjustments produced in basic trials only, separately for the two standards. For the standard t ₁=100 ms, the ANOVA yielded significant differences among the different ratio production factors, F(5,45)=306.9,p<.001,η ²=.97. A post hoc Tukey HSD test was conducted to check whether the mean adjustments of a pair of two adjacent ratio production factors are similar (∼). The results showed that all pairs of ratio production factors ((x,1,100)∼(x,2,100),(x,2,100)∼(x,3, 100), (x,3,100) ∼ (x,4,100),(x,4,100) ∼ (x,6, 100 ) and (x,6,100)∼(x,8,100)) differ significantly at p<.001. For the standard t ₂=400 ms, a comparable ANOVA also yielded significant variations among the ratio production factors, F(5,45)=140.7,p<.001,η ²=.94. Post hoc Tukey HSD comparisons revealed significant differences for all pairs of ratio production factors, p<.001 for (x,4,400)∼(x,6,400) and (x,6,400)∼(x,8,400), p<.01 for (x,1,400)∼(x,2,400) and (x,3,400)∼(x,4,400), and p<.05 for (x,2,400)∼(x,3,400). Further analyses of variance containing the factors block and session revealed no main effects for them, thus any practice effects can be ruled out.

Furthermore, a graphical analysis based on cumulative sums of the adjustments made, as proposed by Augustin and Maier (2008), was conducted for each participant. The axiom of monotonicity requires, that, for a fixed standard stimulus t, a ratio production factor p and a fixed number of repetitions i, the inequality S(x _i,p,t)<S(x _i,q,t) holds, with p<q and S representing the sum of duration adjustments x made up to the i-th trial. That is, the axiom of monotonicity holds, if for each standard t and each number i of repetitions (adjustments), the cumulative sums can be ordered by the ratio production factors used: S(x _i,1,t)<S(x _i,2,t)<S(x _i,3,t)<S(x _i,4,t)<S(x _i,6,t)<S(x _i,8,t). Thus, for each participant and both standards t ₁ and t ₂, the n=12 outcome durations of each type of ×p adjustments were summed up successively across trials. The cumulative sums, S(x,1,t),S(x,2,t),S(x,3,t),S(x,4,t),S(x,6,t) and S(x,8,t), of participant mg12, who is representative for the sample, are depicted in Fig. 2, the left panel shows the shorter and the middle panel shows the longer standard duration. Although the outcome durations of all trials n=1 to 12 were summed up successively, only the cumulative sums in the range of trials n=7 to 12 are plotted, in order to avoid inspecting the effects resulting from random influences for a small number of observations. Both graphs show that the curves for different ratio production factors never cross, e.g., that for the standard duration t ₁, each cumulated outcome duration for p=2 is shorter than the corresponding cumulated outcome duration for p=3, meaning that at no point in the sequence of trials is monotonicity violated, thereby providing a more rigorous test than a comparison of overall condition means would.

Commutativity

The axiom of commutativity provides evidence for the assumption that duration perception is based on a ratio scale. For testing commutativity, adjustments produced in successive trials are analyzed: Commutativity is taken to be satisfied, if a successive ×p×q adjustment is statistically indistinguishable from a successive ×q×p adjustment, i.e., if both types of raw adjustments emanate from the same distribution. For a descriptive analysis, Fig. 1 shows that most of the corresponding pairs of successive adjustments ×p×q and ×q×p which are connected by dashed lines coincide, indicating that the axiom holds for the overall means.

For individual inferential testing, nonparametric Mann–Whitney U tests (two-tailed, α=.1) for both pairs (p,q)=(2,3) and (2,4) and both standards were conducted resulting in four tests per participant and a total of 40 tests for the entire sample.

A standard significance level of α=.1 was used, because the aim of the analysis was to accept a statistical null hypothesis, thus making it harder to assume that an axiom holds for a particular comparison. A correction for multiple comparisons was not applied for the same reason.

For the entire sample, five violations in the 40 tests of the axiom of commutativity were observed (compare Table 1). Four of the five violations were produced by two participants (ml06, mn21), both for the standard of 100 ms. For seven of ten participants, the axiom of commutativity held in all cases.

Table 1 Experiment 1: Empirical evaluation of the commutative property for both standard stimuli with t ₁=100 ms and t ₂=400 ms for each (N=10) participant

Full size table

Multiplicativity

The axiom of multiplicativity was tested to check whether the numerals as used by the participants can be taken at face value, i.e., whether there is a veridical transformation between perceived and mathematical numbers. For testing multiplicativity, the adjusted durations resulting from successive trials are compared with durations adjusted in basic trials: The axiom holds, if the duration resulting from the successive ×p×q (×q×p, respectively) adjustment is statistically indistinguishable from the basic ×r adjustment, with r=p q. In a descriptive manner, Fig. 1 also shows that most of the pairs of successive adjustments ×p×q and ×q×p are commensurate with the corresponding adjustments of ×r (with which they are connected by dashed lines), thus indicating multiplicativity to hold for the entire sample.

The individual inferential statistics tested multiplicativity by conducting Mann–Whitney U tests (two-tailed, α=.1) for the three pairs (p,q)=(2,2), (2,3) and (2,4) and both standards, which results in six tests for each participant and a total of 60 tests for the entire sample. Altogether, 19 violations of 60 comparisons for the axiom of multiplicativity were observed (compare Table 2). For only two participants did the axiom of multiplicativity hold in all cases, whereas the other participants showed violations in one to five of six tests.

Table 2 Experiment 1: Empirical evaluation of the multiplicative property for both standard stimuli with t ₁=100 ms and t ₂=400 ms for each (N=10) participant

Full size table

Model fitting procedure

Furthermore, linear regressions were computed for all participants and both standards to estimate the parameters α and β for the power law (φ(t)=α t ^β) as well as the parameters a and b for a simple linear function (φ(t)=a+b t). It was assumed that the individually adjusted durations of (x,p,100) and (x,p,400) are perceived to be p times as long as the standards, respectively. Thus, for the linear model, a linear regression of the ratio production factors p constituting the dependent variable on the individual adjustments constituting the independent variable was computed. For the power function, a linear regression was computed as well, with the logarithmically transformed ratio production factor p as the dependent variable and the logarithmically transformed individual adjustments serving as the independent variable.

The estimated parameters and squared correlation coefficients R ² for both linear model and power function and for both standards are shown in Table 3. The comparison between linear and power-function model shows, that for the short standard, the power-function model results in a slightly better fit (t(13.15)=1.885,p=.082) explaining 4.7 % more of the variance. For the longer standard, the linear model seems to fit the data as well as the power-function model (t(15.96)=0.735,p=.47), the latter explaining only 2.3 % more of the variance. Furthermore, the power function exponents estimated for the two standards significantly differ in size, t(11.58)=3.67,p=.003. The exponent β of the power function yielded an average of β(t ₁)=0.87 (β<1 in all cases) for the shorter standard and β(t ₂)=1.02 (β>1 in 6 of 10 cases) for the longer standard duration. Both the linear and the power function indicate a reasonable fit to the data with R ² ranging from 0.71 to 0.98 for the raw-data adjustments.

Table 3 Experiment 1: Estimated parameters and squared correlation coefficients for linear model and power function for both standard stimuli with t ₁=100 ms and t ₂=400 ms and each (N=10) participant

Full size table

Summary

The analyses showed that the axiom of monotonicity was not violated, i.e., the participants were able to produce monotonically ordered adjustments according to the different ratio production factors. The axiom of commutativity was violated in 12.5 % of all tests, while multiplicativity was violated in 32 % of all tests. The estimated power function exponents for the two standards clearly differ in value, that is, the estimation of the parameters of the power law seems to depend on the duration of the standard, and, for the longer standard, seems to be close to 1 resulting in a simple linear function.

Experiment 2

The previous experiment investigated the axioms of monotonicity, commutativity, and multiplicativity for the perception of duration to test the validity of assumptions basic to Stevens’ direct scaling methods. Since the axiom of commutativity was found to be valid in 87.5 % of all cases, it can be assumed that participants’ processing of short duration in a ratio production experiment is based on a ratio scale. However, it might be difficult to describe the relationship between the mathematical numbers provided in the experimental instruction and the numbers as interpreted by the participants, because the axiom of multiplicativity held in only 68 % of the tests, i.e., roughly a third of the participants do not appear to process the numbers at their face value. Comparisons of the estimated exponents of the power functions describing the relationship between physical and perceived duration yielded significantly different exponents for the two standard durations employed.

The observation that the two different standard durations used in Experiment 1 result in diverging exponents has traditionally been classified as a context effect. In the domain of psychophysical scaling, several types of context effects have been described: Besides the stimulus range used in the experiment (Garner, 1954; Ward et al., 1996), the numerical examples given in the experimental instruction (Robinson, 1976) and the number values assigned to the standard stimuli (Beck and Shaw, 1965), or even the entire experimental context might have an influence on the size of the exponent. Therefore, the psychological meaningfulness of the exponent has been called into question (Lockhead, 1992). In contrast to this point of view, other investigators have argued that finding the ‘true’ exponent is still possible (Teghtsoonian and Teghtsoonian, 2003; Teghtsoonian, 2012).

However, in the axiomatic-measurement literature, this problem has been framed as a more fundamental issue of meaningfulness (Stevens, 1946; Luce, 1978; Narens, 1981). For each power function describing the relationship between the physical intensity of a stimulus and its perceived magnitude, one might ask whether the parameters of this function are psychologically meaningful, i.e., invariant under certain transformations. Note that the exponent of the power function depends on the sensory continuum, the participant’s individual perception—which does not exert a very strong influence (Teghtsoonian and Teghtsoonian, 1983)—and potential contextual influences as mentioned above. Furthermore, it might also vary under changes of the physical measurement scale f (Narens and Mausfeld, 1992) and the size of the standard (Augustin, 2008) used in an experiment. If, for example, the measurement scale f is transformed to another scale g measuring the same physical intensity as f and if these scales are neither log-interval nor ratio scales, then it must be assumed that the choice of the scale has an influence on the exponent of the power function. Thus, the obtained exponent has no psychological relevance, or is not meaningful.

But even if the exponent of the power function is invariant under changes of the physical stimulus scale applied in the experiment, it has to be investigated, whether the exponent is invariant under changes of the standard stimulus t being the basis for the estimates or adjustments made by the participants. Augustin (2008) suggests a mathematical method to examine the dependency on the standard by postulating two further axioms that can be evaluated empirically that is weak multiplicativity and invertibility. The axiom of weak multiplicativity is formulated as:

$$\begin{array}{@{}rcl@{}} \text{For} \ t, y, z \in X \text{ and a real number} \ p > 0,\\ (y,\mathbf{p}, t) \in E, (z,\mathbf{1/p},y) \in E \Rightarrow (z, \mathbf{1}, t) \in E. \end{array} $$

(5)

That means, weak multiplicativity holds, if the stimulus intensity resulting from successive adjustments $\times \mathbf {p} \times \frac {\mathbf {1}}{\mathbf {p}}$ is equal to the stimulus intensity resulting from the basic adjustment with p=1. For example, doubling the duration of the standard and then halving this adjustment should result in the same final duration as matching the duration of the comparison interval to that of the standard. Weak multiplicativity is very similar to Narens’ axiom of multiplicativity. But while multiplicativity has to hold for all cases p>0 and q>0, weak multiplicativity is a special case of multiplicativity with $\mathbf {q} = \frac {\mathbf {1}}{\mathbf {p}}$, i.e., even if the axiom of multiplicativity is violated, the axiom of weak multiplicativity might hold.

The axiom of invertibility is formulated as:

$$ \text{For} \ t, y \in X \ \text{and} \ \mathbf{p} > 0, (y,\mathbf{p},t) \in E \Leftrightarrow (t, \mathbf{1/p}, y) \in E. $$

(6)

In other words, invertibility holds, if the intensity of a stimulus resulting from successive adjustments $\times \mathbf {p} \times \frac {\mathbf {1}}{\mathbf {p}}$ is equal to the stimulus intensity of the standard t or, put simply, if it is possible to undo a ×p adjustment by requiring to produce its reciprocal $\times \frac {\mathbf {1}}{\mathbf {p}}$. So weak multiplicativity and invertibility differ in whether the successive adjustment resulting from $\times \mathbf {p} \times \frac {\mathbf {1}}{\mathbf {p}}$ is equal to the adjustment of ×1 in the first case and the actual duration of the standard in the second case. As Augustin (2008) stated, both axioms are necessary and sufficient conditions for the exponent of Stevens’ power law to be invariant under changes of the standard t.

However, previous magnitude production experiments using ratio production factors p<1<q assume fractions and integers to be processed differently: A study by Luce, Steingrimsson and Narens (2010) showed the axiom of commutativity to be violated for the N=2 participants tested when fractions and integer ratios were mixed. Steingrimsson and Luce (2007) found comparable discrepancies for the axiom of multiplicativity for N=3 participants in an experiment on loudness production. Augustin (2008) explicitly tested the two crucial axioms of weak multiplicativity and invertibility and found them to be violated for all N=10 participants who performed ratio productions of the area of visually presented circles.

For the perception of duration, numerous experiments to determine the exponent of Stevens’ power law were conducted using standard durations ranging from 50 ms to 300 s (Eisler, 1976). Although the exponents derived from these experiments vary between β=0.23 and 1.36, it has not been sufficiently investigated whether these differences may be caused by the use of different standards. A study by Kane and Lown (1986) used standard durations of 30 and 180 s and did not find the length of standard duration to affect the size of the power law exponent. Eisler’s (1976) review of 111 studies on duration perception, however, reported lower exponents obtained from experiments using standard durations shorter than 500 ms, but they did not specify this observation in more detail.

Because, in contrast, even the exponents derived from Experiment 1, using standards of t ₁=100 and t ₂=400 ms, significantly differ in size, β(t ₁)=0.87,β(t ₂)=1.02, it is plausible to investigate the meaningfulness of the power law exponent for the perception of duration by means of Augustin’s (2008) additional axioms.