A hallmark of Euclidean geometry is the ability to compare distances in different directions. Any distance interval can be calculated using the Pythagorean theorem, \( \mathrm{distance} = \sqrt{{\left({x}_2\hbox{-}\ {x}_1\right)}^2 + {\left({y}_2\hbox{-}\ {\mathrm{y}}_1\right)}^2 + {\left({z}_2\hbox{-}\ {\mathrm{z}}_1\right)}^2} \), where {x1, y1, z1} and {x2, y2, z2} are coordinates indicating the location of two points within a three-dimensional Cartesian spatial framework. Because of this ability to calculate distance between arbitrary locations, it is obviously easy, in Euclidean geometry, to compare two such distance intervals (i.e., determine distance ratios), no matter how they are oriented in space (e.g., one can easily compare horizontal and in-depth intervals). This ability to compare distances in different directions does not extend to other geometries. In affine geometry, for example, only parallel distance intervals can be compared (Coxeter, 1961): One could compare the magnitudes of two horizontal intervals or two in-depth intervals, but one could not compare (i.e., determine) the relative length of horizontal and in-depth intervals.

If visual space has properties like those of Euclidean geometry, it should be possible for human observers to compare distance intervals oriented in different directions and perceive distance ratios (e.g., determine that one distance interval is 3 times larger than another). In a recent study by Norman, Adkins, and Pedersen (2016), observers viewed two nonparallel small intervals (individual lengths ranged from 10 to 97 cm) on any given trial and were asked to estimate the distance ratio (the magnitude of the larger interval relative to that of the smaller). Norman et al. found that their observers could make reliable judgments of distance ratios—the average Pearson r correlation coefficient relating the perceived and actual ratios was 0.87 (therefore, 76% of the variance in the estimated distance ratios could be accounted for by variations in the actual stimulus distance ratios). While their judgments were always reliable, the majority of individual observers nevertheless exhibited some inaccuracies (i.e., significantly over- or underestimated the stimulus ratios).

As we have seen, it is already known (Norman et al., 2016) that distance ratios can be reliably perceived when individual distance intervals are small (less than 1 m). We do not know, however, if this ability extends to the large distances that occur in ordinary outdoor environments. In their study, conducted outdoors, Koenderink, van Doorn, and Lappin (2000) found that visual space is non-Euclidean (i.e., intrinsically curved) and changes curvature as the spatial scale increases. Koenderink et al. found that for small distances of 2 m or less, visual space is elliptic (curved like the surface of a sphere), and is hyperbolic (curved like the surface of a horse saddle) for large distances of 15 m or more (see their Figs. 4 & 6). The results of Experiment 1 of Norman, Crabtree, Clayton, and Norman (2005) were similar to those of Koenderink et al. The judgments of Norman et al.’s observers’ were inconsistent with Euclidean geometry; in addition, the observers’ visual space was generally elliptic for small stimulus configurations (2 m) and hyperbolic for large stimulus configurations (15 m). Since visual space is apparently a space of varying curvature, it may not be possible for human observers to compare large distances (in different directions) outdoors with any degree of accuracy or reliability. The primary purpose of the current experiment was to investigate this issue.

A second purpose motivating the current experiment involves the potential effect of aging. In 2013, Bian and Andersen (also see Gajewski, Wallin, & Philbeck, 2015) demonstrated that older adults’ ability to judge egocentric distance (distance from oneself to a single target) outdoors was superior to that of younger adults. Following this perhaps surprising result, Norman, Adkins, Norman, Cox, and Rogers (2015) found that older adults’ performance for judging small exocentric distance intervals (distance between two environmental locations, irrespective of oneself) was also superior. In a later investigation, however, Norman, Adkins, and Pedersen (2016) found no difference between younger and older adults in their ability to estimate distance ratios for small indoor intervals. If the age-related superiority in distance perception is associated with large distances viewed outdoors (Bian & Andersen, 2013), then one would expect to obtain a similar age-related superiority for judging large-scale distance ratios outdoors. In contrast, if age-related superiorities do not occur for distance ratio estimation tasks (Norman et al., 2016), then one would not expect to obtain an age-related superiority for judgments of large-scale distance ratios outdoors. The current experiment will resolve this ambiguity.

Method

Apparatus and stimulus displays

A random order of stimulus distance ratios was determined for each observer using an Apple Mac Pro computer. Twenty locations were distributed across a 26 × 60 m area of a grassy field on Western Kentucky University’s campus (these 20 locations were marked with small plastic markers that were 12.5 cm tall). The exact spatial configuration is depicted in Fig. 1. These 20 individual locations were quasi-randomly selected, subject to the following constraints: (1) the locations must span the entire field (i.e., be distributed homogeneously), (2) the locations must permit the desired stimulus distance ratios (e.g., 2.0, 5.0, 8.0) to be presented to the participants, (3) the smaller stimulus extent of a distance pair must itself be appreciable in magnitude (i.e., not tiny; e.g., be greater than 5m), and (4) the angle between any two stimulus extents defining a ratio must be greater than 30 degrees (to prevent parallel, or nearly parallel, stimulus extents; parallel distances can be accurately compared using affine geometry) . On each trial, two distance intervals were highlighted by placing PVC (polyvinyl chloride) poles (1.56 m tall × 2.7 cm diameter) at four of the locations shown in Fig. 1. The (usually) longer spatial intervals were defined by the placement of two pink poles, while the (usually) shorter spatial intervals were defined by the placement of two white poles. Eight stimulus distance ratios were used in the experiment; the specific distance intervals that defined these ratios are shown in Table 1. A photograph of a typical trial is shown in Fig. 2.

Fig. 1
figure 1

Top view of the 20 spatial locations used to define the distance intervals used in the current experiment. The observers’ viewpoint is represented by an x. A sample distance ratio (4.0) is also indicated

Table 1 Stimulus distance ratios
Fig. 2
figure 2

Photograph of the grassy field where the experiment was conducted (a stimulus distance ratio of 4.0 is depicted). On any given trial, the endpoints of the (usually) larger stimulus distance were marked by two pink poles (rightmost two poles in this specific example), while the endpoints of the (usually) shorter distance were marked by two white poles (leftmost two poles in this photograph). This photograph was taken from the observers’ approximate point of view

Procedure

On any given trial, observers viewed one of the eight stimulus distance ratios. As in our previous investigation (Norman et al., 2016), the observers were instructed to provide verbal estimates of the highlighted distance ratio (i.e., determine how much longer the “pink” interval was relative to the “white” interval). An experimenter recorded the observers’ subsequent responses. In between trials, the observers faced away from the grassy field (in the opposite direction) so that they could not see the movement of the poles. The observers judged each of the eight stimulus distance ratios three times, and thus made a total of 24 judgments. The observers had unlimited time to view and estimate the distance ratios; no feedback was given to the observers during the experiment regarding their performance.

Observers

There were a total of 32 observers. Sixteen of the observers (eight male and eight female) were older adults (mean age was 71.2 years, SD = 5.6, range: 61–78 years), while the remaining sixteen (eight male and eight female) were younger adults (mean age was 23.0 years, SD = 2.4, range: 18–28 years). All observers gave written consent prior to participation in the experiment. The experiment was approved by the Western Kentucky University Institutional Review Board. Our research was carried out in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki). All observers were naïve. The visual acuity of the observers was good; the acuity of the younger and older observers measured at 1 meter was −0.04 and −0.01 LogMAR, respectively.

Results

Typical results from two individual observers (one younger male and one older male) are presented in Fig. 3; individual results for all 32 observers are included in the Supplementary Materials. For each observer included in Fig. 3, their judged distance ratios are plotted as a function of the actual stimulus ratios. One can see that the slopes of the best fitting regression lines are quite different for the younger male (left panel) and the older male (right panel); the judgments of the older male observer were much more accurate. This pattern was typical: consider Fig. 4. This figure plots the slopes of the best fitting regression lines for all of the participants (a slope of 1.0 indicates accurate performance, such that, e.g., a change of 3.0 in stimulus ratio leads to a change of 3.0 in judged ratios). It is readily apparent from the figure that the slopes of the older male observers did not significantly differ from 1.0, t(7) = −1.235, p = .26); note that the confidence interval for this group of observers includes 1.0. When considered as a group, these observers’ judgments were essentially accurate. Figure 4 also indicates, however, that the judgments for the remaining groups of observers (older females, younger females, and younger males) were inaccurate: The slopes of the best fitting regression lines for these observers were, overall, significantly lower than 1.0 (t values were −6.0, −7.5, and −7.6 for the older females, younger females, and younger males, respectively, all ps < .001), such that a change in stimulus ratio of 3.0 might produce, for example, a change in judged ratios of 1.5.

Fig. 3
figure 3

Plots of the judged distance ratios as a function of the actual distance ratios for two representative observers (one younger male and one older male). Accurate performances would be indicated by the dashed lines. Solid lines plot best fitting linear regression lines. The strength of each observer’s correlation (i.e., Pearson r value) is also indicated

Fig. 4
figure 4

Slopes of best fitting regression lines relating actual stimulus distance ratios and judged distance ratios. Perfectly accurate performance would be indicated by a slope of 1.0 (represented by dashed line). Observers’ slopes are plotted as functions of both age and sex— results for the female observers are indicated by gray bars; analogous results for male observers are indicated by white bars. Error bars indicate the 95% confidence intervals

A two-way between-subjects analysis of variance (ANOVA), with age and sex as factors, revealed that the Pearson r correlation coefficients (relating judged and actual distance ratios) were significantly higher (19.1%) for males than for females, that is, a main effect of sex: F(1, 28) = 18.2, p < .001; ηp 2 = 0.39. The magnitude of the correlation coefficient (r) can be considered as a measure of precision—if an observers’ judgments are precise (and thus reliable over time) the repeated estimates will cluster tightly about the best fitting regression line and the magnitude of r will be high. If an observers’ repeated judgments are imprecise (and vary erratically over time), their estimates will deviate further from the regression line and the resulting magnitude of r will be low. Figure 5 plots the overall precision of our observers’ judgments; it is readily apparent that the male observers exhibited higher precision. At this point, it is important to keep in mind, however, that the precision exhibited by all of our observers was good: The overall (mean) Pearson r correlation coefficient was 0.76. Thus, the majority of the variance (58.1%) in our observers’ estimates of distance ratios could be accounted for by variations in the actual stimulus ratios themselves.

Fig. 5
figure 5

Magnitudes of the observers’ correlations (i.e., Pearson r values) between actual stimulus distance ratios and judged distance ratios. Observers’ Pearson r values are plotted as functions of both age and sex— results for female observers are indicated by gray bars; analogous results for male observers are indicated by white bars. Error bars represent ± 1 SE.

As Table 1 indicates, the angle between the individual stimulus extents varied (from 37.0 to 89.2 degrees) across the different stimulus ratios. In previous research, Norman et al. (2016) found that for small stimulus extents (10–97 cm), the judged distance ratios generally became larger as the angle increased above 50 degrees. As Fig. 6 indicates, however, the effect of variations in angle was quite different in the present investigation of large distance extents (6.4–55.2m). In the current experiment, most of the observers (62.5%) demonstrated a pattern like that exhibited by Observer 1 (left panel of Fig. 6), whose judged ratios generally decreased with increases in angle; for the remaining minority of observers (37.5%; see Observer 8, right panel, of Fig. 6), there was little or no effect of angle upon the estimated distance ratios. The average correlation (nonlinear) coefficient magnitude was 0.318, which was significantly different than zero, t(31) = 15.8, p < .000001. Despite the fact that the average correlation was significantly different from zero, it is clear that variations in angle account for little of the variance (only 10.1%, r 2 = .101) in the observers’ judgments. Nevertheless, it is interesting that for small extents (Norman et al., 2016), judged distance ratios increase at larger angles, while for large extents (current experiment), judged distance ratios often decrease for larger angles. This type of result, that perceived geometrical relations differ at small and large spatial scales, has been observed before (e.g., Koenderink et al., 2000; Norman et al., 2005).

Fig. 6
figure 6

Plots of judged distance ratios (relative to actual stimulus ratios) as a function of the angle between individual stimulus extents (see Table 1) for two typical observers. Accurate performances would be indicated by dashed lines. Solid curves plot best fitting nonlinear (quadratic) regression

Past research upon the visual perception of environmental distance and spatial relations has produced conflicting outcomes. Some studies have found evidence that visual space is not Euclidean in nature, but is intrinsically curved (e.g., Battro, Netto, & Rozestraten, 1976; Blank, 1958; Hardy, Rand, & Rittler, 1951; Higashiyama, 1981; Koenderink et al., 2000; Norman et al., 2005). Others (e.g., Lappin, Shelton, & Rieser, 2006; Norman, Lappin, & Norman, 2000) have found that distances in depth are perceived to be significantly longer than they actually are. A large number of studies, however, have found the opposite result: the perceptual compression of in-depth intervals, such that they appear significantly smaller than their physical size (e.g., Bian & Andersen, 2013; Da Silva & Dos Santos, 1984; Foley, Ribeiro-Filho, & Da Silva, 2004; Gilinsky, 1951; Loomis, Fujita, Da Silva, & Fukusima, 1992; Loomis & Philbeck, 1999; Norman et al., 2005; Norman, Todd, Perotti, & Tittle, 1996; Wagner, 1985). Given the frequency of this finding (that visual space is compressed along the in-depth axis during perception), we decided to investigate whether our current results are consistent with this widely observed phenomenon. At this point, it is important to remember that many of the current observers’ judgments were inaccurate (i.e., they frequently underestimated the physical stimulus distance ratios; see Fig. 4). The modeling of each observer’s individual results was conceptually straightforward. Assuming that human observers’ perceptions of distances are in fact compressed (i.e., shortened) in depth during perception, is it possible that our participants’ judgments were in fact accurate if this well-known perceptual distortion is taken into account? For each observer, we performed a Monte Carlo analysis (e.g., see Kroese, Brereton, Taimre, & Botev, 2014). First, the spatial configuration of 20 stimulus locations (see Fig. 1) was compressed in depth 100,000 times by random magnitudes (sampled from the full range of zero to 100%). For each of the 100,000 iterations, the individual stimulus extents (defined in Table 1) and effective stimulus distance ratios were determined after the stimulus space had been compressed in depth. Then, for each iteration, a correlation (Pearson r) was calculated between the observer’s judged stimulus ratios and the effective stimulus distance ratios (i.e., the magnitudes of the stimulus distance ratios as they existed in that transformed/compressed space). For each observer, a single scale factor (i.e., amount of compression; scale factor of 1.0 = no compression, scale factor of zero = complete compression of 3-D space into a plane) was thus determined (out of the 100,000 randomly selected scale factors) that produced the most accurate performance (i.e., the slope closest to 1.0). As examples, consider the performance of Observers 3 (age = 24 years) and 25 (age = 78 years), whose individual results are illustrated in Fig. 7 and 8, respectively: Both observers underestimated the stimulus ratios as they existed in physical space (left panels of Figs. 7 & 8), but their judgments became much more accurate when considered relative to the stimulus ratios as they existed within the transformed (affinely compressed) space (right panels). This pattern occurred for the entire set of observers: All 32 observers’ judgments were more accurate in transformed space than physical space, t(31) = 8.617, p < .000001. According to our current analysis and data, in the transformation from physical space to perceived space, distances in depth are compressed by about half (average scale factor that produced the most accurate performance was 0.469). There were no effects of age or sex upon the amounts of in-depth compression needed to achieve maximally accurate performance: Fs for main effects of age and sex were both less than 1.0; for the interaction of age & sex, F(1, 28) = 2.05, p = .16.

Fig. 7
figure 7

Plots of judged distance ratios of Observer 3 (representative younger male) as a function of physical distance ratios (left panel) and of stimulus distance ratios following affine compression in depth of stimulus space (right panel). Accurate performances would be indicated by dashed lines. Solid lines plot best fitting linear regression lines. Strength of observer’s correlations (i.e., Pearson r values) and slopes (of best fitting regression line) are also indicated

Fig. 8
figure 8

Plots of judged distance ratios of observer 25 (representative older female) as a function of physical distance ratios (left panel) and of stimulus distance ratios following affine compression in depth of the stimulus space (right panel). Accurate performances would be indicated by dashed lines. Solid lines plot best fitting linear regression lines. Strength of observer’s correlations (i.e., Pearson r values) and slopes (of best fitting regression line) are also indicated

Discussion

Over the past 65 years, many studies have evaluated the visual perception of distance. As reviewed earlier, many of these experiments have demonstrated perceptual compression of in-depth intervals, such that they appear significantly smaller than their physical size (e.g., Bian & Andersen, 2013; Da Silva & Dos Santos, 1984; Foley, Ribeiro-Filho, & Da Silva, 2004; Gilinsky, 1951; Loomis et al. 1992; Loomis & Philbeck, 1999; Norman et al., 1996; Norman et al., 2005; Wagner, 1985). Other studies have found either no perceptual compression (Lappin, Shelton, & Rieser, 2006) or significant individual differences, such that some observers exhibit systematic perceptual compression while others exhibit perceptual expansion (Norman, Adkins, Pedersen, et al., 2015). Even for single observers, the outcomes can be task-specific. When three observers in Experiment 1 of Norman et al. (2005) were asked to adjust distances between markers outdoors to create apparent equilateral triangles, their judgments (binocular viewing, 15-m condition) were consistent with Euclidean geometry; those same observers exhibited affine perceptual compression of in-depth intervals when they were later asked to perform a depth-to-width matching task. When the literature is considered as a whole (e.g., see Wagner, 2006), the inescapeable conclusion is clear: Visually perceived spatial relations and distances cannot be explained in terms of a single geometry. Foley et al. (2004) have concluded that “although the perception of location and the perception of extent are related, they are not related by Euclidean geometry, nor by any metric geometry” (p. 147).

Physical space, at least at scales relevant for human behavior, can be well characterized by Euclidean geometry. Euclidean geometry is mathematically useful, because it provides a metric by which distances can be compared in different directions (i.e., distance ratios for nonparallel intervals can be determined). It is clear from the previous review of the literature (and previous results from our own lab), however, that human visual space is not Euclidean. Nevertheless, one wonders how well human observers can compare visual distances oriented in different directions (i.e., perceive distance ratios). In a recent investigation, the observers of Norman, Adkins, and Pedersen (2016) viewed small distances (1 m or less) indoors and were required to estimate distance ratios for nonparallel spatial intervals (the angle separating the two intervals to be compared on any given trial varied from 31.5 to 84.6 degrees). Given the previous findings that visual space is not only non-Euclidean, but may be nonmetric, it is perhaps surprising that the observers of Norman et al. performed relatively well; for example, the average Pearson r correlation coefficient relating the perceived distance ratios and actual stimulus ratios was 0.87 (thus, 76% of the variance in the perceived distance ratios could be accounted for by variations in the actual stimulus ratios). The results of the current experiment confirm and extend the earlier findings of Norman et al. (2016). Whereas the previous study evaluated the perception of small distances indoors, the current experiment involved the perception of large distances (up to 55.2 m) outdoors. Our results demonstrate that human observers can reliably perceive distance ratios over both large and small spatial scales and in both indoor and outdoor environments.

While the current results demonstrate that observers’ judgments of distance ratios are reliable and precise (Figure 5), they were generally not accurate, at least when considered relative to the stimulus distance ratios in physical space (see Fig. 4). Our observers’ judgments are accurate (see right panels of Figs. 7 & 8), however, if they are compared to the stimulus distance ratios as they exist following affine compression in depth by about half (such that a 20 m physical distance in depth appears to be 10 m in extent). This amount of perceptual compresion of in-depth intervals is quite similar in magnitude to the amounts of compression obtained in previous experiments (e.g., Bian & Andersen, 2013; Gilinsky, 1951; Harway, 1963; Loomis et al. 1992; Wagner, 1985). Given the wide variety of specific tasks employed in the current and past studies (e.g., match depth to width, indicate successive equal-appearing intervals in depth, pulling rope, estimate distance ratios), it is satisfying to see such a large amount of converging evidence to support the idea that the visual space that we experience is affinely compressed in depth relative to physical space.

A consistent effect of sex upon the precision of the observers’ judgments was found in the current experiment: Males, regardless of age, exhibited higher precision than females (the Pearson r correlation coefficients for males were 19.1% higher than for females; see Fig. 5). This modest, but significant, effect of sex is perhaps not surprising: Males frequently perform better than females for a variety of spatial tasks (Contreras, Rubio, Peña, Colom, & Santacreu, 2007; Grön, Wunderlich, Spitzer, Tomczak, & Riepe, 2000; Kaufman, 2007; Lawton, 1996; Linn & Petersen, 1985; Tapley & Bryden, 1977; Voyer & Saunders, 2004; Voyer et al., 2006; Yen 1975; Zuidhoek, Kappers, & Postma, 2007). Interestingly, the frequently obtained male-related superior performance for spatial tasks appears to be related to circulating levels of testosterone in the bloodstream (Cherrier et al., 2001; Driscoll, Hamilton, Yeo, Brooks, & Sutherland, 2005; Little, 2013; Janowsky, Oviatt, & Orwoll, 1994). According to Hausmann, Slabbekoorn, Van Goozen, Cohen-Kettenis, and Güntürkün (2000), while estradiol (produced by the female ovaries) produces deteriorations in human spatial ability, testosterone increases ability. It is important at this point to note that high levels of free testosterone are not needed for superior spatial ability: Shute, Pellegrino, Hubert, and Reynolds (1983) demonstrated that there is a curvilear relationship between circulating testosterone and spatial ability (moderate levels of circulating testosterone produced maximal spatial performance). This may account for the fact that our older male observers also exhibited higher precision in the current experiment (see Fig. 5), despite the fact that aging moderately reduces levels of testosterone (Harman, Metter, Tobin, Pearson, & Blackman, 2001; Morley et al., 1997); apparently, moderate levels of circulating testosterone are all that are needed to produce improvements in performance for spatial tasks.

In addition to the main effect of sex upon precision, a significant effect involving age occurred with respect to accuracy in the current study—the judgments of older male adults were more accurate than those of younger male adults (at least when performance is evaluated relative to physical stimulus distances; see Fig. 4). The average slope (of the relationship between judged and physical stimulus distance ratios) of the older males was 45.6% higher than that of the younger males. There was no comparable age effect for female observers. The current finding that some older adults (older males) can judge distance ratios more accurately than younger adults reinforces and extends previous findings of age-related superiorities in distance perception (Bian & Andersen, 2013; Gajewski, Wallin, & Philbeck, 2015; Norman, Adkins, Norman, et al., 2015).

Conclusion

Human observers can reliably perceive and estimate distance ratios, even for large distances viewed in outdoor environments. Age and sex significantly modulate performance.