The effect of target and background texture on relative depth discrimination in a virtual environment

Lew, Wei Hau; Coates, Daniel R.

doi:10.1007/s10055-024-01000-4

The effect of target and background texture on relative depth discrimination in a virtual environment

Original Article
Open access
Published: 25 April 2024

Volume 28, article number 103, (2024)
Cite this article

Download PDF

You have full access to this open access article

Virtual Reality Aims and scope Submit manuscript

The effect of target and background texture on relative depth discrimination in a virtual environment

Download PDF

314 Accesses
Explore all metrics

Abstract

The spatial frequency (SF) content of an object’s texture is an important cue for depth perception, although less is known about the role of background texture. Here, we used bandpass-filtered noise patterns to systematically study the interactions between target and background textures in a virtual environment. During the trials, three square targets were presented at 3 m against a background wall 6 m away from the observer. One of the squares was presented closer than the other two, and the subjects had to indicate it with a key press. The threshold distance from the two reference tiles was determined using a staircase procedure. Both the target and background were tested with different combinations of SF textures and a non-textured gray, which were rendered onto the flat surfaces. Against a gray background, the distance thresholds were smallest when the targets were presented with a mid-SF texture. Performance declined significantly with a non-textured target against a textured background. With different combinations of target-background texture, the background texture significantly affected the performance. We propose several hypotheses to explain the behavioral result. Understanding the effect of surrounding texture can be useful in improving the depth perception experience in virtual reality.

The Effect of Size Statistics of the Background Texture on Perceived Target Size

Article Open access 19 July 2018

Grid-texture mechanisms in human vision: Contrast detection of regular sparse micro-patterns requires specialist templates

Article Open access 27 July 2016

Computational Study of Depth Perception for an Ambiguous Image Region: How Can We Estimate the Depth of Black or White Paper?

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

One benefit of having two eyes and an overlapping field of view is stereopsis or depth perception. Stereopsis is essential for one to locate the position of an object relative to its surroundings, especially at near. When an object is seen from two different points of view, it has a slight relative horizontal position shift. The visual system encodes the relative distance difference (called binocular disparity) and perceives the object at a certain z-coordinate in the visual space. The 3D location information needs to be obtained, computed, and conveyed for visual-action functions such as avoiding obstacles, grasping a mug, or descending staircases.

Julesz proposed that one of the most important aspects of stereopsis is to break camouflage, i.e. to support the ability to distinguish a target from background even when the target and background have a matching pattern (Julesz 1971). With stereopsis, one can extract global depth information from autostereograms (single 2-dimensional image with repeating patterns that can reveal 3-dimensional content when viewed with a specific eye convergence) or random dot stereogram. This information is purely derived from binocular disparity, based on the corresponding regions (or points) matched between the two eyes. While stereopsis is purely based on relative position information between the two images, the perception can be richer or poorer with other monocular cues. Ancient artists have been able to paint and render a flat canvas into an impression of 3-dimensional depth by tricking our brains with pictorial cues. These pictorial cues in the visual scene such as texture (Johnston et al. 1993; Hornsey and Hibbard 2021), texture gradient (Hillis et al. 2004; Stevens 1981; Tozawa 2012; Tsutsui et al. 2002), blur (Held et al. 2012; Mather 1997), relative size, lighting direction (Langer and Bülthoff 2000), occlusion (Marshall et al. 1996), tilt (Fiorentini and Maffei 1971; Oluk et al. 2022), distance to the horizon (Gardner et al. 2010), and shading (Chen and Tyler 2015) also provide additional information to depth perception or stereopsis. In 3D shapes, different material properties, local luminance contrast changes at the borders, shadow, reflectance, and motion could help us to identify the figure from its background even with matching texture (Troscianko et al. 2009). In addition to those cues, extraction of 3D visual information of an object from its surroundings also depends on the complexity of the background scene, the contrast and color differences between a target from its background (O’Shea et al. 1994; Scaccia and Langer 2018).

Extraction of 3D visual information involves integrating information from both the object and its surroundings (Salmela and Laurinen 2007). Background texture of different spatial patterns (Sutter et al. 1989), contrasts (Frisby and Mayhew 1978), reference planes (Andrews et al. 2001), presence of nearby features (Mitchison and Westheimer 1984) or colors may improve or worsen visual texture in stereopsis. For example, neighboring surfaces with similar texture profiles may have a camouflage effect making it harder to discern one from the other, potentially degrading stereopsis. As a simple demonstration of how background texture affects depth perception, try cross-fusing the left and right images in Fig. 1A. Specific instructions on how to do this are given in the figure caption. One will see a square target appearing in depth, in front of the no-texture gray background. With matching target-background texture (see Fig. 1B), one will have difficulty separating the 3D target from the background. On the contrary, with a lower SF texture in the background (see Fig. 1C), the background appears blurred, making the perception of the target stronger.

The human visual system consists of multiple spatially tuned filters to extract luminance information and give rise to the bandpass shaped contrast sensitivity function. Different SFs provide different information: higher SF provides details and edges, while low SF texture provides coarse global information such as shape (Bex et al. 2009; Westheimer 2001). Similarly, stereopsis exhibits the same behavior. Many psychophysical studies have investigated the spatial characteristics of the target in stereopsis by using Gabor (Schor and Wood 1983), sinusoidal gratings, or depth corrugations (Reynaud and Hess 2017). While these studies provided a fundamental understanding of spatial frequency effects in stereopsis, in real-world scenes, spatial frequency also presents in terms of textures. Some luminance variation on the texture of physical objects is needed to carry disparity information.

It remains uncertain how surroundings, especially the spatial content of background texture, would affect the integration of figure-background depth information (Sawayama and Kimura 2015). The spatial content refers to spatial frequency (SF) or the periodic distribution of dark and light imaged within a given area on the retina. In visual perception studies, sinusoidal, square, and noise patterns defined in spatial frequency (cycles per degree, cpd) are typically used. Only a few studies have attempted to understand the impact of texture spatial property in stereopsis. Brown and Weisstein used sinusoidal gratings and found a bias that higher SF gratings are perceived further in front when surrounded by lower spatial frequency gratings, at a difference of 2-octaves (Brown and Weisstein 1988). On the contrary, another study did not find any bias as a function of spatial frequencies (Siderov and Harwerth 1993). Another example of how background texture affects depth perception is disparity averaging, the combination of disparity information across space (Cammack and Harris 2016; Li and Motter 2007). For example, the extraction of disparity information of a single target in isolation may differ from the same target in the presence of surrounding objects or cluttered background. The SF and contrast of the surrounding affect disparity averaging. Generally, when two targets of different SF patterns within 3.5 octaves are placed next to each other, the perceived depth is biased toward the higher SF (Rohaly and Wilson 1994).

In virtual reality (VR), textures are often applied to a surface to give visual details. The SF of a texture can be easily manipulated in the virtual environment. For example, a texture in the asset library typically comes in a fixed size. It can be used to wrap an entire area or be tiled repeatedly. As a result, different tiling and wrap modes would result in contrasting SF content, as shown in Fig. 2. Improving the virtual content can improve the visual comfort and immersive experience, especially in distance perception (Creem-Regehr et al. 2022; Diaz et al. 2017; El Jamiy and Marsh 2019; Ellis et al. n.d.; Kline and Witmer 1996). The underestimation and bias in distance perception in the VR environment have been studied extensively (Bruder et al. 2015; Creem-Regehr et al. 2015; Kunz et al. 2015; Renner et al. 2013; Scarfe and Glennerster 2021). Hornsey and Hibbard found that texture reduced the bias (Hornsey and Hibbard 2021). In addition, realistic texture has also been reported to have a positive influence on spatial perception in virtual reality as compared to “paper model/non-textured” scenes (Lucaci et al. 2022). Therefore, in addition to its aesthetic component, texture can also be used to improve depth judgment in virtual reality. This study specifically explores the effect of texture (on both the target and background) as pictorial cues in depth perception in a virtual environment.

In our experiment, bandpass noise patterns at different SFs were used to systematically investigate the role of target and background texture for relative distance judgment in virtual reality. Complete randomization of pixel values results in a white noise pattern (Fig. 3A) containing energy at all spatial frequencies. On the other hand, bandpass noise patterns (Fig. 3B–F) are filtered to contain a narrow band of spatial frequency, allowing the study of fundamental mechanisms of visual perception (Allard et al. 2015). Unlike gratings, they have no orientation bias and more closely resemble natural textures. In addition, it allows us to assess disparity as a function of texture SFs. We found that the texture of a target is critical for depth perception. The participants did better with the middle spatial frequencies bandpass noise pattern with texture on the target. With no texture on the target, the performance was significantly poorer. Different background textures also affected the performance, best with mid-SF target against mid-SF background texture. We propose several models to investigate the influence of SF tuning and texture integration in depth perception.

2 Methodology

2.1 Subjects and apparatus

Eleven subjects (mean age = 29.3, three female) volunteered for this pilot study, and one of them was the author of the manuscript. They all had no binocular vision abnormalities, stereopsis of at least 32 secs of arc tested with the Random Dot 2-S Stereotest (Vision Assessment Corporation), and monocular visual acuity of at least 20/25. They wore their prescribed correction during the experiment. Written informed consent was obtained from all participants before the experiment, and the experimental procedures were approved by the Institutional Review Board of the University of Houston.

Custom code was written in JavaScript using the Three.js 3D rendering library to assess distance discrimination in a virtual reality environment. The program ran in a web browser and WebXR emulator, with the rendered scene shown on each display of an HTC Vive Pro headset. This headset has a resolution of 1080 × 1200 pixels per eye, a 90 Hz refresh rate, and an approximately 110° field of view. The scene was stable on the display, with all head tracking features disabled. The luminance was gamma corrected, with a value of 2.2 in software using a custom WebGL shader, with the noisy-bit method used to provide greater resolution of luminance steps (Allard and Faubert 2008). The mean luminance of the stimuli was ~ 50 cd/m², equivalent to a typical indoor brightness level, comfortable for long viewing. The HTC Vive can achieve luminances as high as ~ 110 cd/m² with full white. Subjects responded using a standard computer keyboard.

2.2 Stimulus

The virtual scene consisted of a room with four walls (width and length of 12 m virtual distance) surrounding the perspective camera, positioned in the center. Wearing the headset, the observers saw a background wall located 6 m ahead. With a keypress, three squares appear halfway, 3 m distance for 1 s. Each square tile was 3 × 3° in size and separated by a 4.7° center-to-center distance. The middle square was positioned straight ahead of the line of sight, with one placed on the left and right, respectively. The targets and background were rendered with texture, made of bandpass filtered noise patterns (0.33, 0.66, 1, 3, and 4.5 cycles per degree (cpd)). Higher spatial frequency (cpd) texture has more details and a grainy appearance. The bandpass filtered noise patterns were pre-generated offline, loaded as png files, and wrapped around the objects. The materials used are non-shiny and non-specular meshlambert material defined in Three.js. The spatial filtering has a bandwidth (full width at half-height) of 1 octave and is radially symmetrical in the log₂ frequency domain. The equation of the filter, modified from (Peli 1990) is given by:

$${\text{Amplitude}} = 0.5 + \frac{{\cos \left( { \pi \left( {\log_{2} \left( {vals} \right) - \log_{2} \left( {ctr} \right)} \right)} \right.}}{2}$$

where vals is the position in pixels from the center while ctr corresponds to the spatial frequency. First, random values between 0 and 1 were generated for every pixel within the dimension of the image. The image was then Fourier transformed and multiplied by the log₂ filter of that SF in cycles per image. The entire image was scaled to the visual angle of each object, resulting in textures defined as cycles per degree (cpd). A large texture was generated to match the size needed for the background. A snippet of the texture was then cropped to be used as the texture of the square tiles and randomly rotated for each target on each trial. An ambient light source (illumination = 1) that equally illuminated all the objects in the scene was placed at the origin coordinate with no cast shadow.

During each trial, one of the targets appeared closer than the other two, with a smaller z-coordinate from the fixation plane at 3 m. E.g., as illustrated in Fig. 4A, the target on the left is positioned at 2.7 m away from the observer while the middle and right non-targets were at 3 m distance. The task was to identify the target position and respond with either the left, down or right arrow. To eliminate cues from size difference when the target is displaced forward, a scaling factor was used to compensate for the larger size so that the three squares subtended a fixed angular size regardless of the z-position in space. A black-colored grid was drawn on the background surface for binocular fusion throughout the experiment. None of the subjects reported double vision during the experiment.

A 3-down 1-up adaptive staircase method with seven reversals was implemented to measure the distance threshold (in meter), or the smallest distance separation needed to distinguish the target from the two non-targets. The distance of the target from the two non-targets decreased after three correct responses in a row and increased with a mistake, as shown in Fig. 4B. The mean of the last five reversals was used to estimate the distance threshold. The maximum value of the staircase was set at a limit of 1.5 m away from the fixation plane because the left and right squares presented beyond that limit had a significant horizontal displacement that could be used as a cue. Each staircase was repeated for each condition. The dependent variable of the study is the distance thresholds, while the independent variables are the different combinations of texture. Three sets of conditions were tested for each subject: textured target against gray background (to study the effect of texture on target only), non-texture-gray target against textured background (to study the effect of texture on the background only), and different combinations of target-background texture (to study the combined effect of target and background texture with matching and non-matching textures). To test the effect of spatial frequency, in each texture condition we tested with 5 different spatial frequencies: 0.33, 0.66, 1, 3, 4.5 cpd.

The subjects sat for a total of 70 blocks (35 blocks × 2), and each block took about 2-5 min. All subjects had a few blocks as practiced before the experiment to get familiarized with the experiment. The total duration of the study was approximately 3–4 h, divided into 3 or 4 visits, with short breaks every ~ 15–20 min. We encourage breaks and stop the experiment after an hour to reduce fatigue and accommodation vergence factors. Below are the details for each condition:

Condition A (5 blocks)	Textured targets (0.33, 0.66, 1, 3, 4.5 cpd) against isoluminant gray background (no texture)
Condition B (5 blocks)	Gray target (no texture) against textured background (0.33, 0.66, 1, 3, 4.5 cpd)
Condition C (25 blocks)	Combinations of textured target and background (0.33, 0.66, 1, 3, 4.5 cpd). In “Matched Condition,” both targets and background had the same SF texture. In “Unmatched Condition,” the targets had different SF textures from the background

The distance thresholds in meter were converted to disparity thresholds in arc min with an assumption of 65 mm inter-pupillary distance fixating at 3 m using the formula below (Howard 1919), where d is the threshold distance and D is set at 3 m, which is the fixation plane. Disparity sensitivity is calculated by taking the reciprocal of disparity thresholds.

$$Disparity,\;arc\,min = \frac{0.065\times57.29\times60}{{D^{2} - dD}}$$

$${\text{Log}}\;{\text{disparity}}\;{\text{sensitivity}}\left( {\min^{ - 1} } \right) = 1/ {\text{disparity}}\;{\text{threshold}}$$

2.3 Data analysis

We performed data analyses using Python and the Numpy/Scipy scientific libraries (Oliphant 2007). Shapiro–Wilk test was conducted to test for normality for the distance thresholds and distance sensitivity and concluded that the sample data does not come from a normal distribution. Other transformations also failed to make the data normally distributed. We used the non-parametric Wilcoxon Sign Ranked test to compare the result between the different conditions. A p value of 0.05 is the threshold for statistical significance.

3 Results

For each subject, there were 70 outcome measures of the distance thresholds, the smallest distance to distinguish a target to be in front of the other two non-targets across all the conditions. The mean distance thresholds across all subjects were computed for each condition. The distance sensitivity is reciprocal of distance thresholds, m⁻¹. The goal of the study is to understand the trend of depth perception with different combinations of texture defined by spatial frequency; therefore, descriptive analysis was performed.

3.1 Condition A (textured target against a gray background)

Figure 5A shows the mean distance threshold for textured targets against a gray background. If the spatial frequency of the target texture does not contribute to depth perception, the expected results will be a straight horizontal line. However, the results showed a trend. The mean distance thresholds were 34.55, 29.98, 25.42, 21.18, and 33.4 cm for 0.33 cpd, 0.66 cpd, 1 cpd, 3 cpd, and 4.5 cpd respectively (refer to right y-abscissa). The distance sensitivity, m⁻¹ was 3.13, 3.66, 4.24, 6.12, and 3.27 for 0.33 cpd, 0.66 cpd, 1 cpd, 3 cpd, and 4.5 cpd respectively (refer to left y-abscissa). Higher distance sensitivity or lower distance threshold indicates that the subjects can distinguish a smaller distance separation between the target and non-targets. On average, all the subjects had lower distance thresholds or peak sensitivity when the targets were presented with 3 cpd. Each color and symbol represent individual data. Most subjects could not give a reliable response when the targets were presented with 4.5 cpd. Participants subjectively reported that 4.5 cpd target resembled a “fuzzy” gray texture, making it harder to distinguish from a gray background. Based on the resolution and field of view of the headset, 1° visual angle subtends about 10 pixels. A texture of 4.5 cpd will require a minimum of 9 pixels per degree -with only black and white pixels. Therefore 4.5 cpd is at the verge of display resolution limit. The staircase terminated when the distance threshold was above 1.5 m, and the data points were excluded from the analysis.

3.2 Condition B (gray target against a textured background)

Figure 5B shows the results of non-textured gray targets against a textured background. The mean distance thresholds were 42.70, 39.97, 48.67, 50.41, and 71.83 cm for 0.33 cpd, 0.66 cpd, 1 cpd, 3 cpd and 4.5 cpd respectively (refer right y-abscissa). The mean distance sensitivity was 2.60, 2.74, 2.16, 2.07 and 1.41 for 0.33 cpd, 0.66 cpd, 1 cpd, 3 cpd and 4.5 cpd respectively (refer left y-abscissa). If the spatial frequency of the background texture does not contribute to depth perception, the expected results will be a straight horizontal line. However, the results showed a trend where subjects performed best with lower SF texture in the background. Subjects also had difficulty in the task when the background was presented at 4.5 cpd. Despite being the same combination of SFs as Condition A, except with target and background reversed, the distribution of the distance threshold between the two conditions pooled across all SFs was significantly different (Wilcoxon Signed Rank test, p < 0.001). The orange curve (Fig. 5B) is lower than the red curve (Fig. 5A), revealing a poorer performance with no-texture target.

3.3 Condition C (textured target + textured background)

Figure 6A shows the mean distance sensitivity and distance thresholds across all the observers for different background texture (increasing background SF from the left to right columns). The peak sensitivity remained robust at 3 cpd across different background texture. Note that the “inverted U-shape” of each curve also changed with different background texture. A heat map is generated to illustrate the pattern across the twenty-five texture combinations for the target and background (see the red square in Fig. 6B). We considered gray texture as the maximum SF and extended the axes. An extremely high SF or dense texture would go beyond the resolution of a display and be displayed as the mean luminance, which is gray. The result from Condition A is added to the rightmost column while result from Condition B is added to topmost row. Overall, the subjects performed best within 1-3 cpd of the target and background, indicated by the brighter region in the middle. Subjects had poorer performance when the target was presented at 4.5 cpd and gray. In each column, the color changes from the bottom to the top, indicating how the SFs of the target affected depth perception at a given background. To determine if background or target texture plays a more significant role, the standard deviations of the distributions for each row and column within the red box were computed. The standard deviations were 3.31, 2.31, 3.57, 4.19, and 6.03 for target SF of 0.33–4.5 cpd (bottom to top row). Meanwhile, the standard deviations for each column (left to right) were 4.19, 7.40, 8.84, 9.74, and 9.61for background SF from 0.33 to 4.5 cpd. The larger standard deviation in each column suggests that changing the SF of the target texture had a larger impact than its background.

3.4 Modelling the effect of target and background texture based on spatial frequency processing in human vision

All modeling is based on transformation of distance sensitivity to log units in line with other studies. To fit these data, we assumed that performance across a given background condition (a column of Fig. 6), would resemble results from prior experiments that measured disparity sensitivity. Specifically, with a gray background (Fig. 5A), the paradigm is comparable to Reynaud et al. (2015), who measured sensitivity to depth corrugations and modeled their data with a truncated log parabola fit. This function captures a band-pass shape that drops off at high spatial frequencies and plateaus at low spatial frequencies, like the spatial contrast sensitivity function (CSF) (Robson 1966; Watson and Ahumada 2005). The limit of the DSF at high spatial frequencies has been reported in the depth corrugation literature (Bradshaw and Rogers 1999; Ding and Levi 2011; Peterzell et al. 2017; Serrano-Pedraza and Read 2010; Tyler 1973). This limit, which is between 3-5 cpd, is proposed to be a byproduct of cross-correlation in computing disparity information (Filippini and Banks 2009).

This model, adapted from Reynaud et al. (2015) has four parameters: peak spatial frequency f_max, peak disparity sensitivity γ_max, bandwidth β and γ_low to flatten sensitivity at low spatial frequencies (Reynaud et al. 2015). The truncated log parabola fits our results from the gray background condition well, as shown in Fig. 7F. However, with the addition of a textured background, the function can no longer adequately describe the data. With low or middle SF backgrounds, the performance at high target SFs plateaus, suggesting a dependence on the background. For example, lowering the SF texture of the background helps to increase the saliency of the target, as shown in Fig. 5B. To capture this effect, we introduce a double-truncated log parabola fit, which adds an additional parameter, γ_high, to flatten sensitivity at higher target SFs. The importance of γ_high will be further described in the Discussion section.

$$S^{{\prime }} \left( f \right) = log_{10} \left( {\gamma_{max} } \right) - k\left( {\frac{{log_{10} \left( f \right) - log_{10} \left( {f_{max} } \right)}}{{\frac{{\beta^{{\prime }} }}{2}}}} \right)^{2} ,\quad where\;k = log_{10} \left( 2 \right)\;{\text{and}}\;\beta^{{\prime }} = log_{10} \left( {2\beta } \right)$$

$$S\left( f \right) = \gamma_{low} \;{\text{if}}\;{f} < f_{max} \;{\text{and}}\;S^{{\prime }} \left( f \right) < \log_{10} \left( {\gamma_{low} } \right)$$

$$S\left( f \right) = \gamma_{{{\text{high}}}} \;{\text{if}}\;{f} > f_{\max } \;{\text{and}}\;S^{{\prime }} \left( f \right) < \log_{10} \left( {\gamma_{high} } \right)$$

$$S\left( f \right) = S^{\prime } \left( f \right)\;otherwise$$

The optimal parameters for each subject and different background conditions were independently determined using nonlinear least squares fitting, as shown in Fig. 7. Figure 7 shows the fit to the mean sensitivity across subjects. Each column represents a different background. The red horizontal lines indicate that the subjects were not able to do the task at those texture combinations. Like in Reynaud et al. (2015), the γ_low and β parameters did not display any meaningful trend based on spatial frequency and are excluded from further data analysis. The remaining three parameters (f_max, γ_max, γ_high) showed several characteristic effects describing how the background impacts the DSF, as shown in Fig. 8. While Fig. 7 plots a fit for each background condition (one fit for each column in Fig. 6A), Fig. 8 summarizes the trend for each parameter across conditions. The f_max parameter decreases from approximately 2–1.5 cpd as background spatial frequency increases. This trend results in a slight leftward shift of the peak of each function in Fig. 7 when going from the leftmost to the rightmost column. γ_max reflects the relative height of each function in Fig. 7; specifically, most subjects had the highest sensitivity with a background of 1 cpd. Lastly, γ_high decreases as background SF increases, with a slight peak at 0.66 cpd. The effect of this parameter can be observed by the reduction of the rightmost flat portion of each curve in Fig. 7 when going from the leftmost to the rightmost column.

4 Discussion

The results show that the texture of different spatial frequencies affects depth perception in a virtual environment, whether on the surface of the target or its surroundings. Firstly, against a gray background, the ability to correctly judge distances is better with targets presented with mid-spatial frequency texture. With a non-textured target, performance was significantly worse. With different combinations of target and background textures, the SF characteristics of the target modulated the thresholds more strongly than the SF characteristics of the background. We fitted the results with a double-truncated log parabola fit based on spatial frequency characteristics in human perception. The additional parameter, γ_high was introduced to account for the contribution of the background texture, which could aid discrimination when information from the target texture is limited.

Can simple models based on known SF properties be sufficient to describe our results? Several simple models based on the interaction between the target and background SF properties are described below. The heatmap in Fig. 6B is replotted in log–log scale with interpolation as shown in Fig. 9G.

(A)
Model 1: Background with lower SF gives an illusion of far away. Naturally, far-away objects are perceived as blurred and less detailed. Therefore, a target with a high SF texture against a low SF background texture may give an illusion of blur-mediated bias, as reported by another study (Brown and Weisstein 1988). The hypothetical result would be a heatmap with a gradient bright region on the left side (Fig. 9A) where performance is better with lower background texture. The brighter region represents better depth judgment or lower distance thresholds. Refer to Supplementary Material for the computation.
(B)
Model 2: Blur-mediated bias is only present when the background has lower SF than the target. The hypothetical result would result in a heatmap as Fig. 9B in which performance is better in conditions above the diagonal line (background has lower SF than the target)
(C)
Model 3: Camouflage effect (background and target have matching textures) causes poorer performance: Studies have shown that segregation of a figure from its background is easier when the textures differ in luminance (Julesz 1962), color (Saarela and Landy 2012), orientation (Peterzell et al. 2017), and size. When the target and background are similar, it leads to more visual search errors and a longer reaction time (Neider and Zelinsky 2006; Nothdurft 1992). Figure-background textures with a similar pattern may appear as a continuous surface (Caputo 1996) or camouflage texture (Song and Geng 2010). Camouflage texture blends the target with the background, making it harder to segment the figure from the background. Since the texture patterns comprised band-limited noise, the dissimilarity index can be defined as the arithmetic difference between target and background spatial frequencies. The hypothetical result would be a heatmap with the worst performance along the identity diagonal line and better performance when the two textures are on the opposite spectrum (low SF target texture against high SF background texture and vice versa), as shown by the brighter corners in Fig. 9C.
(D)
Model 4: Depth judgment is tuned to the spatial frequency of target and background texture. The SF tuning characteristics of the target can be defined as a Gaussian distribution, with peak sensitivity at the middle SF of the target, as shown by the bright horizontal band in Fig. 9D. On the other hand, the SF tuning to background is illustrated by the bright vertical band in Fig. 9E. If the subjects were tuned to both SF (target and background) during the task, the resultant peak sensitivity would be a localized bright spot as shown in Fig. 9F.

Based on the behavioral results (Fig. 9G), the contributions of target and background texture are not as straightforward as any of the simple models. When there is no background texture, the SF property of the target is the sole cue and has a tuning characteristic, as shown in Condition A and Fig. 9D. As shown in Fig. 8A, the spatial frequency of the peak, or f_max, changes relative to its background. In other words, the tuning characteristics are not as simple as the naïve tuning model (Fig. 9D). At high SF, the texture on the target becomes less effective as a cue, so the background becomes informative, as shown by the trend in γ_high (see Fig. 8C). In fact, the plot in Fig. 8C shares the same trend as Fig. 5B, reaffirming that γ_high captures the contribution of the background texture.

Textured patterns trigger vergence movements to facilitate depth perception (Kidd et al. 1979). We speculate that without spatial modulation on the target (no texture), it cannot elicit responses from disparity sensitivity neurons. The visual system must rely on information from the local contrast gradient at the edge of the background texture. This could involve a different mechanism of depth perception, which may be independent of SF channels. It has been suggested that depth perception is linearly dependent on the cues, and the weights of each cue change based on the reliability (Young et al. 1993). In this case, when the target texture cue is not available, the weight of the background texture and edges cue would be upweighted. In addition, when the target and background have similar texture patterns, the abrupt change in luminance contrast at the edge may nullify the camouflage effect (Troscianko et al. 2009), especially in low SF textures.

While we have shown that textures improve depth judgment in VR, an extremely high SF texture may not be beneficial unless paired with a lower SF background. Texture compression is also limited by the resolution or pixel density of the display, and current VR headsets are not dense enough to provide very fine detail at high SFs. It may be best to use mid-SF texture for a better stereo experience.

5 Conclusion

Manipulation of the details of texture in virtual reality affects depth judgments. Disparity information derived from texture, especially in middle spatial frequencies, optimally allows users to gauge depth and distance in a virtual environment. Since stereo mechanisms integrate spatial frequency information across the visual scene, the background texture may improve or worsen the ability to discriminate depth. In addition to their aesthetic value, VR content creators or graphic designers can utilize different combinations of texture to support more accurate depth judgments.

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Allard R, Faubert J (2008) The noisy-bit method for digital displays: converting a 256 luminance resolution into a continuous resolution. Behav Res Methods 40(3):735–743. https://doi.org/10.3758/BRM.40.3.735
Article Google Scholar
Allard R, Faubert J, Pelli D (2015) Editorial: using noise to characterize vision. Front Psychol. https://doi.org/10.3389/fpsyg.2015.01707
Article Google Scholar
Andrews TJ, Glennerster A, Parker AJ (2001) Stereoacuity thresholds in the presence of a reference surface. Vis Res 41(23):3051–3061. https://doi.org/10.1016/S0042-6989(01)00192-4
Article Google Scholar
Bex PJ, Solomon SG, Dakin SC (2009) Contrast sensitivity in natural scenes depends on edge as well as spatial frequency structure. J vis 9(10):1. https://doi.org/10.1167/9.10.1
Article Google Scholar
Bradshaw MF, Rogers BJ (1999) Sensitivity to horizontal and vertical corrugations defined by binocular disparity. Vis Res 39(18):3049–3056. https://doi.org/10.1016/S0042-6989(99)00015-2
Article Google Scholar
Brown JM, Weisstein N (1988) A spatial frequency effect on perceived depth. Percept Psychophys 44(2):157–166. https://doi.org/10.3758/BF03208708
Article Google Scholar
Bruder G, Sanz FA, Olivier A-H, Lecuyer A (2015) Distance estimation in large immersive projection systems, revisited. IEEE Virtual Real (VR) 2015:27–32. https://doi.org/10.1109/VR.2015.7223320
Article Google Scholar
Cammack P, Harris JM (2016) Depth perception in disparity-defined objects: finding the balance between averaging and segregation. Philos Trans R Soc b: Biol Sci 371(1697):20150258. https://doi.org/10.1098/rstb.2015.0258
Article Google Scholar
Caputo G (1996) The role of the background: texture segregation and figure—ground segmentation. Vis Res 36(18):2815–2826. https://doi.org/10.1016/0042-6989(96)00045-4
Article Google Scholar
Chen C-C, Tyler CW (2015) Shading beats binocular disparity in depth from luminance gradients: evidence against a maximum likelihood principle for cue combination. PLoS ONE 10(8):e0132658. https://doi.org/10.1371/journal.pone.0132658
Article Google Scholar
Creem-Regehr SH, Stefanucci JK, Thompson WB, Nash N, McCardell M (2015) Egocentric distance perception in the Oculus Rift (DK2). In: Proceedings of the ACM SIGGRAPH symposium on applied perception, pp 47–50. https://doi.org/10.1145/2804408.2804422
Creem-Regehr SH, Stefanucci JK, Bodenheimer B (2022) Perceiving distance in virtual reality: theoretical insights from contemporary technologies. Philos Trans R Soc b: Biol Sci 378(1869):20210456. https://doi.org/10.1098/rstb.2021.0456
Article Google Scholar
Diaz C, Walker M, Szafir DA, Szafir D (2017) Designing for depth perceptions in augmented reality. In: 2017 IEEE international symposium on mixed and augmented reality (ISMAR), pp 111–122.https://doi.org/10.1109/ISMAR.2017.28
Ding J, Levi DM (2011) Recovery of stereopsis through perceptual learning in human adults with abnormal binocular vision. Proc Natl Acad Sci 108(37):15035
Article Google Scholar
El Jamiy F, Marsh R (2019) Survey on depth perception in head mounted displays: distance estimation in virtual reality, augmented reality, and mixed reality. IET Image Proc 13(5):707–712. https://doi.org/10.1049/iet-ipr.2018.5920
Article Google Scholar
Ellis, S. (Ed.). (1991). Pictorial Communication In Real And Virtual Environments (1st ed.). CRC Press. https://doi.org/10.1201/9781482295177
Filippini HR, Banks MS (2009) Limits of stereopsis explained by local cross-correlation. J vis 9(1):8.1-818. https://doi.org/10.1167/9.1.8
Article Google Scholar
Fiorentini A, Maffei L (1971) Binocular depth perception without geometrical cues. Vis Res 11(11):1299–1305. https://doi.org/10.1016/0042-6989(71)90012-5
Article Google Scholar
Frisby JP, Mayhew JEW (1978) Contrast sensitivity function for stereopsis. Perception 7(4):423–429. https://doi.org/10.1068/p070423
Article Google Scholar
Gardner JS, Austerweil JL, Palmer SE (2010) Vertical position as a cue to pictorial depth: height in the picture plane versus distance to the horizon. Atten Percept Psychophys 72(2):445–453. https://doi.org/10.3758/APP.72.2.445
Article Google Scholar
Held RT, Cooper EA, Banks MS (2012) Blur and disparity are complementary cues to depth. Curr Biol 22(5):426–431. https://doi.org/10.1016/j.cub.2012.01.033
Article Google Scholar
Hillis JM, Watt SJ, Landy MS, Banks MS (2004) Slant from texture and disparity cues: optimal cue combination. J vis 4(12):1. https://doi.org/10.1167/4.12.1
Article Google Scholar
Hornsey RL, Hibbard PB (2021) Contributions of pictorial and binocular cues to the perception of distance in virtual reality. Virtual Real 25(4):1087–1103. https://doi.org/10.1007/s10055-021-00500-x
Article Google Scholar
Howard HJ (1919) A test for the judgment of distance. Trans Am Ophthalmol Soc 17:195–235
Google Scholar
Johnston EB, Cumming BG, Parker AJ (1993) Integration of depth modules: stereopsis and texture. Vis Res 33(5):813–826. https://doi.org/10.1016/0042-6989(93)90200-G
Article Google Scholar
Julesz B (1962) Visual pattern discrimination. IRE Trans Inf Theory 8(2):84–92. https://doi.org/10.1109/TIT.1962.1057698
Article Google Scholar
Julesz B (1971) Foundations of cyclopean perception. University of Chicago Press, Chicago
Google Scholar
Kidd AL, Frisby JP, Mayhew JEW (1979) Texture contours can facilitate stereopsis by initiating vergence eye movements. Nature 280(5725):829–832. https://doi.org/10.1038/280829a0
Article Google Scholar
Kline PB, Witmer BG (1996) Distance perception in virtual environments: effects of field of view and surface texture at near distances. Proc Hum Factors Ergon Soc Annu Meet 40(22):1112–1116. https://doi.org/10.1177/154193129604002201
Article Google Scholar
Kunz BR, Creem-Regehr SH, Thompson WB (2015) Testing the mechanisms underlying improved distance judgments in virtual environments. Perception 44(4):446–453. https://doi.org/10.1068/p7929
Article Google Scholar
Langer MS, Bülthoff HH (2000) Depth discrimination from shading under diffuse lighting. Perception 29(6):649–660. https://doi.org/10.1068/p3060
Article Google Scholar
Li S, Motter B (2007) Disparity averaging mechanisms. J vis 7(9):826. https://doi.org/10.1167/7.9.826
Article Google Scholar
Lucaci A, Bach M, Jensen P, Madsen C (2022) Influence of texture fidelity on spatial perception in virtual reality, pp 244–251. https://doi.org/10.5220/0010890100003124
Marshall JA, Burbeck CA, Ariely D, Rolland JP, Martin KE (1996) Occlusion edge blur: a cue to relative visual depth. JOSA A 13(4):681–688. https://doi.org/10.1364/JOSAA.13.000681
Article Google Scholar
Mather G (1997) The use of image blur as a depth cue. Perception 26(9):1147–1158. https://doi.org/10.1068/p261147
Article Google Scholar
Mitchison GJ, Westheimer G (1984) The perception of depth in simple figures. Vis Res 24(9):1063–1073. https://doi.org/10.1016/0042-6989(84)90084-1
Article Google Scholar
Neider MB, Zelinsky GJ (2006) Searching for camouflaged targets: effects of target-background similarity on visual search. Vis Res 46(14):2217–2235. https://doi.org/10.1016/j.visres.2006.01.006
Article Google Scholar
Nothdurft H-C (1992) Feature analysis and the role of similarity in preattentive vision. Percept Psychophys 52(4):355–375. https://doi.org/10.3758/BF03206697
Article Google Scholar
O’Shea RP, Blackburn SG, Ono H (1994) Contrast as a depth cue. Vis Res 34(12):1595–1604. https://doi.org/10.1016/0042-6989(94)90116-3
Article Google Scholar
Oliphant TE (2007) Python for scientific computing. Comput Sci Eng 9(3):10–20. https://doi.org/10.1109/MCSE.2007.58
Article Google Scholar
Oluk C, Bonnen K, Burge J, Cormack LK, Geisler WS (2022) Stereo slant discrimination of planar 3D surfaces: frontoparallel versus planar matching. J vis 22(5):6. https://doi.org/10.1167/jov.22.5.6
Article Google Scholar
Peli E (1990) Contrast in complex images. JOSA A 7(10):2032–2040. https://doi.org/10.1364/JOSAA.7.002032
Article Google Scholar
Peterzell DH, Serrano-Pedraza I, Widdall M, Read JCA (2017) Thresholds for sine-wave corrugations defined by binocular disparity in random dot stereograms: factor analysis of individual differences reveals two stereoscopic mechanisms tuned for spatial frequency. Vis Res 141:127–135. https://doi.org/10.1016/j.visres.2017.11.002
Article Google Scholar
Renner RS, Velichkovsky BM, Helmert JR (2013) The perception of egocentric distances in virtual environments—a review. ACM Comput Surv 46(2):23:1-23:40. https://doi.org/10.1145/2543581.2543590
Article Google Scholar
Reynaud A, Hess RF (2017) Characterization of spatial frequency channels underlying disparity sensitivity by factor analysis of population data. Front Comput Neurosci. https://doi.org/10.3389/fncom.2017.00063
Article Google Scholar
Reynaud A, Gao Y, Hess RF (2015) A normative dataset on human global stereopsis using the quick Disparity Sensitivity Function (qDSF). Vis Res 113:97–103. https://doi.org/10.1016/j.visres.2015.04.021
Article Google Scholar
Robson JG (1966) Spatial and temporal contrast-sensitivity functions of the visual system. JOSA 56(8):1141–1142. https://doi.org/10.1364/JOSA.56.001141
Article Google Scholar
Rohaly AM, Wilson HR (1994) Disparity averaging across spatial scales. Vis Res 34(10):1315–1325. https://doi.org/10.1016/0042-6989(94)90205-4
Article Google Scholar
Saarela TP, Landy MS (2012) Combination of texture and color cues in visual segmentation. Vis Res 58:59–67. https://doi.org/10.1016/j.visres.2012.01.019
Article Google Scholar
Salmela VR, Laurinen PI (2007) Spatial frequency difference between textures interferes with brightness perception. Vis Res 47(4):452–459. https://doi.org/10.1016/j.visres.2006.11.016
Article Google Scholar
Sawayama M, Kimura E (2015) Stain on texture: perception of a dark spot having a blurred edge on textured backgrounds. Vis Res 109:209–220. https://doi.org/10.1016/j.visres.2014.11.017
Article Google Scholar
Scaccia M, Langer MS (2018) Signs of depth-luminance covariance in 3-D cluttered scenes. J vis 18(3):5. https://doi.org/10.1167/18.3.5
Article Google Scholar
Scarfe P, Glennerster A (2021) Combining cues to judge distance and direction in an immersive virtual reality environment. J vis 21(4):10. https://doi.org/10.1167/jov.21.4.10
Article Google Scholar
Schor CM, Wood I (1983) Disparity range for local stereopsis as a function of luminance spatial frequency. Vis Res 23(12):1649–1654. https://doi.org/10.1016/0042-6989(83)90179-7
Article Google Scholar
Serrano-Pedraza I, Read JCA (2010) Multiple channels for horizontal, but only one for vertical corrugations? A new look at the stereo anisotropy. J vis 10(12):10–10. https://doi.org/10.1167/10.12.10
Article Google Scholar
Siderov J, Harwerth RS (1993) Effects of the spatial frequency of test and reference stimuli on stereo-thresholds. Vis Res 33(11):1545–1551. https://doi.org/10.1016/0042-6989(93)90147-O
Article Google Scholar
Song L, Geng W (2010) A new camouflage texture evaluation method based on WSSIM and nature image features. In: 2010 International conference on multimedia technology, pp 1–4.https://doi.org/10.1109/ICMULT.2010.5631434
Stevens KA (1981) The information content of texture gradients. Biol Cybern 42(2):95–105. https://doi.org/10.1007/BF00336727
Article Google Scholar
Sutter A, Beck J, Graham N (1989) Contrast and spatial variables in texture segregation: testing a simple spatial-frequency channels model. Percept Psychophys 46(4):312–332. https://doi.org/10.3758/BF03204985
Article Google Scholar
Tozawa J (2012) Height perception influenced by texture gradient. Perception 41(7):774–790. https://doi.org/10.1068/p7188
Article Google Scholar
Troscianko T, Benton CP, Lovell PG, Tolhurst DJ, Pizlo Z (2009) Camouflage and visual perception. Philos Trans R Soc b: Biol Sci 364(1516):449–461. https://doi.org/10.1098/rstb.2008.0218
Article Google Scholar
Tsutsui K-I, Sakata H, Naganuma T, Taira M (2002) Neural correlates for perception of 3D surface orientation from texture gradient. Science 298(5592):409–412. https://doi.org/10.1126/science.1074128
Article Google Scholar
Tyler CW (1973) Stereoscopic vision: cortical limitations and a disparity scaling effect. Science 181(4096):276–278. https://doi.org/10.1126/science.181.4096.276
Article Google Scholar
Watson AB, Ahumada AJ Jr (2005) A standard model for foveal detection of spatial contrast. J vis 5(9):6. https://doi.org/10.1167/5.9.6
Article Google Scholar
Westheimer G (2001) The Fourier theory of vision. Perception 30(5):531–541. https://doi.org/10.1068/p3193
Article Google Scholar
Young MJ, Landy MS, Maloney LT (1993) A perturbation analysis of depth perception from combinations of texture and motion cues. Vis Res 33:2685–2696. https://doi.org/10.1016/0042-6989(93)90228-O
Article Google Scholar

Download references

Acknowledgements

This work was supported by internal funding from University Houston Start-up Fund (to D.R.C.) and University of Houston College of Optometry sVRSG Research Grant 2021 (to W.H.L.) for the equipment used in this study. Portions of this work were presented at the 2020 and 2021 meetings of the Vision Sciences Society (St. Petersburg, FL).

Author information

Authors and Affiliations

University of Houston College of Optometry, 4901 Calhoun Rd, Houston, TX, 77004, USA
Wei Hau Lew & Daniel R. Coates

Authors

Wei Hau Lew
View author publications
You can also search for this author in PubMed Google Scholar
Daniel R. Coates
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Wei Hau Lew: Conceptualization, Methodology, Software, Data collection, Formal Analysis, Visualization, Writing-original draft. Daniel R. Coates: Conceptualization, Methodology, Software, Data collection, Formal Analysis, Visualization, Writing, Funding Acquisition, Supervision, Writing- review and editing.

Corresponding author

Correspondence to Wei Hau Lew.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Institutional review board statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of University of Houston (STUDY00001694: Measuring visual sensitivity).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 14 kb)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lew, W., Coates, D.R. The effect of target and background texture on relative depth discrimination in a virtual environment. Virtual Reality 28, 103 (2024). https://doi.org/10.1007/s10055-024-01000-4

Download citation

Received: 05 November 2022
Accepted: 12 April 2024
Published: 25 April 2024
DOI: https://doi.org/10.1007/s10055-024-01000-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The effect of target and background texture on relative depth discrimination in a virtual environment

Abstract

Similar content being viewed by others

The Effect of Size Statistics of the Background Texture on Perceived Target Size

Grid-texture mechanisms in human vision: Contrast detection of regular sparse micro-patterns requires specialist templates

Computational Study of Depth Perception for an Ambiguous Image Region: How Can We Estimate the Depth of Black or White Paper?

1 Introduction