Abstract
The perception of pictorial surfaces has been studied quantitatively for more than 20 years. During this time, the “gauge figure method” has been shown to be a fast and intuitive method to quantify pictorial relief. In this method, observers have to adjust the attitude of a gauge figure such that it appears to lie flat on a surface in pictorial space. Although the method has received substantial attention in the literature and has become increasingly popular, a clear, step-by-step description has not been published yet. In this article, a detailed description of the method is provided: stimulus and sample preparation, performing the experiment, and reconstructing a 3-D surface from the experimental data. Furthermore, software (written in PsychToolbox) based on this description is provided in an online supplement. This report serves three purposes: First, it facilitates experimenters who want to use the gauge figure task but have been unable to design it, due to the lack of information in the literature. Second, the detailed description can facilitate the design of software for various other platforms, possibly Web-based. Third, the method described in this article is extended to objects with holes and inner contours. This class of objects have not yet been investigated with the gauge figure task.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Almost 20 years ago, Koenderink, van Doorn, and Kappers (1992) published a key study on quantifying the perceived 3-D structure of a pictorial surface. Their experimental task was to adjust the attitude of a gauge figure probe. The gauge figure consists of a circle with a rod that sticks out perpendicularly from the middle. Observers are instructed to manipulate the 3-D orientation of this probe such that the disk appears to lie flat on the pictorial surface, while the rod consequently sticks out in the normal direction. Since then, numerous researchers have used this paradigm to study visual perception or 3-D shape. Yet the community of scientists using this method has been limited to those who understand the underlying mathematics and are able to experimentally implement this method. Detailed documentation has never been published in a complete fashion. This article will explain in detail all steps of the procedure. Furthermore, software written for PsychToolbox (Brainard, 1997; Pelli, 1997) is made available that covers all of these steps and should lead to an easily usable procedure, by means of which any user of PsychToolbox can conduct gauge figure experiments.
Method
The procedure for running an experiment is visualized in Fig. 1. Each of the steps can be described as follows.
Contour selection
After selecting a stimulus image, the experimenter needs to select which part of the pictorial surface is to be used for the experiment.
Triangulation
Within the contour, measurement samples need to be defined. This is done using a triangulation grid.
Experiment
After it is set up, the experiment can be conducted. The gauge figure probe should be rendered in the picture, and the observer should be able to manipulate its attitude.
3-D reconstruction
On the basis of the observers’ settings, the 3-D surface can be reconstructed. These data are the final result; further analysis will depend on the specific research question, and should thus be designed by the experimenter.
These four steps will now be explained in detail.
Contour selection
First, an image is needed. Some shape should be visible, preferably a smooth one. In previous research, only an outer contour was defined. The method presented here also allows for defining a hole and inner contours. These three types of contours are visualized in Fig. 2. The experimenter can define the contour manually by selecting contour sample points that form a polygon that approximates the actual contour. The distance between the contour sample points can have approximately the same size as the triangulation faces. A sampling of contour points, as shown by the red dots in Fig. 3, is thus sufficiently detailed.
The output of the contour procedure consists of three sets of coordinates, for the outer, inner, and hole contours, which can be written in n-by-2 arrays. A point on the contour can be written as \( {\mathbf{c}}_j^q = \left( {x_j^q,y_j^q} \right) \), where q defines the contour type by the letter o, h, or i (for outer, hole, or inner contour), and j is the index. For example, the first inner contour point is \( {\mathbf{c}}_1^{\text{i}} = \left( {x_1^{\text{i}},y_1^{\text{i}}} \right) \). For the outer and hole contours, the last element (say, n and m, respectively) equals the first: \( {\mathbf{c}}_1^{\text{o}} = {\mathbf{c}}_n^{\text{o}} \) and \( {\mathbf{c}}_1^{\text{h}} = {\mathbf{c}}_m^{\text{h}} \).
Triangulation
Based on the contour data, the triangulation can be defined. To this end, a triangular grid is used that in principle covers the whole screen. However, only points within the outer contour that are not within the hole contour should be used and displayed (at this stage, inner contours are neglected). This requires an algorithm to test whether a point is between the outer and hole contours.
Contour-enclosed points filtering
As can be seen in Fig. 3, a simple rule can be defined to assess whether a point p i is within the closed contours: When a horizontal line is drawn in the positive x direction, it intersects a number of times with the closed contour. If this number is odd, p i is within the closed contours. Thus, the number of intersections needs to be calculated.
First, we need to select contour point pairs whose y-coordinates enclose the y-coordinate of the point. In Fig. 3, these pairs are {c i , c i + 1} and {c j , c j + 1}. Now we can define straight lines through these point pairs. A straight line through two subsequent contour points c i = (x i , y i ) and c i + 1 = (x i + 1, y i + 1) can be defined as y = ax + b, with a = (y i + 1 – y i ) / (x i + 1 – x i ) and b = y i – ax i . The x-coordinate of the intersection point \( {\mathbf{p}}_1^{\prime } \) can now be calculated (the y-coordinates of the points p 1 and \( {\mathbf{p}}_1^{\prime } \) are obviously the same): \( p_1^{{\prime x}} = (p_1^y - b)/a \). Since the rule states that only intersections from the point in the positive x direction are to be counted, only \( p_1^{{\prime x}} - p_1^x > 0 \) is counted as an intersection.
When this procedure is performed for all points, we get one intersection for p 1 (odd, and thus this point is inside the closed contours), two intersections for p 2 (outside), and so forth. Testing whether a point is within the closed contours is computationally laborious. Therefore, the procedure may include an initial selection of the triangulation positions that are within the rectangle defined by the width and height of the outer contour, as is shown by the outer dotted line in Fig. 3.
Adjusting the triangulation grid size and position
It can be useful for the experimenter to adjust the grid size and position of the triangulation in real time. This can be done using the procedures described above, which are also implemented in the supplemental software. In the software, the position can be adjusted with the mouse and grid size by the arrow keys, as is illustrated in Fig. 4a, but other types of implementations may be equally user friendly.
a An initial triangulation is shown at startup. The experimenter can adjust the position with the mouse and can increase (up arrow) or decrease (down arrow) the triangle size. b When the experimenter is satisfied with the settings, the final version can be shown with barycentres and additional cuts in the mesh by the inner contour (see the Calculating Faces and Barycentres and Performing Final Point Filtering section)
Calculating faces and barycentres and performing final point filtering
Up to now, only points that span the triangular grid have been used, without explicit face numbering. However, the reconstruction algorithm requires explicit information about the vertex numbers that constitute the triangles. Each face (triangle) is defined by three vertices, so a definition of all faces comprises an m-by-3 matrix, with a set of three numbers in each row that refer to the vertices. Note that the number of faces does not equal the number of vertices.
The faces can be calculated with a brute-force method, which is why this algorithm is not used during the real-time adjustment of grid size and location. Having defined the individual triangles, we can calculate the barycentres of the faces, which are the actual sample points where the gauge figure will be rendered.
Lastly, triangles crossing the inner contours need to be filtered. To this end, a line intersection algorithm is needed. The basic question is, do the lines between two point pairs {p 1, p 2} and {p 3, p 4}, as shown in Fig. 5, coincide? This can simply be calculated by parameterizing vectors through these lines v 12(t) = t(p 2 – p 1) + p 1 and v 34(s) = s(p 4 – p 3) + p 3. When 0 < t < 1, v 12(t) lies between p 1 and p 2, and the equivalent holds for the parameter s. The intersection parameters can be found by solving v 12(t) = v 34(s). If both intersection parameters are between zero and one, the lines intersect.
As can be seen in Fig. 4, it can happen that a triangle line crosses the hole, which is unwanted. To overcome this problem, the hole contour should be included in the last filtering procedure. When the experimenter is satisfied with the triangulation parameters, the points (vertices), faces, and barycentres should be saved. A screen shot of the final result from the supplemental software is also exported, for later reference (see Fig. 4b).
The experiment
During the experiment, a gauge figure is subsequently presented at all barycentres, in random order. In Fig. 6, a screen shot of the experiment is shown. The 3-D rotation of the gauge figure can be implemented as follows. The circle of radius r is defined by a polygon of n points that lie in the (x, y) plane centered at the origin. The rod has length r and sticks out in the z direction. Although the circle and rod are defined in 3-D coordinates, only the x- and y-coordinates are used for the actual rendering (orthographic projection). Changing the slant and tilt of the gauge figure can be achieved by rotating all coordinates around the y- and z-axes, respectively (the order is evidently important, since rotations do not commute). Rotations can be implemented by using rotation matrices.
The observer needs control over the attitude—that is, the slant and tilt of the gauge figure. A simple interface for this control is to use the mouse position, as is illustrated in Fig. 6. The location (x, y) of the mouse is tracked with respect to the middle of the screen. The tilt τ is defined by arctan(y / x), and the slant σ is defined by the distance \( g\sqrt {{{x^2} + {y^2}}} \), where g denotes a gain to tune the sensitivity of the mouse. To prevent confusion about where the mouse position starts, it is recommended that the gauge figure start at the (τ, σ) = (0, 0) position.
Surface reconstruction
The data from the experiment are attitudes that can be interpreted as depth gradients: the local change of depth in the x and y directions. This can be easily seen by noting the following. The gauge figure can be regarded as the normal vector on a local plane (the triangle is a small plane) and can be written as a function of slant and tilt: n(τ, σ) = (cos τ sin σ, sin τ sin σ, cos σ). This normal vector defines a plane, as is illustrated in Fig. 7. The equation for this plane is n x x + n y y + n z z = d, where d is some unknown depth offset. It can be readily understood from this equation that the relative depth difference between two points is (z 2 – z 1) = −[n x (x 2 – x 1) + n y (y 2 – y 1)] / n z . Similarly, the depth difference (z 3 – z 1) can be calculated, while the depth difference (z 3 – z 2) evidently follows from the other two depth differences and is thus omitted. Thus, one experimental setting results in two depth differences.
A system of linear equations from these depth differences is needed. The basic idea is that a matrix M should be constructed that fulfills the equation M z = ∆z, in which ∆z is a vector with all of the depth differences and z are all depth values. This can be done by using the matrix in Eq. 1.
To reconstruct a continuous surface, one needs to constrain all triangles to be connected with their neighbours. This means that triangle edges that connect two triangles [e.g., (3, 5) in Fig. 7] result in two equations with possibly different depth differences, one resulting from the depth gradient of triangle (2, 3, 5), and the other from triangle (3, 5, 6). This needs to be accounted for in the system of equations from Eq. 1. To ensure that the reconstructed surface has zero average depth (depth offset is irrelevant, anyway), add a last row to the matrix consisting of all 1 s. This boundary condition implies that ∑ z i = 0. The reconstruction can now be calculated by taking the pseudo-inverse of the matrix M: \( {\mathbf{z}} = {{\mathbf{\tilde{M}}}^{{ - 1}}}\Delta {\mathbf{z}} \). The output of the software gives a 3-D plot of the reconstructed surface (as shown in Fig. 8) and the 3-D vertices as a text file, which can be used for the analysis. A different version of the reconstruction algorithm has previously been published by Nefs (2008).
Discussion
This article gives a complete description of the implementation of the gauge figure method. This does not mean that it does not need some additional input by the user. To start with, users should be cautious using large holes or inner contours in the stimuli. As may have become clear, the reconstruction algorithm integrates the depth gradients, and the stability of this integration depends on the sampling. Some noise from observer settings is automatically filtered, because the reconstruction algorithm imposes an integrable surface:Footnote 1 The shared edges of the faces are joined—that is, the surface is “continuous.” An example of a shared edge is (5, 3) from Fig. 7, while (5, 2) is an outer edge. When there are relatively few shared edges, such as in the arm of the bodybuilder in Fig. 9, the integration may get unstable. Instead of averaging out, the observer noise is integrated and may yield unstable results.
Two potential issues have been raised by researchers using the gauge figure method. Firstly, “the gauge orientation task suffers from several limitations. First, there is no way to distinguish between errors due to misunderstanding the orientation of the shape and errors due to misunderstanding the orientation of the gauge” (Cole et al., 2009). These authors imply that the mental image of the stimulus contains attitude information that is matched to the attitude information of the mental image of the gauge figure. However, the mental image is (evidently) in the mind, and therefore whether it contains attitude information is unknown. As Koenderink, van Doorn, Kappers, and Todd (2001) pointed out, “pictorial relief can only be defined operationally”; that is, the perceived attitude can only be defined through a task. In physics, a ruler measures distance, and a clock measures time. There is no way of distinguishing the time on the clock and the actual time, because time is operationally defined by the clock. Similarly, the perceived attitude is operationalized by the gauge figure probe. Hence, the point raised by Cole et al. (2009) does not seem to be an issue, in the context of Koenderink et al. (2001). The second issue was raised by an anonymous reviewer: “When confronted with a reconstruction of the shape that they [the observers] reported using the gauge figures, they quickly identify locations where the reported shape does not correspond to the shape that they actually intended to report.” Indeed, this can be an issue, but only with respect to the reconstruction itself, and not with respect to the task. It is debatable whether one should let observers view a real-time reconstruction of the pictorial relief while doing the task. One may as well let observers use 3-D modeling software to reproduce pictorial shapes, or perform a shape discrimination task. All of these tasks may be useful to study vision, but they ignore the concept of operationalizing pictorial relief by using a (i.e., any) gauge figure probe.
It should also be noted that there exist more experimental tasks to measure pictorial surface structure (Koenderink et al., 2001). However, the gauge figure task is the most used and seems to be the most intuitive task for the observers. Using different methods may be worthwhile, since the gauge figure task is essentially based on first-order (orientation) estimates. Using tasks that probe zeroth- (Koenderink et al., 2001) or second-order information may give different insights into pictorial relief. Furthermore, pictorial surfaces are just one geometric facet of pictorial space. Recently, a method to quantify the spatial layout of objects in pictorial space has been developed (Wijntjes & Pont, 2010). In this method, observers were instructed to use a pointer to point from one object to another. These data were then used to reconstruct the relative depth differences in pictorial space. The combination of pictorial surface and spatial layout methods should increase our understanding of pictorial space.
The analysis of the results is to be defined by the user of the method described in this article. There is much literature that can serve as background material, and essentially, the way that the data are analysed depends on the specific research question. The method described in this report merely provides the means to start with a picture and arrive at the 3-D reconstructed relief.
Notes
This assumption is met when the curl of the normal vector field vanishes identically (Koenderink et al., 1992). In most (if not all) studies, the integrability assumption was justified, but this evidently depends on the stimulus. It is not unthinkable that for some visual stimuli, the vector field will be nonintegrable.
References
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. doi:10.1163/156856897X00357
Cole, F., Sanik, K., DeCarlo, D., Finkelstein, A., Funkhouser, T., Rusinkiewics, S., et al. (2009). How well do line drawings depict shape? ACM Transactions on Graphics, 28(3). doi:10.1145/1531326.1531334
Koenderink, J. J., van Doorn, A. J., & Kappers, A. M. L. (1992). Surface perception in pictures. Perception & Psychophysics, 52, 487–496. doi:10.3758/BF03206710
Koenderink, J. J., van Doorn, A. J., Kappers, A. M. L., & Todd, J. T. (2001). Ambiguity and the “mental eye” in pictorial relief. Perception, 30, 431–448. doi:10.1068/p3030
Nefs, H. T. (2008). Three-dimensional object shape from shading and contour disparities. Journal of Vision, 8(11), 11:1–16.
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. doi:10.1163/156856897X00366
Wijntjes, M. W. A., & Pont, S. C. (2010). Pointing in pictorial space: Quantifying the perceived relative depth structure in mono and stereo images of natural scenes. ACM Transactions on Applied Perception, 7(4), 24:1–8. doi:10.1145/1823738.1823742
Author Note
This work was supported by a grant from the Netherlands Organization for Scientific Research (NWO). Supplemental materials may be downloaded from the author’s website: www.maartenwijntjes.nl/gaugefigure.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
The algorithms described above have been implemented in ready-to-use software written with PsychToolbox for MATLAB. Visit the PsychToolbox webpage (http://psychtoolbox.org) for installation instructions. It is recommended that new users familiarize themselves with the PsychToolbox demos to understand the general structure of the toolbox, although it should not be necessary to use the gauge figure software that is described here. The supplemental software package, an instruction movie, and the documented instructions can be downloaded from www.maartenwijntjes.nl/gaugefigure. For practical instructions, the instruction document is recommended.
The software package consists of four main .m files, one for each of the four procedural steps shown in Fig. 1.
Contour creation
The contour can be manually defined my registering mouse clicks on the contour. The mouse position is tracked by . After clicking the mouse, the positions are appended to the
matrix (
stands for outer contour, in this case):
![figure d](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figd_HTML.gif)
The stands for irrelevant code. The nested
loop checks whether the new location is within a threshold distance from the start location, which is indexed as the third location
. If this is true, the
means that the outer contour is finished, and the program will continue with the other contours.
Triangulation
The triangulation starts by generating a triangular grid, which is done by the auxiliary .m file genPoints.m (stored in the folder):
![figure j](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figj_HTML.gif)
The input parameters are the width () and height (
) of the screen in pixels, the latter of which is given by
from
in triangulation.m. Furthermore, the grid distance is defined by
, and a translation is supplied through
and
. This function is called in triangulation.m by
where
and
are the mouse positions, and
can be adjusted with the arrow keys:
![figure u](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figw_HTML.gif)
The triangulation points have to be filtered for being inside the contour. This is done by the auxiliary file pointsInContour.m. This file first filters the points from the smallest rectangle that encloses the outer contour and puts the contour points in pairs
. Then it does the final filtering to
:
![figure aa](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figaa_HTML.gif)
The first nested selects the contour pairs that enclose the y value of the
using betweenY.m. The second
it performs the intersection test described in the Contour-Enclosed Points Filtering section. Lastly, this program tests whether the number of intersections is odd, with the code
After the filtering, the inner points should be connected by mesh lines, which is done by meshit.m, which basically searches for point pairs that have a distance equal to the grid distance defined by . Up to now, point filtering has been done in real time, so that the experimenter can see the triangulation while adjusting the grid size and translation manually. When the experimenter is satisfied with the triangulation, the faces are calculated by facit.m. Then these faces are filtered for triangles that intersect with the inner contours, as described in the Calculating Faces and Barycentres and Performing Final Point Filtering section:
![figure ag](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figag_HTML.gif)
Experiment
In the experiment, the gauge figure has to be rendered at (randomly ordered) barycentres. The gauge figure is defined in three dimensions, which makes it easy to rotate. Here, the disk (called ) and rod are defined:
![figure ai](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figai_HTML.gif)
During a trial, the mouse position () defines the slant (
) and tilt (
):
![figure am](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figam_HTML.gif)
The sensitivity of the slant can be tuned by . These parameters then define the rotation matrix
:
![figure ap](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figap_HTML.gif)
The location of the barycenter is . Since the gauge figure is still three-dimensional,
is used instead of
. In the
loop, the
is rotated and translated to the barycenter. The same is done for the
.
Reconstruction
The experimental data (n trials) are in the form of an n-by-4 matrix, with the first two columns for the barycentres and last two columns for the slant and tilt. The first thing that reconstruction.m does is transform these data into normal vectors, and then into gradients:
![figure aw](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figaw_HTML.gif)
These depth gradients need to be converted to relative depth differences, as described in Eq. 1. For each triangle, the depth difference between the first vertex and the other two vertices is calculated.
![figure ax](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figax_HTML.gif)
Note that a zero is appended to the vector to satisfy the boundary condition ∑ z
i
= 0 (overall depth is zero). Now, the matrix M has to be defined. This is done as follows:
![figure az](http://media.springernature.com/lw685/springer-static/image/art%3A10.3758%2Fs13428-011-0127-3/MediaObjects/13428_2011_127_Figaz_HTML.gif)
Again, note the appending of a row of ones to satisfy ∑ z
i
= 0. Finally, the depths are calculated through the pseudo-inverse of M: . The remainder of reconstruction.m writes the 3-D vertices to a data file and produces a visualization of the results.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Wijntjes, M.W.A. Probing pictorial relief: from experimental design to surface reconstruction. Behav Res 44, 135–143 (2012). https://doi.org/10.3758/s13428-011-0127-3
Published:
Issue Date:
DOI: https://doi.org/10.3758/s13428-011-0127-3