Introduction

Almost 20 years ago, Koenderink, van Doorn, and Kappers (1992) published a key study on quantifying the perceived 3-D structure of a pictorial surface. Their experimental task was to adjust the attitude of a gauge figure probe. The gauge figure consists of a circle with a rod that sticks out perpendicularly from the middle. Observers are instructed to manipulate the 3-D orientation of this probe such that the disk appears to lie flat on the pictorial surface, while the rod consequently sticks out in the normal direction. Since then, numerous researchers have used this paradigm to study visual perception or 3-D shape. Yet the community of scientists using this method has been limited to those who understand the underlying mathematics and are able to experimentally implement this method. Detailed documentation has never been published in a complete fashion. This article will explain in detail all steps of the procedure. Furthermore, software written for PsychToolbox (Brainard, 1997; Pelli, 1997) is made available that covers all of these steps and should lead to an easily usable procedure, by means of which any user of PsychToolbox can conduct gauge figure experiments.

Method

The procedure for running an experiment is visualized in Fig. 1. Each of the steps can be described as follows.

Fig. 1
figure 1

Illustration of the four procedural steps

Contour selection

After selecting a stimulus image, the experimenter needs to select which part of the pictorial surface is to be used for the experiment.

Triangulation

Within the contour, measurement samples need to be defined. This is done using a triangulation grid.

Experiment

After it is set up, the experiment can be conducted. The gauge figure probe should be rendered in the picture, and the observer should be able to manipulate its attitude.

3-D reconstruction

On the basis of the observers’ settings, the 3-D surface can be reconstructed. These data are the final result; further analysis will depend on the specific research question, and should thus be designed by the experimenter.

These four steps will now be explained in detail.

Contour selection

First, an image is needed. Some shape should be visible, preferably a smooth one. In previous research, only an outer contour was defined. The method presented here also allows for defining a hole and inner contours. These three types of contours are visualized in Fig. 2. The experimenter can define the contour manually by selecting contour sample points that form a polygon that approximates the actual contour. The distance between the contour sample points can have approximately the same size as the triangulation faces. A sampling of contour points, as shown by the red dots in Fig. 3, is thus sufficiently detailed.

Fig. 2
figure 2

Definition of the three types of contours

Fig. 3
figure 3

Visualization of the algorithm that tests whether a point is within the closed contours

The output of the contour procedure consists of three sets of coordinates, for the outer, inner, and hole contours, which can be written in n-by-2 arrays. A point on the contour can be written as \( {\mathbf{c}}_j^q = \left( {x_j^q,y_j^q} \right) \), where q defines the contour type by the letter o, h, or i (for outer, hole, or inner contour), and j is the index. For example, the first inner contour point is \( {\mathbf{c}}_1^{\text{i}} = \left( {x_1^{\text{i}},y_1^{\text{i}}} \right) \). For the outer and hole contours, the last element (say, n and m, respectively) equals the first: \( {\mathbf{c}}_1^{\text{o}} = {\mathbf{c}}_n^{\text{o}} \) and \( {\mathbf{c}}_1^{\text{h}} = {\mathbf{c}}_m^{\text{h}} \).

Triangulation

Based on the contour data, the triangulation can be defined. To this end, a triangular grid is used that in principle covers the whole screen. However, only points within the outer contour that are not within the hole contour should be used and displayed (at this stage, inner contours are neglected). This requires an algorithm to test whether a point is between the outer and hole contours.

Contour-enclosed points filtering

As can be seen in Fig. 3, a simple rule can be defined to assess whether a point p i is within the closed contours: When a horizontal line is drawn in the positive x direction, it intersects a number of times with the closed contour. If this number is odd, p i is within the closed contours. Thus, the number of intersections needs to be calculated.

First, we need to select contour point pairs whose y-coordinates enclose the y-coordinate of the point. In Fig. 3, these pairs are {c i , c i  +  1} and {c j , c j  +  1}. Now we can define straight lines through these point pairs. A straight line through two subsequent contour points c i = (x i , y i ) and c i + 1 = (x i + 1, y i + 1) can be defined as y = ax + b, with a = (y i + 1y i ) / (x i + 1x i ) and b = y i ax i . The x-coordinate of the intersection point \( {\mathbf{p}}_1^{\prime } \) can now be calculated (the y-coordinates of the points p 1 and \( {\mathbf{p}}_1^{\prime } \) are obviously the same): \( p_1^{{\prime x}} = (p_1^y - b)/a \). Since the rule states that only intersections from the point in the positive x direction are to be counted, only \( p_1^{{\prime x}} - p_1^x > 0 \) is counted as an intersection.

When this procedure is performed for all points, we get one intersection for p 1 (odd, and thus this point is inside the closed contours), two intersections for p 2 (outside), and so forth. Testing whether a point is within the closed contours is computationally laborious. Therefore, the procedure may include an initial selection of the triangulation positions that are within the rectangle defined by the width and height of the outer contour, as is shown by the outer dotted line in Fig. 3.

Adjusting the triangulation grid size and position

It can be useful for the experimenter to adjust the grid size and position of the triangulation in real time. This can be done using the procedures described above, which are also implemented in the supplemental software. In the software, the position can be adjusted with the mouse and grid size by the arrow keys, as is illustrated in Fig. 4a, but other types of implementations may be equally user friendly.

Fig. 4
figure 4

a An initial triangulation is shown at startup. The experimenter can adjust the position with the mouse and can increase (up arrow) or decrease (down arrow) the triangle size. b When the experimenter is satisfied with the settings, the final version can be shown with barycentres and additional cuts in the mesh by the inner contour (see the Calculating Faces and Barycentres and Performing Final Point Filtering section)

Calculating faces and barycentres and performing final point filtering

Up to now, only points that span the triangular grid have been used, without explicit face numbering. However, the reconstruction algorithm requires explicit information about the vertex numbers that constitute the triangles. Each face (triangle) is defined by three vertices, so a definition of all faces comprises an m-by-3 matrix, with a set of three numbers in each row that refer to the vertices. Note that the number of faces does not equal the number of vertices.

The faces can be calculated with a brute-force method, which is why this algorithm is not used during the real-time adjustment of grid size and location. Having defined the individual triangles, we can calculate the barycentres of the faces, which are the actual sample points where the gauge figure will be rendered.

Lastly, triangles crossing the inner contours need to be filtered. To this end, a line intersection algorithm is needed. The basic question is, do the lines between two point pairs {p 1, p 2} and {p 3, p 4}, as shown in Fig. 5, coincide? This can simply be calculated by parameterizing vectors through these lines v 12(t) = t(p 2p 1) + p 1 and v 34(s) = s(p 4p 3) + p 3. When 0 < t < 1, v 12(t) lies between p 1 and p 2, and the equivalent holds for the parameter s. The intersection parameters can be found by solving v 12(t) = v 34(s). If both intersection parameters are between zero and one, the lines intersect.

Fig. 5
figure 5

Illustration of two intersecting lines

As can be seen in Fig. 4, it can happen that a triangle line crosses the hole, which is unwanted. To overcome this problem, the hole contour should be included in the last filtering procedure. When the experimenter is satisfied with the triangulation parameters, the points (vertices), faces, and barycentres should be saved. A screen shot of the final result from the supplemental software is also exported, for later reference (see Fig. 4b).

The experiment

During the experiment, a gauge figure is subsequently presented at all barycentres, in random order. In Fig. 6, a screen shot of the experiment is shown. The 3-D rotation of the gauge figure can be implemented as follows. The circle of radius r is defined by a polygon of n points that lie in the (x, y) plane centered at the origin. The rod has length r and sticks out in the z direction. Although the circle and rod are defined in 3-D coordinates, only the x- and y-coordinates are used for the actual rendering (orthographic projection). Changing the slant and tilt of the gauge figure can be achieved by rotating all coordinates around the y- and z-axes, respectively (the order is evidently important, since rotations do not commute). Rotations can be implemented by using rotation matrices.

Fig. 6
figure 6

The experiment. On the left, a typical setting of the gauge figure during the experiment is shown. In the middle, the slant and tilt are defined. On the right, the relation between the mouse position and the slant/tilt parameters is explained

The observer needs control over the attitude—that is, the slant and tilt of the gauge figure. A simple interface for this control is to use the mouse position, as is illustrated in Fig. 6. The location (x, y) of the mouse is tracked with respect to the middle of the screen. The tilt τ is defined by arctan(y / x), and the slant σ is defined by the distance \( g\sqrt {{{x^2} + {y^2}}} \), where g denotes a gain to tune the sensitivity of the mouse. To prevent confusion about where the mouse position starts, it is recommended that the gauge figure start at the (τ, σ) = (0, 0) position.

Surface reconstruction

The data from the experiment are attitudes that can be interpreted as depth gradients: the local change of depth in the x and y directions. This can be easily seen by noting the following. The gauge figure can be regarded as the normal vector on a local plane (the triangle is a small plane) and can be written as a function of slant and tilt: n(τ, σ) = (cos τ sin σ, sin τ sin σ, cos σ). This normal vector defines a plane, as is illustrated in Fig. 7. The equation for this plane is n x x + n y y + n z z = d, where d is some unknown depth offset. It can be readily understood from this equation that the relative depth difference between two points is (z 2z 1) = −[n x (x 2x 1) + n y (y 2y 1)] / n z . Similarly, the depth difference (z 3z 1) can be calculated, while the depth difference (z 3z 2) evidently follows from the other two depth differences and is thus omitted. Thus, one experimental setting results in two depth differences.

Fig. 7
figure 7

Left: A single triangle with depth difference that is based on the depth gradient from the gauge figure. Right: Vertex numbering of the triangle faces

A system of linear equations from these depth differences is needed. The basic idea is that a matrix M should be constructed that fulfills the equation M z = ∆z, in which ∆z is a vector with all of the depth differences and z are all depth values. This can be done by using the matrix in Eq. 1.

$$ \left( {\begin{array}{*{20}{c}} 1 & { - 1} & 0 & 0 & \ldots \\ 1 & 0 & { - 1} & 0 & \ldots \\ 1 & 0 & 0 & { - 1} & \ldots \\ \vdots & \vdots & \vdots & \vdots & \ldots \\ 0 & 1 & { - 1} & 0 & \ldots \\ 0 & 1 & 0 & { - 1} & \ldots \\ \vdots & \vdots & \vdots & \vdots & \ddots \\ 1 & 1 & 1 & 1 & {} \\ \end{array} } \right)\left( {\begin{array}{*{20}{c}} {{z_1}} \\ {{z_2}} \\ {{z_3}} \\ {{z_4}} \\ \vdots \\ \end{array} } \right) = \left( {\begin{array}{*{20}{c}} {{z_1} - {z_2}} \\ {{z_1} - {z_3}} \\ {{z_1} - {z_4}} \\ \vdots \\ {{z_2} - {z_3}} \\ {{z_2} - {z_4}} \\ \vdots \\ 0 \\ \end{array} } \right) $$
(1)

To reconstruct a continuous surface, one needs to constrain all triangles to be connected with their neighbours. This means that triangle edges that connect two triangles [e.g., (3, 5) in Fig. 7] result in two equations with possibly different depth differences, one resulting from the depth gradient of triangle (2, 3, 5), and the other from triangle (3, 5, 6). This needs to be accounted for in the system of equations from Eq. 1. To ensure that the reconstructed surface has zero average depth (depth offset is irrelevant, anyway), add a last row to the matrix consisting of all 1 s. This boundary condition implies that ∑ z i = 0. The reconstruction can now be calculated by taking the pseudo-inverse of the matrix M: \( {\mathbf{z}} = {{\mathbf{\tilde{M}}}^{{ - 1}}}\Delta {\mathbf{z}} \). The output of the software gives a 3-D plot of the reconstructed surface (as shown in Fig. 8) and the 3-D vertices as a text file, which can be used for the analysis. A different version of the reconstruction algorithm has previously been published by Nefs (2008).

Fig. 8
figure 8

Resulting 3-D relief superimposed on the picture

Discussion

This article gives a complete description of the implementation of the gauge figure method. This does not mean that it does not need some additional input by the user. To start with, users should be cautious using large holes or inner contours in the stimuli. As may have become clear, the reconstruction algorithm integrates the depth gradients, and the stability of this integration depends on the sampling. Some noise from observer settings is automatically filtered, because the reconstruction algorithm imposes an integrable surface:Footnote 1 The shared edges of the faces are joined—that is, the surface is “continuous.” An example of a shared edge is (5, 3) from Fig. 7, while (5, 2) is an outer edge. When there are relatively few shared edges, such as in the arm of the bodybuilder in Fig. 9, the integration may get unstable. Instead of averaging out, the observer noise is integrated and may yield unstable results.

Fig. 9
figure 9

Example of a stimulus in which the integration may get unstable

Two potential issues have been raised by researchers using the gauge figure method. Firstly, “the gauge orientation task suffers from several limitations. First, there is no way to distinguish between errors due to misunderstanding the orientation of the shape and errors due to misunderstanding the orientation of the gauge” (Cole et al., 2009). These authors imply that the mental image of the stimulus contains attitude information that is matched to the attitude information of the mental image of the gauge figure. However, the mental image is (evidently) in the mind, and therefore whether it contains attitude information is unknown. As Koenderink, van Doorn, Kappers, and Todd (2001) pointed out, “pictorial relief can only be defined operationally”; that is, the perceived attitude can only be defined through a task. In physics, a ruler measures distance, and a clock measures time. There is no way of distinguishing the time on the clock and the actual time, because time is operationally defined by the clock. Similarly, the perceived attitude is operationalized by the gauge figure probe. Hence, the point raised by Cole et al. (2009) does not seem to be an issue, in the context of Koenderink et al. (2001). The second issue was raised by an anonymous reviewer: “When confronted with a reconstruction of the shape that they [the observers] reported using the gauge figures, they quickly identify locations where the reported shape does not correspond to the shape that they actually intended to report.” Indeed, this can be an issue, but only with respect to the reconstruction itself, and not with respect to the task. It is debatable whether one should let observers view a real-time reconstruction of the pictorial relief while doing the task. One may as well let observers use 3-D modeling software to reproduce pictorial shapes, or perform a shape discrimination task. All of these tasks may be useful to study vision, but they ignore the concept of operationalizing pictorial relief by using a (i.e., any) gauge figure probe.

It should also be noted that there exist more experimental tasks to measure pictorial surface structure (Koenderink et al., 2001). However, the gauge figure task is the most used and seems to be the most intuitive task for the observers. Using different methods may be worthwhile, since the gauge figure task is essentially based on first-order (orientation) estimates. Using tasks that probe zeroth- (Koenderink et al., 2001) or second-order information may give different insights into pictorial relief. Furthermore, pictorial surfaces are just one geometric facet of pictorial space. Recently, a method to quantify the spatial layout of objects in pictorial space has been developed (Wijntjes & Pont, 2010). In this method, observers were instructed to use a pointer to point from one object to another. These data were then used to reconstruct the relative depth differences in pictorial space. The combination of pictorial surface and spatial layout methods should increase our understanding of pictorial space.

The analysis of the results is to be defined by the user of the method described in this article. There is much literature that can serve as background material, and essentially, the way that the data are analysed depends on the specific research question. The method described in this report merely provides the means to start with a picture and arrive at the 3-D reconstructed relief.