All faces are created equal, in the sense that they all contain the same set of facial features: two eyes, a nose, and a mouth. Moreover, these features are arranged in a common configuration in which the eyes are situated above the central nose feature, which is located above the mouth. Therefore, our ability to quickly and accurately differentiate one face from another must depend on resolving the relatively subtle differences in their featural and configural qualities. In this article, we examine the roles of featural and configural information in face perception and how their contributions might vary as a function of their location in the upper (eye) and lower (mouth) halves of the face stimulus.

In the face perception literature, it has been well established that inversion disproportionately impairs recognition of faces, relative to the discrimination of other, nonface objects (e.g., airplanes, stick figures, birds, cars; Yin, 1969). The so-called face inversion effect is a robust phenomenon that has been demonstrated across a wide variety of study–test paradigms in which the encoding study item, the test item, or both the study and test items are shown in their inverted orientations. The effect has been reported in tests of perceptual matching (Goffaux & Rossion, 2007; Riesenhuber, Jarudi, Gilad, & Sinha, 2004), immediate memory (Leder & Carbon, 2006; Rhodes, Brake, & Atkinson, 1993), and long-term recognition memory (Valentine & Bruce, 1986). Some face recognition researchers have suggested that understanding how inversion affects face perception might provide insight into the processes that uniquely distinguish face recognition from other forms of object recognition (Valentine, 1988). A critical question then is, Are featural and configural information differentially impaired by inversion?

This question has proven to be both theoretically controversial and empirically challenging. On one side of the issue, proponents of the configural-processing viewFootnote 1 argue that sensitivity to the spatial relations between the features of a face (e.g., distance between the eyes or distance between the nose and mouth) is differentially disrupted through inversion (Freire, Lee, & Symons, 2000; Le Grand, Mondloch, Maurer, & Brent, 2001; Maurer, Le Grand, & Mondloch, 2002; Mondloch, Le Grand, & Maurer, 2002). For example, in discrimination tasks, face stimuli were constructed such that they varied in either their configural information (i.e., the distance between the eyes and the distance between the nose and mouth) or their featural information (i.e., the shape of the eye and mouth features). When the faces were inverted, discrimination accuracy decreased and reaction times increased for faces that differed in their configural information, but not for faces that differed in their featural information. These results suggested that inversion differentially disrupts the coding of configural information in a face. Other studies have shown that configural information is more vulnerable to the effects of inversion than are color or luminance changes (Barton, Keenan, & Bass, 2001; Leder & Bruce, 1998, 2000).

Subsequent studies have indicated that inversion seems to cause a selective rather than a global disruption of configural information. For instance, changes in the vertical placement of the eyes are more difficult to detect in an inverted face than are changes in the horizontal spacing between the eyes (Goffaux & Rossion, 2007; Sekunova & Barton, 2008). Inversion also differentially distorts configural information in the lower, mouth half of the face relative to configural changes in the upper, eye half (Malcolm, Leung, & Barton, 2004).

On the other side of the issue, advocates of the “featural plus configural” view have argued that nothing is special about the disruption of the configural properties of a face through inversion. According to this position, the featuresFootnote 2 and configuration of a face are equally vulnerable to the influence of inversion when the featural and configural discriminations are equated for difficulty in a face’s upright orientation (McKone & Yovel, 2009; Riesenhuber et al., 2004; Yovel & Kanwisher, 2004). Indeed, when Yovel and Kanwisher controlled for baseline differences between featural and configural information in upright orientations, they found that the facial features in an inverted face were as difficult to discriminate as the configural spacing between the features (Yovel & Kanwisher, 2004). Others have argued that inversion does not differentially impair the perception of configural over featural information, but equally disrupts the perception of both types of information (Sekuler, Gaspar, Gold, & Bennett, 2004).

Comparing featural and configural discriminations in inverted faces presents several empirical challenges. First, featural and configural forms of information are intrinsically intertwined in the stimulus, such that changes to the features of a face (e.g., shape or size) produce concomitant changes in the configural distances between the features (e.g., altering the shape of the eyes produces subtle changes in the distance between the nose and mouth features; Sergent, 1984). Second, it is not obvious how the configural spacing should be calculated. One approach is to compute spacing as the distance from one feature to the closest edge of its neighboring feature. Alternatively, spacing can be calculated by measuring the distances between the centroids of the features (i.e., center of mass). Manipulating feature size would be expected to differentially affect their edge-to-edge relations whereas changes to the shape of a feature should affect its centroids. Despite such complexities, Yovel and Kanwisher (2004) claimed that the effect of inversion on featural and configural discriminations can be fairly compared if the two judgments are (1) approximately matched for difficulty in their upright orientations and (2) calibrated for sensitivity to avoid ceiling and floor effects.

Toward this goal, the face dimensions task was developed, in which featural and configural dimensions were manipulated independently in the eye and mouth regions in a step-wise fashion (Bukach, Le Grand, Kaiser, Bub, & Tanaka, 2008). The discrimination of featural and configural information was equated in the upright orientation, and discrimination was then tested in the inverted orientation (see also Malcolm et al., 2004). The advantage of this approach was that we could investigate the separate contributions of information type (featural vs. configural) and spatial region (eye vs. mouth) to the perception of upright and inverted faces. In previous studies, the face dimensions task has been successfully applied to examine the face perception strategies of individuals with autism (Wolf et al., 2008), patients with prosopagnosia (Bukach et al., 2008; Rossion, Le Grand, Kaiser, Bub, & Tanaka, 2009), and infants (Quinn & Tanaka, 2009; Quinn, Tanaka, Lee, Pascalis, & Slater, 2013).

In the following experiments, featural information was manipulated in two ways: In Experiment 1, we scaled the size of the eyes and mouth features, and in Experiment 2, we changed the shape by morphing the eyes (or mouth) of one exemplar with the eyes (or mouth) of another. In Experiment 3, the effects of inversion on the perception of featural and configural information in nonface objects (houses) was examined. In Experiment 4, we employed a spatial cueing procedure to direct participants’ attention to either the eye or mouth regions of the face. The collective findings from these experiments showed that inversion impaired the perception of information in the lower (mouth) region of faces but not of nonface objects, and that the inversion effect could be ameliorated via a spatial cueing manipulation.

Experiment 1: upright and inverted faces with featural size changes

The goal of Experiment 1 was to test the perception of featural and configural information in upright and inverted faces. In a previous study, Malcolm et al. (2004) had tested the discrimination of configural, featural size, and external contour information in upright and inverted faces. Participants were shown three faces and were asked to identify the face that differed from the other two. Although they found an overall inversion effect, misorientation impaired the discrimination of configural changes more than the discrimination of contour and featural changes, and this impairment was more pronounced for configural mouth discriminations than for configural eye discriminations. We developed the face dimensions task, in which the featural and configural information in the eye and mouth regions was independently and incrementally manipulated. For the featural dimension, the size of the eyes and the size of the mouth were parametrically varied to produce five levels of change. The manipulation of feature size has the advantage of preserving the feature’s shape and absolute location, while only minimally affecting the edge-to-edge distances between features. For the configural dimension, the distance between the eyes or the distance between the center of the upper lip (i.e., the philtrum) and the bottom of the nose were parametrically varied, to produce five levels of configural change. We anticipated that performance should vary monotonically as a function of the faces’ separation along the face dimension, in that faces that differed by three steps along a given dimension should be relatively “easy” to discriminate, faces that differed by two steps should be “intermediate” in their discriminability, and faces that differed by one step should be the most “difficult” to discriminate. In Experiment 1, we tested the perception of featural and configural information in upright and inverted faces. If inversion differentially impairs configural information over featural information, we would expect to find an interaction between face orientation (upright vs. inverted) and information type (featural vs. configural). In contrast, if inversion differentially impairs information in the mouth region of the face, we would expect to see an interaction between face orientation and region (eye, mouth).

Method

Participants

A group of 24 undergraduate students at the University of Victoria received course credit for their participation (22 female, two male; mean age = 20.6, range = 18–43 years). The participants had normal or corrected-to-normal vision.

Stimuli

The stimuli were created using six high-quality, grayscale digitized photographs of children’s faces (three male, three female), ranging from nine to 12 years in age. The stimuli were from the “Let’s Face It” face-processing battery (Wolf et al., 2008) and have been used in previous experiments to test face processes in patients with prosopagnosia (Bukach et al., 2008; Rossion et al., 2009), children with autism (Wolf et al., 2008), and infants (Quinn & Tanaka, 2009). The images were cropped at each side of the head. To discourage reliance on nonfacial cues, the faces had no jewelry, glasses, or makeup, and facial markings such as freckles, moles, and blemishes were removed digitally. The faces were 300 pixels in width by 400 pixels in height. The images subtended a visual angle of 10.0° (width) × 13.2° (height) when presented at the testing distance of 60 cm. Using the graphic software program Adobe Photoshop, the eyes or mouth of each original face was modified either featurally or configurally, to create four dimensions of change: featural eyes, featural mouth, configural eyes, and configural mouth. Each dimension of change consisted of five faces along a continuum: The original (primary) face, and four incrementally varied (secondary) face images. This process created 20 faces per each original face, for a total of 120 face stimuli (see Fig. 1 for an example).

Fig. 1
figure 1

Examples of the featural and configural face stimuli used in Experiment 1: (top row) featural changes in the size of the eyes, (second row) featural changes in the size of the mouth, (third row) configural changes in the interocular distance, and (bottom row) configural changes in the nose–mouth distance

Featural and configural values selected for the face dimensions task were based on the previous work by Bukach et al. (2008). In the featural condition, the size of the eyes or mouth was manipulated by resizing the original feature by 80 %, 90 %, 110 %, or 120 %, while the shape and position were unchanged. The size scaling manipulation did not change the location of the eye or mouth in the face. Due to the nature of the manipulations in the featural condition, some degree of configural change was unavoidably introduced. Specifically, changing the size of the eyes altered the spatial distance between the internal edge of the left eye to the internal edge of the right eye by four pixels for each level of change. Changing the size of the mouth altered the distance between the philtrum of the mouth and the bottom of the nose by two pixels for each level of change. For the configural manipulations of the eyes, the interocular distance was created by increasing (or decreasing) the distance by ten pixels for each level of change. For configural mouth manipulations, the mouth was shifted upward (or downward) by five pixels for each level of change.

Procedure

The research ethics board of the University of Victoria approved the following experiment. Prior to testing, the procedure was explained and informed written consent was obtained from the participants. A complete debriefing was given on completion. Participants were told that they would be presented with two images in sequence on a computer monitor, and they were asked to determine whether these images were identical. They were directed to respond as quickly and accurately as possible using keys labeled “same” and “different” on a serial response box. We emphasized to participants that in order to respond “same,” the two faces should be physically identical. All instructions were provided verbally by an experimenter, as well as in writing at the beginning of the experiment. Participants sat in a darkened room, approximately 60 cm from a 15-in. monitor. Each trial was constructed as follows: A fixation cross was displayed for 250 ms, then the first stimulus was presented for 500 ms, followed by a noise mask displayed for 500 ms, and finally, the presentation of the second stimulus, which remained on the screen until the participant responded, for a maximum of 3,000 ms. Trials were separated by 1,500 ms. In each trial, the two stimuli were presented either both upright or both inverted. Stimuli were centered horizontally on the screen and positioned vertically so that the nose was at the center of the screen in both upright and inverted trials. Participants were tested using the computer software E-Prime Version 1.0. The computers were equipped with Intel Pentium 4 processors and 15-in. Sony Trinitron E240 or LG Flatron F700P monitors, set at a screen resolution of 1,024 × 768 pixels.

Design

Four within-subjects independent factors were manipulated: Type (configural or featural), Region (eyes or mouth), Orientation (upright or inverted), and Level (easy, intermediate, or difficult). The “easy” trials were separated by three degrees of difference along the continuum (e.g., Face2–Face5), the “intermediate” trials by two degrees (e.g., Face2–Face4) and “difficult” trials by one degree (e.g., Face3–Face4). Each face pair appeared approximately the same number of times. The “same” trials were sampled equally from faces along the Face1–Face5 continuum. Half of the trials were “same” trials and half were “different” trials. In total, 576 trials were in the experiment; the trials were presented in four blocks and shown randomly to minimize participant strategies (Riesenhuber et al., 2004).

Results and discussion

A repeated measures analysis of variance was performed on the d' scores, with Region (eye, mouth), Type (featural, configural), Orientation (upright, inverted), and Level (easy, intermediate, difficult) as within-subjects factors.

The main effect of region was significant, F(1, 23) = 59.04, p < .001, η 2 = .72, demonstrating an advantage for discrimination in the eye region over the mouth region. A significant main effect of orientation, F(1, 23) = 107.86, p < .001, η 2 = .82, showed that discriminations in inverted faces were more difficult than discriminations in upright faces. The main effect of level was also significant, F(2, 46) = 251.86, p < .001, η 2 = .92, indicating that performance varied according to the three levels of discrimination. The main effect of type approached but did not reach significant levels, F(1, 23) = 4.22, p > .05, showing that the difference between featural and configural information was not reliable.

With respect to interactions, face region did not interact with type, F(1, 23) = 2.28, p > .10, showing that the discriminations of configural and featural information did not differ in the mouth and eye regions. The Type × Orientation interaction also was not reliable, F(1, 23) = 0.34, p > .50, demonstrating that inversion did not impair the discrimination of configural information more than that of featural information.

Critically, we found a significant interaction between orientation and region, F(1, 23) = 33.60, p < .001, η 2 = .59. Post-hoc tests showed that the discrimination of upright eyes (M = 1.71) was better than the discrimination of inverted eyes (M = 1.39), p < .01, and that the discrimination of upright mouth information (M = 1.52) was better than the discrimination of inverted mouth information (M = 0.23), p < .001. Although reliable inversion effects were found for both mouths and eyes, the magnitude of the inversion effect was four times as large for the mouth (1.29) as for the eyes (0.32). The discrimination of information in the mouth region did not differ from discrimination in the eye region in the upright orientation, p > .10. However, when the faces were inverted, the discrimination of information in the mouth region was reliably worse than the discrimination of information in the eye region, p < .001. Thus, processing was more severely impaired by inversion for information in the mouth region than for information in the eye region.

The factors Orientation, F(2, 46) = 63.74, p < .001, η 2 = .74, Region, F(2, 46) = 31.13, p < .001, η 2 = 0.58, and Type, F(2, 46) = 5.36, p < .001, η 2 = .19, interacted with Level, showing that the discrimination of parametric differences in the face images was affected by their orientation, the region of a change, and the type of information, respectively. Finally, the interaction of orientation, region, and level, F(2, 46) = 18.10 p < .001, η 2 = .44, demonstrated that as the level of discrimination became more difficult, inversion differences between the mouth and eye impairments decreased, probably due to a floor effect. No other interactions were reliable.

The main finding of Experiment 1 was that inversion disproportionately impaired the discrimination of information in the lower, mouth region of the face as compared to the discrimination of featural and configural information in the upper, eye region. Reaction time analyses revealed that the eye discriminations were faster than mouth-region discriminations, F(1, 23) = 78.83, p < .001, ruling out the possibility of speed–accuracy trade-offs. As is shown in Fig. 2, the locus of the face inversion effect is determined by the location of the face information (i.e., eye vs. mouth regions), not by the type of information (i.e., featural vs. configural) (Maurer et al., 2002; Mondloch et al., 2002). In contrast to the Malcolm et al. (2004) finding, but consistent with more recent studies (Crookes & Hayward, 2012; Goffaux & Dakin, 2010; Goffaux & Rossion, 2007; Sekunova & Barton, 2008), we found that relative to discrimination in the mouth region, the discrimination of featural and configural eye information was surprisingly preserved in the inverted orientation. Also divergent from the Malcolm et al. results, we found that discriminations of featural mouth information and of configural nose–mouth information were equally compromised by inversion, suggesting that inversion affects a region of the face (mouth vs. eye region) rather than a particular kind of information (configural vs. featural information).

Fig. 2
figure 2

Experiment 1: Manipulating the size and distance of facial features. The two graphs show d' scores for detecting configural and featural shape differences across three levels of change (easy, intermediate, difficult) in the eye and mouth regions, respectively. Bars represent standard errors of the means

What factors might account for the contrasting findings between the present experiment and Malcolm et al. (2004)? First, whereas we used a sequential same–different task under conditions of brief exposure, Malcolm et al. used an oddball task in which participants were presented with three faces for 2 s of inspection time and were asked to select the face that differed from the other two faces. Second, whereas Malcolm et al. blocked their trials according to feature and configural changes, we used a mixed design in which upright and inverted faces were shown randomly, to minimize participant strategies (Riesenhuber et al., 2004). Finally, in Experiment 1, we manipulated feature information by changing the size of the feature, whereas Malcolm et al. manipulated the shapes of features. These differences in tasks, encoding times, and stimulus manipulations might explain the featural impairment found in our experiment but not in the Malcolm et al. study.

Experiment 2: upright and inverted faces with morphed featural changes

As we discussed in the introduction, the manipulation of a facial feature invariably changes its configural relations to the other features in a face. For example, increasing the size of the eyes decreases the edge-to-edge distance and the distance of each eye to the tip of the nose. In Experiment 2, similar to Malcolm et al. (2004), we manipulated featural properties by altering the shape of the eyes and mouths of adult faces through a morphing procedure. Morphing transformations have an advantage over size transformations, in that they are less disruptive to edge-to-edge distances. In the morphing procedure, two pairs of eyes (or two different mouths) were averaged together to form a five-step continuum of featural changes. In a same–different matching task, on “different” trials, the two faces could be separated by four degrees in the continuum (easy condition), two degrees (intermediate condition), or one degree (difficult condition) (see Fig. 3). If the featural mouth effect in Experiment 1 was an artifact of configural changes, we would expect that this confound would be reduced by the morphing procedure applied in Experiment 2, and that inversion would be more disruptive to configural mouth changes than to featural mouth changes. Alternatively, if inversion disproportionately disrupts information in the mouth region of the face, we would predict that both featural mouth and configural nose–mouth judgments would be compromised by inversion.

Fig. 3
figure 3

Examples of the featural and configural face stimuli used in Experiment 2: (top row) featural changes in the shape of the eyes, (second row) featural changes in the shape of the mouth, (third row) configural changes in the interocular distance, and (bottom row) configural changes in the nose–mouth distance

Method

Participants

A group of 24 undergraduate students at the University of Victoria received course credit for their participation (14 female, ten male; mean age = 20.5, range = 18–31 years). All had normal or corrected-to-normal vision.

Stimuli

The images used to create these stimuli were adult faces taken from the Karolinksa face set (KDEF; Lundqvist, Flykt, & Öhman, 1998). Six primary faces (three male and three female adult faces) were created using morphed eyes and mouths prepared using the Morph version 2.5 computer program. The eyes and mouths of the primary faces were each a 50:50 morph of two distinct mouths or pairs of eyes from two different faces. None of the features used in morphing were the original features of the face frame. The featural modifications consisted of progressive morphing between the two features (Feature A and Feature B). Thus, the five levels of morphing in the featural continuum consisted of 100 % of Feature A, 75 % of Feature A and 25 % of Feature B, 50 % of Feature A and 50 % of Feature B, 25 % of Feature A and 75 % of Feature B, or 100 % of Feature B. In this experiment, the configural changes introduced as a result of the featural manipulations were minimal. In the eye condition, the interocular distance varied by 0.75 pixels or fewer between each level of change, and in the mouth condition, the philtrum position varied by 0.5 pixels or fewer. Configural stimuli were created by increasing or decreasing interocular distance in increments of six pixels. The configural mouth stimuli were shifted vertically upward and downward in increments of four pixels. Images were identical in size to Experiment 1 (i.e., 300 × 400 pixels) and subtended a visual angle of 10.0° (width) × 13.2° (height) when viewed at the testing distance of 60 cm.

Design

The experiment design was as in Experiment 1, except that the “easy” trials were separated by four degrees of difference along the continuum (e.g., Face1–Face5), the “intermediate” trials by three degrees (e.g., Face2–Face5), and the “difficult” trials by two degrees (e.g., Face2–Face4). Each face pair appeared approximately the same number of times. The “same” trials were sampled equally from faces along the Face1–Face5 continuum. Half of the trials were “same” trials, and half were “different” trials.

Procedure

The research ethics board of the University of Victoria approved the following experiment. The procedure was identical to that of Experiment 1.

Results and discussion

A repeated measures analysis of variance were conducted on the on the d' scores, with the within-subjects factors Type (configural, featural), Region (eye, mouth), Orientation (upright, inverted), and Level (easy, intermediate, difficult).

The main effect of region was significant, F(1, 23) = 28.94, p < .001, η 2 = .56, showing that eye discriminations were superior to mouth discriminations. Upright faces were easier to discriminate than inverted faces, as was indicated by the significant effect of orientation, F(1, 23) = 100.57, p < .001, η 2 = .81. The main effect of level of difficulty was reliable, F(2, 46) = 99.63, p < .001, η 2 = .81, but not that of type, F(1, 23) = 2.93, p = .10.

Orientation reliably interacted with region, F(1, 23) = 29.37, p < .001, η 2 = .56, showing that discrimination in the mouth region was more impaired by inversion than was discriminations in the eye region. Post-hoc tests showed that the discrimination of upright eyes (M = 1.80) was better than the discrimination of inverted eyes (M = 1.32), p < .001, and that the discrimination of upright mouth information (M = 1.54), p < .001, was better than the discrimination of inverted mouth information (M = 0.27), p < .001. Although reliable inversion effects were found for both mouths and eyes, the magnitude of the inversion effect was more than twice as large for the mouth (1.27) than for the eyes (0.48). Post-hoc tests (Bonferroni) also showed that the discrimination of upright eyes was better than the discrimination of upright mouths, p < .05, and the discrimination of inverted eyes was better than the discrimination of inverted mouths, p < .001. Although reliable regional differences were found for both orientations, the magnitude of this difference was about four times as large for the inverted (1.05) than for the upright (0.26) orientation.

Type interacted with region, F(1, 23) = 26.79, p < .001, η 2 = .54, demonstrating that configural information in the mouth region was more difficult to discriminate than was featural information. Bonferroni-corrected comparisons revealed that configural information was easier to discriminate than was featural information in the eye region, p < .001. However, configural information was more difficult to discriminate in the mouth region than was featural information, p < .001. Importantly, type did not interact with orientation, F(1, 23) = 0.23, p > .10, showing that inversion did not differentially affect configural or featural information.

Level of difficulty interacted with region, F(2, 46) = 8.52, p < .001, η 2 = .27, type, F(2, 46) = 5.36, p < .01, η 2 = 0.19, and orientation, F(2, 46) = 28.38, p < .001, η 2 = .55. Level also entered a three-way interaction with region and orientation, F(2, 46) = 8.60, p < .001, η 2 = .27. No other interactions were reliable.

The main finding of Experiment 2 was that inversion disproportionately impaired the discrimination of featural and configural information in the lower, mouth region of the face, but had relatively little effect on the perception of featural and configural information in the eye region. Reaction time analyses revealed that eye discriminations were faster than mouth-region discriminations, F(1, 23) = 22.384, p < .001, ruling out the possibility of speed–accuracy trade-offs. As is shown in Fig. 4, a large difference emerged across the three levels of discrimination for featural and configural judgments in the mouth region of the face. However, inversion had a minimal effect on the eye region of the face. Although Rhodes, Hayward, and Winkler (2006) found greater inversion effects for configural morphed changes than for featural morphed changes, both their eye and mouth features were concurrently rather than independently manipulated. The advantage of the dimensions task is that the source of configural effects can be isolated to the eye or the mouth region of the face. In summary, the results of Experiment 2 indicated that the source of the face inversion effect was determined not by the type of information (i.e., featural vs. configural), but by the region of the face information (i.e., eye vs. mouth region). These results replicate and extend the findings of Experiment 1 with a set of adult faces rather than child faces and a featural manipulation of shape rather than size. In Experiment 3, we examined whether the upper and lower visual field differences observed in Experiments 1 and 2 were specific to faces or could be generalized to nonface objects, such as houses.

Fig. 4
figure 4

Experiment 2: Manipulating the shape and distance of facial features. The graphs show d' scores for detecting configural and featural shape differences across three levels of change (easy, intermediate, difficult) in the eye and mouth regions, respectively. Bars represent standard errors of the means

Experiment 3: house dimensions task

The general finding from the previous experiments was that inversion disrupted the perception of information in the lower, mouth half of the face stimulus, but did little to perturb the discrimination of information in the eye half. Importantly, detection was impaired for both featural size (Exp. 1) and shape (Exp. 2) changes in the mouth, as well as for configural changes to the distances between the nose and mouth. In contrast, the perception of size and shape differences in the eyes and changes in the horizontal spacing between the eyes remained relatively preserved in the inverted face. Although these results suggest that the effect of inversion on the perception of face information depends on the relative location of this information, it was unclear whether this pattern of deficit is specific to faces. In a test of inversion on nonface stimuli, Yovel and Kanwisher (2004) examined the effects of inversion on houses. They found that misorientation had little effect on the perception of the features of a house or their spacing. However, in that study, information in both the upper and lower portions of the house stimulus was altered, so that it was not possible to know whether inversion had regional influences similar to the effects demonstrated for faces in Experiment 1 and 2. Also, the spatial layout of the house stimuli in that study contained two elements in the upper region (two windows) and two elements (a window and a door) in the lower region. Thus, it is possible that visual attention was equally distributed across the top and bottom portions of the house.

For our study, we constructed house stimuli that contained two elements (i.e., small windows) in the upper region and a single element (i.e., a large window) in the lower region. As is shown in Fig. 5, the arrangement of elements in the house stimulus approximated the spatial layout of a face (i.e., two elements in the upper visual field and one element in the lower visual field), thereby allowing us to examine the regional disruption of featural and configural changes in a nonface stimulus. Following a procedure similar to the one we used for faces, featural information was manipulated by changing the size of the windows, and configural information was manipulated by altering the spacing between the windows or the vertical height of the large window. The discrimination of house features and their configuration was tested in upright and inverted orientations. If inversion causes a general impairment of the perception of features in the lower halves of images, similar inversion effects should be observed for houses and faces. On the other hand, if inversion distinctly disrupts information that is important to face perception, there should be no difference in the discriminability of information in the upper and lower halves of upside-down houses.

Fig. 5
figure 5

Examples of the featural and configural house stimuli used in Experiment 2: (top row) featural changes in the size of the upper windows, (second row) featural changes in the lower window, (third row) configural changes in the distance between the upper windows, and (bottom row) configural changes in the distance between the upper windows and the lower window

Method

Participants

A group of 24 undergraduate students at the University of Victoria received course credit for their participation (16 female, eight male; mean age = 19.0, range = 18–24 years). All had normal or corrected-to-normal vision.

Stimuli

The stimuli used in Experiment 3 were comparable to those in Experiments 1 and 2, except that six house images were used in place of faces. These photos were of houses in the Victoria, BC, area. In order to keep the images uniform, most plants and other decorative items were removed from the images using Adobe PhotoShop; however, some shrubs were left to maintain a realistic house representation. The primary house images had a pair of small windows near the top and a single, larger window near the bottom. These windows were used as analogues of the eyes and mouth, respectively. Again, configural and featural modifications were made to the windows of the houses, to create four secondary images within each condition, using the procedures described in Experiment 1. In the featural condition, the size of either the pair of top windows or the single bottom window was manipulated. The four secondary house images in each condition had windows that were 60 %, 80 %, 120 %, and 140 % of the size of the primary windows. In both conditions, the position of the center points of the windows remained constant. Again, some degree of configural change was inherent in these manipulations. In the top-window condition, the interwindow distance varied by 2.5–5.0 pixels per degree of change; in the bottom-window condition, the vertical distance between the inferior edge of the top windows and the superior edge of the bottom window varied by 1.5–2.5 pixels per degree of change.

To create stimuli for the configural condition, the spacing between the windows was manipulated. In the top-window condition, the distance between the upper windows was increased or decreased by 20 or 40 pixels, and in the bottom-window condition, the window was shifted vertically up or down by 20 or 40 pixels. The stimuli were 400 pixels in width by 425 pixels in height, and the images subtended a visual angle of 13.1° × 14.0° from the testing distance of 60 cm.

Design

The experimental design was similar to that described in Experiment 1; however, the levels of the independent region-of-change variable were modified. Since house images were used in this experiment in place of face images, the top windows and bottom window were used in place of eyes and mouth, respectively. As in Experiment 1, three levels of discrimination were employed: The easy, intermediate, and difficult degrees of difference were defined as being separated by four, three, and two degrees of change, respectively. Half of the trials were “same” trials, and half were “different” trials.

Procedure

The research ethics board of the University of Victoria approved the following experiment. The procedures were identical to those described for Experiments 1 and 2.

Results and discussion

A repeated measures analysis of variance was performed on the d' scores, with Region (upper or lower half of the house), Type (featural, configural), Orientation (upright, inverted), and Level (easy, intermediate, difficult) as within-subjects factors.

The main effect of region, F(1, 23) = 20.56, p < .001, η 2 = .47, was significant, showing that overall, changes in the upper half of the house (i.e., small windows) were more accurately detected than were changes in the lower half. The significant effect of level, F(2, 46) = 143.08, p < .001, η 2 = .86, showed that discriminability improved as the differences in the physical stimuli increased. Orientation was also significant, F(1, 23) = 6.17, p < .05, η 2 = .21, demonstrating that overall sensitivity was greater in the upright than in the inverted orientation.

The two-way interaction between type and region, F(1, 23) = 8.34, p < .01, η 2 = .27, demonstrated that the featural changes in the lower window were more difficult to detect than were featural changes in the upper windows. The interaction between orientation and level, F(2, 46) = 3.38, p < .05, η 2 = .13, showed that orientation had a greater impact on the larger than on the smaller degrees of difference. Notably, orientation did not interact with region, F(1, 23) = 0.08, p > .10, showing that inversion had equivalent effects on information in the lower and upper portions of the house. The absence of a Region × Orientation interaction for houses presented a striking contrast to the findings reported in Experiments 1 and 2, in which both featural and configural forms of information at the bottom of the face were severely affected by inversion. Nor did orientation interact with type, F(1, 23) = 0.17, p > .10, indicating that inversion did not differentially impair featural or configural house information. The graphs in Fig. 6 reveal little effect of orientation and no difference between the type of information (featural vs. configural) nor the region (upper vs. lower half of the house). The main finding of Experiment 3 was that, unlike with faces, inversion failed to produce a selective loss of sensitivity to information in the lower half of the house stimuli. These results argue against the hypothesis that inversion produces a global impairment of information in the lower half of a stimulus, but suggest that this loss may be specific to faces.

Fig. 6
figure 6

Experiment 3: Manipulating the shape and distance of house features. The two graphs show d' scores for detecting configural and featural shape differences across three levels of change (easy, intermediate, difficult) in the small-window and large-window regions, respectively. Bars represent standard errors of the means

The preceding experiments have offered a precise account of the types of information and stimuli that are affected by inversion. The results from Experiments 1 and 2 demonstrated that inversion selectively disrupts the perception of featural and configural information in the lower half of the face, while preserving information in the upper region, around the eyes. Experiment 3 showed that inversion had little influence on the perception of featural or configural information in the upper or lower regions of nonface (house) stimuli.

These results speak to the importance of eye information in face processing. Although it has been acknowledged that the eyes play a prominent role in the processing of an upright face (Vinette, Gosselin, & Schyns, 2004), the eye preference seems to be even more pronounced in an inverted face. When a face is turned upside down, perception focuses on information regarding the shape, size, and interocular distance of the eyes, whereas information is lost involving the vertical position of the eyes (Goffaux & Rossion, 2007), the distance between the nose and the mouth (Barton, Deepak, & Malik, 2003; Barton et al., 2001), and the size and shape of the mouth (Exps. 1 and 2).

An important question is whether the loss of information in the inverted face is perceptual or attentional. According to Rossion (2008), upright faces are viewed with a broad perceptual window that is large enough to incorporate the entire face. However, when a face is turned upside down, the perceptual window shrinks in size and is unable to detect vertical displacements of the eyes. Sekunova and Barton (2008) showed that cueing had little effect on participants’ ability to detect vertical changes of the eyes and eyebrows, relative to cueing of the mouth region (see also Crookes & Hayward, 2012). This finding suggests that despite the benefit of attention, the perceptual system cannot resolve vertical eye displacements in an inverted face. In Experiment 4, we tested whether the loss of featural and configural mouth information in an inverted face is similarly immune to the influence of attention.

Experiment 4: spatial cueing of eye and mouth regions

Previous studies have shown that cueing participants to the mouth half of the face reduced the inversion effect for spatial discriminations between the nose and mouth and for discriminations of mouth luminance (Barton et al., 2003; Sekunova & Barton, 2008). In the present experiment, trials were blocked according to face region (i.e., eye, mouth). For eye-region differences, faces could differ with respect to their interocular distance (configural) or the size of the eyes (featural). For mouth-region differences, faces could differ with respect to the nose–mouth distance (configural) or the size of the mouth (featural). Prior to the beginning of the block, participants were informed of the region of potential change (i.e., eye or mouth region). If the inversion effect observed in Experiments 1 and 2 was due to a failure to fully attend to information in this area, cueing participants to the mouth region should abolish the inversion effect.

Method

Participants

A group of 24 undergraduate students at the University of Victoria received course credit for their participation (18 female, six male; mean age = 18.875, range = 17–22 years). Six additional undergraduate participants were tested, but two of the participants did not finish the task because of technical difficulties, and four were excluded from the data analysis because they did not follow the instructions.

Stimuli

The stimuli used in Experiment 4 were identical to those in Experiment 1.

Design

The experimental design was the same as in Experiment 1, except that it consisted of two blocks with upright-oriented faces and two blocks with inverted faces. In a given block, a “different” trial could differ in featural or configural information in the eye region only or the mouth region only. Half of the trials were “same” trials, and half were “different” trials.

Procedure

We followed the same procedures as in Experiment 1, except that before each block, the participants were informed in what region (eye, mouth) changes could occur. Half of the participants started with the two upright blocks and finished with the two inverted blocks, and the other half of the participants started with the two inverted blocks and finished with the two upright blocks. For the upright blocks, half of the participants started with a block in which “different” trials could differ in the eye region; the other half started with a block in which “different” trials could only differ in the mouth region. The same was true for the participants who started with the inverted blocks. As a result, 24 participants were divided among the four possible permutations, with six participants in each permutation.

Results and discussion

A repeated measures analysis of variance was performed on the d' scores, with Region (eye, mouth), Type (featural, configural), Orientation (upright, inverted), and Level (easy, intermediate, difficult) as within-subjects factors.

The main effect of type was significant, F(1, 23) = 7.76, p < .01, η 2 = .25, indicating that overall, configural changes were more difficult to detect than featural changes. A significant main effect of orientation, F(1, 23) = 19.17, p < .001, η 2 = .46, showed that discriminations of inverted faces were more difficult than discriminations of upright faces. The main effect of level was also significant, F(2, 46) = 610.60, p < .001, η 2 = .96, indicating that performance varied according to the three levels of discrimination.

We found a significant interaction between region and level, F(2, 46) = 5.58, p < .01, η 2 = .20: The eye region of the face was more influenced by difference in the images than was the mouth region. Type interacted with level, F(2, 46) = 3.78, p < .05, η 2 = .14, in that configural information was more influenced by parametric differences in the images than was the featural information. Finally, the interaction between orientation and level, F(2, 46) = 6.12 p < .01, η 2 = .21, showed that orientation had a greater impact on larger than smaller degrees of difference. No other interactions were significant. In particular, orientation did not interact with region, F(1, 23) = 3.40, p > .50, showing that information in the bottom half of the face was not more adversely affected by inversion than was information in the eye half. Nor did orientation interact with type, F(1, 23) = 0.008, p > .90, indicating that inversion did not differentially impair the perceptions of featural and configural information in faces.

Consistent with previous results (Barton et al., 2003; Sekunova & Barton, 2008), the main finding of Experiment 4 was that when participants were cued to the mouth region, inversion had relatively little effect on the perception of featural mouth size or the distance between the nose and mouth (see Fig. 7). This result contrasts with the findings reported in Experiments 1 and 2, in which the perception of featural and configural mouth information was severely impaired by inversion. These findings suggest that when faces are turned upside down, by default, attention is directed to the eye region, at the expense of information in the mouth area. By instructing participants to attend to the mouth region, this attentional bias can be reversed.

Fig. 7
figure 7

Experiment 4: Cueing to the eye region or the mouth region. The graphs show d' scores for detecting configural and featural size differences across three levels of change (easy, intermediate, difficult) in the eye and mouth regions, respectively. Bars represent standard errors of the means

General discussion

The goal of the present study was to test the effects of inversion on the discrimination of features and feature spacing in the eye and mouth regions of a face. In Experiment 1, inversion differentially disrupted the discrimination of faces that varied in mouth size or nose–mouth distances. By comparison, inversion had little effect on the discrimination of faces that varied in eye size or interocular distance. In Experiment 2, we tested the effects of inversion on the discrimination of feature shape using a morphing manipulation. Whereas inversion severely impaired the discrimination of changes to the mouth shape and the nose and mouth spacing, it had relatively weak effects on the discrimination of eye shape or eye spacing. In Experiment 3, we found that regional inversion effects do not generalize to nonface objects; specifically, unlike faces, discrimination of featural and configural information in house stimuli was preserved in their inverted orientation. In Experiment 4, we found that the regional deficit can be ameliorated if participants were cued to the critical mouth region. The summary results indicate that face inversion does not systematically impair featural or configural face information to the degree that it disrupts information in the mouth region of the face. We propose that in an upside down face, perception is spontaneously drawn to the eye features at the expense of processing information in the mouth region. The source of the eye region bias appears to be attentional because the mouth region deficit can be overridden when participants are cued to the previously unattended mouth area of the inverted face.

After more than four decades of research, since the face inversion effect was first reported by Yin (1969), the present research adds to the growing consensus in the literature regarding the types of information that are lost and preserved in an inverted face. The literature shows that inversion disproportionately impairs perception of (1) vertical displacements of the eyes and eyebrows relative to the entire face (Crookes & Hayward, 2012; Goffaux & Dakin, 2010; Goffaux & Rossion, 2007; Sekunova & Barton, 2008), (2) the vertical spacing between the nose and the mouth (Crookes & Hayward, 2012; Malcolm et al., 2004; Sekunova & Barton, 2008; Exps. 1 and 2 in the present study), and (3) the size (Exp. 1 in the present study) and shape (Malcolm et al., 2004; Exp. 2 in the present study) of the mouth.

Conversely, inverting a face does little to disrupt (1) the horizontal spacing between the eyes (Crookes & Hayward, 2012; Goffaux & Dakin, 2010; Goffaux & Rossion, 2007; Sekunova & Barton, 2008; Exps. 1, 2, and 4), (2) the vertical distance between the eyes and the eyebrows (Sekunova & Barton, 2008), (3) the size and shape of the eyes (Goffaux & Rossion, 2007; Malcolm et al., 2004; Exps. 1 and 2 in the present study), and (4) the perception of featural and configural information in nonface stimuli, such as houses (Yovel & Kanwisher, 2004; Exp. 3 in the present study) or cars or scenes (Goffaux & Dakin, 2010).

Regional impairment of featural face information due to inversion

The present study provides novel insights into our understanding of the regional deficits of face processing due to inversion. The results of Experiments 1 and 2 clearly demonstrate that inversion disrupts featural information related to the size (Exp. 1) and shape (Exp. 2) of the mouth feature. Whereas Malcolm et al. (2004) showed that inversion impaired configural mouth discriminations, the present findings demonstrate that featural mouth information is also susceptible to inversion effects.

Regional biases have been observed in clinical populations who show specific deficits in their face recognition abilities. For example, on the dimensions task, individuals with autism show a preserved ability to discriminate spacing and feature differences in the mouth region, but an impaired ability to detect spacing and feature differences in the eye region (Wolf et al., 2008). Similarly, patients with prosopagnosia show normal discrimination in the mouth region and impaired discrimination in the eye region (Bukach et al., 2008; Rossion et al., 2009). For individuals with autism and patients with prosopagnosia, impaired recognition of upright faces is likely due to their compromised perception of eye-region information. For neurotypical adults, impaired recognition of inverted faces is due, at least partially, to impaired processing of information in the mouth region.

The face specificity of the face inversion effect

The present results provide a strong test of the specificity of the face inversion effect. It has been suggested that faces as an object class have a distinctive spatial geometry in which the majority of visual elements are located in the upper region of the stimulus, rather than the lower region (Macchi, Turati, & Simion, 2004). Following this logic, inversion effects should be obtained for nonface stimuli that preserve the general spatial layouts of faces. To test the generality of the inversion effect, we constructed a nonface stimulus set that maintained the spatial geometry of faces (i.e., two elements in the upper region and a single element in the lower region). Indeed, as is depicted in Fig. 5, it is not difficult to interpret the house stimulus as a face, with its two small windows serving as “eyes” and the large lower window as a “mouth.” Yet, when comparing upright versus inverted performance in Experiment 3, we found that inversion had little effect on the perception of features and of their configuration as house parts. Nor was there any evidence to suggest that the information in the lower region of the house was more severely affected by inversion than was information in the upper region. These results support the specificity of the face inversion effect to faces, rather than a more general effect that extends to the perception of other, nonface objects that share the geometric properties of faces.

Attentional factors of the face inversion effect

The present findings clarify the conditions in which attention ameliorates the face inversion effect. Sekunova and Barton (2008) speculated that the perception of long-range distances that span two featural units would be more impaired by inversion than would a short-range, local discrimination. Rossion (2008, 2009) characterized the area of vision from which the observer can extract diagnostic visual information as its perceptual field. According to Rossion (2009), when a face is inverted,

the perceptual field is constricted and limited to one feature at a time (i.e., analytical processing). In general, such an analytic mode of processing will be the most detrimental for the observer when faces differ only by long-range relative distances between features, because such diagnostic cues require to consider several elements over a wide space. (p. 306).

For Rossion (2008, 2009) and Sekunova and Barton (2008), configural information about the spacing between features is more vulnerable to inversion than is information about a feature.

However, our results (Exps. 1 and 2) show that the features of a face (e.g., the mouth feature) are as susceptible to inversion effects as is the spacing between features (e.g., the spacing between the nose and mouth features). In an upright face, the perceptual field is broad, encompassing featural and configural information in the eye and mouth regions (see Fig. 8a). In an inverted face, the perceptual field contracts and focuses on information in the eye region by default. Featural information regarding eye shape and eye size and configural information regarding the spacing between the eyes and between the eyes and eyebrows are preserved within the restricted perceptual field (see Fig. 8b). Information outside the perceptual field, such as the vertical position of the eye–eyebrow unit, the distance between the nose and mouth, and the shape and size of the mouth, is lost (Crookes & Hayward, 2012; Goffaux & Dakin, 2010; Goffaux & Rossion, 2007; Malcolm et al., 2004; Sekunova & Barton, 2008; the present Exps. 1 and 2). To offset the eye-region bias, the perceptual field can be cued to the previously unattended mouth area, thereby improving color, featural mouth, and configural discriminations in the mouth region (Barton et al., 2003; the present Exp. 3, see Fig. 8c). However, cueing to the mouth region comes at the cost of detecting changes in the previously attended eye-unit region (Barton et al., 2001). Consistent with this account, Xu and Tanaka (2013) found that in an eyetracking study, fixations prior to response were predictive of successful discrimination when viewing an inverted, but not an upright, face. That is, if participants were attending to the region of change in an inverted face, they were more likely to detect differences. In contrast to Rossion’s (2009) original claims, our results suggest that the perceptual field preserves featural and configural information in the eye region of an inverted face, but is insensitive to both featural and configural information in the mouth region when a face is turned upside down.

Fig. 8
figure 8

Perceptual field. (a) In an upright face, the perceptual field encompasses the entire face, binding featural and configural information in a holistic representation. (b) Inversion reduces the size of the perceptual field, and attention is directed to the high-relevance eye region by default. (c) Cueing can redirect the perceptual field to the mouth region, facilitating the discrimination of nose–mouth configurations and mouth features, but impairing featural and configural discriminations in the eye region

In summary, our results show that the disruptive effects of inversion are determined by the spatial location of the face information (eye vs. mouth region) more than by the type of face information (configural vs. featural information). Whereas featural and configural information in the eye region are likely to be more preserved in an inverted face, featural and configural information in the mouth region are impaired. The selective sparing and preservation of face information according to spatial location argues against a strict qualitative view, in which it is claimed that configural information is more vulnerable to inversion than is featural information (Maurer et al., 2002; Rossion, 2008). Moreover, the relatively preserved perception of eye information in an inverted face suggests that the impaired recognition of an inverted face cannot be directly attributed to the loss of eye information. Instead, the source of the face inversion effect is likely related to the disruption of holistic processes that integrate the eyes with other elements of the face (e.g., mouth and nose features and external face contours; McKone & Yovel, 2009; Tanaka & Farah, 1993). Although the eyes may be special for the perception of facial identity and expression, the key to face recognition must depend on the holistic integration of featural and configural information from other regions of the face.