Encoding differences affect the number and precision of own-race versus other-race faces stored in visual working memory

Zhou, Xiaomei; Mondloch, Catherine J.; Emrich, Stephen M.

doi:10.3758/s13414-017-1467-6

Encoding differences affect the number and precision of own-race versus other-race faces stored in visual working memory

Open access
Published: 17 January 2018

Volume 80, pages 702–712, (2018)
Cite this article

Download PDF

You have full access to this open access article

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Encoding differences affect the number and precision of own-race versus other-race faces stored in visual working memory

Download PDF

Xiaomei Zhou¹,
Catherine J. Mondloch¹ &
Stephen M. Emrich¹

3610 Accesses
13 Citations
7 Altmetric
Explore all metrics

Abstract

Other-race faces are discriminated and recognized less accurately than own-race faces. Despite a wealth of research characterizing this other-race effect (ORE), little is known about the nature of the representations of own-race versus other-race faces. This is because traditional measures of this ORE provide a binary measure of discrimination or recognition (correct/incorrect), failing to capture potential variation in the quality of face representations. We applied a novel continuous-response paradigm to independently measure the number of own-race and other-race face representations stored in visual working memory (VWM) and the precision with which they are stored. Participants reported target own-race or other-race faces on a circular face space that smoothly varied along the dimension of identity. Using probabilistic mixture modeling, we found that following ample encoding time, the ORE is attributable to differences in the probability of a face being maintained in VWM. Reducing encoding time, a manipulation that is more sensitive to encoding limitations, caused a loss of precision or an increase in variability of VWM for other-race but not own-race faces. These results suggest that the ORE is driven by the inefficiency with which other-race faces are rapidly encoded in VWM and provide novel insights about how perceptual experience influences the representation of own-race and other-race faces in VWM.

A Memory Computational Basis for the Other-Race Effect

Article Open access 18 December 2019

The Other-Race Effect Revisited: No Effect for Faces Varying in Race Only

Neural evidence for the contribution of holistic processing but not attention allocation to the other-race effect on face memory

Article 25 June 2018

Across a broad range of research paradigms investigating face recognition, there is a robust other-race effect (ORE), defined here as inferior performance when identifying faces of a different race than faces of the same race as the perceiver (see Bothwell, Brigham, & Malpass, 1989; Meissner & Brigham, 2001, for reviews).

In numerous studies examining the ORE, participants have been presented with own-race and other-race faces during a study phase and then asked to recognize those faces when they are intermixed with novel identities (the old/new face recognition task). A ubiquitous finding is that participants make more false alarms (incorrectly identifying an unseen face as familiar) and fewer hits (correctly identifying a previously seen face as familiar) for other-race compared to own-race faces, reflecting impairments in the encoding, storage and/or retrieval of other-race face representations from memory (Meissner & Brigham, 2001; Young, Hugenberg, Bernstein, & Sacco, 2012). A similar own-race advantage is found when learning is more extensive (e.g., Cambridge Face Memory Test, in which faces were learned from multiple angles; McKone et al., 2012), and when memory demands are minimized by asking participants to make same/different judgments for pairs of faces that differ only in feature shape or spacing (e.g., Hayward, Rhodes, & Schwaninger, 2008; Mondloch et al., 2010).

Although impaired memory for other-race relative to own-race faces is robust, traditional measures only provide a single binary measure of perceivers’ memory performance; each response is scored as either correct or incorrect. Such measures fail to capture potential variability in the quality of the representation, and so little is known about differences in the precision with which own-race and other-race faces are stored. The assumption that the representation of any given face stored in memory is a perfect representation is theoretically untenable and has recently been challenged by studies examining the precision with which basic visual features (colors, orientations) are stored in both visual working memory (VWM; Bays, Catalao, & Husain, 2009; Wilken & Ma, 2004; Zhang & Luck, 2008) and long-term memory (LTM; Brady, Konkle, Gill, Oliva, & Alvarez, 2013; also see Luck & Vogel, 2013, for a review).

A recent and more refined approach, the continuous response paradigm, provides a more sensitive index of the structure of memory (and perceptual) representations (Bays et al., 2009; Bays & Husain, 2008; Brady, Konkle, & Alvarez, 2011; Heyes, Zokaei, & Husain, 2016; Sarigiannidis, Crickmore, & Astle, 2016). In the continuous response paradigm, participants are asked to recall and report the remembered target, which is presented in an array of stimuli that vary along a continuous feature dimension (e.g., color, orientation). Response error is evaluated by calculating the angular deviation between the target item and the item reported by the participant. Probabilistic mixture modeling allows one to measure many sources of overall error (Bays et al., 2009; Bays & Husain, 2008; Brady et al., 2013), including (a) failure in encoding or retrieving the target item, leading to a random response (i.e., guessing); (b) noisiness of the stored representation, leading to decreased precision when the target is recalled; (c) trial-by-trial variability in the mean precision of those responses (i.e., how consistently the stored representation is recalled); and (d) representation of the target item being interrupted by a nontarget item, which leads to recalling the nontarget instead of the target (i.e., a swap error). Here, we used this methodological combination of continuous recall and mixture modeling to provide a more refined examination of the nature of own-race and other-race face representations, and the types of errors that lead to recognition impairments for other-race faces.

Although the continuous response paradigm has been widely used in studies examining VWM for basic features (e.g., hue, line orientation), its use with more complex stimuli is limited. Lorenc, Pratte, Angeloni, and Tong (2014) investigated the role of perceptual experience in encoding and storing face representations in VWM by contrasting VWM for upright versus inverted faces. It is widely established that inverted faces are discriminated and recognized less accurately than own-race faces; like the ORE, this inversion effect has been attributed to differential experience (Maurer, Le Grand, & Mondloch, 2002). Lorenc et al. reported a significant loss of precision for inverted faces relative to upright faces with no difference in the guess rate. The fidelity of representations in LTM is constrained by those in VWM (Brady et al., 2013). Thus, the difference in recognition performance between upright and inverted faces is partially attributable to the effect of visual experience on the fidelity of face representations encoded in VWM. Whether a similar difference in fidelity characterizes own-race compared to other-race faces remains unknown.

Here, we provide the first examination of the extent to which the ORE is attributable to a failure to encode and retrieve other-race faces from memory versus a loss of precision in their representations. To examine this question, we used a continuous response paradigm in which participants were asked to maintain own-race or other-race faces in VWM, and to report a target face on a unique circular face space that smoothly varied along the dimension of identity. The angular deviation between the target face and the face selected by the participant provides a more sensitive measure of face memory than can be obtained through traditional face recognition paradigms, as it captures continuous variability in face representations.

In two experiments, we examined the nature of the representations of own-race and other-race faces that are stored in VWM. In Experiment 1, we presented two faces on each trial, one of which was then cued for recall. By applying two different mixture models to the raw error, we differentiated potential sources of error that contribute to the ORE: random guesses, swap errors, and lack of precision and/or trial-by-trial variability in precision for a remembered face. In Experiment 2, we presented only one face but varied presentation time. Applying mixture modeling here allowed us to examine whether reducing presentation time especially impaired VWM for other-race faces.

Experiment 1: Storing two faces with ample encoding time

Method

Participants

Fifteen Caucasian adults (one male, ages 19–30 years, SE = 0.68) from Brock University participated in the study and were included in the final analysis, a sample size comparable to that in other studies using the continuous response paradigm (Brady et al., 2013; Lorenc et al., 2014). All participants reported minimal contact with other-race identities and verbally confirmed normal or corrected-to-normal vision. An additional seven participants were excluded from the final analysis because they reported extensive contact with Asian identities (n = 1) or had extremely poor performance (i.e., guess rate exceeded 2.5 standard deviations of the mean; n = 6). All participants provided written informed consent and received either research credit or a small honorarium for their participation. This study received clearance from the Research Ethics Board at Brock University.

Stimuli

Four Caucasian and four East Asian faces were acquired from the Let’s Face It database at Brock University. All faces were female, physically similar, displayed in full-front view and unfamiliar to the participants. Each identity was paired with each of the other same-race identities to create six pairings. We then used a linear morphing procedure to create 19 morphed faces for each pairing by blending the two faces in 5% steps (e.g., 95/5, 90/10, . . . , 5/95). Nineteen morphs across six face pairs for each of the two race categories resulted in a total of 236 faces (228 morphs; eight originals) that were used in the experiment.

A unique circular face space comprised of Caucasian or East Asian faces, analogous to a color wheel, was created on each trial by randomly placing the four original (anchor) faces with equal distances between them. Based on their relative location, morphed faces were then placed among the anchor faces such that identity varied continuously around the wheel. Because all faces used to create the face wheel were wholly unfamiliar to our participants, no face on the wheel had special status (i.e., categorical perception was precluded). Thus, in the 360° circular face space, 80 faces (four anchors; 76 morphs) were evenly distributed, making the difference between any two neighboring faces equivalent to 4.5°. All faces were standardized at 395 × 510 pixels and were presented on a 19-inch computer monitor with the viewing distance approximately 60 cm. Stimuli were presented, and participants’ response were collected using PsychoPy1.8 (Peirce, 2007, 2009).

Procedure

Each participant completed a 1-hour session, comprising eight practice trials (four/race) followed by 240 test trials. The race of face was blocked such that half of the participants were presented with Caucasian faces first and the other half with East Asian faces first.

Each trial began with a sequential presentation of two faces (e.g., 90%A–10%B; 55%C–45%D) that were chosen randomly from the face space (could be anchor or morphed faces), followed by a delay period of 900 ms, and then a face wheel (see Fig. 1). The two faces were cued by different colors (red or green) and were presented sequentially for 1,500 ms each, with a 150-ms interstimulus interval. A 1,500 ms presentation time ensures full encoding of each face in VWM (Lorenc et al., 2014). One of the two faces was randomly assigned as the target face and the other as the nontarget face. Participants were unaware of which face was the target and were instructed to memorize both of them. After the 900-ms delay, a red or green rectangle appeared in the center of the screen indicating which face was the target. Eight randomly chosen and equidistant faces from the face wheel were presented around the central target item at equal intervals. Participants were instructed to locate the target face by using a computer mouse to select a point on the face wheel. While they moved the mouse along the face wheel, the face in the center changed simultaneously to indicate the face they were selecting. Like the composition of the face wheel, both the color (red/green) and the position (first/second) of the target were randomized across trials. Participants proceeded at their own pace and were asked to be as accurate as possible in their decision.

Data analysis

Overall response error

Response error was calculated for each trial as the angular deviation (in degrees; −180° to 180°) between the correct orientation of the target face and the orientation of the face reported by the participant. To obtain a generic measure of the overall precision of response, we calculated the reciprocals of the standard deviation (1/SD) of response error across trials separately for own-race and other-race faces.

To further identity the sources of increased response error for other-race faces, we fit the raw error using two models: a variable-precision model, in which precision of face representations varies across items and trials (Fougnie, Suchow, & Alvarez, 2012; van den Berg, Shin, Chou, George, & Ma, 2014) and an equal-precision model, in which each face representation is assumed to have equal precision (Bays & Husain, 2008; Zhang & Luck, 2008). The general method for both model types involves finding the maximum likelihood of a mixture of distributions, which are fit to the raw error.

Variable-precision model

In the case of the variable-precision model, the precision of responses is assumed to vary according to a higher order, truncated normal distribution (Fougnie et al., 2012). The model therefore takes the following form:

$$ p\left(\widehat{\theta}\right)=\left(1-\gamma \right)\psi \left(\widehat{\theta}-\theta \right)+\gamma \frac{1}{2\pi } $$

where γ represents the proportion of trials on which the participant is randomly guessing (i.e., a flat distribution). The error term on the remaining trials is defined as the difference between the target face (θ) and the face selected by the participant ($ \widehat{\theta} $); these responses fall under a wrapped Student’s t distribution (ψ). The model, therefore, returns three parameters of interest: the proportion of trials on which the participant is assumed to be guessing (γ), the mean standard deviation of responses on remaining trials (trials on which they did report the target; inverse of precision), and the standard deviation of response error on these remaining trials (reflecting intertrial variability in precision). A larger standard deviation of response error indicates more variability in the quality of the face representation stored across items and trials.

Equal-precision model

We also fit an equal-precision model to each participant data set for own-race and other-race faces. We used the three-component model (Bays et al., 2009; Bayes, Gorgoraptis, Wee, Marshall, & Husain, 2011), described by the following equation:

$$ p\left(\widehat{\theta}\right)=\alpha {\phi}_{\kappa}\left(\widehat{\theta}-\theta \right)+\beta \frac{1}{m}\ \sum \limits_i^m{\phi}_{\kappa}\left(\widehat{\theta}-{\varphi}_i\right)+\gamma \frac{1}{2\pi } $$

where α, β, and γ represent the probability of reporting the correct target face, the probability of mistakenly reporting the nontarget face, and the probability of responding randomly, respectively. Here, α + β + γ = 1. In addition, θ represents the correct location of the target face, and $ \widehat{\theta} $ represents the location of the face reported by the participant. The von Mises (circular normal) distribution is ϕ_κ,with the mean zero and the concentration parameter κ. Greater κ indicates a more concentrated von Mises distribution. The number of nontarget faces is m, in this case, m = 1, and {φ₁, φ₂, …φ_m} are the locations of the m nontarget faces. Thus, according to this model, the overall response distribution comprises a mixture of three components (Bays et al., 2009): (1) the proportion of trials on which the participant is assumed to be guessing; (2) target (correct) responses, from a von Mises distribution centered on the target face, indicating the probability that perceivers correctly remembered the target face; and (3) nontarget responses, drawn from the same von Mises distribution but centered on the nontarget face (i.e., the distractor face), indicating the probability of a swap error.

The proportion of correct responses can also be transformed into an estimate of the number of successfully maintained faces by multiplying the probability of correct responses by the set size (e.g., n = 2 in Experiment 1) for both own-race and other-race faces.

For all model fits, maximum likelihood estimates of the mixture parameters for each participant and face race were obtained using an expectation-maximization algorithm implemented with the MemToolBox 1.0 (Myung, 2003; Suchow, Brady, Fougnie, & Alvarez, 2013).

Results

Overall response error

The distribution of errors for own-race and other-race faces is shown in Fig. 2. A paired-samples t test revealed a significant main effect of face race, t(14) = 3.69, p = .002, Cohen’s d = 0.95; overall, participants had smaller response errors for own-race faces (M_SD = 56.61^o) than for other-race faces (M_SD = 69.66^o).

Variable-precision model

In order to examine the trial-to-trial variability in precision, as well as the proportion of trials in which participants guessed, a variable-precision model was fit to the raw error (see Table 1 for parameter means). Paired-samples t tests revealed a significantly higher guess rate for other-race faces than for own-race faces, t(14) = 3.57, p = .003, Cohen’s d = 0.92, with no difference in precision of VWM for own-race versus other-race faces, t(14) = 0.67, p = .514, Cohen’s d = 0.17, and no difference in variability in the precision of VWM for own-race versus other-race faces, t(14) = 0.634, p = .537, Cohen’s d = 0.16.

Table 1 Mean (and SD) of variable precision model parameters (Experiment 1)

Full size table

Equal-precision model

This pattern was confirmed using the equal-precision model. The result of the model fit is plotted in Fig. 2. Paired-samples t tests revealed a lower correct response rate for other-race faces (M = .58) than for own-race faces (M = .78), t(14) = 3.57, p = .003, Cohen’s d = 0.95. The significant difference in the proportion of correct responses was attributable to a significant difference in guess rate (M = .24 vs .03 for other-race vs. own-race faces), t(14) = 3.36, p = .005, Cohen’s d = 0.88, with no difference in swap errors (M = .18 vs .19 for other-race vs. own-race faces), t(14) = 0.17, p = .865, Cohen’s d = 0.04. The change in guess rate reflects a diminished number of stored faces for other-race (k = 1.16) relative to own-race (k = 1.56) faces. Notably, we did not detect any difference between the precision of VWM for own-race and other-race faces, t(14) = 0.74, p = .472, Cohen’s d = 0.19, as indicated by comparable standard deviations of von Mises distributions for own-race faces (35.26^o) and other-race faces (32.34^o).

Discussion

When holding two potential target faces in VWM and given ample encoding time, participants made significantly larger errors in their recall of other-race compared to own-race faces, as indicated by the greater angular deviations (SD) between the target face and the face that was reported by the participant. Results of both variable-precision and equal-precision modeling further informed us that the increase in overall errors for other-race faces was attributable to an increased guess rate but not to reduced precision or an increase in swap errors. Under these task conditions, differences in performance between own-race and other-race faces can be attributed to impairments in the encoding, consolidation, and/or retrieval of other-race face representations, rather than a change in either the precision with which remembered faces are stored or an increase in identity confusion.

Experiment 2: Storing one face with limited encoding time

In Experiment 1, participants were given ample time (1,500 ms) to encode each of two faces; one face was then cued for recall. This protocol is maximally sensitive to storage limitations (Bays et al., 2011) and also enabled us to examine the contribution of interference by other faces to the ORE. Encoding limitations are best captured by very brief presentations (Bays et al., 2011). To examine whether any observed differences in Experiment 1 were attributable to differences in encoding, in Experiment 2 we examined whether reducing presentation time (from 1,500 to 200 ms) especially impairs the probability and/or precision of correct responses for other-race faces. To isolate limitations in encoding, we further reduced the set size to one, thus working well below the capacity of VWM observed in Experiment 1.