The relationship between objects and their referents is not always arbitrary. In particular, individuals systematically map the linguistic properties of words onto the physical characteristics of shapes. Ramachandran and Hubbard (2001), replicating classic work by Kohler (1947), found that participants almost unanimously matched curvy shapes with the novel word “bouba” and angular shapes with the novel word “kiki.” The Bouba/Kiki effect, which has now been replicated across ages (Maurer, Pathman, & Mondloch, 2006; Ozturk, Krehm, & Vouloumanos, 2013), cultures (Bremner et al., 2013; Davis, 1961; Ramachandran & Hubbard, 2001), and perceptual modalities (Ngo, Misra, & Spence, 2011), may reflect synesthesia-like cross-modal mapping between the roundedness of a visual shape and the roundedness of the mouth when pronouncing its name.

Like two-dimensional geometric shapes, people’s faces and bodies are more or less angular (Enlow & Hans, 1996). Therefore it is plausible that, like shape names, people’s names will seem more suited to them when they are similar in shape to the people themselves: that the name “Lou,” for example, whose pronunciation requires rounding the mouth, will be judged a better fit to a person with geometrically rounded features. Although this hypothesis has not been tested directly, several lines of research suggest its plausibility.

Lea, Thomas, Lamkin, and Bell (2007) provided quantitative evidence for the intuition that some names fit some people better than others. Participants agreed in a majority of cases that certain names (e.g., Bob, Justin) were most suitable for certain facial prototypes (established by previous participants). Although the researchers did not determine what factor(s) make for a good fitting name, Sidhu and colleagues have more recently confirmed that the bouba/kiki effect may play a role, extending the phenomenon to “real” names (rather than nonsense words), both in Canada (the “Bob/Kirk” effect; Pexman & Sidhu, 2014) and France (the “Benoit/Éric” effect; Sidhu, Pexman, & Saint-Aubin, 2016). These studies, however, did not establish a direct link between names and the physical shape of people who have them.

The current studies were therefore designed, first, to test for a “social” bouba/kiki effect, such that perceivers associate round and angular names, defined by their vowel content, with round and angular faces, respectively. Second, and more important, the current studies test the affective consequences of “correct” versus “incorrect” name assignment. A significant literature on schema violation (Deliza, MacFie, & Hedderley, 2003; Killen et al., 1996; Michie, Marteau, & Bobrow, 1997; Williams, Weinman, Dale, & Newman, 1995; Yeomans, Chambers, Blumenthal, & Blake, 2008) suggests that social perceivers may not only expect “round” people to have round names, but they may also judge those people more positively when they do, and/or punish them when they do not. Although no research has examined the affective implications of name congruency, other consistency effects, such as the fact that boys with “girls’” names are more prone to disruptive behavior (Figlio, 2007), suggest that having the “wrong” name can have serious social implications for some people.

In sum, we test the hypotheses that (1) names will be judged more suitable when they are congruent in shape with the people they denote, and (2) people whose names match their faces will be judged more positively than people with incongruent names. A final study also tests whether name-face congruency is predictive in an ecologically valid context, vote shares earned by real Senatorial candidates.

Study 1

As an initial test of social Bouba/Kiki effect, participants rank-ordered “round” and “angular” names in terms of their suitability for round and angular caricatured faces.

Method

Participants

We aimed for a sample comparable to that used by Lea et al. (2007), described above. Forty-two female and 15 male students at the University of Otago volunteered in exchange for credit toward their first or second year psychology courses (M age = 20.36 years, SD = 5.08). The majority of participants identified as New Zealand European (85%) and spoke English as their first language (91%). All participants in all studies in this paper provided written informed consent and were debriefed.

Stimuli and procedure

Participants were asked to rank order six names in terms of their suitability for ten rounded and ten angular male face caricatures. (Only male faces were used in the current studies, in order to minimize error variance associated with gender, a theoretically irrelevant factor in this context.) The faces were created on the website www.pimptheface.com (see Fig. 1A for examples) to have either exaggerated round or angular features (e.g., round vs. narrow head, puffy vs. thin lips, etc.). Analogously, names consisted of three “round” stimuli (Jono, George, and Lou) and three “angular” stimuli (Pete, Kirk, and Mickey), so-classified based on their use of back versus front vowels, and the shape the speaker’s lips take in their pronunciation.

Fig. 1
figure 1

Panel A: Examples of stimuli used in Study 1 (top panel) and Study 2

Participants were tested in individual light and sound attenuated experimental cubicles containing 21-in. iMac computer workstations running custom-made Superlab software (SuperLab, 2006). Participants completed 20 randomized trials in which a face appeared on the left side of the screen with the six possible stimulus names below it in random order. Participants used their mouse to drag on the names to rank order them, then clicked “next” to advance to the next face.

Results

The primary dependent measure was the average rank of congruent names (i.e., the rank of round names for round faces, and the rank of angular names for angular faces). One-sample t-tests showed that this measure was significantly higher (closer to 1) than chance (3.5) for both round (M = 3.32, SD = .29) and angular (M = 3.34, SD = .37) faces, t (56) = −4.46, p < .001, d = .62; t (56) = −3.22, p = .002, d = .43. A paired-samples t-test comparing congruent name rankings between round and angular faces was not significant, t (57) = −.36, p = .72, d = .05. Round faces were as likely to be named congruently as angular faces.

We also examined the data with faces as the unit of analysis. A face was classified as “congruently named” if the average rank of congruent names was higher than the average rank of incongruent names. This analysis revealed that nine of the ten of round faces, and eight of the ten of angular faces, were congruently named, a non-significant difference, χ2 (1) = .39, p = .53, Φ = .16.

Study 2

A second, replication study used a larger set of names, real faces (see Fig. 1), and a non-psychology student sample. In addition, rather than relying on our informal measurement, we used faces validated against both subjective and objective criteria.

Method

Participants

Forty-six female and 21 male students at the University of Otago took part in the study (M age = 21.85 years, SD = 3.2). Participants, who had not taken classes in psychology, were recruited from a campus job clearing house, and were paid NZ$15 to cover their travel expenses. The majority identified as New Zealand European (73%), and all were fluent in English.

Stimulus development

A stimulus pool of “round” and “angular” faces (all male and Caucasian to minimize error variance) was compiled by drawing on several databases and online sources, including the Facelab database (Rhodes, personal communication), University of Stirling face database (http://pics.psych.stir.ac.uk), and the Karolinska Directed Emotional Faces database (Lundqvist, Flykt, & Öhman, 1998), and supplemented by online searches (e.g., “man with round face”) of Google and Bing images. Faces were chosen if they were relatively round or angular in appearance, forward facing with a neutral or positive expression, and of high image quality. The search resulted in a pool of 267 faces.

We then quantified face roundedness using both objective and subjective methods. The former involved a simple measurement of three facial anthropometry landmarks, based on a widely-used system designed by Farkas (1994). Specifically, an angle was computed by drawing a line from the base of both ear lobes (landmark “sba” in Farkas’ system) to the tip of the chin (landmark “gn”).These data were supplemented with subjective “roundness” judgments from a series of pre-tests (N = 482 in total, sourced from Amazon’s Mechanical Turk and local participants), made on a scale anchored at 1 (very round) to 9 (very angular). Subjective ratings showed very high inter-rater reliability (Cronbach’s alpha α = .98), and correlated highly with the objective measure of roundness (Pearson’s Correlation r = .88), so the former was ultimately used to select stimuli.

Thirty-two faces were chosen from this database for use in Study 2, which differed significantly in their pretest shape ratings: for round versus angular faces, Ms = 2.51 and 6.29, SDs = .58 and .74, t (30) 16.02, p< .001, d = 5.69. Ten names were chosen using similar criteria to those described in Study 1, five round (Paul, Joe, Lou, George, and Bob) and five angular (Rick, Mike, Kirk, Vic, and Pete).

Procedure

The procedure was identical to Study 1, except for an increased number of trials.

Results

One-sample t-tests showed that the average ranking of congruent names was significantly higher (closer to 1) than chance (5.5) for both round (M = 5.12, SD = .44) and angular (M = 5.14, SD = .37) faces, t (66) = −7.04, p < .001, d = .86; t (66) = −8.03, p < .001, d = .97. A paired-samples t-test comparing congruent name rankings between round and angular faces was not significant, t (66) = −.35, p = .724, d = .05. Round faces were as likely to be named congruently as angular faces.

As in Study 1 we also examined the data with faces as the unit of analysis, which showed that 14 of 16 round faces, and 15 of 16 angular faces, were congruently named, a non-significant difference, χ2 (1) = .37, p = .54, Φ = .11.

Study 3

Study 3 sought to replicate the effects with a larger pool of names that had been validated, like the faces, using objective and subjective criteria, as well as to examine participants’ liking for people with congruent names.

Method

Participants

Without knowing how effects generated on MTurk would compare to those in the laboratory in our paradigm, we conservatively increased our sample size by 50%, aiming to recruit approximately 100 participants. Forty-one female and 55 male MTurk workers (M age = 38.15 years, SD = 10.25) were thus recruited and remunerated US$0.40 for their participation. The majority of the participants lived in the USA (95%) and spoke English as their first language (96%).

Stimuli and procedure

Faces were chosen from the stimulus pool described in Study 2. An analogous method was used to create a pool of stimulus names. Eighty names were initially compiled based on the researchers’ assessment of the New Zealand Department of Internal Affairs’ list of the top 100 boys and girls names since 1999 (dia.govt.nz, 2011), and from a site listing the top 1,000 baby names in the USA (Babynamewizard.com, 2015). Objective name roundedness was quantified by assigning an ordinal scale score from 1 to 3 for each vowel, based on the mouth shape required to pronounce it, according to the International Phonetic Alphabet (Ladefoged, 1990). For instance, the vowels /i/ and /e/ occur at the front and high in the mouth, thus they were assigned a score of 3. The centrally formed vowel /a/ was assigned a score of 2, and the back and high positioned vowels /u/ and /o/ were assigned a score of 1. Averaging across all vowels in a name, “Gordon,” for example, receives a score of 1 (two vowels of roundedness level 1), while “Maverick” receives a score of 2.7 (two vowels of roundedness level 3, one of roundedness level 2). These data were supplemented with subjective “roundedness” judgments from 200 Mechanical Turk workers, who rated one of two sets of 40 names (20 round and 20 angular based on the aforementioned objective analysis) on a sliding scale anchored at 1 (“Very Round”) and 9 (“Very Angular”). Subjective ratings showed very high inter-rater reliability (Cronbach’s alpha α = .99), and correlated highly with the objective measure of roundness (r = .86), and the former was used to select stimulus names. All names and their subjective and objective roundedness appear in the Supplementary Online Material.

The procedure was similar to that of Studies 1 and 2, with the following exceptions. Participants rank ordered eight names (four round, four angular), drawn randomly from a larger subset of 36 stimulus names, in terms of their suitability for each of 20 round and 20 angular faces. Both names and faces differed significantly according to pretest ratings described above: for round versus angular names, Ms = 2.99 and 6.45, SDs = .36 and .72, t (33) 17.69, p< .001, d = 6.41; for round versus angular faces, Ms = 2.35 and 6.73, SDs = .50 and .68, t (38) 23.29, p< .001, d = 7.42. In addition to ranking the names, participants were also asked to type their top name choice into a box provided below the list of names and, at the bottom of the screen, to rate how much they liked the target individual on a 9-point scale anchored at 1 (“Very little”) and 9 (“Very much”).

Results

One-sample t-tests showed that face shape-congruent names were ranked higher (closer to 1) than chance (4.5) for both round (M = 4.15, SD = .33) and angular (M = 4.12, SD = .30) faces, round t (95) = −10.32, p < .001, d = 1.06; angular t (95) = −12.49, p < .001, d = 1.27. A paired-samples t-test comparing congruent name rankings between round and angular faces was not significant, t (95) = .84, p = .403, d = .09, . Round faces were as likely to be named congruently as angular faces. With faces as the unit of analysis, 19 of 20 round faces, and 18 of 20 angular faces, were congruently named, a non-significant difference, χ2 (1) = .36, p = .55, Φ = .09.

To examine the relation between ranking and liking, each face, for each participant, was classified as congruently or incongruently named, based on the mean ranking of shape-congruent names (i.e., above or below 4.5). Liking ratings were then analyzed in a 2 (round versus angular face) × 2 (congruently named versus incongruently named) repeated measures ANOVA. There was a significant main effect for face shape, F (1, 95) = 63.01, p <.001, partial η 2 = .40, such that angular faces (M = 5.27, SE = .13) were liked more than round faces (M = 4.69, SE = .14). There was also a significant main effect for congruence F (1, 95) = 22.19, p < .001, partial η 2 = .19, such that congruently-named faces (M = 5.13, SE = .13) were liked more than incongruently named faces (M = 4.83, SE = .14). The interaction did not reach significance, F (1, 95) = 2.87, p =.094, partial η 2 =.03.

Study 4

Study 3 again replicated the association between face and name shape, and also showed that people assigned congruent names are liked better than those assigned incongruent names. To establish the causal direction of the latter effect, Study 4 manipulated rather than measured congruency.

Method

Participants

Aiming again for 100 participants per between-subjects condition, we recruited 94 female and 107 male Mechanical Turk workers (M age = 35.95 years, SD = 11.78), remunerating them US$0.40 for their participation. The majority of the participants lived in the USA (86%) and Europe (7%). Most of the participants spoke English as a first language (92%).

Stimuli and procedure

Participants rated their liking for 20 round and 20 angular faces, each of which had been assigned either a single round or an angular name, on a continuous sliding scale anchored at 1 (not at all) and 9 (very much). Both faces and names differed significantly according to pretest ratings described in Study 2: for round versus angular faces, Ms = 2.56 and 6.63, SDs = .54 and .71, t (38) -20.44, p< .001, d = 6.51; for round versus angular names, Ms = 3.11 and 6.20, SDs = .38 and .61, t (48) -21.71, p< .001, d = 6.24. Names were randomly assigned to faces, without replacement, for each participant, with the constraint that half of each type of face was paired with half of each type of name.

On each trial, participants first rated their liking for round and angular individuals on their own (i.e., with no name information). After rating a face, participants were given “an additional piece of information,” the person’s (congruent or incongruent) name, and then had the opportunity to change their original liking rating up or down if they wished (the slider on the scale was set to their original rating). Thus, each face was rated twice, first without a name, and then after the name was revealed.

Results

A 2 (face shape) × 2 (congruence) × 2 (rating stage: before versus after name was provided) repeated measures ANOVA revealed a main effect for rating stage, F (1, 200) = 10.58, p = .001, partial η 2 = .05; faces were, overall, liked better after they were assigned names (Ms = 4.40 vs. 4.35, SEs = .09). There was also a main effect for face shape, F (1,200) = 182.80, p < .001, partial η 2 = .48, with round faces liked overall less than angular faces (Ms = 4.07 vs. 4.68, SEs = .09). There was a marginal main effect for congruence, F (1,200) = 3.45, p = .065, partial η 2 = .02. Overall liking ratings for congruently named faces was higher than liking ratings for incongruently named faces (Ms = 4.41 vs. 4.34, SEs = .09).

More importantly, the three way interaction was significant, F (1, 200) = 4.60, p = .033, partial η 2 = .022. To interpret the interaction, separate 2 (congruence) × 2 (rating stage) ANOVAs were conducted for round and angular faces separately. For round faces, there was a significant interaction between congruence and rating stage, F(1, 200) = 6.04, p = .015, partial η 2 =.03, such that participants liked faces more after learning they had congruent names, t (200) = −6.13, p < .001, d = .38, but not after learning they had incongruent names t (200) = −1.55, p = .122, d = .12. The interaction was also significant, but more extreme, for angular faces, F(1, 200) = 22.49, p < 001, partial η 2 =.10: faces were liked better when given congruent names, t (200) = −4.55, p < .001, d = .35 and marginally worse when given incongruent names t (200) = 1.77, p = .078, d = .12. The mean liking ratings according to face shape, face-name congruency, and rating stage are shown in Fig. 2.

Fig. 2
figure 2

Liking as a function of face shape, face-name congruency, and time of measurement (before vs. after name was revealed), Study 4. Error bars represent standard errors of the mean

Study 5

Study 4 showed that, not only do people “expect” (i.e., implicitly) individuals to have names that match the shape of their face, but, at least in extreme cases, they diminish their estimation of the individuals if they happen to violate this expectation. The effects are robust and, although small, could have implications for seemingly more complex and important social judgments, particularly those that are made spontaneously or heuristically. One such judgment domain is voting decisions: previous research suggests that individuals can very rapidly extract trait information, such as attractiveness (Rosar, Klein, & Beckers, 2008) and competence (Olivola & Todorov, 2010; Todorov, Mandisodza, Goren, & Hall, 2005), from faces and that these judgments predict actual candidates’ success at the polls.

Method

Participants

One hundred and seven female and 92 male Mechanical Turn workers (M age = 40.04 years, SD = 13.0) took part in the study and were remunerated US$0.60. The majority of the participants lived in the USA (93%) and spoke English as their first language (97%).

Stimuli and procedure

The stimulus faces were 158 political candidates who ran for the U.S. Senate between 2000 and 2008 inclusive, selected from races in which the two primary opponents were both male Caucasians. Photographs of each candidate were sourced from the Internet using Google and Bing image searches; they were standardized to be 190 pixels wide (keeping height in proportion), and ranged in quality from 96 to 300 dpi; 96% were in color. Individual vote shares (the proportion of votes won in the election) were obtained from previous research (Olivola & Todorov, 2010; Todorov et al., 2005), courtesy of the authors. When candidates had run in two elections during the period (n = 28), their vote shares were averaged.

Participants made either face shape (N = 94) or name shape (N = 105) ratings of all candidates in random order, using the scale described in Study 2.

Results

Inter-rater reliability, measured as Cronbach’s alpha, was high for both name shape ratings for face shape ratings (α = .97 in both cases). Ratings were averaged across participants in each task to create single face roundedness and name roundedness estimates for each candidate. We then derived a “matching score” by taking the absolute difference of the two standardized estimates for each face, such that a higher score indicates a poorer fit between a candidate’s name and face. The mean match score was 1.18, with a standard deviation of 0.84, and a range of 0.02 (Bob Weygand, D-Rhode Island, 2000) to 3.91 (Rocky Raczkowski, R-Michigan, 2002) ( Fig. 3).

Fig. 3
figure 3

Panel A: the two Senatorial candidates in Study 5 with the best fit between their names and faces. Panel B: the two Senatorial candidates in Study 5 with the worst fit between their names and face

The overall correlation between match score and vote share (the number of votes cast for a candidate relative to the total number cast in the election) was negative and nonsignificant, r (158) = −.08, p = .34 (Fig. 4).

Fig. 4
figure 4

The overall correlation between matching score and vote share across all politicians

However, matching scores were overall moderate in size, in contrast to the stimuli in our laboratory studies, which were extreme and bimodally distributed by design. Thus, as a more powerful test, we also compared extremely well and poorly-matched candidates (i.e., above and below 1 SD of the mean match score, comparable to the stimuli used in Studies 1–4), which revealed an advantage for well-named candidates: those with congruent names earned a greater proportion of votes (M = .57, SD = .11) than those with incongruent names (M =. 47, SD = .16); t(51) = 2.45, p = .018, d = .67.

Discussion

Overall, our results tell a consistent story. People’s names, like shape names, are not entirely arbitrary labels. Face shapes produce expectations about the names that should denote them, and violations of those expectations carry affective implications, which in turn feed into more complex social judgments, including voting decisions (Abelson, Kinder, Peters, & Fiske, 1982; Ballew & Todorov, 2007; Chandrashekar et al., 2009; Todorov et al., 2005; Yeomans et al., 2008). The latter effect is particularly noteworthy, even if the congruency effect was limited to a subset of highly (in)congruent candidates. The fact that candidates with extremely well-fitting names won their seats by a larger margin – 10 points – than obtains in most American presidential races (Friedman, 2015), suggests the provocative idea that the relation between perceptual and bodily experience could be a potent source of bias in some circumstances. We note, however, that the effects in the current studies were of medium magnitude, but of small absolute size, and their importance in other “real-world” applications will depend on the weight given to improved prediction in a given context.

We also note that shape congruency is not the only way that people might (mis)fit with their names, nor are names the only feature that might vary in their suitability. For example research has found an association between oral-somatosensory experience (in a sample of sparkling water) and angular shapes (Chandrashekar et al., 2009); it is possible that people with angular features or names are expected to have more “sparkly” personalities or more energetic behavior. Indeed, Sidhu and Pexman (2015), who extended the bouba-kiki effect to real names (see the Introduction), also found that participants associated those names with shape-congruent traits (e.g., reflecting a “round and curvy personality”). Other research has found an association between lower tones and larger objects (Sapir, 1929), perhaps leading to expectations about the pitch of a person’s name or voice as a function of their physical size. These examples, which are easily open to empirical test, suggest that social judgment involves not only amodal application of stored information (e.g., stereotypes) to new stimuli, but also, as some “embodiment” theorists argue (e.g., Niedenthal, Barsalou, Winkielman, Krauth-Gruber, & Ric, 2005), an integration of perceptual and bodily input.