Does the language we speak shape the way we think about the world? This question has been debated for more than a half century, and was developed into the tenet of the linguistic relativity hypothesis or the Sapir-Whorf hypothesisFootnote 1 formulated in the 1950s. Ever since it came to prominence in the linguistic field, the linguistic relativity hypothesis has been highly controversial in such disciplines as anthropology, psychology, education, and linguistics. Lucy (1997) noted that “[f]ew ideas generate as much interest and controversy as the linguistic relativity hypothesis…” (p. 291). Twenty years after Lucy’s (1997) claim, it remains largely the same. What is different from before, however, is that more rigorous scientific studies with multiple approaches and methods have been conducted to test and elucidate linguistic relativity in recent decades. What has made the hypothesis so controversial and, at the same time, so interesting? The long-standing die-hard interest, despite intense criticisms by a certain school of thought, suggests that the hypothesis has something significant at the core. The premise of the language-thought connection has also led to more sophisticated questions as to whether language functions as a lens or a mirror (or both).

Kuhn’s (2012) notion of the paradigm shift applies to linguistic relativity as well. As one of the epigraphs above shows, Kuhn (2012) explains the development of paradigm shifts in science. Kuhn uses the phrase normal science to refer to traditional scientific activities, including answering specific questions, collecting data, and making interpretations based on data collected. According to Kuhn (2012), in the process of normal science, anomalies emerge, which cannot be explained by an existing paradigm. When anomalies have accumulated against a current paradigm, the scientific discipline calls for extraordinary research, which is exploratory in nature, to address the anomalies accrued. As a result of extraordinary research on the anomalies, a new a paradigm is formed, which refers to a paradigm shift. A paradigm shift encounters resistance. As the new paradigm gradually gets accepted and goes through gestalt-like changes, however, the old paradigm eventually die (Kuhn, 2012). In the long run, the new paradigm becomes the dominant one.

The controversy of linguistic relativity has led to a wide range of laboratory studies as a traditional approach (i.e., normal science) and established a foundation for a paradigm shift by extensively exploring linguistic and nonlinguistic domains as extraordinary research in relevance to our thinking. Hacking (2012) notes that “[w]e have a tendency to see what we expect, even when it is not there. It often takes a long time for an anomaly to be seen for what it is, something contrary to the established order” (Hacking, 2012, p. xxvi). The opposition to Whorfianism has shown the inability to explain differences shown by different language groups. With the technological advances, brain imaging research has become available. Especially given that adults’ brains are reshaped as a result of literacy (see the second epigraph), the impact of reading on our cognition warrants a new treatment as a paradigm shift.

Since the linguistic relativity hypothesis has gone through an unprecedented cycle of acceptance and dismissal for more than five decades, this chapter first reviews the heated debate over the hypothesis, focusing on the evolution and dismissal of the hypothesis, followed by accounts of why and how it was dismissed. Next, empirical evidence that has been accrued in multiple disciplines in recent decades is reviewed. This chapter ends with an expansion on the linguistic relativity hypothesis to the script relativity hypothesis.

1 The Evolution and Dismissal of the Linguistic Relativity Hypothesis

The idea of the linguistic relativity hypothesis was incubated in the early 1900s, evolving from an ethnolinguistic inquiry. The idea that language and thought were intertwined was first indirectly expressed by Wilhelm von Humboldt, who saw language as the key to understanding the worldviews of its speakers and who observed relations between language and the mind in his cultural study of kawi, a literary language in Java (Odlin, 2005). The proposal was more refined by Franz Boas, Edward Sapir, and Benjamin Lee Whorf in the mid-1900s (Koerner, 1992). Among them, Whorf became the primary figure of the linguistic relativity hypothesis with his research into the language of Hopi Indians of Arizona and his comparison of temporal markings between the Hopi and English in the 1930s. Whorf attempted to explain the way in which language and syntactic systems affected human perception and ideas through his study of the Native American language. Whorf (1940) argued “… the background linguistic system (in other words, the grammar) of each language is not merely a reproducing instrument for voicing ideas but rather is itself a shaper of ideas…” (p. 212; cited in Koerner, 1992, p. 181). Although Whorf lacked an advanced degree in linguistics and was a fire prevention engineer and inspector for an insurance company with a degree in chemical engineering from the Massachusetts Institute of Technology, his insights were considered prudent in providing anecdotal ethnographic evidence and were highly regarded by linguistic authorities, such as Boas, Sapir, Bloomfield, and Lucy. Lucy (1997) notes that, although Whorf did not have formal training in psychology and linguistics, his work in linguistics is still considered to be of outstanding quality. After Whorf’s premature death in 1941 at age 44, a book entitled Language, Thought and Reality was published posthumously in 1956 compiling unpublished papers that he had left behind. The thesis of Whorfianism was continuously developed by linguists, psychologists, and anthropologists who investigated the effect of habitual use of language on habitual thinking and cognition.

Although Whorf himself did not put forth the strong deterministic effect of language on thinking, the hypothesis was later interpreted in two versions: (1) linguistic determinism as a strong version that posits that language determines thought and cognition and (2) linguistic relativity as a weak version that postulates that linguistic categories and habitual use of language affect our thought patterns (Pinker, 1994). The first view was the main source of strong opposition and quickly fell out of favor among scholars. The second view has received both acceptance and extreme dismissal over time. However, it has been repeatedly espoused by many scholars who argue that language indeed influences certain areas of cognition or cognitive processes.

Although many scholars believe that Whorf subscribed to linguistic determinism, another camp of scholars, such as Lee (1997) and Lucy (1992, 1997, 2016), reinterprets Whorf’s view based on his words, and claims that Whorf did not subscribe to the linguistic deterministic view. Schwanenflugel, Blount, and Lin (1991) seem to join the camp of Lee (1997) and Lucy (1997). They note that “Whorf’s major points appear to be arguments against the simplistic view that languages are directly translatable, category for category and word for word. His linguistic analyses were accordingly designed to highlight differences in grammatical and lexical patterns and to argue that a speaker must adhere to the patterns of his/her specific language in order to be understood” (p. 73).

Two types of examples are dominant in cross-linguistic comparisons under the notion of linguistic relativity: Lexical differentiation and grammatical differentiation. At the lexical level, Whorf argued that the way in which languages differentiate concepts in domains was different according to the culturally significant meaning assignment showing the high concentrations of differentiation in words in some domains and low concentrations in others. A well-known example is the statement that the Eskimo languages, including Yupik and Inuit, have a much larger number of words for “snow” in the lexicon than English. Whorf claimed that “[w]e [English speakers] have the same word for falling snow, snow on the ground, snow hard packed like ice, slushy snow, wind-driven snow--whatever the situation may be. To an Eskimo, this all-inclusive word would be almost unthinkable...” (Carroll, 1956, p. 216). Another example is that the American Indian language of Hopi uses an umbrella word to refer to everything that flies except birds; that is, the same word is used for insects, airplanes, aviators, etc. (Carroll, 1956). Whorf’s lexical examples received criticisms that resulted from a different view on morphological differentiations. Regardless of the focus of the debate, it suggests that each language has its own way of differentiating lexical domains, which is different across languages. The real question is whether or not linguistic variations yield differences in thinking and thought patterns.

At the syntactic level, languages differ in the use of word order or morphology to represent meaning. Whorf claimed that grammatical classifications or distinctions would also impact individuals’ ways of thinking. Relatedly, the syntactic ordering of subject-verb-object (SVO) is the norm in English. In principle, each sentence begins with a noun or pronoun, followed by a verb (and then by another noun or noun phrase or ends with only S+V). This overt rule may reinforce a reliance on the subject and its action or description. Li and Thompson (1976) dub English a subject-prominent language. In contrast, Japanese and Korean use an SOV order, in which the subject is most of the time omitted in the sentence. Even objects are at times omitted in the sentence, but the speaker and the listener do not have difficulty understanding the meaning of the sentence or message. Japanese and Korean are called topic-prominent languages or context-bound languages in that sentences are structured around a given topic and that contextual cues play a significant role in deciphering the sentence. The SOV word order and null-subject usage in the Japanese and Korean languages may have to do with context-focused problem-solving strategies Japanese and Korean people typically use, as discussed in Chapter 6.

Whorf’s hypothesis indicating that the habitual use of language affects habitual thinking and behavior has been challenged mostly by nativists or universalists from the 1960s through the 1980s. Opponents, such as Chomsky and Pinker, criticize Whorf’s hypothesis for implausibility or lack of logic in the accounts of how language affects thought and for Whorf’s arguments being in the form of anecdotes and speculations without hard evidence. The nativists argue that all languages share a common underlying structure that is largely innate. They believe that linguistic differences across languages are at the surface and do not make differences in the universal linguistic processes of the brain. Since they believe that all human beings possess the same set of psychological faculties, biological construction, and neural configuration, similar cognitive patterns are expected to show in language use across different language speakers; as a result, cultural variability is of less importance.

As a vehement opponent, Pinker (1994) criticizes Whorf’s hypothesis, in his book The Language Instinct, to be a “conventional absurdity: a statement that goes against all common sense…” (p. 47). He also mentions “… the more you examine Whorf’s arguments, the less sense they make” (p. 50) and “[a]s a cognitive scientist I can afford to be smug about common sense being true (thought is different from language) and linguistic determinism being a conventional absurdity” (p. 57). He goes on asserting that “[p]eople do not think in English or Chinese or Apache; they think in a language of thought” (p. 72), which is a meta-language mentalese and that “[k]nowing a language… is knowing how to translate mentalese into strings or words and vice versa” (p. 73).

As shown in his words, Pinker equated Whorfianism with the strong version, linguistic determinism, which can be seen as a misinterpretation of Whorf’s claim. Considering that the notion of strong and weak versions of Whorfianism was posthumously invented by other scholars, there is no evidence that Whorf himself claimed the determinism. In his later book, Pinker (2007) continues to debunk the linguistic relativity hypothesis by again relying on the strong version of linguistic determinism. Ironically, he essentially acknowledges linguistic relativity, as shown in his own words “[l]et me say at the outset that language surely affects thought--at the very least, if one person’s words didn’t affect another person’s thoughts, language as a whole would be useless” (p. 125). However, he still erroneously sticks with the determinism and tries to make Whorfianism “banal” (p. 126).

Malotki (1983) was an anthropologist who rejected Whorfianism. He argued that the Hopi language contains a series of time-related linguistic features, such as tense, metaphors for time, and time units (e.g., days, weeks, months), as opposed to Whorf’s claim. Lee (19911997) directly refuted Malotki’s (1983) analysis of adverbial particle “tensors” to be problematic and invalid. Lee also contended that, since his interest was geared toward showing that Hopi was similar to English, Malotki overlooked how Hopi grammar and time concepts were different from English.

There was an additional group of scholars who were opposed to linguistic relativity. Following Lenneberg’s line of inquiry, Berlin and Kay (1969) continued color research and indicated that the formation of color terminology was universal based on the three core color names (i.e., black, white, and red) commonly found across cultures. Berlin and Kay endorsed universal typological color principles, which were regarded to be determined by physical-biological universals, not by linguistic factors. However, Lucy (1992) criticized Berlin and Kay’s interpretation of their findings, arguing that the results of their study actually did not disprove linguistic relativity in color naming mainly because of questionable assumptions and data-related problems that were contained in their study of basic color terms. Due to the controversial accounts of linguistic relativity and conflicting research results, the debate has been continuing.

2 Rekindled Interest in the Linguistic Relativity Hypothesis

In the midst of the criticism on the linguistic relativity hypothesis, Fishman (1982) attempted to expand on Whorfianism as an intrinsic cultural value. He suggested that Whorfianism be the third kind above and beyond the linguistic relativity and linguistic determinism hypotheses. This third kind of hypothesis supports ethnolinguistic diversity as an intrinsic value of societal assets to promote pan-human creativity, problem solving, and mutual cross-cultural acceptance. He viewed this third kind as a “valuable humanizing and sensitizing effect on the language-related disciplines” (p. 1).

This line of refocusing on the linguistic relativity hypothesis continued in the late 1980s and early 1990s when cognitive linguistics solidified its way. Lakoff (1987) argues in his book Women, Fire and Dangerous Things: What Categories Reveal about the Mind that language is used metaphorically and that our knowledge is organized by the mapping of idealized cognitive models which are a by-product of category structures and cultural metaphors. In his elaboration on cultural metaphors, Lakoff (1987) revisits linguistic relativity focusing on how linguistic categorizations influence mental categories. He asserts that opponents have used different parameters to describe linguistic relativity to the degree that their criticisms are not fully grounded in the tenet of linguistic relativity. He also stresses that misunderstanding and confusion got in the way of opposition by noting “[t]he point is to show that there is not one concept of relativism but literally hundreds and that much of the emotion that has been spent in discussion of the issue has resulted from confusions about what is meant by ‘relativism’” (p. 304). Lakoff (1987) continues to assert that the dismissal of relativism was a result of “… scholarly irresponsibility, fuzzy thinking, lack of rigor, and even immorality” (p. 304). When it comes to different conceptual systems across languages, the degree, depth, nature, and locus of variations need to be scientifically addressed above and beyond the monolithic system issue.

A stockpile of studies accumulated by Lucy (1992, 1997), Lee (1991), and Levinson and colleagues (Bowerman & Levinson 2001; Gumperz & Levinson, 1996; Levinson, 2003) shows how the linguistic relativity hypothesis was misinterpreted, and also suggests a nuanced approach to study how language is intertwined with speakers’ cognition and mental processes. Levinson (2003) points out how the view of Simple Nativists was “simply ill informed” (p. 28). He continues indicating that “… Simple Nativism has outlived its utility; it blocks a proper understanding of the biological roots of language, it introduces incoherence into our theory, it blinds us to the reality of linguistic variation and discourages interesting research on the language-cognition interface.” (2003, p. 43).

Hunt and Agnoli (1991) indicate from a perspective of cognitive psychology that thought is related to variations in the lexicality, syntax, semantics, and pragmatics of language, and that different languages bring up different challenges and support for the cognition of diverse speakers. They also note that “[t]he Whorfian hypothesis is properly regarded as a psychological hypothesis about language performance and not as a linguistic hypothesis about language competence” (p. 387). Rediscovering Whorf’s insights, Lee (1997) argues that relativism has significant implications for pedagogy and education such that accepting the language-mind-experience relationship would facilitate teaching and thinking.

Another effort to rethink and reformulate linguistic relativity has been made with an anthology entitled Rethinking Linguistic Relativity edited by Gumperz and Levinson (1996). The compilation of articles focuses on cognitive and social aspects of linguistic relativity ranging from the cognitive processes of spatial semantic categories to the linguistic and cultural relativity of inference, including both pro-Whorfianisn and anti-relativist perspectives. The collection covers language-specific effects on cognition as well as cross-linguistically and cross-culturally specific and universal constructs. In addition, it covers not only language and linguistic structures that are situated within particular cultural contexts, but also the ramifications of linguistic and cultural concepts as well as language use and the variability of language. This line of resurrected interest has been extended to conceptual discussions in cross-language or second language studies (Bylund & Athanasopoulos, 2014; Casasanto, 2008; Cook & Bassetti, 2011).

3 Empirical Evidence for Linguistic Relativity

Lucy (1997) laments that, although linguistic relativity has drawn a long-standing historical interest from scholars of multi-disciplines, there has been a paucity of empirical studies, compared to other subjects. There are several reasons for the lack of empirical studies. First, as indicated in Chapter 1, it has to do with the interdisciplinary nature of the hypothesis, which makes the specialization of approach and methodology difficult to reconcile among different disciplines (Lucy, 1997, 2016). Second, it is related to the fact that, as briefly discussed earlier, some scholars equate Whorfianism with determinism, which has led to misinterpretations, unjust treatments of the hypothesis, and prejudices and biases (Lucy, 1997). Third, the intricately interwoven nature of language and cognition has also made empirical research challenging. Whorf discussed many linguistic classifications, but they were difficult to disentangle without assessing language independently of cognition. Boroditsky (2001) also points out a challenge involved in research of linguistic relativity. Although comparison studies have been conducted in different languages, a lack of instruments that are comparable to and reliable in each language imposes huge difficulties in the interpretation of results. The next challenge is related to nonlinguistic tasks used in the research. Although tasks are claimed to be nonlinguistic, it is difficult to ensure that nonlinguistic tasks are not reinforced or affected by the participant’s language due to the nature of interrelatedness between language and cognition and between language and human behavior. Last, Whorf’s views did not fit well with the tradition of behaviorists in psychology that prevailed at the time nor with subsequent nativism that was pioneered by Chomsky in the 1950s.

Lucy (1997) summarizes empirical research into linguistic relativity in three main approaches, focusing on language, thought, and reality as the central orientations: structure-centered, domain-centered, and behavior-centered approaches. The structure-centered approach focuses on the lexicogrammatical structures of languages and examines structural differences in languages between two languages as well as their possible implications for thought and reality (e.g., number, gender, aspect markings). The three key elements of language, thought, and reality are closely interrelated such that “[l]anguage embodies an interpretation of reality and language can influence thought about that reality” (Lucy, 1997, p. 294; emphasis in original). Human thought not only is closely linked to perception and attention, but also regulates the personal, sociocultural, and linguistic systems of classification, inference, and memory. The domain-centered approach involves the domains of experienced reality as well as the way in which a language encodes and construes semantic categories (e.g., color, time, space). The last behavior-centered approach concerns practical matters in relation to the behavioral aspects of the linguistic system (e.g., usage-based analysis).

Besides the three main foci on language, thought, and reality, other conceptual and methodological considerations are worth mentioning. First, the parameter of differences in languages needs to be defined. This has been addressed by looking at the presence or absence of a particular linguistic marker in languages under comparison. Another way is to address how the differences, if any, are manifested in languages being compared. Second, if a language shapes or affects the speaker’s cognition or thought patterns, the degree to which the language affects cognition needs to be defined, clarified, and identified. Third, differences in cognition or thought patterns also need to be defined. Since cognition and thought patterns are latent constructs, they are difficult to measure. Therefore, research has taken an indirect route to examine color perception, time perception, number perception, and so on. As indicated in Chapter 1, the opponents of linguistic relativity claim that evidence should come from nonverbal behavior in order to make linguistic relativity tenable. However, it is difficult to draw a distinct line between language and cognition because these two have an interlocking relationship that has been formed since infancy (Perszyk & Waxman, 2018). Although perceptual and conceptual domains, such as color, time, number, and space, can be considered nonverbal, it is still an open question because linguistic representations associated with these concepts are bound to be activated in the performance of tasks that elicits color, time, number, and space concepts.

With these issues related to research in Whorfianism in mind, a review of scientific evidence that supports or refutes linguistic relativity is in order. Research on first language influences on thinking is first reviewed and then studies of cross-language transfer in relation to linguistic relativity are discussed.

3.1 Studies of First Language Influences on Cognition among Various Language Communities

3.1.1 Color

Zipf’s (1935) law refers to the inverse relationship between the frequency of a word and its rank in the frequency table as well as a negative correlation between the length of a word and its frequency of usage. The higher the frequency of a word, the shorter the word. This notion was used in Brown and Lenneberg’s (1954) study of color codability based on the relationship between codability and ease of expression. Brown and Lenneberg asked college students to name 24 different colors and examined their reaction time. They found that colors with longer names (meaning less codable or less focal, according to them) took longer time, produced less agreement among the participants, and produced less consistency from one time to another.

Given that Brown and Lenneberg’s (1954) study used only English, linguistic relativity could not be fully addressed without a comparison between (at least) two language groups. Berlin and Kay (1969) investigated color terms and codability in 20 different languages. They took the nativist’s position that color recognition and coding were an innate physiological process rather than a form of cultural acquisition that relied on a premise of cross-linguistic regularities and constraints involved in the coding of colors and biological sources of color patterns. They noted universal restrictions on the number of basic color terms across languages. They claimed that all color terms of all languages could be broken down into 11 color terms that were monomorphemic, which appeared in a five-level hierarchy in languages: (1) black and white, (2) red, (3) yellow, green, and blue, (4) brown, and (5) purple, pink, orange, and grey. If one language had just two basic colors, the terms would be black and white (e.g., New Guinean people). If one language has three basic colors, it would be black, white, and red, and so forth, according to the hierarchy. This hierarchy was extended as evidence that human physiology would determine the categorization of color terms and put constraints on linguistic variations on color classification and perception. Berlin and Kay interpreted their findings as anti-Whorfianism.

Early studies of the lexical codability of colors showed that more codable colors (i.e., aforementioned focal colors) were better remembered than less codable colors in nonlinguistic tasks. Agrillo and Roberson (2009) revisited Brown and Lenneberg’s (1954) color study by comparing communication accuracy and recognition memory with varying distractor arrays for color items in order to overcome or control for the influence of context and task demands on the results. Unlike the findings of Brown and Lenneberg’s (1954) study, Agrillo and Roberson found that colors that were easier to name showed no recognition advantage for memory in a randomized array of distractors which was more akin to real life situations outside the laboratory setting. They concluded that the eight basic colors were not inherently more codable and memorable than other colors.

In another study, Kay and Kempton (1984) compared color categorization between English speakers and speakers of Tarahumara, a Uto-Aztecan language of northern Mexico, who did not have a distinction between green and blue and had instead a collective term siyóname meaning green or blue, in order to examine whether the lexical difference would result in a distinct judgment of the distances between the two colors. In Experiment 1, 56 triads of color chips were presented, in which three chips were shown at a time, and participants were asked which of the three chips was most different from the other two (a.k.a., a “pick an odd one out” method). Two chips were distinct in the colors of green and blue, while the hue of the other item was somewhere in between green or blue. English speakers tended to exaggerate the distinction of colors close to the lexical category boundary of blue and green, whereas Tarahumara did not show the tendency. In other words, English speakers clearly distinguished the green and blue chips based on the lexical category, while Tarahumara speakers did not distinguish the blue-green contrast. Kay and Kempton interpreted this result as a clear Whorfian effect in the direct subjective judgment of colors. When speakers are forced to judge color discrimination, they may use the lexical classification of the judged objects as if discrimination is related to the required dimension of judgment as long as the task does not block this connection. Under this assumption, Experiment 2 eliminated the subject’s use of the color name strategy to examine whether or not participants used a name strategy as a cognitive mechanism when discriminating between blue and green colors according to their lexical categories. The participants made discriminations based on the distance between the two colors but not on the lexical category, which showed no group difference. Results indicated that no sensitivity to lexical category boundaries was found in English speakers and that the Whorfian effect found in experiment 1 disappeared when the use of their color names was removed from the experiment.

Roberson et al. (2000, 2005) investigated perceptual judgments and memory in different language groups whose basic color terms were different. They found that differences in color cognition between different language groups yielded significant effects on perception and memory for colors (Roberson, Davies, & Davidoff, 2000). In order to overcome limited evidence from a tiny and remote language community, Roberson et al. (2005) studied a large language community of semi-nomadic tribesmen in Southern Africa and found a different cognitive organization of color was involved in both English and semi-nomadic tribesmen’s language with five color terms (Roberson et al., 2005). Roberson et al. (2000, 2005) suggested that categorical perceptions were language-dependent given the close interaction found between language and cognition, supporting the cultural relativity hypothesis.

Research has also been conducted to investigate whether having a word for a concept influences visual color perception. Given that English and Russian color terms are different in the color spectrum (while English has a single word for blue, Russians use different color terms for light blue goluboy and dark blue siniy), Winawer et al. (2007) examined whether the difference in color terms made differences in color discrimination. They tested native speakers of English and Russian in a speeded color discrimination task using two shades of blue. Russian speakers were faster to discriminate two shades when they fell into different shades used in Russian (one siniy and the other goluboy) than the same shades (both siniy or both goluboy). In order to determine whether words were unconsciously activated, they asked Russian participants to perform a verbal task at the same time when making their color discrimination. The reaction time advantage of different shades of goluboy and siniy disappeared. The different results of the verbal dual tasks indicated that the task of discriminating color shades was facilitated by the unconscious activation of verbal categories. English speakers showed no difference in discriminating the two blue shades. Winawer et al. (2007) concluded that color categories in language influenced color discrimination in simple perceptual color tasks and that the effect of language was disrupted by verbal interference. These findings are a piece of evidence for pro-Whorfianism.

Özgen and Davies (2002) also examined categorical color perception and claimed that color perception could be learned through repeated practice, such as laboratory training. They interpreted the findings of four experiments as support for the linguistic relativity hypothesis, claiming that “language may shape color perception” (p. 477). Lu, Hodges, Zhang, and Wang’s (2012) study was also in a similar line. They investigated the effects of Chinese color names on recognition in the left and right hemispheres using color naming and color memory. Results showed that, unlike previously assumed, linguistic effects on color discrimination were not constrained in the left hemisphere. They suggested that the right hemisphere’s relative speciailzation of color discrimination and the left hemisphere’s relative speciailzation of linguistic discrimination might have yielded varing degrees of effects on timing. Gibson and colleagues (2017) also conducted a large-scale study of 110 languages using the World Color Survey. They found cross-language similarity in color naming efficiency as well as differences in overall usefulness of color across cultures.

Importantly, Kay and Regier (2006) seem to support this line of reasoning. They acknowledge that there are universal constraints on color categories, but, at the same time, differences in color categorization across languages yield differences in color cognition and perception. This is a significant advancement for linguistic relativity, compared to the claim made in Berlin and Kay (1969), which was anti-Whorfianism.


Another set of studies in relation to linguistic relativity is an encoding pattern of motion events. Athanasopoulosa and Albright (2016) adopted a perceptual learning approach to the linguistic relativity hypothesis to examine the way English speakers categorize motion events by training them in an English-like way (aspect language) and in a Swedish-like way (non-aspect language) using the conditions of with and without verbal interference in English. Results showed that verbal interference effects were salient only in the within-language condition (i.e., English speaker’ categorizing events in an English-like way) but not in the between-language condition (i.e., English speakers’ categorizing events in Swedish-like way). This suggests a selective language influence on the classification of motion event cognition among English speakers. Gennari, Sloman, Malt, and Fitch’s (2002) study also examined lexicalizing patterns of motion events among English and Spanish speakers using two nonlinguistic tasks of recognition memory and similarity judgment. They found a linguistic effect in the similarity task with verbal encoding only, indicating that language-specific encoding patterns were observed in the form of language-dependent regularities involving the lexicalization of motion events.

Choi and Bowerman (1991) reported that children learning English and Korean showed different patterns of lexicalization of motion as early as 17-20 months. American children tended to quickly generalize spatial words of path particles, such as up, down, and in, to both spontaneous and causal changes of location. In contrast, Korean children were more likely to use different words for spontaneous and cause motion expressions. These findings indicated that children’s language acquisition was influenced by the semantic organization of their native language from the early phase of language acquisition. This suggests that language input and cognition interact with each other from the beginning of learning about motion and space.

3.1.2 Number

An attempt to redefine a Whorfian effect as a processing difference according to the language spoken has been made through research on numbers. Brysbaert, Fias, and Noël (1998) examined number sense and numerical encoding among French- and Dutch-speaking students. Whorfian effects on numerical cognition was examined using the Dutch number naming system in which the order of tens and units was reversed (e.g., 24 is read ‘four-and-twenty’). In Experiment 1, the researchers used two conditions of mathematical addition problems: (1) different order of the combination of two- and single-digit operands (e.g., 20 + 4 vs. 4 + 20) and (2) different presentation modality (i.e., Arabic numeral vs. oral). A significant difference was found between the two language groups in the presentation modalities. Experiment 2 showed that the difference disappeared when the participants were asked to type in their answers instead of verbal response. This indicated that the difference found in the methods of presentation might be related to input or output processes rather than the mathematical addition operation per se. Although numerical cognition could be independent of the language system, the authors did not completely dismiss the possibility of Whorfian effects on human cognition.

Lucy (1992) also examined relationships between grammatical number markings and cognition among speakers of American English and Yucatec Maya. English speakers use obligatory plural markings to accord with associated countable nouns, whereas Yucatec speakers optionally indicate plural terms. The two groups of different language speakers performed differently in nonverbal experimental tasks with a preference made based on the lexical structure of their native language. Specifically, English speakers showed a preference for shape-based classifications, while Yucatec speakers demonstrated material-based categorizations. This is an interesting study because not all languages have obligatory plural markings as shown in English. For example, the Japanese and Korean languages do not require number agreement between the subject and the verb as well as between the number marking and related countable nouns in the sentence. Specifically, the Korean language does not require number agreement between the subject and the verb or other grammatical elements within the sentence, but has a specific classifier that collocates with a given noun. For example, the phrases three books and three dogs in English are expressed as book three kwon (kwon is a designated classifier for books) and dog three mari (mari is a designated classifier for animals). Although no empirical data are available on this as of today, it is possible that these kinds of linguistic differences yield differences in shape-based, material-based, or animacy-based categorization as well.

Scientific attention has been paid to morphological differences in number coding between East-Asian languages and English as well as its effect on children’s conceptualization on numbers, and, ultimately, their mathematics performance. The number naming system in English is less straightforward than that of the East-Asian languages. In English, for example, the number name for 11 is hardly related to the unit name for 1, although the decade names for 13 through 19 are consistent with the unit names 3 through 9. The three Asian languages have a systematic code of number names from 11 and beyond; that is, the decade name followed by the unit name. For example, 11 and 12 are coded as literally (one) ten one and (one) ten two, respectively, and so forth. Likewise, the names for 21 and 22 are literally two ten one, two ten two, respectively, and so on. The numbers greater than 100 follow the same rule. This consistent way of combination does not require the use of new additional words to refer to numbers, unlike the number names from 13 to 19 in English. Notably, the English number names for 13 through 19 have inconsistent combinations because they consist of the unit name before the decade name, which is different from the other number names (i.e., names for 20 and onward). In short, the three East-Asian languages code the number names by the principle of place-value structure, meaning that the numeric values of multi-digit numbers are represented by the position of constituent digits in the structure of descending power from left to right (e.g., 123 = {1} ×102 + {2} ×101 + {3} ×100).

Based on these formal place-value structures of numbers, research has been conducted on the effect of the numeric name system on mathematics performance among students of different language groups. Miura et al. (1988, 1994) carried out cross-national comparisons of mathematics performance among American, Chinese, Japanese, and Korean children (1988) and among Chinese, French, Japanese, Korean, Swedish, and American children (1994). The results of two studies showed differences in cognitive representations of numbers and their effects on math achievement. Children with the three East-Asian languages consistently outperformed their peers of European and American backgrounds. The researchers attributed the East-Asians’ outperformance to numerical language characteristics. In other words, East-Asian children tended to construct decade blocks and unit blocks in a systematic way to show the place value, showing a better understanding of the place-value structure of the number system. However, children from France, Sweden, and the U.S. showed a preference for a collection of unit blocks to represent numbers as a grouping of counted objects. Furthermore, Asian students showed a greater flexibility in mental number manipulations than their counterparts. Miura et al. (1988, 1994) concluded that the systematic numeric characteristics expressed in the three East-Asian languages might facilitate the learning of mathematics, especially arithmetic.

Differences in the naming speed of the numbers have also been found among different language groups. Miller et al. (1995) found that Chinese children were faster in counting between 11 and 99 than English-speaking children, although there was no difference in the range of numbers between 1 and 10 and beyond 99. This difference may be attributable to the systematic number name structure between 11 and 99, as explained earlier. Additional studies also indicated that Chinese speakers pronounced numbers faster than English speakers. Hoosain and Salili (1987) noted that working memory capacity did depend on the time-based duration of sounds rather than the item-based number chunks. They reported that Chinese speakers’ pronunciation speed was faster and their sound duration for numbers was shorter than those of English speakers in their three experiments with English- and Chinese-speaking undergraduate students. They also reported Chinese speakers’ greater digit spans than those of English speakers. They suggest that pronunciation speed for numbers in language affects the mental capacity for the speaker’s cognitive manipulation of numbers.

It seems plausible that East-Asian children take advantage of the greater regularity embedded in their languages than English when they acquire number names and number sense. Ng and Rao (2010) have indicated in a comprehensive review that the Chinese language offers benefits for math learning and that the language is a contributing factor to the early attainment of math skills, although language, culture, cultural beliefs, and educational systems are interrelated. Klein et al. (2013) also show that a direct comparison of Italian-speaking children to German-speaking children further corroborates the previous findings that language affects cognitive number processing. They conclude that numerical development can be language-universal, but it might be modulated by language.

Another study with an Amazonian tribe provides an interesting piece of evidence that challenges the idea that people have an innate mathematical ability. Frank et al. (2008) argue that the number is a cognitive technology for creating mental representations for accurate memory. The Pirahã, an Amazonian tribe of hunters-gatherers in remote northwestern Brazil, have no words that express exact quantity (not even one), although they have words to express the quantities “one,” “two,” and “many” (Everett, 2005). These number words do not refer to counting numbers, but are rather signifying relative quantities (e.g., one for any quantity between one and four; two for as many as six). Frank et al. (2008) carried out two experiments for an investigation of the number language (Experiment 1) and numerical abilities (Experiment 2). They showed that the Pirahã could perform exact matching tasks with the large numbers of objects when the tasks did not involve memory. However, their responses were inaccurate on matching tasks when involved with memory. These results suggest that language for the exact cardinal number is a cultural invention rather than a linguistic universal. They also indicate that number words do not change our underlying number representations, but instead are a cognitive technology for keeping track of the cardinality of large sets across time, space, and modality (Frank et al., 2008). Although the results do not support the strong version of Whorfianism, they do suggest that language influences cognition and memory.

3.1.3 Time

The concept of time has also been studied. Universalists view time as a universally abstract concept, while relativists stress that different languages frame and express time differently. Boroditsky (2001) investigated the concept of time perceived by native speakers of Mandarin and English by looking at whether time is perceived horizontally or vertically because Mandarin and English encode time concepts differently in the languages. She demonstrated different ways of indicating time in English and Chinese, showing that English speakers tended to express time horizontally, while Chinese were likely to express time vertically. Specifically, Mandarin speakers responded faster when March and April were presented in a vertical display. In contrast, English speakers’ judgment was faster when March and April were presented in a horizontal array. She offered support for the weak version of linguistic relativity by concluding that the native language was a tool that shaped habitual thought and cognition of abstract concepts. Although January and Kako (2007) rebutted Boroditsky’s (2001) conclusion in a replication study, the inconsistent findings have not prevented from maintaining continued research interest in time perception.

Bylund and Athanasopoulos (2017) investigated how people construct their mental representations of time passage and estimate time among native speakers of Spanish and Swedish as well as Spanish-Swedish bilinguals. The Swedish language describes time in terms of length (i.e., long or short), while the Spanish language estimates it in terms of volume (i.e., big or small). When the participants were asked to measure the time duration (i.e., how much time had passed) while watching on the computer screen either a line gradually growing or a container being filled or both, “Swedish speakers were misled by stimulus length, and Spanish speakers were misled by stimulus size/quantity" (Bylund & Athanasopoulos, 2017, p. 911). Based on the language-specific interference found in the duration reproduction task, they asserted that language could play a powerful role in transforming our psychophysical experience of time, based on the robust presence of preferred expressions of time duration in magnitude according to the native language; that is, the long-short concept in Swedish and the big-small concept in Spanish. Bylund and Athanasopoulos’ (2017) bilingual data showed a different interference effect depending on the language used in the context. When the word “duración” (duration in Spanish) was presented first, bilinguals were likely to rely their time estimate more on how full the container was than how much the line grew. When they were prompted with the word “tid” (duration in Swedish), they measured the time estimate merely by the distance that the lines that had made by growing. These results were not counterevidence to linguistic relativity. The researchers concluded that humans’ mental representation of time was malleable in the form of a “highly adaptive information processing system” (p. 911). Montemayor (2019) recently suggests that the mechanism for time perception be examined in a broader context (i.e., early and late time perception) of time cognition and perception to overcome the narrow scope of termporal properties of time. He states that time perception provides researchers with new possibilities to invenstigate linguistic modulation through the interface between semantic categorization and mental representations in different forms.

3.1.4 Object

Conceptual categories pertaining to object names seem to be constructed as early as when children learn their mother tongue, if not before. Gopnik and Choi (1990) examined an early semantic and cognitive development among Korean-, French-, and English-speaking children by having them perform object-permanence, means-ends problem solving, and categorization tasks. Gopnik and Choi found that Korean children used significantly different forms than English-speaking children in encoding disappearance and success-failure words. English- and French-speaking children developed categorization and naming earlier than did Korean children. A longitudinal study (Gopnik, Choi, & Baumberger, 1996) showed that Korean-speaking children used not only more means-ends and success-failure words, but also more verbs than English speakers. These results are consistent with the observation that Korean-speakng mothers used more verbs and fewer nouns than English-speaking mothers (Gopnik, Choi, & Baumberger, 1996). In an observational study, they found that Korean mothers tended to emphasize actions, while English-speaking mothers tended to emphasize categorical names. Consistent with the previous study, Korean-speaking children were delayed in categorization but superior in means-ends abilities, compared to English-speaking counterparts. These findings suggest that differences in linguistic input and linguistic usage influence children’s cognitive development through two-way interactions between language and cognition in the early phase of language acquisition.

The specification of object position was also examined. Koster and Cadierno (2018) examined whether the perception of placement is universal or not using German and Spanish verbs. They examined categorization (Experiment 1), recognition memory (Experiment 2), and object orientation (Experiment 3). Null effects were found in the categorization and mental simulations of object orientation. However, German speakers demonstrated better recognition memory for object position than did Spanish-speaking counterparts. Although it did not show fully involved mental processes in the perception of placement, the study demonstrated robust language-specific effects involved in the specification of object position. More studies in this line are warranted for a better understanding of the interface between language and perception.

3.1.5 Nonlinguistic Representations

Nonlinguistic representations were also examined using musical pitch. Dolscheid, Shayan, Majid, and Casasanto (2013) used nonlinguistic psychophysical tasks to investigate the mental representation of musical pitch among native speakers of Dutch and Farsi. The two languages encode pitches differently; Dutch describes pitches using adjectives of high or low, while Farsi describes pitches using terms thin or thick. Performance differences were found in two pitch-reproducton tasks between the two groups. The Dutch-speaking group was further trained to describe musical pitches as in Farsi (i.e., thin or thick in description). Training actually made Dutch participants describe pitch in a similar way to that of Farsi speakers, which provided psychophysical evidence for linguistic relativity. The authors concluded “[l]anguage can play a causal role in shaping nonlinguistic representations of musical pitch” (p. 613).

3.1.6 Other Areas

The framework of the linguistic relativity hypothesis has been addressed in diverse areas. Gender issues were examined in a social identity analysis through the prism of the linguistic relativity hypothesis (Khosroshahi, 1989). Sign language was also used to examine a Whorfian effect. Xia, Xu, and Mo (2019) investigated deaf people’s color perception using visual search and oddball tasks. Both behavioral and electrophysiological findings showed that sign language affected the perception of color categories among deaf people and concluded that the nature of language influenced perception and thought. Considering little relevance of these studies to the thesis of this book, albeit important in terms of addressing linguistic relativity, the review of these studies is limited here.

Also examined was how language or grammatical usage could make workers misconstrue dangerous situations in the workplace. Strømnes observed that the linguistic features of Swedish prepositions could represent space in three dimensions, while Finnish cases could represent space in two dimensions coupled with a third dimension of time or duration. In other words, the Swedish language describes movement in detail in three-dimensional spaces, whereas the Finnish language places emphasis on static and holistic relationships between or among people. This could be extended to the linguistic difference between Indo-European languages and Uralic languages. Indo-European languages (e.g., Swedish, Norwegian, English) tend to form coherent temporal entities in a way that actions are explained linearly from the beginning to the end in the setting. In contrast, Uralic languages (Finnish, Hungarian, Estonian) tend to describe static settings with minimal movement of the person in a way that settings are expressed with the global sentiment of people involved within the setting. Due to these linguistic differences in the emphasis placed in the situation, the Finns tend to organize their work environment in a way that individual workers are more focused (i.e., person-centered) than the work process for overall production. This lack of emphasis on the overall temporal organization of production processes is likely to lead to frequent disruptions in production, and ultimately result in higher occurrences of work-related accidents than Swedish-speaking counterparts. (summarized from Lucy, 1997; see pp. 303-304).

3.2 Studies of Cross-Language Influences

The debate over the linguistic relativity hypothesis has been mainly involved in the monolingual mind. However, Neo-Whorfianism exemplifies universal constraints and cross-cultural regularities. As such, linguistic relativity has been resurrected as an active research topic in psycholinguistics and studies of a second language (L2) or a third language (L3). Jarvis and Pavlenko (2007) employed the linguistic relativity hypothesis as a framework of crosslinguistic influences on bilinguals’ and multilinguals’ minds and learning additional languages regardless of the directionality of cross-language influences (i.e., L1 to L2, L2 to L1, or L2 to L3). The new wave of studies of L2 learning in recent decades in a wide range of areas, including phonetics and phonology, speech perception, lexical access, morphology, reading, and pragmatics, has provided a different perspective on the accounts of linguistic relativity as well as a groundwork for continued research on linguistic relativity.

Negating, at times, helps better explain the phenomenon under consideration. If language does not influence our thoughts, why do speakers of different languages display different perceptions, different worldviews, and different behavioral patterns? If language does not affect our cognition, why do we observe cross-language transfer and how should we interpret it? On a flip side, if our cognition affects language, why does language not change as a result of different thoughts? Language does evolve. However, it hardly evolves due to the change of our thinking or cognition. New words are coined in response to necessity, new technology, new discoveries, or social movements.

Empirical evidence of second language studies generally concurs with the paradigm of linguistic relativity. Bylund and Athanasopoulos (2014) suggest that linguistic relativity be a new approach to L2 research. They underscore neo-Whorfianism in studies of L2 acquisition with refined methodological and theoretical prerequisites for linguistic relativity research, and encourage the use of nonverbal methods to examine the effects of linguistic relativity among L2 speakers to avoid argument circularity (which was one of Pinker’s criticisms about linguistic relativity). In order to demonstrate the extent and the nature of cognitive restructuring in L2 learning as a function of learner variations, Bylund and Athanasopoulos (2014) also call for an identification and delineation of cognitive mechanisms related to the associative learning involved in L2 acquisition and nonverbal behavior. Factors characterizing individual learner trajectories, such as L2 proficiency, L2 contact and use, learning context, and age, need to be taken into account in recalibrating nonverbal behavior among L2 speakers. Pavlenko (1999) also offers a new look at the bilingual mind. Pavlenko (1999) attempted to interpret L1-based description of events among speakers of Russian and English within the framework of the linguistic relativity hypothesis. Although her focus is semantics and concepts in bilingual memory, the results of her study are essentially in support of the relativistic approach.

Recent studies have attempted to tease apart the extent, dimension, and directionality of cross-language transfer. L2 research is especially effective in filling gaps presented in the debate about linguistic relativity. Odlin (2005) adopts the linguistic relativity hypothesis as a theoretical framework to explain cross-linguistic influences, especially to explain conceptual transfer from L1 to L2 or from L2 to L1. While highlighting the intersection between L2 acquisition and linguistic relativity, Odlin (2005) uses the concept of “binding power” of language to the mind or cognition. He points out that even highly skillful speakers of L2 “never free themselves entirely of the ‘binding power’ of L1” (p. 3) in L2 comprehension or production because cognitive templets are established in L1. By a similar token, Slobin (1996) proposes thinking for speaking as a moderate version of linguistic relativity, and notes that an L1-specific worldview affects the subsequent learning of another language.

Pederson et al. (1998) examined spatial relations using prepositions among 13 typologically and genetically different languages. Their linguistic data revealed that prepositions showed functional similarities, but represented different semantics across languages. Their nonlinguistic data showed a correlation between the cognitive frame of reference and the linguistic frame of reference in the same referential domain of spatial arrays among the languages. For example, Dutch speakers used direct deictic locations and gestures (e.g. this one; explicit pointing) to recall the location of objects, while speakers of Arandic, a language belonging to the Pama-Nyungan language family spoken in Australian, used their linguistic system of absolute Geo-cardinal-derived (and intrinsic) information (e.g., north, south) to recall the same objects. Speakers of languages using the absolute frame of reference, such as Tzeltal (Mayan language spoken in Mexico) and Longgu (or Logu; Austronesian language spoken in the Solomon Islands archipelago), tended to show more accurate recall of the location of objects than those who use the relative frame of reference, such as Japanese.

L1 effects on personality perception was also examined (Chen, Benet-Martinez, & Ng, 2014). Chinese-English bilinguals showed more dialectical thinking and differences between self-ratings and observer-ratings of personality when they use Chinese rather than English. They indicate that language affects personality perception and that culture-related linguistic cues are perceived differently according to the language used to fulfill a specific demand.

Since studies of Chinese, Japanese, and Korean in relation to English are reviewed more in-depth in Chapters 8 and 9, I keep this section (of cross-language influences) rather short in this chapter. An expansion on linguistic relativity to script relativity is in order.

4 From Linguistic Relativity to Script Relativity

Lucy (1997) classified three levels of potential linguistic influences on thought: (1) semiotic level, (2) structural level, and (3) functional level. The semiotic level concerns “whether having a code with a symbolic component (versus one confined to iconic-indexical elements) transforms thinking” (p. 292). This inherently refers to the semiotic relativity of thought. The second level, structural level, involves a question of whether the morphosyntactic configuration of meaning affects thought or not. This is basically what the traditional linguistic relativity posits. The last functional level concerns a question of whether the use of language in a particular way affects thought or not. This largely has to do with the context or setting in which language is used (e.g., casual setting vs. academic setting).

Among these three levels, what is most related to my claim, script relativity, is the first level of Lucy’s (1997) classification. Semiotic relativity has not been investigated or drawn scientific attention so far in the discussion of relativism. Given that linguistic relativity has been saturated for more than a half century, for better or worse, we can easily identify what is known so far and what is unknown so far. It is time to extend the linguistic relativity hypothesis to a script relativity hypothesis. In this regard, my claim is to extend semiotic relativity to script relativity. Semiotics is the study of signs, symbols, or sign processes. Although it includes nonlinguistic sign systems, semiotics primarily refers to the linguistic study of signs or symbols because meaning-making is crucial in semiotics.

Signs are by and large arbitrary. The arbitrariness of signs refers to the absence of natural connections between a sign and its sound or between a sign and its meaning. As most written signs are assigned arbitrarily within the writing system, arbitrariness is one of linguistic characteristics that is common among almost all languages. Although a Chinese logographic character signifies a meaning, the Chinese writing system is not free from arbitrariness. This is heightened in simplified characters. Strictly speaking, Chinese is not purely logographic because some signs refer to the morphemes of the word, while others indicate their pronunciation. In this sense, Chinese is a morphosyllabary, as indicated in Chapter 1. Since scripts rely on cultural conventions, each script has a unique convention that evolves over time.

Just like linguistic relativity that postulates that habitual language use results in a unique set of habitual thought and thinking patterns, habitual reading of a particular script has the great potential to yield unique thought processes or patterns in the reader’s mind as an embodied experience. As mentioned in Chapter 1, Logan’s (2004) book entitled The Alphabet Effect captures this point well with the focus on the alphabetic script (regardless of criticisms that the book has received for Eurocentrism and the inaccurate presentation of Chinese characters). Dehaene (2009) notes, as one of the epilogues shows, that brain imaging shows that the fixed neural networks and circuitry of skilled adults’ brains delicately adjust to reading. This suggests that prolonged literacy rewires our brain to be conducive to reading. Hence, it is natural to surmise the consequences of literacy, as many scholars (Goody & Watts, 1963; Logan, 2004; Ong, 1986) postulated before brain imaging technology becomes available.

The concept of the paradigm shift is related to linguistic relativity. The existing paradigm of anti-Whorfianism cannot explain why the same phenomenon is viewed and interpreted differently by different linguistic and cultural groups. This inability can be seen as Kuhn’s (2012) term anomalies that nativists or opponents of linguistic relativity cannot explain. The anomalies have been addressed by extraordinary research of structure-centered and domain-centered subjects as well as L2-related inquiries with advanced research tools, including brain-imaging. Accrued findings have formed a new paradigm, which is neo-Whorfianism. If the paradigm shift from anti-Whorfianism to neo-Whorfianism is tenable, the extension of linguistic relativity, which is script relativity, has a sound ground. Hence, it can be said that script relativity is an offspring of the new paradigm shift.

Since I will gradually develop the thesis, script relativity, throughout this book, I use this section as a signal to a more in-depth discussion of the thesis in the following chapters in Part II, and, therefore, I keep this section rather short. In the meantime, I would like the reader to think about competitive plausibility between the pro-Whorfianism and the anti-Whorfianism. If Whorfianism is more plausible to explain how our perception and thought patterns are molded, I ask the reader again to think about how we are affected by what we read everyday. If you are a bilingual and biliterate individual, I ask you to think about the script-shifting between your most comfortable script in which you read and less comfortable script. If you are like me, you are likely to see differences in reading two scripts. I can sense differences in my eye movement and attention I pay within the passage during reading in Korean and English. I will cover the alphabet and nonalphabetic scripts in the following chapters for a comparison purpose. The Chinese, Japanese, and Korean writing systems are considerably different from the Roman alphabet. Although it is classified as an alphabetic script, the Korean writing system is discussed along with Chinese and Japanese as a batch of the East-Asian scripts due to its unambiguous syllabic configuration. In the following Part II section, discussed are the alphabet, the three East-Asian scripts, the difference between the East-West, and psycholinguistic and neurolinguistic evidence of script relativity.