It is widely accepted that during spoken word recognition phonological neighbors, or words that sound similar to the target word, are activated and subsequently compete with the target word for recognition. Furthermore, numerous studies have found that words with fewer competitors or phonological neighbors are recognized more quickly and accurately than words with many competitors or phonological neighbors (for a review see Vitevitch & Luce, 2016).

In addition to the number of phonological neighbors influencing spoken word recognition, it has been shown that the internal structure of a word’s neighborhood also influences spoken word recognition. The internal structure of a word’s neighborhood can be quantified via the clustering coefficient, C, which represents the extent to which a word’s neighbors are also neighbors of each other. Recent work has shown that C influences various aspects of language processing, including spoken word recognition (Chan & Vitevitch, 2009), word production (Chan & Vitevitch, 2010), word learning (Goldstein & Vitevitch, 2014), and short- and long-term memory processing (Vitevitch, Chan, & Roodenrys, 2012). In general, controlling for neighborhood size, words with low C (less interconnected neighborhoods) are recognized and produced more quickly and accurately than words with high C (more interconnected neighborhoods).

The influence of C on processing suggests that simply counting the number of phonological neighbors may not capture all of the sources of competition that occur during word recognition. Rather, it appears that activation and competition are also influenced by subtle differences in the structure of a word’s neighborhood and not simply modulated by the number of activated, competing word-forms. Given that prior work has shown that the connectivity of a word’s immediate neighborhood influences lexical processing (as measured by C), the present report investigated if the connectivity of a word’s distant neighbors also influences processing.

The results reported in Siew and Vitevitch (2016) strongly suggest that it is important to consider the influence of a word’s distant neighbors on language processing. Siew and Vitevitch identified words residing in the largest connected component of the phonological language network consisting of more than 6,000 words (giant component words) and words residing in smaller connected components of the network, ranging in size from 2 to 53 (lexical island words). Words were matched on a number of lexical characteristics and differed only on the number of distant neighbors. Giant component words, which reside in the largest connected component of the network, have several more distant neighbors as compared to lexical island words, which are found in smaller connected components of the network.

Lexical island words were more quickly responded to than giant component words in various tasks. Siew and Vitevitch suggest that, as compared to giant component words, lexical island words had fewer competitors overall and therefore were able to “stand out” more and were more readily recognized. On the other hand, giant component words had several distant neighbors overall that made it difficult for these words to stand out from its competitors and were thus less readily recognized. Given that these results demonstrated that the mere presence (number) of distant connections has measurable effects on spoken word recognition, the present paper examines whether the connectivity of a word’s distant neighbors also influences spoken word recognition.

To quantify the connectivity of a word’s distant neighbors, I used the computational tools of Network Science, an emerging area of complexity science that uses mathematical techniques to study complex systems in diverse fields, such as telecommunications and biological systems (Barabási, 2009). The tools of Network Science have been applied in the cognitive and language sciences to investigate how the organization of phonological word-forms in the mental lexicon influences language processing (Vitevitch, 2008).

In the phonological network of language described in Vitevitch (2008), links were placed between words that were phonologically similar to each other. Two words were considered phonologically similar if the first word could be transformed to the other by substituting, adding, or deleting one phoneme in any position (Luce & Pisoni, 1998). Based on this operationalization, words such as /cʌt/ (“cut”) and /æt/ (“at”) are phonological neighbors of the target word /kæt/ (“cat”). These words are immediate phonological neighbors of /kæt/ (i.e., the grey nodes in Fig. 1). On the other hand, words such as /cʌp/ (“cup”) and /ænt/ (“ant”) are distant phonological neighbors of /kæt/ (i.e., the white nodes in Fig. 1), because they are indirectly connected to the target via /cʌt/ and /æt/ respectively. Although it is possible to consider the interconnectivity of distant neighbors that are several steps removed from the target (in some cases up to 29 steps; see Vitevitch, Goldstein, & Johnson, 2016), in the present paper I focused on the interconnectivity of the distant neighbors that are two steps removed from the target (i.e., the white nodes in Fig. 1). To examine whether the level of connectivity of distant neighbors influences lexical processing, I introduced a new Network Science metric known as 2-hop density, which measures the level of interconnectivity that exists among a word’s immediate and distant neighbors, and conventional psycholinguistic tasks were conducted to determine whether any processing differences exist for words with high or low 2-hop density.

Fig. 1
figure 1

Illustration of a word with low 2-hop density (left) and a word with high 2-hop density (right). The target word is shown in black, its 1-hop neighbors are shown in gray, its 2-hop neighbors are shown in white. For visual clarity, only a few connections are shown. Note that both words have the same number of 1-hop and 2-hop neighbors

Experiment 1: Naming

In the present experiment, a conventional psycholinguistic task was used to examine how 2-hop density might influence lexical processing. In the auditory naming task, participants repeated the words they heard out loud as quickly and accurately as possible.

Method

Participants

Thirty native English speakers were recruited from the Introductory Psychology subject pool at the University of Kansas. All participants had no previous history of speech or hearing disorders and received partial course credit for their participation.

Materials

Forty English words were selected as stimuli. Half of the stimuli had high 2-hop density and half had low 2-hop density. A male native speaker of American English produced the stimuli by speaking at a normal speaking rate into a high-quality microphone in an Industrial Acoustics Company sound-attenuated booth. Individual sound files for each word were edited from the digital recording with SoundEdit16 (Macromedia, Inc). The Normalization function in SoundEdit16 was used to ensure that all sound files were comparable in amplitude. Due to a recording error, one low 2-hop density word (“scurry”) was excluded from all subsequent analyses. Therefore, there were 20 high 2-hop density words and 19 low 2-hop density words.

Stimuli comparisons

Word stimuli were selected such that high and low 2-hop density words were matched on a number of characteristics known to influence processing. These include traditional lexical variables, such as word length (number of phonemes), number of syllables, subjective familiarity (measured on a 7-point scale, where 7 indicates “You recognize the word and are confident that you know the meaning of the word” and 1 indicates “You have never seen the word before”; Nusbaum, Pisoni, & Davis, 1984), word frequency (represented by log-base 10 of frequency counts from the SUBTLEXUS corpus; Brysbaert & New, 2009), age of acquisition ratings (Kuperman, Stadthagen-Gonzalez, & Brysbaert, 2012), neighborhood frequency (mean word frequency of the phonological neighbors of a word), phonotactic probability (positional segment probability: probability that a segment occurs in a certain position of a word; biphone probability: probability that two adjacent segments co-occur in a word; Vitevitch & Luce, 2004), as well as network science measures, such as number of phonological neighbors (also known as degree in the network science literature), clustering coefficient, and number of 2-hop neighbors. The duration and amplitude of sound files for these words also were matched. Descriptive statistics and t-tests for the stimulus comparisons are shown in Table 1. Because network science measures were more recently developed as compared to traditional psycholinguistic variables, network science measures are described in more detail below.

Table 1 Lexical characteristics of high 2-hop density and low 2-hop density words
  • 1-hop degree. Within the context of the phonological network, 1-hop degree refers to the number of words that are directly connected to the target word. These words are also known as 1-hop neighbors, because they are one “hop” or step away from the target node (i.e., the grey nodes in Fig. 1). For a given target word /kæt/ (“cat”), examples of 1-hop neighbors include /kʌt/ (“cut”) and /æt/ (“at”). In the traditional psycholinguistic literature, these words are considered to be phonological neighbors of the target word (Luce & Pisoni, 1998). To be consistent with the terminology used in the network science literature, 1-hop degree will be used to refer to the number of words that are directly connected to the target word (i.e., number of 1-hop neighbors).

  • Clustering coefficient. Clustering coefficient, C, refers to the extent to which the (immediate) 1-hop neighbors of a word are also neighbors of each other. To calculate C, the number of connections between 1-hop neighbors of a target word was counted and divided by the number of possible connections that could exist among the 1-hop neighbors. Therefore, C is the ratio of the actual number of connections existing among 1-hop neighbors to the number of all possible connections among 1-hop neighbors if every 1-hop neighbor were connected (Batagelj & Mrvar, 1998). For a definition of C in the context of the phonological network, see Chan and Vitevitch (2010). C ranges from 0 to 1; when C = 1 all 1-hop neighbors of a word are neighbors of each other, and when C = 0 no 1-hop neighbors of a word are neighbors of each other.

  • 2-hop degree. Within the context of the phonological network, 2-hop degree refers to the number of words that are indirectly connected to the target word via the target’s immediate neighbors. These words are also known as 2-hop neighbors, because they are two “hops” or steps away from the target node (i.e., the white nodes in Fig. 1). For a given target word /kæt/ (“cat”), examples of 2-hop neighbors include /kʌp/ (“cup”) and /ænt/ (“ant”), which are indirectly connected to the target via its 1-hop neighbors /kʌt/ (“cut”) and /æt/ (“at”) respectively. To be consistent with the terminology used in the network science literature, 2-hop degree will be used to refer to the number of words that are directly connected to the target word (i.e., number of 2-hop neighbors).

  • 2-hop density. Mathematically, 2-hop density is calculated in a similar manner as clustering coefficient. Note that Vitevitch et al. (2012) reported that 1-hop degree and C are not correlated; by extension one would not expect 2-hop degree and 2-hop density to be correlated. Whereas C represents the ratio of the actual number of connections existing among a word’s 1-hop neighbors to the number of all possible connections among its 1-hop neighbors, 2-hop density represents the ratio of the actual number of connections existing among a word’s 1-hop and 2-hop neighbors to the number of all possible connections among its 1-hop and 2-hop neighbors. These values were calculated in Gephi—a freely accessible graph visualization software (Bastian et al., 2009). Therefore, 2-hop density is a measure of the level of interconnectivity among a word’s 1-hop and 2-hop neighbors. As high 2-hop density words and low 2-hop density words are matched on the number of 2-hop neighbors (2-hop degree), the number of 1-hop neighbors (1-hop degree), as well as the level of interconnectivity among 1-hop neighbors (as represented by C), the key difference between the two sets of words lies in the level of interconnectivity among 2-hop neighbors (Fig. 1).

Procedure

Participants were tested individually. Each participant was seated in front of an iMac computer that was connected to a New Micros response box. PsyScope 1.2.2 was used to randomize and present the stimuli via BeyerDynamic DT100 headphones at a comfortable listening level. A response box containing a dedicated timing board provided millisecond accuracy for the recording of response times.

In each trial, the word “READY” appeared on the screen for 500 ms. Participants heard one of the randomly selected stimuli and were instructed to repeat the word as quickly and accurately as possible. Reaction times were measured from stimulus onset to the onset of the participant’s verbal response. Verbal responses were recorded for offline scoring of accuracy. The next trial began 1 s after the participant’s response was made. Prior to experimental trials, each participant received five practice trials to become familiar with the task; these trials were not included in the subsequent analyses.

Results and Discussion

Reaction times and accuracy were the dependent variables of interest. Reaction times less than 500 ms and larger than 2,000 ms (which approximated ±2 SD from the mean) were considered to be outliers and were excluded from the data. This resulted in the removal of < 1 % of the data. Only accurate word responses with reaction times between 500 ms and 2,000 ms were included in the analysis of reaction times. A within-participants ANOVA was used to analyze the data.

Participants named low 2-hop density words (M = 931 ms, SD = 109) more quickly than high 2-hop density words (M = 950 ms, SD = 104), F (1, 29) = 15.53, p < .001, η p 2 = .349. There were no accuracy differences among low 2-hop density words (M = 99.1 %, SD = 2.0) and high 2-hop density words (M = 98.2 %, SD = 3.3), F (1, 29) = 2.10, p = .16, η p 2 = .068.

The results from Experiment 1 indicated that words with low 2-hop density were named more quickly than words with high 2-hop density—words found in less interconnected 2-hop neighborhoods were named more quickly than words found in more interconnected 2-hop neighborhoods.

Experiment 2: Lexical Decision

Experiment 2 sought to replicate the above finding in another commonly used psycholinguistic task—auditory lexical decision. Participants are presented with words and nonwords and have to decide if the given stimulus was a real word or not. In a lexical decision task, the real words are the important stimuli of interest, whereas nonwords (made-up words) acted as foils in the experiment.

Method

Participants

Forty native English speakers were recruited from the same population described in Experiment 1. All participants were right-handed and had no previous history of speech or hearing disorders; none took part in Experiment 1.

Materials

The word stimuli for the present experiment consisted of the same 40 words used in Experiment 1. In addition, a list of 40 phonotactically legal nonwords was constructed by replacing a phoneme (at any position except the first and last positions) of the word stimuli with another phoneme. For instance, the nonword vilt (/vilt/) was created by replacing /ɔ/ in vault (/vɔlt/) with /i/. The phonological transcriptions of the nonwords are listed in Appendix B. The nonwords were recorded by the same male speaker in a similar manner as in Experiment 1. The same method for editing and digitizing the word stimuli was used to create individual sound files for each nonword. The Normalization function in SoundEdit16 was used to ensure that all word and nonword sound files were comparable in amplitude. The data obtained for these nonwords from the lexical decision task would not be analyzed in the Results section. The duration of the stimulus sound files was equivalent across both words and nonwords.

Procedure

Participants were tested in groups no larger than three. The same equipment used in Experiment 1 was used in the present experiment, except that a response box containing a dedicated timing board was used to record response times.

In each trial, the word “READY” appeared on the screen for 500 ms. Participants heard one of the randomly selected stimuli and were instructed to decide, as quickly and accurately as possible, whether the item heard was a real English word or a nonword. If the item was a word, participants pressed the button labeled “WORD” with their right (dominant) index finger. If the item was a nonword, participants pressed the button labeled “NONWORD” with their left index finger. Reaction times were measured from stimulus onset to the onset of the participant’s button press. The next trial began 1 s after the participant’s response was made. Prior to experimental trials, each participant received eight practice trials to become familiar with the task; these trials were not included in the subsequent analyses.

Results and Discussion

Reaction times and accuracy were the dependent variables of interest. Reaction times less than 500 ms and larger than 2,000 ms (approximately ±2 SD) were considered to be outliers and were excluded from the data. This resulted in the removal of < 1 % of the data. Only accurate word responses with reaction times between 500 ms and 2,000 ms were included in the analysis of reaction times. A within-participants ANOVA was used to analyze the data.

Participants responded to low 2-hop density words (M = 955 ms, SD = 118) more quickly than high 2-hop density words (M = 987 ms, SD = 113), F (1, 39) = 14.30, p = .001, η p 2 = .268. There was no difference in accuracy for the low 2-hop density words (M = 91.45 %, SD = 6.6) and high 2-hop density words (M = 92.25 %, SD = 7.1), F (1, 39) < 1, p = .54, η p 2 = .01.

The results from Experiment 2 mirrored that of Experiment 1—words with low 2-hop density were responded to more quickly than words with high 2-hop density. Together, the results indicate a processing advantage for words found in less interconnected 2-hop neighborhoods as compared to words found in more interconnected 2-hop neighborhoods.

General Discussion

The present findings showed that the connectivity of distant, indirect phonological neighbors influences language processing. Specifically, across both tasks, words with less interconnected 2-hop neighborhoods (i.e., low 2-hop density) were responded to more quickly than words with more interconnected 2-hop neighborhoods (i.e., high 2-hop density).

This finding of a processing advantage for words with low 2-hop density compared to words with high 2-hop density can be accounted for via a simple diffusion framework implemented in the phonological network described by Vitevitch (2008). In this framework, activation is spread back and forth between the target, its 1-hop and 2-hop neighbors, and other words in the phonological network. Over time greater amounts of activation remain among 1-hop and 2-hop neighbors of words that reside in more interconnected 2-hop neighborhoods (i.e., high 2-hop density words), rather than diffusing to the rest of the phonological network. On the other hand, relatively less activation remains among 1-hop and 2-hop neighbors of words that reside in less interconnected 2-hop neighborhoods (i.e, low 2-hop density words) as most of the activation spreads to the rest of the phonological network. Based on this account, it is more difficult for a high 2-hop density word to “stand out” from its competitors (1-hop and 2-hop neighbors) than for a low 2-hop density word to do so. Note that an analogous explanation was used to account for the clustering coefficient effect (Chan & Vitevitch, 2009; see also the computer simulation reported in Vitevitch et al., 2011). It is worthwhile to note briefly the limitations of the diffusion framework described above, particularly with regards to the phonological Levenshtein distance (PLD20) effects reported in Suárez, Tan, Yap, and Goh (2011), where lexical hermits (i.e., words with no phonological neighbors based on the traditional 1-phoneme similarity metric) with “close” Levenshtein neighbors were less quickly recognized than lexical hermits with “distant” Levenshtein neighbors (see also Yarkoni, Balota, & Yap, 2008). Within the context of the phonological language network these lexical hermits would not possess any direct connections, and it is unclear how the diffusion framework might account for PLD20 effects.

Nevertheless, the finding that the amount of interconnectivity among a word’s indirect neighbors influences processing poses a challenge to widely accepted models of spoken word recognition and speech perception (such as the Cohort Model (Marslen-Wilson, 1987; Gaskell & Marslen-Wilson, 1997), TRACE (McClelland & Elman, 1986), Shortlist B (Norris & McQueen, 2008), the Neighborhood Activation Model (Luce & Pisoni, 1998), and its computational instantiation PARSYN (Luce, Goldinger, Auer, & Vitevitch, 2000)). Current models of spoken word recognition do not explicitly consider how phonological similarity beyond that of a word’s local neighbors influences processing. Simulations of jTRACE in Chan and Vitevitch (2009) showed that TRACE could not account for the clustering coefficient effect, which is a measure of the interconnectivity of the structure of a word’s local neighborhood. However, the diffusion network in Vitevitch et al. (2011) was able to account for the clustering coefficient effect. Therefore, it is unlikely that TRACE or other models of spoken word recognition could account for the present finding that the interconnectivity of a word’s distant structure influences lexical retrieval.

The Network Science approach can advance the field of psycholinguistics in two key ways. From a theoretical standpoint, the Network Science approach compels language researchers to consider how the structure of the mental lexicon, the part of long-term memory where lexical representations are stored, influences various aspects of language processes. Within complex systems, researchers recognize that a complete understanding of how a complex system works is not possible without also considering the structural properties of that system (Strogatz, 2001). Language systems are no exception. However, current models of spoken word recognition have typically focused on delineating language processes without an explicit consideration of how the structure of the mental lexicon affects these processes.

From a methodological perspective, the Network Science approach provides language researchers with tools to measure the structure of the mental lexicon at various levels of the system. Specific to the phonological network, this approach can be used to model and quantify the phonological similarity of words at various levels of the network. Local similarity measures, such as 1-hop degree (number of immediate phonological neighbors) and C (which quantifies the internal structure of a word’s immediate neighborhood), and broader similarity measures, such as 2-hop degree (number of indirect phonological neighbors) and 2-hop density (which quantifies the internal structure of a word’s 2-hop neighborhood), can be generated, which can then be used to test specific hypotheses about how the structure of the mental lexicon affects language processes.

Earlier studies showed that the immediate structure of the lexicon influenced processing (Chan & Vitevitch, 2010; Goldstein & Vitevitch, 2014; Vitevitch et al., 2012). The experiments reported in this paper build on this body of work by demonstrating, for the first time, that the structure of a word’s distant neighborhood influences spoken word recognition as well, indicating that lexical processes are sensitive to the level of connectivity among a word’s distant neighbors.