Priming and models of language processing

As in most domains of cognitive psychology, understanding the processes that underlie language behavior is an empirical challenge because these processes cannot be observed directly. Instead, the field relies on analyses of performance in language-oriented tasks to shed light on these processes and then to formulate models of language representation and use. A classic example is the use of priming tasks, which have provided extensive insight into the ways that language knowledge is stored, activated, and used during comprehension and production (Pickering & Ferreira, 2008). Our previous work (Tooley, Konopka, & Watson, 2014) suggested that, unlike every other level of linguistic representation tested so far, one aspect of prosody—intonational phrase boundaries (IPBs)—is not amenable to priming. A further puzzle is that previous work has suggested that another aspect of prosody—speech rate—is primeable (Jungers & Hupp, 2009; Jungers, Palmer, & Speer, 2002). In this article, we first aim to replicate this prosodic priming asymmetry in one experiment; then we investigate priming for IPBs and pitch accenting in two further experiments, to assess the similarity of the underlying representations of these aspects of prosody. Below we first discuss the role of priming in language research and the types of inferences made from priming effects. Then we present three studies investigating the priming of intonational boundaries, speech rate, and pitch accenting.

Priming as a method

The term priming usually refers to a facilitation of a construction/retrieval mechanism that the language user deploys due to recent experience with similar representations. For example, in lexical priming studies, participants are normally faster in lexical decision tasks after exposure to related words (e.g., Meyer & Schvaneveldt, 1971). These findings helped to motivate models of semantic memory (Collins & Loftus, 1975; Collins & Quillian, 1970; see also Gabora, Rosch, & Aerts, 2008; Rosch, 1975) and lexical ambiguity resolution (e.g., Swinney, 1979), and they paved the way for more specific models of lexical access during comprehension (McClelland & Elman, 1986; Vigliocco, Vinson, Damian, & Levelt, 2002).

More recently, priming tasks have also been used to investigate complex representations like syntactic structure. Studies of syntactic priming show that speakers reuse recently processed structures when producing novel sentences, such as when describing pictured events or completing sentence preambles (Bock, 1986; see Pickering & Ferreira, 2008, for reviews). Priming also occurs in dialogue, including alignment of terminology (Brennan & Clark, 1996; Schober & Clark, 1989) and syntactic structures (Branigan, Pickering, & Cleland, 2000) between conversation partners, as well as higher-level representations, like situation models (Schober, 1993). Such findings helped motivate the interactive alignment model (Garrod & Pickering, 2004), which posits a language mechanism that causes language users to adapt their linguistic representations to those of their conversation partners in order to facilitate communication.

Thus, priming tasks provide a fruitful avenue to study the mental representations and processing of language: The extent to which priming is observed for a specific aspect of language has implications for whether, and how, that information is represented during processing. Recently, priming tasks have also been used to investigate how different aspects of prosody are represented and planned during production (Jungers & Hupp, 2009; Tooley et al., 2014). This line of research provides novel and important evidence about the nature of prosodic representations and helps to establish where prosody fits in a process model of language production.

Priming for prosodic representations

Prosody refers to acoustic aspects of spoken language that are not specific to individual vowels or consonants, but to larger units such as words or phrases in an utterance, such as rhythm, pitch, intonation, and speech rate. Intonational phrasing refers to perceptual groupings of words within an utterance. Intonational phrases are separated by intonational phrase boundaries (IPBs), which are perceived as pauses, and can be recognized by a pause in sound energy and/or lengthening of the preboundary word and tonal movement at the end of the phrase (Pierrehumbert & Hirschberg, 1990; see Wagner & Watson, 2010, for a review). How this aspect of prosody is represented and planned during production processes is still unclear. However, if IPBs are a structuring property of spoken language (grouping words together in time) in the same way that syntax is a structuring property of language (grouping words into grammatical phrases), they may be expected to prime in the same way that syntactic structures can be primed.

Yet, our earlier work in this area found no evidence of priming for IPBs (Tooley et al., 2014). We manipulated the presence of a boundary at two syntactic locations in prime sentences that were presented to participants auditorily. Participants repeated the prime sentences, and then silently read and repeated a visually presented target sentence out loud, from memory. In three experiments, participants produced pauses at the primed locations in the prime sentences they repeated, but these effects did not carry over to the target sentences. This was the case whether participants repeated back the prime sentence (Exp. 2) or not (Exp. 3) before receiving the target sentence. These findings are markedly different from priming effects observed at other levels of linguistic representation, including syntax, where experience with a prime sentence reliably affects syntactic choice in a target sentence (e.g., Bock, 1986; Pickering & Branigan, 1998). To our knowledge, intonational boundary production may be the only aspect of linguistic representation reported thus far that is not amenable to priming.

One possible explanation for the lack of IPB priming is that speakers do not create a separate, abstract plan for when and where to produce boundaries in a sentence. Thus, there may be no representation to prime. Instead, boundaries may be triggered by cues from the syntactic and semantic processing stages (see Tooley et al., 2014, for a production model that incorporates this account of IPBs).

Although these findings offer an explanation for how one aspect of prosody may be represented during planning, they also present a puzzle. There have been a number of reports of robust priming effects for different aspects of prosody. For example, interlocutors’ F0 and intensity become more alike over the course of a conversation (de Looze, Scherer, Vaughan, & Campbell, 2014; Levitan & Hirschberg, 2011; Ward & Litman; 2007). This entrainment is linked to real-world behavior. For example, the amount of prosodic convergence that occurs in a conversation can be used to predict positive and negative affect in couples undergoing marriage counseling (Lee et al., 2010). Additional work suggests that prosodic entrainment can interact with the content of the conversation: Couples who entrain while discussing a conflict are less likely to resolve that conflict (Weidman, Breen, & Haydon, 2016). While this line of work reveals a strategic or communicative dimension to prosodic entrainment, it also points to the possibility of priming occurring for the underlying prosodic representation.

There is also clear evidence of prosodic priming for speech rate in nonconversational tasks that closely resemble traditional priming paradigms (Jungers & Hupp, 2009; Jungers et al., 2002). Jungers and Hupp auditorily presented participants with three prime sentences, spoken at a fast or slow rate, while they viewed a clipart image depicting the meaning of each sentence. Participants were then asked to describe a new target image. The rate of their productions for target sentences depended on the rate of the prime sentences they heard (target speech rate was faster after fast primes and slower after slow primes). The fact that encountered speech rate does prime speaking rate of later utterances suggests that speech rate may be planned separately from other representations.

If different aspects of prosody share a common type of underlying representation and are planned at a similar stage of processing, then in principle they should be equally primeable. Yet the evidence presented above suggests that this is not the case: Unlike with speech rate, we have no evidence that IPBs can be primed. We extend this work in the present study by testing whether the same type of linguistic representations underlie three different aspects of prosody: speech rate, intonational boundaries, and pitch accents. We use a priming paradigm again to test whether production of these aspects of prosody persists from one sentence to another.

The goal of Experiment 1 was to verify that the priming asymmetry between speech rate and IPBs exists when investigated within the same experiment. Thus, assessing priming for speech rate and IPBs in one experiment is critical to eliminate the possibility that differences in participants and methodology across studies produced the observed differences in priming. Next, Experiment 2 verified whether priming for intonational phrase boundaries may occur only when these boundaries have communicative value. Finally, Experiment 3 tested the validity of our conclusions for IPBs by examining priming of an aspect of prosody that has not been investigated previously: pitch accenting. If the lack of priming for IPBs is due to processing constraints or representational constraints on this specific aspect of prosody, one might expect other aspects of prosody (i.e., pitch accenting) to show priming. However, if both IPBs and pitch accenting are found to be immune to priming manipulations, this supports the claim that not all aspects of prosody are subsumed under the same processing stage.

Experiment 1

Experiment 1 used a prime–target paradigm to test whether boundary placement and speech rate of prime sentences can influence the production of new target sentences. Participants listened to and immediately repeated back the prime sentences they heard. These sentences either had no intonational phrase boundaries or had a boundary spliced in at a syntactically preferred location. The sentences were then resynthesized to be either 10% faster or 10% slower than the originally recorded speaking rate (i.e., the naturally produced rate of the speaker). Primes were followed by target trials, in which speakers silently read a novel sentence and then repeated it aloud from memory. Durational and perceptual measures were used to determine whether participants persisted in producing the speech rate and IPBs in the target sentences that they had heard in the primes.

Method

Participants

In all, 64 students from the University of Illinois participated for course credit. Participants in all three experiments were native speakers of English with normal hearing and normal (or corrected-to-normal) vision.

Materials

We used the experimental sentences from Tooley et al. (2014). The experimental set consisted of 40 items: 20 sentences with relative clauses (e.g., The dolphin that tossed the ball wanted a reward for his trick) and 20 sentences with main clauses (e.g., The girl bought new clothes at the mall today; Appx. 1). Two sentences with the same syntactic structure were yoked together to create 20 prime–target pairs (ten main-clause and ten relative-clause pairs).

To create the boundary manipulation, two versions of each sentence were initially recorded by a native English speaker: one with and one without an IPB at the critical location. The critical boundary location followed the second noun (e.g., The dolphin that tossed the ball // wanted a reward for his trick), as the clause boundary and boundary between the noun and verb phrases make this a natural location for a boundary (e.g., Truckenbrodt, 1999; Watson & Gibson, 2004). For the purpose of our analyses, the critical boundary region includes the word immediately preceding the boundary, the boundary itself, and the word after the boundary. All experimental sentences were created by splicing critical regions from recordings of each condition into a neutral carrier sentence that had no prosodic boundaries. This ensured that the prosody of regions that were outside of the critical region did not influence perception of the critical region. This splicing procedure was used to create sentences in the control condition (with no boundaries) as well as the experimental condition (with a boundary at the critical region). On average, sentences with a boundary were approximately 400 ms longer than those with no boundaries. The sentences were then subjected to a rate manipulation using a rate-resynthesizing script in PRAAT that created two sentence versions that were 10% slower and 10% faster than the original sentences, respectively. (The stimuli are available at the following link: https://dataverse.tdl.org/dataset.xhtml?persistentId=doi:10.18738/T8/LHQZDQ).

The boundary manipulation crossed with the rate manipulation yielded four conditions: fast sentences without boundaries, fast sentences with boundaries, slow sentences without boundaries, and slow sentences with boundaries. Both factors were counterbalanced within-participants and within-items, so each participant saw each sentence in only one of these conditions. Within lists, each participant received five items in each of the four conditions. Additionally, each sentence could appear both as a prime and as a target. Thus, we created eight lists of stimuli to counterbalance the four conditions as well as the prime/target status of each sentence (referred to as sentence position below) on that list. The experimental sentences were arranged such that no more than two items from the same condition followed one another. Targets always immediately followed primes, and three filler sentences intervened between all prime–target pairs.

Filler sentences included a variety of syntactic structures (e.g., cleft constructions, sentences with fronted prepositional phrases, sentences with that-complements, and sentences with fronted temporal phrases). To reduce the salience of the manipulations in the primes, the fillers also varied with respect to IPBs and speaking rates. Roughly, one half of the fillers had one boundary, one quarter had two boundaries, and one quarter had no boundaries. Boundaries were produced naturally by the speaker and did not include any splicing. Half of the filler sentences were presented at the original recording rate, one quarter were resynthesized to be 10% faster, and one quarter were resynthesized to be 10% slower.

Procedure

The procedure used was the same as in Tooley et al.’s (2014) second experiment. Participants were told that they would either hear recorded sentences or read sentences printed on the screen. After either hearing a sentence or silently reading a sentence, their task was repeat the sentence back out loud. If the sentence was presented auditorily, the word LISTEN appeared and remained on the screen while the recording played. At sentence offset, the word REPEAT appeared on the screen to prompt participants to repeat the sentence. Participants then pressed the spacebar to advance to the next sentence. If the sentence was presented visually (i.e., if it was printed on the screen), participants first saw the word READ for 1 s, followed by the sentence. The sentence remained on the screen for an amount of time equal to 50 ms multiplied by the number of words in the sentence. After that amount of time had elapsed, the word REPEAT appeared on the screen, prompting participants to repeat the sentence aloud from memory. Participants then pressed the spacebar to advance to the next trial.

The prime sentences were always listen-and-repeat trials (as these recordings contained the manipulations), and the target sentences were always read-and-repeat trials (so they were prosodically neutral). Roughly half of the filler sentences were randomly assigned to be presented as listen-and-repeat trials, and half as read-and-repeat trials. The modality of fillers remained constant across all eight lists, and varied throughout the experiment to reduce the predictability of the trial type. The experiment started with a practice block of four listen-and-repeat and four read-and-repeat sentences, presented in a pseudorandom order.

Scoring and analyses

We excluded responses in which participants changed the syntactic structure of the sentence, paused for extended periods of time (average pause time of 1.36 s. for excluded trials), produced disfluencies at or near the critical sentence region, or produced sentence fragments. Minor wording changes were acceptable. These exclusion criteria left 1,131 trials (out of 1,280 total trials) for analysis. Participants’ boundary productions were assessed in two ways: One coder (the first author) rated whether or not a boundary was discernible in the critical region, and a second coder (the second author) measured the duration of the preboundary word through the onset of the first postboundary word. Total speaking time of each sentence was also measured. Coders were blind to condition in all experiments.

Analyses were carried out in R (R Development Core Team, 2008) using logit mixed models for the measure of perceived intonational boundaries, and linear mixed-effects models for the analyses of word-and-pause durations and total sentence speaking durations (Baayen, Davidson, & Bates, 2008; Jaeger, 2008). Prime boundary (present vs. absent), speech rate (fast vs. slow), and sentence position (prime vs. target) were included as mean-centered fixed effects (along with all interactions), and all models estimated random effects for participants and items. In all experiments, the maximal version of the models (warranted by the design) was used unless this resulted in nonconvergence. In those cases, random effects were removed on the basis of the size of their variance components (smaller effects were removed first) until the model reached convergence. All effects were considered significant at α < .05.

Results and discussion

Sentence speaking duration

The listen-and-repeat (prime) sentences were spoken faster when the original recording was the fast sentence version, and slower when the original recording was the slow sentence version (Fig. 1, left panel). This effect carried over into the read-and-repeat (target) sentences (Fig. 1, right panel).

Fig. 1
figure 1

Total sentence speaking durations: Mean total sentence speaking durations for (left) prime sentences and (right) target sentences across the boundary and speaking rate conditions. Error bars represent 95% confidence intervals

The analysis of overall speaking times showed significant main effects of speech rate and boundary, as well as an interaction between speech rate and sentence position (Table 1). Participants spoke faster and slower after hearing fast and slow prime sentences, respectively, but this effect was smaller in the targets than in the primes. Participants also spoke more slowly when they heard a prime sentence with a boundary. Follow-up analysis of these effects in target sentences alone revealed a significant effect of prime speaking rate, suggesting that the speaking rate of the prime sentences did influence participants’ speaking rates in the targets.

Table 1 Analyses of total sentence speaking duration for all sentences (primes and targets)

Production of intonational phrase boundaries

The repeated prime sentences were longer and contained a perceived boundary more often when the recorded prime sentence also had a boundary (Fig. 2, left panel). This effect, however, did not carry over into the read-and-repeat (target) sentences (Fig. 2, right panel).

Fig. 2
figure 2

Perceived pauses: Proportions of perceived pauses in (left) the prime sentences and (right) the target sentences, across the boundary and speaking rate conditions. Error bars represent 95% confidence intervals

The overall analysis of perceived pauses revealed a main effect of boundary, a marginal effect of speaking rate, and an interaction between boundary and sentence position (Table 2). Participants were slightly more likely to produce a boundary after hearing a slow-rate prime. They also produced boundaries at the critical region more often when they were primed to do so, but this effect depended on sentence position: Participants reproduced the heard boundaries in the prime sentences but did not generalize these boundaries to the target sentences. A follow-up analysis restricted to the target sentences confirmed that the effect of speaking rate was significant (p = .037) but the effect of boundary was not (p = .10). In other words, hearing a slower prime increased the chances that participants would produce a boundary in the target, but hearing a boundary in the prime sentence did not.

Table 2 Analyses of perceived boundary production and word-and-pause durations at the critical sentence region of all sentences (primes and targets)

A similar pattern was observed with word-and-pause durations: Word-and-pause durations were longer in sentences produced after hearing slow primes than in sentences after fast primes, and after hearing primes with boundaries than after primes without boundaries (Fig. 3). This resulted in main effects of boundary and speaking rate but no interaction (Table 2). Importantly, we observed an interaction between boundary and sentence position, in which the effect of boundary on word-and-pause durations was limited to the repeated prime sentences. There was also a weak interaction between speaking rate and sentence position, in which the effect of speaking rate was again limited to the repeated prime sentences.

Fig. 3
figure 3

Word-and-pause durations: Mean word-and-pause durations for the critical regions of (left) the prime and (right) the target sentences, broken down by boundary and speaking rate conditions. Error bars represent standard errors

Thus, Experiment 1 replicated previous studies (Jungers & Hupp, 2009; Jungers et al., 2002; Tooley et al., 2014): Participants persisted in their use of a faster or slower speaking rate when the recorded prime sentences, respectively, also had a faster or slower rate. However, they did not persist in their use of intonational boundaries at the critical target sentence location when the prime sentence contained a boundary at that location. This supports the observation that speaking rate is much more amenable to priming than is the production of IPBs. Thus, different types of underlying representations and/or processes may be involved in the production of IPBs and speaking rate.

Interestingly, when participants heard a slow-rate prime, their production of the target sentence was more likely to contain a boundary. Likewise, the presence of a boundary in the prime sentence resulted in participants taking longer to produce the target sentence. Participants may have perceived an overall speaking rate that was slower in a prime sentence with a boundary, leading to an overall reduction in their speaking rate. This is consistent with earlier work (Lass, 1970) suggesting that the perception of speech rate is influenced by the presence of a boundary. Thus, our results are consistent with earlier work showing a relationship between speaking rate and boundary production (e.g., Gee & Grosjean, 1983). Though not the primary focus of this study, this interplay between boundary production and speaking rate can provide novel insight into the relationship between the perception and production of prosody.

Naturally, there are limitations to these conclusions. The absence of a priming effect for boundaries does not necessarily mean that no effect was present, since the null finding could reflect an inability to detect such effects in the present paradigm. However, we have consistently found that participants are more likely to reproduce boundaries heard in prime sentences (Tooley et al., 2014). This implies that our manipulation is not too weak to influence production and that participants do in fact retain some prosodic information from the prime sentences. Our paradigm was also successful in showing variation in participants’ boundary production, but importantly, this effect was not driven by the boundary-priming manipulation.

One plausible alternative for the lack of IPB priming concerns the optionality and information value of the boundaries in the prime sentences. In Experiment 1, as well as in previous studies, the boundaries produced in the primes were not strictly necessary and did not add syntactic or semantic information that might influence comprehension. Thus, primes may have been ineffective because they did not contribute to participants’ interpretation of the sentences. It is therefore plausible that priming might be observed in sentences with more “meaningful” boundaries. We tested this possibility in Experiment 2.

Experiment 2

Previous studies had used sentence structures in which IPBs were optional and did not add meaningful syntactic or semantic information to the sentence, which may have decreased the saliency of the boundaries. Thus, in Experiment 2 we used the same prime–target paradigm and the same measures of boundary production as in Experiment 1, but with new, ambiguous sentences in which boundaries supported disambiguation. We manipulated the presence of a boundary in the prime sentences (with no manipulation of speech rate). The target sentences were always ambiguous (e.g., She put the money in the basket on the table), so their structural interpretations could be influenced by the presence of a boundary in the critical location (i.e., between the phrases in the basket and on the table in the present example). If priming for IPBs is dependent on the saliency or meaningfulness of those boundaries to the listener, then participants should be more likely to produce a boundary at the critical location in target sentences when they have heard a boundary in that location in the primes.

Method

Participants

A total of 74 Texas State University undergraduates participated for course credit. One participant failed to follow the instructions and was excluded from the dataset.

Materials and design

The experimental stimuli consisted of a set of 40 sentences that described transfer-of-location events. The sentences were either ambiguous or unambiguous and either included a boundary at a critical location or did not (Examples 2a–2d). In the ambiguous conditions (2a, 2b), the sentences could be interpreted in two ways: Someone is putting money in a basket that is on a table, or someone is taking money that was in the basket and putting it on a table. The absence of a boundary (Sentence 2a) suggests the former interpretation, whereas a boundary after the word basket (Sentence 2b) suggests the latter.

  1. 2a)

    She put the money in the basket on the table. (Ambiguous, No boundary)

  2. 2b)

    She put the money in the basket // on the table. (Ambiguous, Boundary)

  3. 2c)

    She put the money for the basket on the table. (Unambiguous, No boundary)

  4. 2d)

    She put the money for the basket // on the table. (Unambiguous, Boundary)

However, it is also possible that an ambiguous prime with a boundary might reinforce a particular syntactic interpretation, which could then influence the syntactic interpretation (and its appropriate boundary) of the target, via syntactic priming. For example, it is possible that interpreting in the basket as a location in Sentences 2a and 2b, rather than as a modifier, would prime an interpretation of this phrase in the target sentence. If so, participants might produce more boundaries at the critical location in the target merely due to the persistence of a syntactic frame rather than due to the meaningfulness of the boundary. Thus, to allow for interpreting the effects of the boundary manipulation, we crossed the boundary manipulation with the ambiguity manipulation. Each sentence in the set included two unambiguous versions (Sentences 2c and 2d), created by changing a single word—for example, money in the basket and money for the basket. Critically, both the ambiguous and unambiguous sentences should have the same structural priming effect on an ambiguous target, which would control for effects of syntax.

This design allows us to test for effects of communicativeness on IPB priming. The boundary in Sentence 2d occurs in same location as in Sentence 2b, but the former boundary provides information that is redundant with the syntax. Comparing these conditions allow us to examine priming in contexts in which IPBs are highly informative syntactically and less syntactically informative. If boundary priming depends on the boundary’s communicative value, effects of boundary meaningfulness should result in stronger priming in the ambiguous than the unambiguous condition.

The boundary manipulation was achieved via the same cross-splicing method as in Experiment 1. Half of the stimuli set (20 sentences) had the critical boundary location after what was the first noun phrase, and half had it after what was the first prepositional phrase in these sentences (see Examples 2b and 2d). Thus, critical boundary location was manipulated between-items but within-participants. (The stimuli are available at https://dataverse.tdl.org/dataset.xhtml?persistentId=doi:10.18738/T8/36OOH3.)

Each sentence in the set was yoked to another sentence to form a prime–target pair. The prime sentence was presented in one of the four conditions (as in Sentences 2a–2d), and the target sentence was always ambiguous (e.g., He threw the marble in the bucket in the yard; target sentences were presented visually once again, and thus had no prosody). Ambiguity, boundary, and sentence position (prime or target position) were counterbalanced within participants and within items to create eight experimental lists. Each participant received five items in each of the four conditions obtained by crossing ambiguity and boundary. Three filler sentences intervened between each prime–target pair. The filler sentences included relative clause sentences (like the target sentences from Exp. 1), main clause sentences, and ambiguous sentences such as He touched the plant with the leaf. The fillers also included naturally produced boundaries at varying syntactic locations.

Procedure

The experimental procedure was identical to that of Experiment 1.

Scoring and analyses

The boundary scoring procedures were the same as in Experiment 1. Trials in which the length of the critical region was three standard deviations above the mean (i.e., longer than 1.02 s) were eliminated from the dataset. Furthermore, applying the same exclusion criteria as in Experiment 1 resulted in a loss of 13% of the data, leaving 2,576 trials (out of 2,960 possible trials) for analysis.

The analyses were carried out as in Experiment 1, including the factors boundary (present vs. absent), ambiguity (ambiguous vs. unambiguous prime), and sentence position (prime vs. target), with all interactions.

Results and discussion

Participants were more likely to produce pauses at the critical region in their repetitions of the prime sentences when the recorded primes contained a boundary (Fig. 4, left panel). This effect was stronger in the repeated primes than in the targets (Fig. 4, right panel) and did not vary with prime ambiguity.

Fig. 4
figure 4

Perceived pauses: Proportions of perceived pauses in the (left) prime sentences and (right) target sentences, broken down by the boundary and ambiguity conditions. Error bars represent 95% confidence intervals

The overall analysis of perceived boundaries (Table 3a) revealed significant main effects of sentence position, ambiguity, and boundary, and interactions between sentence position and ambiguity as well as sentence position and boundary. The perceptual coder for this study was more likely to perceive a boundary at the critical location in prime than in target sentences. Furthermore, she was more likely to perceive a boundary in an unambiguous sentence than in an ambiguous sentence in the primes, but not the targets. She was also more likely to perceive a boundary in a sentence in which the prime contained a boundary, and again this effect differed across primes and targets. Following up on these interactions, an analysis restricted only to target sentences revealed no effects of ambiguity and prime boundary. Thus, this coder’s perception of a boundary at the critical location in the targets was not affected by prime ambiguity or by prime boundaries.

Table 3 Results of analyses of perceived boundary production and word-and-pause durations at the critical sentence region of all sentences (primes and targets)

A similar pattern was observed again with word-and-pause durations (Fig. 5). The analysis of word-and-pause durations revealed significant main effects of sentence position and boundary, as well as an interaction between position and boundary (Table 3b). Participants produced the words in the critical region of the targets faster than the primes, although the effect was numerically small. Participants also spent less time producing the words in this region when the prime sentence did not contain a boundary, and the size of this effect was larger in primes (Fig. 5, left panel) than in targets (Fig. 5, right panel). A follow-up analysis restricted to the target sentences revealed a main effect of boundary but no interaction between boundary and ambiguity: Participants produced the words in the critical region of target sentences more slowly when the prime contained a boundary, but this effect was not larger when the prime sentence was ambiguous.

Fig. 5
figure 5

Word-and-pause durations: Mean word-and-pause durations for the critical regions of the (left) prime and (right) target sentences, broken down by boundary and ambiguity conditions. Error bars represent 95% confidence intervals

In sum, when participants were exposed to a boundary that provided a means of disambiguating the syntax of the prime sentence, they still did not persist in using the boundary in the following target sentence. This replicated Experiment 1 and showed that IPB priming did not occur even under conditions in which the boundaries were highly salient and meaningful to the listener.

Our results did show an effect of ambiguity in the perceptual measure of boundaries that was not predicted: Boundaries were produced more often in the repeated unambiguous prime sentences. It is possible that our coder was more likely to perceive a boundary that supported a particular syntactic interpretation when that syntax was not ambiguous, since internal prosody might have had some influence on this measure. Furthermore, the perceptual coder coded all the target sentences before coding the primes (to avoid practice effects and to help keep the coder blind to condition), which may explain why this subjectivity impacted primes more than targets. Since this was the only inconsistency for the objective duration measure, and since it does not change the interpretation of the boundary-priming effect, we do not discuss it further.

On the basis of the results from Experiments 1 and 2, it appears that speaking rate (which persists) and intonational boundaries (which do not) may have differing types of underlying representations or may be planned at different processing stages. However, what it is about the processes underlying boundary production that resists priming is an open question. Tooley et al. (2014) argued that the lack of a separable representation is responsible for this effect. Rather than people engaging an independent representation for intonational boundaries in speech production, direct connections between semantic/syntactic planning systems and articulators might simply trigger boundary production at points at which boundaries are needed. In that case, because there is no overt prosodic representation across the sentence during production, there would be nothing to prime.

It is also possible that there is an abstract representation for boundaries and that the relationship between intonational boundaries and other levels of linguistic representation inhibits priming. For example, it could be that the planning requirements for the syntactic and semantic systems that drive boundary placement overwhelm any impact of the boundary representation from the prime. If that is the case, one might expect other aspects of prosody that are also linked to higher levels of linguistic representation to be similarly immune to the effects of priming. We tested this prediction in Experiment 3 by investigating the priming of pitch accents.

Experiment 3

Pitch accents are signaled by a movement in the F0 contour, increased intensity, and lengthening, and like intonational boundaries they are tightly linked to semantic and syntactic structure. They are also related to focus and discourse structure (Wagner & Watson, 2010): Pitch accents can appear throughout an utterance to satisfy metrical requirements, but they are typically used in English to signal information that is new (or focused), unpredictable, or important (Bolinger, 1972). However, because they are constrained by syntactic information, pitch accent placement is optional at times (Selkirk, 1984). Here we exploited this optionality to determine whether pitch accents are similar to IPBs in their resistance to priming.

There are several technical definitions of focus in the literature, but here we will use it to refer to words or syntactic phrases that are new or important in a sentence. If a syntactic phrase is focused, there is some optionality in where a pitch accent can occur (Gussenhoven, 1983; Selkirk, 1995). For example, in Sentences 3a and 3b, the head of the noun phrase book about the Greeks is book. The prepositional phrase that modifies book is an argument—that is, a word or phrase that satisfies a core semantic requirement of a head (see Gibson & Schütze, 1999, for a more precise definition of argumenthood). For example, all books have a topic, so the prepositional phrase “about the Greeks” specifies a semantic property of the head book. In Sentences 3c and 3d the prepositional phrase is a modifier—that is, a word or phrase that modifies the head but does not satisfy a core semantic property of a head (being next to some thing or some person is not an intrinsic property of the definition of a book).

  1. 3a)

    The professor assigned the book about the GREEKS to the class.

  2. 3b)

    The professor assigned the BOOK about the GREEKS to the class.

  3. 3c)

    *The professor assigned the book next to the GREEKS to the class.

  4. 3d)

    The professor assigned the BOOK next to the GREEKS to the class.

Arguments play a key role in the distribution of pitch accents in focused phrases. It is grammatical for a pitch accent to occur either on the argument of a head (3a) or on both the head and its argument (3b). In contrast, for modifiers, the pitch accent must occur on both the head and its modifier (3d). If it occurs only on the modifier (3c), the sentence sounds less acceptable. Thus, for focused phrases with arguments, there is optionality in where a pitch accent can occur (Selkirk, 1984).

In this experiment we investigated the priming of optional pitch accents. As in previous experiments, participants listened to and immediately repeated back the prime sentences they heard. The sentences either had pitch accents only on the second of the two nouns (3a) or on both the head noun and the noun within the modifier (3b). On the following target trial, the participants silently read a novel sentence (with the same syntactic structure as the prime) and repeated it aloud.

It is highly likely that an abstract representation supporting pitch accent production does exist (e.g., Gussenhoven, 1983; Pierrehumbert & Hirschberg, 1990; Selkirk, 1995; and many others). The complex constraints that govern syntax, focus, and pitch accents would require an abstract representation for pitch accent structure, if only to track where accents have and have not occurred so that the speaker can ultimately produce a grammatical sentence. In addition, linguists have proposed a catalogue of pitch accent types that convey different semantic and pragmatic interpretations, such as introducing a new referent or signaling a contrast between a referent and something previously mentioned in the discourse (e.g., Pierrehumbert & Hirschberg, 1990). These would require an independent level of representation mapping the acoustic form to meaning.

Thus, the priming of pitch accents provides an ideal test case for the two hypotheses discussed in the previous section. One hypothesis is that prosodic elements that are determined by other linguistic levels do not prime. If a lack of priming is due to some aspects of prosody being controlled by higher-order linguistic levels of representation (such as discourse, syntax, and semantics), then we would expect to see no priming for either IPBs or pitch accents. The second hypothesis is that elements of prosody that are not represented abstractly do not prime. In previous work (Tooley et al., 2014), we proposed that intonational boundaries do not prime because they are represented abstractly by speakers. Because we know that pitch accents must be represented abstractly, they serve as an ideal comparison case. If priming is only absent when a linguistic phenomenon is not represented abstractly, we would expect to find priming of pitch accents in this experiment.

Method

Participants

A total of 44 students from Texas State University participated for course credit.

Materials

The experimental items consisted of 40 main clause sentences in which the object of the verb was a noun followed by a prepositional phrase (e.g., The Mexican billionaire purchased the photo of the landscape at the auction; Appx. 3). Each sentence was randomly yoked to another sentence to create 20 prime–target pairs. The pitch accenting manipulation was created by having a trained speaker record two versions of each of the sentences: a control condition in which only the noun in the prepositional phrase was accented (e.g., The Mexican billionaire purchased the photo of the LANDSCAPE at the auction) and a priming condition in which both the head noun and the noun in the prepositional phrase were accented (e.g., The Mexican billionaire purchased the PHOTO of the LANDSCAPE at the auction). The critical word in these sentences is therefore the head noun (e.g., photo). Unlike in Experiment 1, the stimuli were not created via splicing, since attempts at cross-splicing yielded less than natural-sounding sentences. The accented version of the critical word across sentences had a longer mean duration (0.413 vs. 0.341 s), greater intensity (63.74 vs. 59.45), lower minimum pitch (173.13 vs. 196.32), and higher maximum pitch (249.53 vs. 225.89) than the unaccented version (all ps < .01). (The stimuli are available at https://dataverse.tdl.org/dataset.xhtml?persistentId=doi:10.18738/T8/9L8ABK).

The two experimental conditions, and the use of each sentence as a prime or as a target, were counterbalanced across four lists. Thus, each participant saw each sentence in only one condition, either as a prime or as a target. Within lists, each participant received ten items in each of the two experimental conditions. Three filler sentences intervened between all prime–target pairs. The filler sentences were the same as those used in Experiment 1 but were recorded by the same speaker who produced the experimental items and included at least one accented word.

Procedure

The procedure for Experiment 2 was the same as that in Experiment 1.

Scoring and analysis

Responses were excluded from the analysis if participants changed the syntax of the sentence, did not produce two nouns in the critical prepositional phrase, paused for extended periods of time, or produced disfluencies at or near the critical region. Minor wording changes were acceptable. This left 1,282 trials (out of 1,760 total trials) for analysis. Participants’ pitch accenting was assessed in two ways: one based on subjective perception, and one based on objectively measured speech correlates of pitch accenting. The perceptual coder rated the perceived level of pitch accenting on the critical word relative to the other words in the sentence on a 4-point scale. Another coder annotated (marked) the onsets and offsets of the critical words using PRAAT. We then extracted measures of average pitch, intensity, and duration on the critical words.

Analyses were also implemented in R (R Development Core Team, 2008), using linear mixed-effects models for the subjective and objective measures of pitch accenting. All models included prime accenting (accented vs. unaccented) and sentence position (prime vs. target) as contrast-coded fixed effects, as well as participants and items as random effects.

Results and discussion

Perceptual measure of pitch accenting

The critical words in the reproductions of the prime sentences were rated as being more accented when the prime recording also contained accenting on this word (Fig. 6). This effect was not present in the target sentences.

Fig. 6
figure 6

Perceived ratings of pitch accenting: Mean perceived ratings of pitch accenting on the critical word in both the prime (left) and target (right) sentences, broken down by the accenting condition (no accent vs. accent). Error bars represent 95% confidence intervals

The overall analysis of ratings of perceived levels of pitch accenting yielded a main effect of prime accenting and an interaction between prime accenting and sentence position (Table 4). Follow-up analysis of the target sentences revealed no significant effect of prime accenting. Thus, participants tended to accent the critical word in their repetitions of the prime sentences more when they had heard a prime with the critical word accented, but this effect did not persist into the target sentences.

Table 4 Results of analysis of perceived pitch accenting on the critical word in the prime and target sentences

Measures of pitch, intensity, and duration

Figure 7 shows the mean pitch, intensity, and durations of the critical words in the primes and targets. The pitch analysis showed no effects of prime accenting on pitch in either the primes or the targets (Table 5a). The analyses of intensity and duration showed main effects of sentence position (primes were louder and shorter than targets), but no effects of prime accenting and no interactions (Tables 5b, c). Thus, although the primes differed in intensity and duration from the targets, pitch accenting had little effect on acoustic measures for the critical words in target sentences.

Fig. 7
figure 7

Mean measurements of the (a) pitch accenting (in hertz), (b) intensity (in decibels), and (c) duration (in seconds) of the critical word in both the prime (left) and target (right) sentences, broken down by the accenting condition (no accent vs. accent). Error bars represent standard errors

Table 5 Results of the analysis of (a) pitch, (b) intensity, and (c) duration of the critical word in the prime and target sentences

In sum, the results from the analysis of pitch accenting suggest that pitch accenting is not readily amenable to priming. These results are remarkably similar to those for IPB priming in Experiments 1 and 2. Participants do appear to store the prosodic information that they hear in the prime sentence, since it clearly influences the prosody they use when repeating that sentence (even with minor wording changes). However, there is no evidence that their experience with the prime, or its stored prosodic information, affects the pitch accenting of target sentences. Notably, this is again quite distinct from priming effects observed for syntactic structure, word meaning, and even speaking rate, in which experience with the prime has an immediate, observable influence on the target. Theoretically, this result suggests that both pitch-accenting and intonational phrase boundaries are planned in conjunction with other representations, such as syntax- and message-level representations, during sentence production.

General discussion

Three experiments showed that some aspects of prosody (i.e., intonational phrase boundaries and pitch accenting) are not amenable to priming, but another aspect of prosody (speaking rate) is. This lack of priming for IPBs and pitch accenting was observed despite retention of prosodic representations in repetitions of the primes, even when those repetitions involved minor content word changes. This was the case even when IPBs served a disambiguating function (Exp. 2). Thus, we propose that this difference in priming across experiments may reveal an important distinction in how more linguistic aspects of prosody (i.e., aspects of prosody that convey linguistic information to the listener, such as cues to syntax, semantics, and discourse focus) are represented relative to more paralinguistic aspects of prosody (such as speech rate). These findings are consistent with a model of production in which speaking rate is planned separately but pitch accenting and intonational phrase boundaries are planned together with other types of linguistic representations. We outline such a model below.

Incorporating prosody into a model of speech production

The model postulated by Tooley et al. (2014) suggested that boundary production is the result of interactions between different levels of representation (i.e., syntax, semantics, and discourse) as well as processing resource constraints of the speaker. As such, a separate, abstract level of representation for the prosodic phrasing of an entire sentence is not included in the model. Instead, boundaries are initiated as needed by other levels of representation—specifically, by “go” signals from those planning stages to the articulation stage. However, given that pitch accenting is also not amenable to priming (and that this aspect of prosody is widely assumed to be abstractly represented), it is possible that IPBs are also represented abstractly, but priming for these representations is not strong enough to survive the processes related to planning the linguistic structure of the subsequent target sentence. Furthermore, because speech rate can be primed, it is likely planned at a more global level, rather than in direct concert with other linguistic representations (e.g., syntax).

A model of speech production that incorporates prosodic planning would therefore need to include a global processing stage or controller that sends information to the articulators to modulate speaking rate. This model specification is consistent with a growing body of research that has shown persistence for speaking rate from one sentence to the next across various ages and tasks (Finlayson, Lickley, & Corley, 2010; Hupp & Jungers, 2009; Jungers & Hupp, 2009; Jungers et al., 2002). Such a model would also likely include an abstract processing stage for linguistic aspects of prosody. However, this processing stage would have direct communication with message-level and syntactic/semantic-level representations (Fig. 8), and would then send prosodic plans to the articulation processing stage. Signals from the message-level stage would convey information about the givenness and newness of referents to the prosodic planning stage, so that words could be accented or deaccented accordingly. Furthermore, information from the syntactic stage would need to be conveyed in order to plan IPBs that coincide with phrasal boundaries and pitch accenting on particular words that will maintain the hierarchy of pitch accenting produced across different clauses, phrases, and the entire sentence. Additionally, processing difficulty experienced during formulation could be communicated to the prosodic processor so as to initiate a boundary, to allow processing to “catch up.”

Fig. 8
figure 8

Model of speech production, including prosodic planning.

Message and structural formulation systems send signals to a prosodic signaling system that initiates prosodic production in the articulators. However, speech rate is controlled at a separate processing stage (the speech rate controller).

Such a model can account for the results of the present studies. However, given the scarcity of research in this area, only a few dimensions of prosody have been investigated for potential priming effects. Additional priming research for other aspects of prosody, in different contexts, will be needed in order to determine whether such a model can account for a broader range of findings. Interestingly, the aspects of prosody that have been found to be least amenable to priming are those that are also inherently more linguistic in nature. In contrast, speaking rate, which is a paralinguistic aspect of prosody, does show robust priming. We propose that this distinction deserves further scrutiny.

Social influences on priming

A paralinguistic aspect of prosody, such as speaking rate, often conveys nonlinguistic information to the listener (Crystal, 1976; Fujisaki, 1997), such as the internal emotional state of the speaker (Frick, 1985; Williams & Stevens, 1981), or the speaker’s competence or benevolence (Brown, 1980). This implies that priming for aspects of prosody may depend on the extent to which those aspects interact with social variables.

Social factors have been shown to mediate priming at the phonetic and syntactic levels. For example, priming can influence pronunciation in conversational tasks (Pardo, 2006), and the degree to which a participant shows phonetic convergence with their conversational partner depends jointly on their gender and role in the conversational task (Bilous & Krauss, 1988; Pardo, Jay, & Krauss, 2010). Similarly, stronger structural priming is observed when participants have a positive social impression of their conversational partners and weaker when they have a negative social impression of their partners (Balcetis & Dale, 2005; but see Branigan, Pickering, Pearson, & McLean, 2010, and Heyselaar, Hagoort, & Segaert, 2017). Furthermore, recent studies of structural priming and alignment showed alignment of structure only when participants were interacting with other participants or a human-like avatar (Heyselaar et al., 2017), and not when they were interacting with a computer (Fehér, Wonnacott, & Smith, 2016, using in an artificial language).

These findings raise the possibility that priming of more linguistic aspects of prosody might occur in the presence of an interlocutor or in a conversational context, neither of which was present in the current experiments. However, repetition of syntactic structure can be elicited reliably in both communicative and noncommunicative settings, because the role of a syntactic structure in conveying relational information in an utterance does not depend on the presence or absence of an interlocutor. Even if the communicative value of prosody is contingent on the production context in a way that the communicative value of syntax is not, priming effects of speaking rate have now been observed in multiple single-person studies (the present study, as well as Jungers & Hupp, 2009). Thus, if conversation is a prerequisite for IPB priming, this would make IPB representations entirely unlike other aspects of language that have been found to prime.

Conclusion

In the present study we investigated priming for three aspects of prosody. Speaking rate was found to persist from one sentence to a subsequent sentence, but boundary placement and pitch accenting were not. These findings replicate and extend previous work on the priming of aspects of language that are part of the spoken language signal (i.e., prosody) but are separate from the meanings of individual words (Jungers & Hupp, 2009; Tooley et al., 2014). Finding priming for a paralinguistic aspect of prosody (i.e., speaking rate) but not for more linguistic aspects of prosody (intonational phrase boundaries and pitch accenting) may suggest a difference in how these different aspects of prosody are represented and planned during language production.