Scrutiny of the psychological processes that underlie the aesthetic appreciation of art dates back to the very origins of empirical aesthetics (e.g., Fechner, 1876; cf. also Berlyne, 1974). However, only recently have these psychological processes sparked interest in cognitive psychology and the neurosciences. Neuroaesthetics has emerged as a new field of interest in the neurosciences (e.g., Chatterjee & Vartanian, 2014; Zeki, 1999) that attempts to specify the underlying (neural) mechanisms for specific types of aesthetic appreciation and experience. To date, researchers in this field have primarily explored the neural processes underlying visual art appreciation—for instance, of faces and figures (e.g., Cela-Conde et al., 2013; Chatterjee, Thomas, Smith, & Aguirre, 2009; Jacobsen, Schubotz, Höfel, & von Cramon, 2006). An increasing number of neuroscientific studies have also explored music and dance (music: see, e.g., Brattico & Jacobsen, 2009; also see related work on how music evokes emotions and “chills”—e.g., Grewe, Nagel, Kopiez, & Altenmüller, 2005; Salimpoor & Zatorre, 2013; Steinbeis & Koelsch, 2008; dance: Cross, Kirsch, Ticini, & Schütz-Bosbach, 2011; Kirsch, Drommelschmidt, & Cross, 2013). However, the aesthetic appreciation of words and texts, and specifically of poetic features of language (e.g., rhyme and meter), has so far received very little attention in neuroscience research. This is especially surprising because artistic uses of language that feature metrical patterns and phonological similarities of various types are found in many contexts, ranging from religious rites, to political and commercial advertisements, to the verbal arts. Therefore, it is of great interest to explore the neural mechanisms underlying the aesthetic appreciation of such uses of language.

Poetry and song are the most universal and conspicuous instances of poetic language. The theoretical study of poetic features in language can be traced as far back as Aristotle’s Poetics (Aristotle, 1932). This longstanding tradition has influenced studies of poetry and literature alike, as is evidenced in Jakobson’s seminal work on the “poetic function” of language (e.g., Jakobson, 1960). However, neither traditional poetics nor Jakobson’s propositions regarding the linguistic features of poetic language have addressed the psychological and neural mechanisms underlying these forms of language use and how they influence aesthetic appreciation.

Advocates of cognitive poetics (e.g., Tsur, 2008; Van Peer, 1990) have theorized that certain features of poetic language influence psychological processes, and consequently the aesthetic appreciation of poetry. In particular, meter and rhyme,Footnote 1 two characteristic features of poetry (see Jakobson, 1960), seem to influence the aesthetic appreciation of poetry (e.g., Obermeier et al., 2013). However, the underlying psychological processes of aesthetic poetry reception and their neural correlates have yet to be investigated. In the present study, therefore, we aimed to shed light on these processes (for a comparable approach in the visual domain, see Jacobsen, 2013).

Cognitive fluency theory (for a comprehensive review, see Reber, Schwarz, & Winkielman, 2004), which brought 19th-century psychological concepts such as the principle of minimizing processing expenses (Fechner, 1876) and processing ease (Stumpf, 1885) back to the agenda of empirical aesthetics, provides a theoretical framework for explaining the psychological processes underlying the appreciation of art in general, and of poetry in particular.Footnote 2 The theory postulates that “the more fluently the perceiver can process an object, the more positive is his or her aesthetic response” (Reber et al., 2004, p. 365). Thus, cognitive fluency may influence how much we like a work of art or an object (for a similar view, see Ticini & Omigie, 2013).Footnote 3 Cognitive fluency derives from two distinct processes: perceptual fluency and conceptual fluency. Perceptual fluency relates to the “ease of identifying the physical identity of the stimulus” (Reber et al., 2004, p. 366); this is the type of fluency that has been previously tested in empirical aesthetics (see Schellenberg & Trehub, 1996, for auditory stimuli, and Palmer, 1991, for visual stimuli). More generally, feature recurrence and object familiarity seem to increase perceptual fluency. Conceptual fluency, on the other hand, relates to the stimulus meaning and semantics of a particular work of art (e.g., Reber et al., 2004; Whittlesea, 1993; Winkielman, Schwarz, Fazendeiro, & Reber, 2003). This type of higher-order, cognitive processing fluency is particularly important in the reception of modern and contemporary art and often relies less on recurring stimulus features and more on ideas and concepts stimulating the recipient’s search for meaning (Dewey, 1934) and interpretation (Kreitler & Kreitler, 1972; Leder et al., 2004; Martindale, 1984). In summary, previous research has shown that cognitive fluency depends on the features of the object being processed, as well as the subjective processing experience of the recipient (e.g., both the production and the reception of poetry are influenced by expertise; Peskin, 1998; Tsur, 2008).

Turning back to the rhetorical and poetic features of language, behavioral studies have shown that rhyme affects word comprehension (e.g., Lea, Rapp, Elfenbein, Mitchel, & Romine, 2008) and contributes to the organization of lexico-semantic information in the mental lexicon (e.g., Allopenna, Magnuson, & Tanenhaus, 1998). Several event-related potential (ERP) studies have reported that the N400 component responds sensitively to rhyme manipulations. In visual-priming paradigms, nonrhyming target words elicit increased N400 responses relative to rhyming targets (e.g., Coch, Hart, & Mitra, 2008; Khateb et al., 2007; Kramer & Donchin, 1987; Rugg, 1984a, b; Rugg & Barrett, 1987). Similar effects have been observed for visually presented word pairs and sentences in different languages (English: e.g., Rugg, 1984a, b; Spanish: Perez-Abalo, Rodriguez, Bobes, Gutierrez, & Valdes-Sosa, 1994), for nonword targets (Rugg, 1984a), and for picture prime–target pairs (Barrett & Rugg, 1990). Rhyming effects have also been reported in the auditory modality (e.g., Coch, Grossi, Skendzel, & Neville, 2005; Davids, van den Brink, van Turennout, & Verhoeven, 2011; Praamstra, Meyer, & Levelt, 1994; Praamstra & Stegeman, 1993). For example, Davids et al. described an enhanced N400 response to auditory nonrhyming target words, as compared to the response to rhyming target words. Because rhyme manipulations do not lead to different effects for words and pseudowords (Rugg, 1984a), modality-independent rhyme effects have been attributed to phonological rather than lexico-semantic processes.

There is also evidence that regular meter benefits cognitive processing (see, e.g., Cutler & Foss, 1977). For instance, metrically regular structures are easier to remember and reproduce than metrically irregular structures (Essens & Povel, 1985; Menninghaus, Bohrn, Altmann, Lubrich, & Jacobs, 2014). Moreover, meter plays an important role in language acquisition (e.g., Jusczyk, 1999), as well as syntactic (e.g., Schmidt-Kassow & Kotz, 2009b) and semantic (e.g., Magne et al., 2007; Rothermich, Schmidt-Kassow, & Kotz, 2012) auditory language processing. Various studies have reported that the N400 response to single words and to words in sentences is reduced when the words are presented in a metrically regular context (e.g., Bohn, Knaus, Wiese, & Domahs, 2013; Magne, Gordon, & Midha, 2010; Magne et al., 2007; Rothermich et al., 2012; Rothermich, Schmidt-Kassow, Schwartze, & Kotz, 2010). For example, Magne et al. (2007) investigated how regular meter influences semantic processing in spoken French. Participants listened to short sentences ending in a metrically and/or semantically congruous or incongruous word. It should be noted that French relies on rather regular and predictable accent stress in sentence-final words, which creates strong predictions about the unfolding of metrical structure. The authors reported an N400 amplitude increase for metrically incongruous sentence-final words in a semantic task and suggested that regular meter is beneficial for lexico-semantic integration. Recent findings by Rothermich et al. (2012) corroborate these findings and their interpretation. Finally, an ERP study by Bohn et al. (2013) that manipulated word stress in auditory language processing found that, where an irregular but possible meter was costly (i.e., they found an enhanced N400 response to an unexpected stress change), regular meter was not. This suggests that regular meter reduces processing costs.

Positive effects of metrical structure in auditory language processing have also been linked to the P600 component (e.g., Roncaglia-Denissen, Schmidt-Kassow, & Kotz, 2013; Schmidt-Kassow & Kotz, 2009b). For example, Roncaglia-Denissen et al. investigated whether a regular speech meter can facilitate the resolution of a syntactically ambiguous sentence structure (object–subject–verb [OSV], rather than subject–object–verb in German). OSV sentences elicited significantly reduced P600 responses in metrically regular sentence contexts, but not in metrically irregular ones, and the authors interpreted this as reduced processing costs. As compared to the N400 evidence regarding lexical stress at the single-word level and for single words in sentence contexts, the modulation of the P600 response seems to vary as a function of metrical context—that is, the prediction of when the next stressed syllable will occur. The P600 response may thus be indicative of the unfolding of stress patterns beyond single words and the integration of sentence constituents (e.g., Schmidt-Kassow & Kotz, 2009a). A biphasic N400/P600 pattern in studies investigating meter at the sentence level may also reflect different levels of integration or task-specific results (e.g., lexical stress vs. metrical stress patterns; Luo & Zhou, 2010; Marie, Magne, & Besson, 2011; McCauley, Hestvik, & Vogel, 2013; Ystad et al., 2007).

In summary, previous ERP studies have reported two components that respond sensitively to rhyme and meter manipulations, reflecting ease of cognitive processing. More specifically, regular meter accentuates speech events, such as syllables or words, in the incoming speech signal (e.g., Kotz & Schwartze, 2010).

In the present study, we presented participants with 60 lyrical stanzas taken from 19th- and 20th-century German poems, each consisting of four verses. We presented the stanzas to nonexpert poetry listeners in four linguistically controlled versions that differed in meter (metered vs. nonmetered) and rhyme (rhyming vs. nonrhyming) while recording their electroencephalograms (EEGs). Participants rated the stanzas for liking (which we understand as a proxy for overall aesthetic appreciation) and rhythmicity. On the basis of previous ERP evidence about ordinary language use, we expected a reduced N400 response to both regular meter and rhyme, and a reduced P600 response to meter. Whether meter and rhyme interact as a function of pattern recurrence remains an open question, because to date meter and rhyme have only been investigated separately. We expected that if these two poetic features do interact, they would enhance each other’s effects, since traditional German (and English) poetry employs both features conjointly. Consequently, we expected enhanced ease of processing for stanzas that are both metered and rhyming. In addition, we expected to replicate previous behavioral findings of higher liking and rhythmicity ratings for metered, rhyming stanzas (Obermeier et al., 2013). Finally, we predicted that ease of processing in the ERPs would correlate with aesthetic appreciation. If this were the case, we should find a significant correlation between reduced N400 and P600 ERP responses and the ratings for aesthetic liking. This would be compatible with the hypothesis that ease of processing and aesthetic appreciation of poetry influence each other.

Method

Participants

Eighteen native German speakers (11 male, seven female; 18–30 years of age, mean 24.9 years) participated in the study and signed a written informed consent form. All participants were right-handed (mean laterality coefficient = 96.3; Oldfield, 1971), had normal or corrected-to-normal vision, had no self-reported hearing deficits, and had not participated in the previous behavioral and norming studies using the same stimulus material. Participants were unaccustomed reading or listening to poetry, but had had some exposure to poetry in school.

Stimuli

The stimulus material consisted of a subset of the lyrical stanzas used by Obermeier et al. (2013). The original stimulus set consisted of 100 four-line stanzas from 19th- and early-20th-century German poetry. All stanzas were German folk song stanzas, roughly comparable to English ballad stanzas and known as Volksliedstrophen. Given that this type of stanza underlies many, if not most, present-day German pop songs, we expected that even readers who barely read poetry would easily detect the meter and rhyme pattern. We controlled the stanzas for type of meter (iambic vs. trochaic), rhyme scheme (half of the stanzas contained alternating rhymes, and the other half contained rhyming couplets), stanza scheme (isometric), syntactic regularity, verse length (85 to 125 letters per verse), and the absence of syntactic ellipses and enjambments. Only nouns and verbs appeared in the rhyming position. We excluded well-known and frequently cited poems in order to ensure a low level of familiarity with the stanzas.

We constructed four different versions of each of the 100 stanzas, taking the poetic features meter (metered vs. nonmetered) and rhyme (rhyming vs. nonrhyming) into account. The first version was the original stanza (metered and rhyming). The second version was a metered but nonrhyming version of the stanza (metered and nonrhyming). The third version was nonmetered but retained the original rhyming structure (nonmetered and rhyming), and the fourth version was both nonmetered and nonrhyming (nonmetered and nonrhyming). We constructed the altered versions in accordance with the following principles. We retained the stanza’s original words and word order wherever possible, and obtained the nonmetered versions by adding one or two syllables to each verse—for example, by changing particles or function words, modifying adjectives, or replacing nouns with different ones that had a convergent meaning. For the nonrhyming stanza versions, we replaced the first word of each rhyme pair. Apart from modifying rhyme and meter, we took great care to retain other poetic features, such as metaphors and syntactical figures (e.g., repetitions of themes; see Table 1).

Table 1 Stimulus examples (Wordsworth: To Mary)

In summary, we created four versions each of the 100 original stanzas that included the poetic features of both meter and rhyme, resulting in an experimental set of 400 stanzas. In order to minimize possible interpretation effects, we had a professional actor recite all versions of the stanzas with reduced expressivity, relative to typical recordings of poetry. We recorded each stanza several times and subsequently chose the most homogeneous sound recording, which we then normalized to 78 dB to minimize differences in intensity between the stanzas. Furthermore, we calculated separate acoustic analyses of duration, maximal pitch, minimal pitch, and mean pitch in order to ensure that the different versions of each verse did not differ from one another in terms of their acoustic properties.

Pretest

In order to verify and optimize the quality of the meter manipulation, we performed a rating study. We asked 40 native speakers of German to rate a subset of the stimuli (full stanzas, which were either metered and rhyming or nonmetered and rhyming) for rhythmic regularity on a 5-point semantic differential scale (1 very irregular to 5 very regular). The metered stanzas were judged to be significantly more regular than the nonmetered stanzas [F(1, 39) = 189.7, MSE = 0.154, p < .0001]. On the basis of the rating results, we selected 30 stanzas with rhyming couplets and 30 stanzas with alternating rhymes for the final stimulus set, which provided the largest difference in rhythmic regularity ratings for the metered and nonmetered stanza versions. Thus, the resulting final stimulus set included 240 stanzas (60 stanzas × 4 versions).

Procedure

We seated the participants in front of a computer screen in a dimly lit, sound-attenuated chamber. We instructed them to listen attentively to all stanzas, and we asked them to rate the stanzas for rhythmicity and liking on 5-point semantic differential scales (rhythmicity ranged from 1 very irregular to 5 very regular, and liking ranged from 1 very bad to 5 very good) by pressing the corresponding response button. A typical trial started with a fixation cross presented on the computer screen for 300 ms, followed by the presentation of a stanza via loudspeakers. The rhythmicity and liking ratings immediately followed the stanza (through a scale that was visually presented on the screen) and were followed by a blank screen for 100 ms. We counterbalanced the order of the ratings across participants.

We presented each participant with a randomized list of 40 blocks. Each block consisted of six stanzas, chosen from among the four different conditions, resulting in 60 trials per condition and 240 trials overall. We interspersed short pauses after each ten blocks. An experimental session, which included a short training session, lasted approximately 90 min.

EEG recording

We recorded the EEGs from 59 Ag–AgCl electrodes (Electro-Cap International, Eaton, OH, USA) according to the modified 10–20 system. We used a PORTI-32/MREFA amplifier (DC to 135 Hz) to amplify the EEG signal, and we digitized it at 500 Hz. The sternum served as ground and the left mastoid as reference. We kept electrode impedances below 5 kΩ, and the signal was band-pass filtered between DC and 140 Hz. Offline, we re-referenced the data to linked mastoids. We measured vertical and horizontal electrooculograms for artifact rejection purposes.

Analysis

Behavioral data analysis

The task required participants to rate each stanza for rhythmicity and overall liking. We entered the ratings of all trials in the statistical analysis using a repeated measures analysis of variance (ANOVA) with the within-subjects factors Meter (metered, nonmetered) and Rhyme (rhyming, nonrhyming).

ERP data analysis

Prior to the statistical analysis, we filtered the data offline with a band-pass filter ranging from 0.1 to 100 Hz and subjected the data to automatic artifact rejection, removing electrode and muscle artifacts using FieldTrip (Oostenveld, Fries, Maris, & Schoffelen, 2011). We then applied an independent component analysis to identify eye movements as well as other artifacts (e.g., heartbeats). We removed the corresponding components from the EEG data and subjected the corrected data to a manual artifact rejection procedure. Overall, we excluded 13.7 % of the trials from further analysis. Finally, we filtered the corrected data again with a band-pass filter ranging from 0.5 to 30 Hz, to correct for baseline drifts (rather than using a prestimulus baseline). We calculated single-subject averages for each condition at the final word of the last verse of each stanza.

We time-locked epochs to the onset of the final word; these lasted from 200 ms prior to the onset to 1,000 ms post-stimulus-onset. We calculated separate analyses for lateral and midline electrode sites. For lateral electrode sites, we defined four regions of interest (ROIs): anterior left (AL: FP1, AF3, AF7, F7, F5, F3, FT7, FC5, and FC3), posterior left (PL: TP7, CP5, CP3, P7, P5, P3, PO7, PO3, and O1), anterior right (AR: FP2, AF4, AF8, F8, F6, F4, FT8, FC6, and FC4), and posterior right (PR: TP8, CP6, CP4, P8, P6, P4, PO8, PO4, and O2). For the midline analysis (MID) we used the following ROIs: FPz, AFz, Fz, FCz, Cz, CPz, Pz, POz, and Oz.

On the basis of visual inspection and our hypotheses, we selected two time windows within which to analyze the influences of meter and rhyme on poetry reception. The first time window ranged from 200 to 500 ms and corresponds to the classical N400 time window, and the second ranged from 700 to 850 ms to quantify a late positive response resembling the P600.

For both time windows, we calculated the mean ERP amplitude and subsequently performed a repeated measures ANOVA using the within-subjects factors Meter (metered, nonmetered), Rhyme (rhyming, nonrhyming), Hemisphere (left, right), and Region (anterior, posterior) for the lateral ROIs, and using Meter (metered, nonmetered) and Rhyme (rhyming, nonrhyming) for the midline ROI. We report only the effects that involved the factors Meter and Rhyme. We used a 7-Hz low-pass filter for graphical display purposes only.

Correlational data analysis

We applied a correlation analysis to test whether the impacts of meter and rhyme on ease of processing correlated with aesthetic liking. The N400 and P600 meter and rhyme effects (lateral and midline electrodes) were correlated with the meter and rhyme effects of aesthetic liking. For this purpose, we calculated (a) the ERP differences between the metered and nonmetered conditions, as well as between the rhyming and nonrhyming conditions, for the N400 and P600, and (b) a difference score of the liking ratings for both meter and rhyme. We calculated a Pearson correlation coefficient (one-sided) to explore the relation between the N400 and P600 effects of meter and rhyme and the corresponding behavioral rating effects. This procedure tested whether the ERP effects systematically influenced aesthetic liking, and vice versa.

Results

Behavioral data

Rhythmic regularity ratings revealed main effects of meter (metered, 3.36 ± 0.11 [standard error]; nonmetered, 3.00 ± 0.11) [F(1, 17) = 20.45, p < .001] and of rhyme (rhyming, 3.98 ± 0.12; nonrhyming, 2.38 ± 0.18) [F(1, 17) = 52.08, p < .001]. Both main effects were modulated by a significant interaction of meter and rhyme [F(1, 17) = 9.59, p = .007; see Fig. 1]. Resolving this interaction for meter, we found a main effect of rhyme for both metered (rhyming, 4.19 ± 0.12; nonrhyming, 2.52 ± 0.19) [paired t(17) = 7.08, p = .000002] and nonmetered (rhyming, 3.76 ± 0.13; nonrhyming, 2.25 ± 0.16) [paired t(17) = 7.29, p = .000001] stimuli. Similarly, when we resolved the interaction for rhyme, there was a significant main effect of meter for rhyming stimuli (metered, 4.19 ± 0.12; nonmetered, 3.76 ± 0.13) [paired t(17) = 5.04, p = .0001], as well as a smaller but still significant effect for nonrhyming stimuli (metered, 2.52 ± 0.19; nonmetered, 2.25 ± 0.16) [paired t(17) = 3.49, p = .003].

Fig. 1
figure 1

Results for aesthetic liking and rhythmicity ratings. The left panel shows the main effects of meter and rhyme, as well as their interaction, for the liking rating. The right panel shows the main effects of meter and rhyme, as well as their interaction, for the rhythmicity rating

Our statistical analysis of the liking ratings again showed main effects of both meter (metered, 3.11 ± 0.10; nonmetered, 2.92 ± 0.09) [F(1, 17) = 20.03, p < .001] and rhyme (rhyming, 3.38 ± 0.12; nonrhyming, 2.65 ± 0.12) [F(1, 17) = 26.05, p < .001], and a significant interaction of the two factors [F(1, 17) = 11.01, p = .004]. Step-down analyses for the factor Meter showed a significant effect of rhyme for both metered (rhyming, 3.52 ± 0.12; nonrhyming, 2.70 ± 0.12) [paired t(17) = 5.04, p = .001] and nonmetered (rhyming, 3.24 ± 0.12; nonrhyming, 2.59 ± 0.1) stanzas [paired t(17) = 4.65, p = .001]. Step-down analyses of the factor Rhyme resulted in a significant effect of meter for rhyming stanzas (metered, 3.52 ± 0.13; nonmetered, 3.24 ± 0.12) [paired t(17) = 5.17, p = .00007] but a less pronounced effect of meter for nonrhyming stanzas (metered, 2.70 ± 0.12; nonmetered, 2.59 ± 0.11) [paired t(17) = 2.35, p = .03], with both factors likely driving the interaction of meter and rhyme.

ERP data

N400

As can be seen in Fig. 2a and b, there was a negative ERP deflection for all conditions starting around 200 ms after the onset of the final word of the stanza. At the lateral ROIs, the statistical analysis revealed a marginally significant main effect of meter [F(1, 17) = 3.49, p = .08], a significant main effect of rhyme [F(1, 17) = 23.72, p < .001], and a significant three-way interaction of meter, hemisphere, and region [F(1, 17) = 7.38, p = .015]. The resolution of this three-way interaction did not yield any significant main effects or interactions with the factor Meter (all Fs < 3.5, all ps > .08). We also found a significant two-way interaction of meter and rhyme [F(1, 17) = 12.80, p = .002], which confirmed a significant main effect of rhyme in the metered [paired t(17) = 6.22, p = .000009], but not in the nonmetered [paired t(17) = 1.24, p = .23], stanzas. When we resolved the interaction with rhyme, we found a main effect of meter only for rhyming [paired t(17) = 4.44, p = .00036], but not for nonrhyming [paired t(17) = –1.53, p = .14], stanzas.

Fig. 2
figure 2

I ERP effects for meter and rhyme, aligned to the onset of the last word of the stanzas. The gray shading shows the time windows for the N400 and P600 analyses (N400, 200–500 ms; P600, 700–850 ms). II Bar graphs of N400 effects (mean percentages of microvolt change and standard deviations) at lateral (A) and midline (B) electrode sites for all four conditions. Statistically significant differences are marked by *s. III Bar graphs of P600 effects (mean percentages of microvolt change and standard deviations) at lateral anterior (A), lateral posterior (B), and midline (C) electrode sites for all four conditions. Statistically significant differences are marked by *s

At the midline ROI, a repeated measures ANOVA revealed significant main effects of meter [F(1, 17) = 5.76, p = .028] and of rhyme [F(1, 17) = 24.47, p < .001]. Both effects were qualified by a significant interaction of meter and rhyme [F(1, 17) = 8.19, p = .011]. Step-down analyses revealed a significant effect of rhyme for metered [paired t(17) = 6.16, p = .00001], but not for nonmetered [paired t(17) = 1.49, p = .15], stanzas. Comparable to the analysis at lateral electrode sites, we found a main effect of meter for rhyming [paired t(17) = 4.12, p = .00071], but not for nonrhyming [paired t(17) = –0.78, p = .44], stanzas.

Overall, the N400 effects indicated that the verse-final words of lyrical stanzas are processed with more ease when they are both metered and rhyming, whereas processing requires more effort when only one or neither of these features is involved.

P600

The ERP data also displayed a positive deflection starting 600 to 700 ms after the onset of the final word of the stanza (see Fig. 2a and c). On the basis of visual inspection, we defined a time window ranging from 700 to 850 ms for the statistical analysis. At the lateral ROIs, there was a marginally significant main effect of rhyme [F(1, 17) = 4.10, p = .06], as well as a significant three-way interaction of the factors Meter, Rhyme, and Region [F(1, 17) = 5.94, p = .026]. Step-down analyses revealed no significant main effects or interactions of the factors at anterior sites [all Fs(1, 17) < 1.99, all ps > .17], but a significant main effect of rhyme [F(1, 17) = 9.60, p = .0007] and an interaction of meter and rhyme [F(1, 17) = 5.16, p = .036] at posterior sites. Resolution of the latter interaction confirmed an effect of rhyme for metered [paired t(17) = –3.97, p = .0009], but not for nonmetered [paired t(17) = –0.68, p = .51], stanzas.

At the midline ROI, the repeated measures ANOVA revealed significant main effects of meter [F(1, 17) = 5.38, p = .033] and of rhyme [F(1, 17) = 4.12, p = .05], but no interaction of the two factors [F(1, 17) = 1.53, p = .23].

Taken together, the analysis of the P600 time window confirmed a facilitating effect of rhyme in the metered versions of the lyrical stanzas at lateral posterior scalp sites.

Correlations

N400 effects and aesthetic liking ratings

The correlational analyses did not confirm significant correlations between the N400 rhyme effects at lateral and midline electrodes and the aesthetic liking ratings [lateral: r(18) = .10, p = .35; midline: r(18) = .07, p = .38]. However, we did observe significant correlations between the N400 meter effects and aesthetic liking ratings at both the lateral [r(17) = .44, p = .038] and midline [r(18) = .48, p = .026] ROIs (see Fig. 3).Footnote 4 Thus, increased processing ease (indicated by a larger N400 effect) correlated with higher aesthetic liking.

Fig. 3
figure 3

Correlations between the ERP effect of meter and aesthetic liking ratings. The left panels show the correlations between the N400 effect for meter and the liking ratings for lateral (upper part) and midline (lower part) ROIs. The right panels show the correlations between the P600 effect and the liking ratings for lateral (upper part) and midline (lower part) ROIs

P600 effects and aesthetic liking ratings

Similarly, the correlations of the P600 rhyme effects at lateral and midline ROIs and aesthetic liking were nonsignificant [lateral: r(18) = .07, p = .38; midline: r(18) = –.11, p = .34]. However, the P600 meter effects significantly correlated with the aesthetic liking ratings at both the lateral [r(18) = –.51, p < .0001] and midline [r(18) = –.78, p < .0001] ROIs (see Fig. 3). Again, increased ease of processing, as indicated by a larger P600 effect, correlated with higher aesthetic liking.

General discussion

Combining the behavioral and ERP data, in the present study we set out to explore whether (a) meter and rhyme enhance the ease of processing of auditorily presented lyrical stanzas and (b) whether ease of processing leads to higher aesthetic liking of the stanzas, as is proposed by cognitive fluency theory. First, we replicated previous behavioral results (Obermeier et al., 2013) in an independent group of listeners unaccustomed to poetry. Both regular meter and rhyme enhanced liking and rhythmicity ratings as compared to nonmetered, as well as to nonmetered and nonrhyming, variants of the stanzas. Moreover, our behavioral results yielded an interaction of meter and rhyme, confirming that both poetic features have a bearing on aesthetic appreciation as well as on perceived rhythmicity.

Extending these findings, we also found significant effects of meter and rhyme, and an interaction of the two, in the N400 range: Whereas there was a rhyme effect in the metered stanzas, no such effect was observable in the nonmetered stanzas. Note that rhyming but nonmetered verses—as we presented in our study—are virtually nonexistent in traditional German poetry. Reflecting the rhyming practice observed in actual German poetry, only the combination of meter and rhyme led to a reduced N400 effect, suggesting that the poetic features jointly affect ease of processing. Furthermore, although it was less pronounced, we found a similar pattern in the P600 range. At the midline electrodes, we obtained effects of meter and rhyme, whereas at the posterior lateral electrode sites, we found a rhyme effect for metered but not for nonmetered language. Overall, this biphasic N400–P600 response reduction for metered and rhyming stanzas, relative to those that were nonmetered but rhyming or both nonmetered and nonrhyming, implies that both poetic features affect the reception of German lyrical stanzas in verse-final words. In addition, the N400 and P600 meter effects strongly correlated with the aesthetic appreciation ratings, indicating that rhyme does not modulate the ease of processing or aesthetic appreciation of poetry independently of meter.Footnote 5 These findings are in accord with the fact that (Western) poetry as a whole has many metrical forms without rhyme (e.g., the vast majority of Greek and Latin poetry), but barely any rhymed poetry without meter—if there is any at all. This asymmetry suggests that end rhymes in poetry may be conceived of as nonmandatory reinforcers of the metrical patterning, highlighting as they do the conclusion (clausula) of the metrical verse pattern.

With these findings, we offer the first empirical evidence concerning the psychological and neural correlates of processing ease in poetry reception, supporting the propositions put forward in cognitive poetics and cognitive fluency theory. In the following sections, we discuss these novel findings in the context of nonartistic language use and consider their effects on theoretical frameworks such as cognitive poetics and cognitive fluency theory.

Ease of processing is enhanced by poetic language use

The primary aim of the present study was to test how the poetic features meter and rhyme affect ERP responses that are indicative of processing ease. Would these potential effects in poetry compare to the well-documented effects of processing ease in ordinary language use? The N400 rhyme effects in poetry are in line with previously reported auditory rhyme effects in common language (e.g., Coch et al., 2005; Davids et al., 2011; Praamstra et al., 1994; Praamstra & Stegeman, 1993). Similarly, the meter effect is in accordance with previous N400 and P600 findings concerning how regular meter facilitates auditory lexical, semantic, and syntactic sentence processing (for the N400, see, e.g., Bohn et al., 2013; Magne et al., 2007, 2010; Rothermich et al., 2010, 2012; for the P600, see, e.g., Roncaglia-Denissen et al., 2013; Schmidt-Kassow & Kotz, 2009b; for both the N400 and P600, see Luo & Zhou, 2010; Marie et al., 2011; McCauley et al., 2013; Ystad et al., 2007). The present results also align well with the results of Bohn et al. Those authors reasoned that irregular but possible metrical stress in spoken language enhances processing costs, whereas regular meter reduces processing costs. This result resembles ours, which is based on small but clearly disruptive changes in the meter of German lyrical stanzas. Thus, the influences of meter and rhyme on ease of processing in poetry reception parallel the impacts of both features on the processing of common language. This may not come as much of a surprise, since listeners most likely would rely on the same underlying mechanisms, irrespective of the language type—that is, phonological processing in the case of rhyme, as well as dynamic allocation of attention based on the temporal regularities encoded in meter (e.g., Large & Jones, 1999).

However, the present data also significantly extend previous ERP evidence that concerns the isolated effects of meter and rhyme in processing common language. The present data clearly show that meter and rhyme influence the ease of processing of poetry in combination, rather than separately. This may suggest that the allocation of attention to salient events in stanzas, as induced by their metrical structure (e.g., the alternation of weak and strong syllables and the number of stressed syllables predicting the end of the line, and hence the occurrence of the rhyme), significantly affects how rhyme is encoded, and vice versa. This could, of course, be conceived of as a culture- and language-specific effect of German poetry, which, like English poetry, traditionally relies on a strong rhyme and meter interface and typically allows for dissociation of this interface only in an asymmetrical fashion (i.e., metered verses can well dispense with rhyme, but rhymed verses are almost invariably metered). Comparative and cross-linguistic work on poetry reception will therefore be needed to further substantiate the tentative claims we have put forward in the present research. Note also that although we report a significant interaction of meter and rhyme in a biphasic ERP pattern, it is still possible that meter and rhyme engage different neural sources and networks in the brain. In summary, the present findings are in line with, but also extend, previous results on meter and rhyme in auditory language processing.

Cognitive fluency and dysfluency in poetry reception

We hypothesized that meter and rhyme exert their influence on the aesthetic appreciation of poetry by enhancing processing ease or perceptual fluency, as is proposed by cognitive fluency theory (Reber et al., 2004). The correlation analyses provide the first evidence in favor of this hypothesis with regard to meter, but not rhyme. We were able to show that the larger the N400 and P600 effects—that is, the greater the increase in ease of processing for metrically regular as compared to metrically irregular stanzas—the greater the difference in aesthetic liking. Even though these results cannot explain how ease of processing and aesthetic liking are causally linked, it is plausible to assume that meter plays a role in the aesthetic appreciation of poetry via ease of processing (e.g., Ticini & Omigie, 2013). Therefore, the present results shed light on the underlying mechanisms and temporal structure of processing ease, thereby providing the first neuroscientific evidence that the propositions put forward in cognitive fluency theory (e.g., Reber et al., 2004) may also apply to poetry.

Importantly, our results do not imply that poetry is aesthetically liked just because it is easy to process. In fact, many poems are clearly more difficult to read than newspaper articles; semantic ambiguities are routinely expected and exploited, rather than suppressed, in poetry (cf. Galak & Nelson, 2011; Giora et al., 2004; Jakobson, 1960; Miall & Kuiken, 1994, 1998), and an entire tradition of poetry specializes in uncertainties, or even outright obscurity of semantic content. However, regardless of how demanding the understanding of their meaning may be, these verses can be, and apparently are, processed with more ease when they feature rather than lack rhyme and meter. Accordingly, a recent study using proverbs has shown that (a) rhyme and meter can enhance perceived beauty, succinctness, and persuasiveness, while also reducing semantic processing ease (due to implementing constraints on both word choice and word order), and (b) the total sum of the contradictory effects on perceptual (phonological, prosodic) and conceptual (semantic) processing is still positive (cf. Menninghaus et al., 2015). We therefore emphasize that the positive correlation we found between patterns of phonological recurrence (rhyme and meter), ease of processing, and aesthetic liking of poetic stanzas does not amount to the hypothesis that the appreciation and enjoyment of poetry is all about reduced processing demands.Furthermore, conceptual fluency may be especially prone to individual differences, since problem-solving mechanisms rely on personal experiences of art, including knowledge of art history, cultural knowledge, and the contexts in which works of art are encountered. For example, it has been argued that expertise may play an important role in poetry reception (Peskin, 1998; Tsur, 2008). All of these aspects of art reception have recently received increased attention in the study of empirical aesthetics (e.g., the psychohistorical approach of Bullot & Reber, 2013) and should be considered in future work.

Conclusions and outlook

In the present study, we set out to explore whether and how the poetic features meter and rhyme enhance ease of processing in the reception of poetry and whether this effect is related to a higher aesthetic appreciation of poetry. We found N400 and P600 effects for meter and rhyme that are indicative of processing ease for poetry, and these effects correlated significantly with aesthetic liking. We thus have provided the first neuroscientific evidence that these recurring phonological and prosodic patterns have a bearing on the aesthetic appreciation of poetry by enhancing ease of processing, as is proposed by cognitive poetics and cognitive fluency theory.

Future research will need to investigate how higher versus lower ease of perceptual processing interacts with higher versus lower ease of semantic processing, thereby facilitating a more sophisticated understanding of how different dimensions of heightened and reduced cognitive demand interact in poetry reception. Furthermore, poetry is just one of many forms of artful language use. Therefore, it would be interesting to explore whether meter is not only relevant for the aesthetic appreciation of poetry, but also plays a significant role in other special uses of language—for example, in religious, rhetorical, or promotional contexts.

In addition, poetry reception should be directly compared to music reception and its emotional and rewarding consequences (e.g., Brattico & Jacobsen, 2009; Grewe et al., 2005; Salimpoor & Zatorre, 2013; Steinbeis & Koelsch, 2008; Steinbeis, Koelsch, & Sloboda, 2006), since poetry and music share many structural properties. Finally, a neuroaesthetic approach is needed to identify and compare brain activation patterns in response to poetic language appreciation and non-language-related works of art (see Brown, Gao, Tisdelle, Eickhoff, & Liotti, 2011; Nadal, 2013).