Participants were randomly assigned to one of three learning conditions: speaking, rhythmic speaking, and singing. The participants heard 20 paired-associate phrases in English and an unfamiliar language (Hungarian) during a 15-min “listen-and-repeat” learning period, separated into three 5-min learning sessions. Participants practiced the 20 English–Hungarian paired-associate phrases one after another by first listening to the spoken English phrase, and then listening twice to the paired Hungarian phrase and repeating the Hungarian phrase aloud as best they could. The 15-min learning period was followed by a series of five different production, recall, recognition, and vocabulary tests for the English–Hungarian pairs. Measures of participants’ mood, background experience, and abilities in music and language were also administered in order to check that the randomly assigned groups were matched for these factors.
Hungarian was chosen because it was likely to be an unfamiliar language for native English-speaking participants. In addition, as compared to English (and, indeed, to the more frequently studied Germanic or Romance languages), Hungarian has different syntactic structures, few lexical cognates, and differences in the sound system. Using basic phrases in a foreign language, rather than using nonsense words that sound like possible native-language words, provided a strong test of whether singing can support foreign language learning. Importantly, the stimuli in the three learning conditions were also controlled for overall duration and presentation rate, reproducing an important feature of the experiments by Kilgour, Jakobson, and Cuddy (2000).
A group of 60 self-selecting adult students (30 male, 30 female) participated in the study. These participants were recruited through a university website advertising an auditory memory study to learn foreign language phrases. Their ages ranged from 18 to 29 years, with a mean age of 21.7 years. The 60 participants were randomly assigned to the three learning conditions, which were matched for gender (ten males and ten females in each group).Footnote 1 Analyses of variance (ANOVAs) revealed no significant differences between the three groups in age, mood, phonological working memory, language learning experience, language learning aptitude, musical experience, or musical ability (see Table 1).
The English and Hungarian stimuli were recorded by native speakers of each language in a soundproofed recording studio. Both the English and Hungarian phrase recordings were made by an experienced sound engineer using an omnidirectional microphone. Digital audio files were recorded onto a Windows computer using the SONAR 4 Studio Edition software.
The English stimuli were recorded by a native British English speaker who was given a list of 20 phrases, plus three practice phrases and was asked to say them aloud at a normal speed. She then repeated the entire list to ensure that each English phrase had a clearly articulated recording. The recorded stimuli were later split into individual sound files, one for each English phrase, and these spoken English stimuli were used during the learning process in all three conditions and as the English prompts for the Hungarian production test.
The Hungarian stimuli were recorded by a native speaker who did not have extensive training in music or singing, but who felt comfortable singing for the recording sessions. The spoken phrases were recorded first; then the speaker said the rhythmically spoken phrases in time with a metronomic pulse (72 beats per minute [bpm]) using the rhythms of the written melodies; and finally, the stimuli were sung in time with the metronome, using a pitch range from A3 to F4. Musical notation of the sung stimuli is available in the Appendix, showing the rhythmic and melodic patterns employed. The same 20 Hungarian phrases were used in all three learning conditions and ranged from two to eight syllables in length. For the spoken stimuli, the Hungarian speaker was asked to speak slowly and clearly, as if she were talking to a nonnative speaker. Since the phrases were spoken very slowly, both the stressed and nonstressed syllables were pronounced clearly. The rhythms and melodies created for use in the rhythmic speaking and singing conditions were modeled on the natural prosody of the Hungarian language and on melodies found in Hungarian folk songs. The rhythms used for the rhythmic speaking and singing conditions were identical, in that the singing condition stimuli simply included the addition of a melodic line along with the rhythmic patterns used for the rhythmic speaking condition, and both were recorded at the strict tempo of 72 bpm. The final rhythmic speaking stimuli were thus very different from the speaking stimuli, since they had a clear, metrical, musical rhythm and were spoken in time with a metronome. The initial practice trials of participants in all three conditions were listened to by the experimenter, to confirm that participants were capable of repeating back the Hungarian phrases during the learning process at a reasonable level of accuracy.
An important consideration for this study was to control for the duration and rate of presentation of the foreign language phrases in the three learning conditions, since it has previously been argued that listening to a song is only facilitative when verbal materials are presented at a slower rate than normal speech (Kilgour et al., 2000). In this study, the duration of the Hungarian phrases was carefully controlled, with the shortest, two-syllable phrases lasting 1 s each, and the longest, eight-syllable phrases lasting for 4 s. The mean and range of durations for the Hungarian stimuli were almost identical across the three learning conditions (see Table 2). An ANOVA comparing the stimulus durations (in milliseconds) across the three learning conditions showed an extremely close relationship between the phrase durations (p = .97). The English spoken stimuli were identical in all three learning conditions, with a mean duration of 1.0 s. The phrases were also presented in the same context in all three learning conditions: English Phrase 1, pause (1 s), Hungarian Phrase 1, pause (1 s), Hungarian Phrase 1, pause (8 s) for a participant to repeat the Hungarian phrase as best he or she could, followed by English Phrase 2, and so on, up to Phrase 20 (see Fig. 1 for an illustration of the learning procedure).
The order of stimulus presentation in both the learning and testing phases was generated using a pseudorandom number generator based on the Mersenne Twister algorithm (Matsumoto & Nishimura, 1998). The order of presentation was then checked to ensure that a phrase with a particular word was not placed directly before or after another phrase with the same word. The final Hungarian tests were also checked to ensure that the algorithm had not placed the test phrases in the same order of presentation as those phrases had appeared in during the learning sessions.
Five Hungarian tests were developed to measure participants’ learning of the paired-associate English–Hungarian phrases, and several background measures and questionnaires were also administered to establish whether the three groups were well matched, as described below.
Multiple-choice Hungarian vocabulary test
This test consisted of 20 forced choice multiple-choice questions in which each Hungarian word was presented with four possible English meanings to choose from (chance performance was thus 25 %). This measure was used as a pretest in order to assess whether participants had any prior knowledge of individual words in Hungarian, and again as a posttest in order to test whether the same individual words could be correctly identified after the learning sessions (which involved learning complete phrases in Hungarian). A score higher than 50 % on the pretest resulted in the participant’s data being excluded, due to the possibility that the participant had basic knowledge of Hungarian prior to starting the study (four participants were removed for this reason). The same 20 multiple-choice items were used as the Hungarian vocabulary posttest after participants had finished all three learning sessions.
Hungarian production test
The participants heard the 20 English phrases from the learning sessions—presented in a different, randomized order—and attempted to recall and reproduce the equivalent Hungarian phrases as best they could. The written, on-screen instructions asked participants to say the Hungarian phrases normally (rather than using singing or rhythmic speaking). That is, although the participants in two conditions spoke rhythmically or sang during the learning phase, all participants were explicitly asked to speak normally during the test phase, and they all complied with this instruction.
English recall test
Participants heard the 20 Hungarian phrases as prompts—presented in a different, randomized order—and attempted to recall and reproduce the equivalent English phrase. For this test, participants heard the Hungarian stimuli in the same way as during the learning sessions (e.g., spoken, rhythmically spoken, or sung phrases, depending on the group to which a participant had been assigned).
Hungarian recognition test
The participants were asked to make same/different judgments for accurate and inaccurate spoken versions of the 20 Hungarian phrases they had learned—again presented in a different, randomized order. Ten of the Hungarian phrases were presented with all syllables in the correct order. In the remaining ten items, two adjacent syllables within each phrase were swapped, resulting in new, incorrect Hungarian phrases (e.g., Megismételné, kérem was changed to Megistelméné, kérem). A native English speaker created the ten new, “inaccurate” Hungarian phrases, and then the same Hungarian speaker who was recorded for the other Hungarian stimuli was audio-recorded saying these ten “inaccurate” phrases. Because the ten inaccurate Hungarian phrases still had all of the same syllables, they sounded very similar (but not identical) to the phrases that participants had heard during the learning sessions.
Delayed-recall Hungarian conversation
Participants were asked to engage in a short conversation entirely in Hungarian, 20 min after the final learning session had been completed. They were informed that they would hear a series of Hungarian phrases on an audio recording and were instructed to respond by using a Hungarian phrase that would make sense in the context. The participants were encouraged to guess or to attempt to recall and reproduce the Hungarian phrases for “I don’t know” or “I don’t understand” if they were unsure of how else to respond. The recording consisted of five simple Hungarian phrases, separated by 8-s pauses, which functioned as one side of a brief conversation.
The participants also completed a number of additional measures and questionnaires relating to their musical and language learning abilities and experience. They reported their age and gender at the start of the experiment session. This was followed by an assessment of each participant’s phonological working memory, using the 20 low-wordlike items from the Children’s Test of Nonword Repetition (CNRep) developed by Archibald and Gathercole (2006, p. 514). Each participant also completed the 20-item self-report Positive and Negative Affect Scale (PANAS) mood questionnaire (Watson, Clark, & Tellegen, 1988) at the beginning and end of the experimental session. A brief language aptitude test (adapted from the modern language aptitude test of Gilleece, 2006) and a short questionnaire about the participant’s language-learning experience were administered, in addition to a brief musical ability test (adapted from the musical ability tests developed by Overy, Nicolson, Fawcett, & Clarke, 2003) and a short questionnaire about their musical training and experience.
The experimental sessions were held in a quiet room at a comfortable temperature and with appropriate lighting. All of the participants completed an informed consent form and were treated according to the ethical research standards published by the American Psychological Association (2002). Sessions were completed on an individual basis, with each participant taking approximately 60 min to complete all sections of the experiment. The participants were compensated £6 for their time.
Participants first completed the phonological working memory test (CNRep) by repeating each nonword after the researcher. This test was followed by the brief, presession mood questionnaire (PANAS) and the multiple-choice Hungarian vocabulary pretest, both of which were presented on a Windows desktop computer. Because the Firefox Web browser was displayed full-screen without displaying the URL, participants could neither return to a previous screen nor proceed to the next page until all of the required responses for each webpage were completed.
Before beginning the Hungarian learning sessions, participants were given spoken and written instructions that they should listen to the recording and repeat the phrases that they heard in the new language aloud, as best they could, and try to remember both the foreign phrases and the English meanings. The auditory stimuli and test items were played at a comfortable volume through noise-canceling headphones. Participants completed a practice session with three Hungarian phrases (which were never used again) while the researcher was present to answer questions. After establishing that the participant understood and was accurately repeating the practice phrases as instructed, the researcher went to a nearby room while the participant worked through the remainder of the session by following written on-screen instructions.
The 15-min learning period consisted of three 5-min aural/oral “listen-and-repeat” learning sessions. During the first learning session, the Hungarian phrases were displayed as text on the screen, as the 20 paired-associate phrases were presented. No text was displayed for the second and third 5-min learning sessions. This learning procedure gave participants time to learn and practice repeating the 20 Hungarian phrases in the complete paired-associate list three times, before performance on the material was evaluated.
At the end of the three 5-min learning sessions, participants first completed the Hungarian production test, followed by the English recall test, the Hungarian recognition test, and finally the multiple-choice Hungarian vocabulary posttest. For all tests, the participants were told that if they were not certain of the correct response, they should try to guess. They then completed the measures of language learning ability and experience and of musical ability and experience, as well as the brief mood postsession questionnaire (PANAS). Finally, participants completed the delayed-recall Hungarian conversation test.
At the end of the experimental session, the participants completed a four-item debriefing questionnaire about the study. They were also informed that Hungarian was the language that they had been learning during the experiment.
Digital audio recordings were made during each experimental session. Listening to the recordings confirmed that all of the participants had followed the instructions during the Hungarian learning sessions. Responses to the oral test items were analyzed by phonetically transcribing participants’ spoken utterances from the audio recordings, which were later analyzed by the experimenters without knowledge of the learning condition to which each participant had been assigned. These raw data were entered into a spreadsheet, and scores were calculated on the basis of the phonetic transcriptions. Responses to the Web-based items were collected separately via a MySQL database to reduce the need for paper tests that could introduce coding errors.
Multiple-choice Hungarian vocabulary test
Participants’ responses to these Web-based test items were scored with a correct answer receiving one point and an incorrect answer receiving zero points. A total score of 20 points was possible. Across all groups, the mean posttest performance was significantly above chance levels (p < .001), indicating that L2 learning had occurred (see the Results section).
Hungarian production test
All of the participants’ spoken utterances on this verbatim recall task were phonetically transcribed from the audio recordings. A point was only awarded if the participant produced the whole phrase in the new language correctly, with all syllables in the correct order. However, perfect pronunciation was not required; for example, the Hungarian phrase meaning “I don’t understand” ['nɛm 'ertɛm] was scored as being correct if the participant said ['nɛm 'erdɛm], and the phrase meaning “Yes, thank you” ['igɛn 'køsønøm] was scored as being correct if the participant said ['Igɛn 'køsønøm]. A total score of 20 points was possible.
English recall test
The participants’ English phrases spoken in response to the Hungarian prompts were transcribed from the audio recordings. One point was awarded if the participant produced the correct meaning of the phrase in English, for a total possible score of 20. Verbatim production of the original English phrase was not required; for example, the response “My name is Maria” was accepted as being correct for the Hungarian phrase Marja vagyok (“I am Maria”).
Hungarian recognition test
The same/different accuracy judgments for the spoken versions of the 20 Hungarian phrases were scored as being either correct (one point) or incorrect (zero points), for a total of 20 possible points. The absence of a response was also scored as zero.
Delayed-recall Hungarian conversation
Participants’ responses on this five-item test were phonetically transcribed from the audio recordings and scored out of a possible 10 points. Two points were awarded for an appropriate reply, spoken in Hungarian, to the previous statement. Responses of Nem tudom (“I don’t know”) or Nem értem (“I don’t understand”) received just one point, whereas incorrect Hungarian phrases and replies in English earned zero points.