Introduction

One of the defining features of episodic memory is that its content involves associating multiple elements, often including the who, what, and where of an event. As such, the content of episodic memory involves both items (people, places, objects, feelings) and the associations between those items (e.g., Cox & Criss, 2017, 2018; Kahana, Howard, & Polyn, 2008; Murdock, 1982; Tulving, 1972). Associations matter; our capacity to explicitly relate objects and their meaning and to relate context and items (events/people) is at the heart of learning and memory (e.g., Anderson & Bower, 1973; Zacks & Tversky, 2001). Moreover, one of the signatures of normal cognitive aging is a loss in the capacity to create new associations between items − a phenomenon known as the age-related associative deficit (Naveh-Benjamin, 2000). Although most agree on the importance of associative memory, our understanding of the nature of associations and how they relate to item information is still very limited (Cox & Criss, 2017, 2018; Cox & Shiffrin, 2017). It follows that a better understanding of associative memory will support the development of memory theory and contribute to our understanding of important issues such as normal and abnormal cognitive aging.

In the laboratory, episodic memory for associations or associative memory is often assessed by presenting pairs of stimuli. At test, pairs are presented for recognition or one pair member is presented, and participants attempt to produce or select the second pair member (cued recall). It has been established for some time that semantically related and frequently co-occurring items are more easily recalled than unrelated pairs (e.g., Dosher, 1984). However, Jones, Estes, and Marsh (2008) and Badham, Estes, and Maylor (2012) showed that novel word pairs can be recalled at the same level as semantically related pairs if they share an integrative relation. In an integrative relation, the first word of the pair specifies or classifies the second word of the pair (e.g., monkey-foot or horse-doctor). Badham et al. (2012) also showed that integrative relations very significantly reduced the age-related deficit in associative memory. Moreover, there is evidence that integrative relations facilitate processing in an obligatory fashion, without strategic control, contrary to what has often been assumed for semantic relations (see Badham et al., 2012; Estes & Jones 2009). The findings of Jones et al. (2008) and Badham et al. (2012) suggest that prior knowledge can support associative memory in a variety of indirect, not-so-explicit ways.

This view is echoed within recent models of item and associative memory. Cox and Shiffrin (2017; Cox & Criss, 2017, 2018) proposed models of associative memory in which any features shared by pair members facilitate encoding and recall of the pair. They suggest that, when presented with a new pair of stimuli, associative features can only be formed once participants accumulate enough information on the individual items presented. In other words, associative features between the items of a pair would be derived from the individual features of the items and would serve as a form of compound cue. As stated by Cox and Criss (2018): “Our dynamic account of associative encoding says that shared item features of any kind make it possible to encode more associative information in memory, leading to better recognition of intact pairs and better rejection of rearranged pairs” (p. 253). While this quote applies to recognition, the authors argued that similar mechanisms are involved in other types of tests, including cued recall (Cox, Hemmer, Aue, & Criss, 2018).

Here, we examined the impact of an unusual type of prior knowledge on associative memory, namely sound-symbolism. Studies of sound-symbolism have shown that some phonemes are inherently associated with specific features (Köhler, 1929; Ramachandran & Hubbard, 2001). For example, some phonemes such as /u/ as in who and /o/ as in hope are associated with roundness while other phonemes such as /k/ and /t/ are associated with sharpness (Maurer, Pathman, & Mondloch, 2006; Nielsen & Rendall, 2011). Empirically, these sound-symbolism relationships are highly reliable; if participants are presented with an angular shape as well as a more rounded shape and asked which is a “kiki” or a “bouba,” over 95% associate the shape with sharper edges with the non-word “kiki” and the rounder shape with the non-word “bouba” (Ramachandran & Hubbard, 2001).

In the work reported here, our goal was twofold. First, we wanted to verify if sound-symbolism could support associative memory. Second, we wanted to implement what can be thought of as a strict test of the view proposed by Cox and collaborators that any form of shared features should support item and associative memory (Cox & Criss, 2017, 2018; Cox & Shiffrin, 2017). In our experiment, a non-verbal item (random shape) and a verbal item (non-word) were combined to form an unfamiliar to-be-remembered pair; note that both the items and their associations are new. Moreover, neither the items nor their relationship can be supported by semantic knowledge of the type involved in semantic relatedness or integrative relations. We compared memory for associations relying on known sound-symbolism pairings with memory for associations that did not conform to these sound-symbolism links.

To the best of our knowledge, there has only been a single attempt to investigate the impact of sound-symbolism relationships on associative memory. In their study, Preziosi and Coane (2017) asked participants to study the definition of non-words that were either sound symbolically congruent or incongruent with the non-word. Results revealed a beneficial effect of sound-symbolism. However, the definitions were generated by another group of participants. This raises the possibility that in addition to congruency between sound qualities of the non-words and their definition, some additional associations were present. In addition, since the normative group spontaneously generated these definitions, participants in the memory experiment may have produced the correct answer by guessing the definition. Preziosi and Coane addressed this issue in a control experiment in which non-words and their definition were not participant-generated and the effect did not reach significance. Therefore, it remains to be seen if associative memory can benefit from sound-symbolism.

Method

Participants

Seventy participants (Mage = 26.53, SD = 3.80, 42 female, 28 male) were recruited from the online platform ProlificAC using the following inclusion criteria : (1) Native English speaker; (2) currently living in Canada, the USA, or the UK; (3) vision normal or corrected to normal; (4) no language disorder; (5) no cognitive impairment of any kind; (6) approval rating of at least 90% on prior submissions; and (7) age between 18 and 32 years. All participants were offered £1.50 for their participation. The complete procedure including consent, instructions, and debriefing lasted approximately 10 min.

Materials

Non-word selection

Non-words were selected from Westbury, Hollis, Sidhu, and Pexman (2018) database. Westbury et al. began with 21,220 non-morphologically inflected English words, which were input into the LINGUA software in order to create 629,797 non-words using a Markov chaining process to connect sets of three characters. All trigrams created this way also appeared in real words and were proportionately represented according to their probability of occurring in real words. This process led to the production of non-words beginning like real words, marked for one level of stress, and being syllabified. Westbury et al. then hand selected 12,556 non-words of three to eight letters judged to have an unambiguous pronunciation and a plausible ending. Using text-to-speech software, they created audio files for all selected non-words and removed problematic files until 8,000 remained. They then asked participants to make yes/no judgments on category membership for 20 categories. Of interest here, the stimuli were judged on how well they fit in either the “sharp” (i.e., spiky) or “round” categories; it was hence possible to compute the probability of assigning each non-word to the sharp and round categories.

For the current study, we used the following criteria for selecting non-words from Westbury’s et al. (2018) database. First, the probability of being characterized as round was subtracted from the probability of being selected as sharp. With this metric, words with the highest positive scores were the sharpest, words with the highest negative scores were the roundest, and words with the scores closest to zero were neutral. Second, in order to avoid over-representing some initial letters, a maximum of five non-words starting with the same letter were allowed in a category (sharp, round, and neutral). Each category was first populated with the five most representative non-words starting with all 26 letters of the alphabet, except for the letter X since only three non-words in the database start with the letter X; this generated a total of 128 non-words per category. If more than five non-words started with the same letter and had the same difference score, five of them were randomly selected from the tied items. In addition, when two non-words in a category were too similar to each other, one of the non-words was randomly deleted and replaced with the next most representative non-word. Two non-words were judged too similar if they (1) were orthographic neighbors (shared all letters at the same positions but one, i.e., moom and mool) or (2) if they were identical except one had an extra letter (i.e., mool and mooll or camulo and campulo). Thirdly, all non-words with a difference score smaller than .50 were removed from the sharp list, all non-words with a difference score larger than -.50 were removed from the round list, and all non-words with an absolute difference larger than .05 were removed from the neutral list. At this stage, there were 72, 121, and 88 non-words in the sharp, neutral, and round categories, respectively. Fourth, to further reduce the number of non-words in a list, the least representative non-words in each category were removed; however, in order to keep the number of non-words starting with different letters similar, the number of non-words starting with each letter was calculated and only non-words starting with the most common letter were removed. Consequently, if five non-words started with the letters A and C and only three started with the letter O, non-words starting with the letters A and C were removed before removing a non-word starting with the letter O. This procedure was applied until only 66 non-words remained in each category (total=198; see Fig. 1 for examples).

Fig. 1
figure 1

Illustration of two typical trials starting with a fixation cross, followed by item presentation and recall. A congruent trial is presented on the left side and an incongruent trial is presented on the right side

Shape stimuli

Shape stimuli were taken from Sidhu and Pexman (2017); there were 28 sharp, 28 round, and 28 ambiguous shapes (total= 84; see Fig. 1 for examples). Ambiguous shapes served as the neutral stimuli as they share both round and sharp characteristics. These shapes were created by drawing outlines, in black on a white background, with round edges for the round shapes, sharp edges for the sharp shapes, and a combination of round and sharp edges for the ambiguous shapes. Anti-aliasing was used on all images to avoid jagged stair-like contour lines, without losing jagged edges for the sharp and neutral images. Shapes in all three categories were matched in terms of area, height, and width. We standardized the image size to 300 × 300 pixels.

Procedure

The experiment was developed using the PsyToolkit (Stoet, 2010, 2017) and run online. Participants completed a paired-associate learning task comprising 12 trials. Each participant completed six congruent and six incongruent trials. Presentation order of the 12 trials was randomized independently for each participant. As shown in Fig. 1, on each trial, three pairs of stimuli comprising a non-word and a shape were sequentially presented. With each trial comprising three pairs, each trial generated three observations for a total of 18 observations per participant per condition (six trials/condition × three pairs). In the congruent condition, the three pairs included a sharp, round, and neutral pair, with each shape paired with an appropriate non-word. That is, the round shape was always paired with a round non-word, the sharp shape with a sharp non-word, and the neutral shape with a neutral non-word. In the incongruent condition, a given shape could be associated with two types of non-words to produce an incongruent pair. For instance, a round shape could be associated with a sharp or neutral non-word. To balance the design, over the six incongruent trials, shapes were paired three times with one incongruent non-word type and three times with the other type. Hence, for three trials, the round shape was paired with a sharp non-word, and on three trials it was paired with a neutral non-word, etc.

For each trial and each participant, an appropriate group of three shapes and three non-words were randomly selected without replacement, with the condition constraining how pairs were constructed (congruently or not). This implies that each participant only saw a subset of the possible shapes and non-words; each participant was presented with 36 shapes in total from the possible pool of 84 and 36 non-words from the overall pool of 198. This ensured that any observed effects were not due to stimuli-set effects.

Participants were instructed that, on each of the 12 trials, they would see three pairs composed of one shape and one non-word. They were further informed that their task was to memorize the non-word that was presented with each shape for a memory task in which the shapes would appear one at a time in a random order along with the three presented non-words. They were warned that their task would be to click on the appropriate non-word. As shown in Fig. 1, each trial began with the presentation of a fixation cross at the center of the screen for 500 ms, followed by the sequential presentation of the three pairs of shape/non-word stimuli. Pairs were presented at a rate of 3 s per pair with an inter-stimulus interval of 500 ms. The shape was always presented on the left side of the screen while the non-word was always on the right side of the screen, both centered within their respective side. Recall began 500 ms after the last pair of stimuli was presented. At recall, all three non-words were simultaneously presented in a vertical list format and in a random order on the right side of the screen. Non-words presented at recall were the same three non-words that appeared during the encoding phase. Each shape was sequentially presented on the left side of the screen. The presentation order at recall was pseudo-random with the constraint that the recall order could not be identical to the presentation order and that the last presented shape could not be the first one to be recalled. Participants had to select the appropriate non-word by clicking on it. Once an answer was selected, the next shape was presented following a 100-ms delay. No feedback was provided about the accuracy of the response.

Results

Mean proportion of correct responses were computed separately for each condition. Results revealed significantly better performance in the congruent (M = .78, SD = .16) than in the incongruent condition (M = .68, SD = .22), t(69) = 3.86, p < .001, Cohen’s d = 0.461.

We further investigated the contribution of sound-symbolism to associative memory by testing an alternative interpretation. According to this alternative interpretation, the benefit observed for the congruent pairs is not due to memory, but to a form of guessing whereby if participants cannot remember a response, they tend to select the item that satisfies the sound-symbolism relationship. In order to examine this possibility, we analyzed the errors in the incongruent condition as this is the only condition where selecting a congruent shape can produce an error. In this condition, a shape was always studied with one of the two incongruent non-word types. For instance, a round shape could either be paired with a neutral or a sharp non-word, but never with a round non-word. Consequently, two kinds of errors can be produced at the point of test; recall that each shape is presented along with the sharp, round, and neutral non-word studied in the trial. Two erroneous choices are hence possible; participants could select the congruent non-word (the round one in our example) or the incorrect incongruent non-word. If the beneficial effect of sound-symbolism observed in the congruent condition is not due to memory processes, but to a form of guessing, errors in the incongruent condition should show a similar bias. In other words, when an error is made − presumably this is more likely when memory fails − the congruent non-word should be selected more frequently than the incongruent non-word. Given the presence of two alternatives, a probability of .50 of selecting the congruent response would indicate no bias, a probability of 1 would indicate a perfect bias toward selecting the congruent response and a probability of 0 would indicate a perfect bias toward selecting the other incongruent response.

In total, in the incongruent condition, there were 406 errors representing 32.22% of responses. The mean probability of selecting the congruent alternative was .502 (SD = .173). After collapsing data across participants, the one-sample t-test revealed that this probability was not significantly greater than the expected chance level of .50, t(59) = 0.09, p = .93, Cohen’s d = 0.012. Given the theoretical importance of this null effect and the impossibility of arguing for the null effect with null hypothesis testing, we further analyzed the data using a Bayesian one-sample t-test analogue by means of the BayesFactor package (Morey & Rouder, 2018) in the R statistical software (Version 3.6.1; R Core Team, 2019). We used the default priors provided in the BayesFactor package. Results revealed a Bayes factor of 7.05 indicating positive evidence in favor of the null hypothesis according to Raftery’s (1995) labeling system.

Discussion

In this paper, we offer the first demonstration of the impact of sound-symbolism on episodic memory for associations. Based on the work of Cox and Criss (2017, 2018) and Cox and Shiffrin (2017), we expected that novel item pairs that conformed to known sound-symbolism pairings would lead to better memory for associations when compared to pairs that did not.

Results confirmed this expectation; pairs of non-words and abstract shapes that were congruent with known sound-symbolism associations were better recalled than incongruent pairs of the same stimuli. This finding is important because the relations between the items were novel; moreover, contrary to other types of relations, the benefit provided by sound-symbolism is not likely to depend on explicit processing of meaning or shared features (Sidhu & Pexman, 2018). Consider the case of the integrative word pair advantage (Badham et al., 2012; Jones et al., 2008). Integrative word pairs can only facilitate associative memory if the meaning of each item is accessed and the semantics of the words allow meaningful integration. Even if the pair and the relations were never encountered before, the integration still relies on meaning and prior knowledge of integration rules (e.g., metal-flower is a “compositional” relation; it implies that the flower is made of metal). The work presented herein extends these findings by showing memory facilitation for pairs that have no prior relations and are also entirely novel items (i.e., random shapes and non-words). Hence, our findings suggest that learning novel associations between unfamiliar items can benefit from forms of prior knowledge that do not rely on semantic integration or other forms of relatively elaborate semantic processing.

What is the nature of the knowledge embodied in sound-symbolic links? Although there is a wealth of research on the impact of sound-symbolism, much less is known about the mechanisms that give rise to these associations between the acoustic and articulatory properties of phonemes and other stimuli or properties. Sidhu and Pexman (2018) provided a review of the current thinking regarding potential causal mechanisms, which include: (1) a co-occurrence view whereby phoneme features and related stimuli features co-occur in the world, (2) the proposal that phonemes and associated stimuli share some properties at the perceptual, conceptual, affective, or linguistic level, (3) a hypothesis attributing the associations to structural properties of the brain, evolution, or language patterns. It seems possible that sound-symbolism and its impact on memory might be attributable to a special form of co-occurrence/associative learning. The latter would lead to obligatory, pre-awareness, but albeit weak links (see Heyman, Maerten, Vankrunkelsven, Voorspoels, & Moors, 2019) between phonemes and certain properties of objects, which − like other forms of co-occurrence-based associations − can support encoding and retrieval of new associations. However, this remains a speculative account; further research is needed to elucidate the nature of the sound-symbolism links.

Overall our findings fit well with the proposals of Cox and Criss (2017, 2018) and Cox and Shiffrin (2017), as they stipulated that any information shared by both items of a pair should facilitate the encoding and retrieval of associative links. In this case, the links embedded within sound-symbolism associations led to better recall performance in a cued-recall task where both the items and their associations were novel. Although results fit well with the model of associative memory put forward by Cox and colleagues, it can be argued that, because there were no interfering tasks between encoding and retrieval, the paired-associate memory test used here also called upon short-term memory. This possibility would fit nicely with Cowan’s model (2019) suggesting that short-term memory is formed from a subset of the elements in long-term memory that are in a temporarily heightened state of activation. Under this view, properties of associative memory should impact performance in a short-term memory task. Alternatively, it can be argued that short-term memory is distinct from associative memory with a separate copy of the information and different operating principles (e.g., Norris, 2017). In this case, if it is assumed that short-term memory is unable to capitalize on sound-symbolism associations, we would have underestimated the size of the sound-symbolism effect. This issue could be addressed in further studies by using a delayed task.

Conclusion

We found that pairs of novel stimuli congruent with sound-symbolism principles were better recalled than pairs of novel stimuli incongruent with the same principles. The major finding of this study was to show the facilitative role of subtle associations between very different types of stimuli − i.e., the acoustic and articulatory properties of phonemes and the visual properties of random shapes. These associations are not considered to be the result of semantic processing, nor are they reliant on meaning, as is the case for integrative relations or semantic relatedness – although they might be the result of a form of implicit learning based on co-occurrence in the environment. Overall, our results extend the influence of sound-symbolism to the domain of episodic memory and are congruent with the expectations derived from the work of Cox and Criss (2018) and Cox and Shiffrin (2017) on associative memory.

Author Notes

This research was supported by a Discovery grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) to Jean Saint-Aubin. While working on this article, R.-P. Sonier was supported by a graduate scholarship from the New Brunswick Innovation Foundation and D. Guitard was supported by a graduate scholarship from NSERC.

Data Availability Statement

The data and materials are available at: https://osf.io/84sfu/?view_only=f8bb27530644455fbb088e337167c956 and the experiment was not preregistered.