Face perception is often described as a domain of perceptual expertise. Our skill with faces is believed to be mediated by the acquisition of a holistic processing strategy (Richler, Cheung, & Gauthier, 2011). Unlike objects that are typically identified at the category level (e.g., “dog”; Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976), it is often necessary to identify faces at the level of the individual (e.g., “Paulo”). But all faces consist of the same kinds of features (eyes, nose, and mouth) in the same configuration (eyes above nose, nose above mouth). Holistic processing is believed to facilitate individuation of such visually similar objects. Several measures and definitions have been proposed for holistic processing, such as the idea that faces are represented as a gestalt on which face parts are glued together into a global “face template” (e.g., Maurer, Le Grand, & Mondloch, 2002), better recognition for wholes rather than parts (Tanaka & Farah, 1993), sensitivity to configuration or the spatial relationships between face parts (Diamond & Carey, 1986; Leder & Bruce, 1998; Le Grand, Mondloch, Maurer, & Brent, 2001; Mondloch, Le Grand, & Maurer, 2002), or interactive processing of features and configurations (Amishav & Kimchi, 2010). Finally, holistic face processing may be a consequence of inflexibility in attentional weightings on face parts (e.g., Richler, Palmeri, & Gauthier, 2012; Richler, Wong, & Gauthier, 2011), wherein all parts of a face are obligatorily attended. These definitions are not necessarily mutually exclusive. One should also note that although configural processing may be seen as a definition of holistic processing, these are constructs that may be dissociated, being differentially affected by experimental manipulations and with different developmental trajectories (e.g., Richler & Gauthier, 2014).

Here, we focus on the sequential matching composite task (Farah, Wilson, Drain, & Tanaka, 1998) as providing evidence for holistic processing: Participants judge whether one face half (e.g., top) of two sequentially presented faces is the same or different while ignoring the other face half (e.g., bottom). Holistic processing is measured as a failure of selective attention/an interference from the task-irrelevant part, despite instructions to focus on only one part. A particular indicator, such as the composite effect, can reflect the different definitions of holistic processing. For example, the failure to ignore face parts measured by the composite effect may reflect inflexibility in attentional weightings on irrelevant information (e.g., Richler et al., 2012). However, configural processing can also explain interference effects in the composite task if configurations are computed across as well as within each half.Footnote 1 One should thus be aware that holistic processing is an operational construct (and somewhat tautological) and does not rely, on the majority of studies, on theoretically derived approaches to defining and measuring the processes underlying holism. It is important to note that, as already suggested (e.g., O’Toole, Wenger, & Townsend, 2001; Townsend & Wenger, 2014), that the empirical regularities that are interpreted as evidence for holistic processing can be produced in numerous ways, with only a subset of these corresponding to the manner in which perceptual holism is currently used in the literature. We will thus caution the reader that holistic processing is an operational construct (not theoretically derived in the majority of studies), measured in the present study, and in many studies, with the composite task.

In the so-called complete composite paradigm (cf. Richler & Gauthier, 2014, for a meta-analysis and review), there are congruent and incongruent trials. In the congruent trials, both the target and irrelevant parts come from the same (or different) individual/face image. In the incongruent trials, the two parts, target and irrelevant, are different. An alignment manipulation is often included, with the target and irrelevant parts either aligned or misaligned. Holistic processing is indexed by an Alignment × Congruency interaction: the better performance on congruent than on incongruent trials is attenuated in the misaligned condition. Wong, Palmeri, and Gauthier (2009) showed that such hallmarks of face processing can also be found for novel, artificial, nonface objects as people develop expertise in individuating specific objects. Considering naturally occurring objects, although these objects do not bear any low-level physical resemblance to faces (e.g., X-rays: Bilalic, Grottenthaler, Nagele, & Lindig, 2014; chessboards: Bilalic, Langner, Ulrich, & Grodd, 2011; fingerprints: Busey & Vanderkolk, 2005; cars: Gauthier, Curran, Curby, & Collins, 2003), experts have been found to show holistic processing of these objects.

A question of interest is whether holistic processing also serves as a general marker for expert word recognition. Although word recognition is among the most prevalent forms of expert object recognition observed in the population (Wong & Gauthier, 2007), there is no consensus yet on whether expert word recognition involves holistic processing (Wong, Wong, Lui, Ng, & Ngan, 2019). For example, there is the well-known word superiority effect (Reicher, 1969; Wheeler, 1970) which suggests that whole-word representations can have a top-down influence on the letter level (McClelland & Rumelhart, 1981; Rumelhart & McClelland, 1982). However, other authors argue for part-based recognition of visual words (Pelli, Farell, & Moore, 2003). In this vein, Farah and colleagues (e.g., Farah, 1991, 1992; Farah et al., 1998; Tanaka & Farah, 1993) conceived word and face recognition as relying on different representational capacities (part based vs. holistic based). One should note that the division of part-based processing for words and whole-based processing for faces may be an oversimplification. Indeed, learned attention to diagnostic parts has been shown to drive holistic processing of faces (Chua, Richler, & Gauthier, 2014). Face-like holistic effects appear to require that both the task-relevant and task-irrelevant parts have a history of being attended and that the parts be perceptually grouped, allowing this attentional effect to apply to the entire object (Chua, Richler, & Gauthier, 2015).

Applying the complete composite paradigm to words, the composite effect has been observed in English words (Wong et al., 2011), Chinese characters (Wong et al., 2012), and Portuguese words (Ventura et al., 2017). These results have led to the proposal that holistic processing is a marker of perceptual expertise for both faces and visual words (Wong et al., 2012). Not all work evaluating word holistic processing has reached the same conclusion. Indeed, Hsiao and Cottrell (2009) found that non-Chinese readers (novices) perceived Chinese characters more holistically than Chinese readers (experts) did. A possible explanation for the discrepant findings concerns the difference in performance level between participant groups in the composite paradigm. While in Wong et al.’s (2011) study the participants were expert and intermediate English readers who had a similar performance level in the composite matching task, the Chinese novices in Hsiao and Cottrell’s (2009) study performed significantly worse on the composite matching task than experts did . When performance level between the two groups was equated (Wong et al., 2012), experts’ holistic processing showed some sensitivity to the amount of experience with the characters, as it was larger for characters than for noncharacters. Novices, however, did not show a systematic difference, suggesting that their effects were more related to their inefficient decomposition of a novel, complex pattern into parts. The work of Tso, Au, and Hsiao (2014) raised the interesting hypothesis of a role of writing experience in holistic Chinese character processing.

Words and faces differ in multiple aspects, from their physical properties to their underlying cognitive and neural mechanisms (cf. e.g., C. Chen, Abbasi, Song, Chen, & Li, 2016). Face and visual word recognition involve different neural mechanisms with an opposite hemispheric lateralization (e.g., Dehaene & Cohen, 2011; Kanwisher, 2010; Mckone, Kanwisher, & Duchaine, 2007; but see some recent findings suggesting that they might overlap in their neural resources in both hemispheres: Nestor, Behrmann, & Plaut, 2012, and also evidence for variability in the lateralization of the visual word form area: e.g., Carlos, Hirshorn, Durisko, Fiez, & Coutanche, 2019; and fusiform face area, e.g., Gerrits, Van der Haegen, Brysbaert, & Vingerhoets, 2019). It is then possible that faces and words can both involve holistic processing in their own separate face and word processing systems, but using different mechanisms.

It is thus important to evaluate whether holistic processing in the face and word processing systems use similar or different mechanisms. So far, the evidence is scarce and contradictory. Indeed, H. Chen, Bukach, and Wong (2013) showed in an event-related potential (ERP) study that holistic processing of Chinese characters has an earlier neurophysiological correlate (P1) than that (N170) commonly found for face holistic processing (e.g., Jacques & Rossion, 2009). This suggests potentially different mechanisms behind holistic processing for words and faces. On the other hand, there are similarities between the holistic processing found for words and that found for faces and other domains of perceptual expertise. For example, holistic face processing, like that for Chinese characters, has been found for both sequential and simultaneous matching of the study and test stimuli (Richler, Tanaka, Brown, & Gauthier, 2008; Robbins & McKone, 2007; Wong et al. 2012). Also, holistic processing of faces and cars was larger for familiar than for unfamiliar stimuli (Bukach, Phillips, & Gauthier, 2010; Ellis, Shepherd, & Davies, 1979; Harris & Aguirre, 2008; Young, Hay, McWeeny, Flude, & Ellis, 1985), just like the larger holistic processing for words than pseudowords (Wong et al., 2011). The absence of duration effects for visual words and faces also suggest that the holistic processing in these two categories may share some common principles (C. Chen, Abbasi, Song, Chen, & Li, 2016). In a study on face recognition, Richler, Mack, Gauthier, & Palmeri (2009) parametrically varied the stimulus duration from 17 ms to 800 ms. The congruency effect was observed for exposure as brief as 50 ms and from 50 ms onwards it was affected neither by the duration of the study face, nor by the duration of the test face (Richler et al., 2009). Similarly, C. Chen et al. (2016) found that variation in the exposure duration did not bring about significant changes in the holistic word effect, at least when the stimuli were presented in the range of 170 ms to 600 ms. Recently, Ventura et al. (2018) investigated the lateralization of word holistic processing. We know from divided visual field studies coupled with a composite task (Ramon & Rossion, 2012) that holistic processing of faces seems to be mainly mediated by the right hemisphere. The results of Ventura et al. (2018) showed that the word composite effect was observed in the left visual field/right hemisphere, but not in the right visual field/left hemisphere. This does not contradict the usual LH advantage for words (but see above evidence for variability in the lateralization of language), it merely states that the initial stages of holistic processes of words are right-lateralized. Late stages of holistic processing might occur in the left hemisphere—in fact, in studies using fMRI adaptation, the left visual word form area show greater selectivity to whole words rather than to sublexical orthographic patterns, suggesting that the region may serve as a form of visual dictionary (Glezer, Jiang, & Riesenhuber, 2009; Glezer, Kim, Rule, Jiang, & Riesenhuber, 2015). In sum, the lateralization results (Ventura et al., 2018) suggest that holistic processing of words and faces are at least partly served by common (initial) mechanisms in the right hemisphere.

As face holistic processing has been studied more extensively than word holistic processing, it remains to be seen if characteristics of holistic expert processing can also be observed for words. In the present study, we considered the contextual influences that can be observed across faces and novel object categories within a single trial (Richler et al., 2009). In their Experiment 3, Richler et al. (2009) examined whether congruency effects for a novel artificial object category (Greebles) could be contextually induced across object categories within a single trial. They used a dual task in which a face composite task and a Greeble composite task were interleaved. The study face was placed between the study Greeble and the test Greeble (Greebles were always presented aligned). This so-called inducing face was either aligned (processed more holistically) or misaligned (processed less holistically). The authors found a congruency effect for Greebles only when the inducing face was aligned. This is particularly surprising given the clear differences in the geometries of the stimuli. These contextual effects at the scale of the trial were explained as resulting from hysteresis of the processing strategy engaged by the inducing face (Richler et al., 2009): Because participants have perceived the aligned face holistically, they also process the following Greeble more holistically. Other reports have also shown that processing style recruited by one task may influence processing on a subsequent task. Macrae and Lewis’s (2002) seminal study showed that face recognition ability was affected when observers were primed to process local information of Navon stimuli. In Gao, Flevaris, Robertson, and Bentin’s (2011) study, on each trial participants first matched two simultaneously presented Navon letters (Navon, 1977) and then matched the upper halves of two sequentially presented composite faces. Navon matching required attention to either the global or the local level in separate blocks. Prior orientation to the global features of Navon stimuli led to larger holistic processing of faces, whereas local priming did not interfere (Gao et al., 2011).

We were interested in evaluating whether similar contextual influences can be observed in a single trial across words and a novel object category for which participants were novices. Previous studies showed that aligned artificial objects (Greebles) do not give rise to a congruency effect in a composite task (Richler et al., 2009). We first evaluated (Experiment 1) whether the same pattern of results could be observed for aligned Ziggerins (Wong, Palmeri, & Gauthier, 2009). In Experiment 2, we used a dual task in which a word composite task and a Ziggerin composite task were interleaved. Critically, the study word (which appeared before the test Ziggerin) was aligned or misaligned. Aligned words are processed more holistically than misaligned words (Ventura et al., 2017; Wong et al., 2012; Wong et al., 2011). If the trial-by-trial contextual effect is similar to what has been found for faces and Greebles, we would expect that the trial context in the aligned word condition should induce more of a congruency effect for the interleaved Ziggerin trials than the misaligned word condition. However, a reduced contextual influence of a misaligned word might reflect the very unfamiliar format of a misaligned word in text processing, and thus it might not be related to holistic processing. More compelling evidence for the idea that a word processed holistically might influence the holistic processing of an artificial object would be to compare aligned words versus aligned pseudowords. Words are processed holistically (Ventura et al., 2017; Wong et al., 2012; Wong et al., 2011), whereas pseudowords are not (Ventura et al., 2017), and only the former would engage holistic processing (and hence induce a congruency effect on a subsequent artificial object). That was the main aim of Experiment 3.

Our study thus contributes to the evaluation of whether holistic processing in the two domains—faces and words—occur because of similar reasons and mechanisms. Our study is also interesting because it might give important information regarding the nature of contextual holistic processes. The contextual effects observed by Richler et al. (2009) are difficult to explain considering that failures of selective attention can arise because of representational constraints of a global face template. Indeed, it is difficult to imagine that a global, unified, perceptual representation of an aligned face can affect processing of a subsequent novel object that does not share the same configuration of features (Richler et al., 2012). These contextual effects suggest the role of a perceptual strategy that is automatically recruited for the aligned face stimulus and remains in play when a novel object is processed (Richler et al., 2012). The interpretation of the contextual effects by a perceptual strategy would be reinforced by the observation of the same type of effects between words and Ziggerins, which are even more different than faces and Greebles.

Experiment 1: Composite task with aligned Ziggerins

Method

Participants

To estimate sample size, we considered the congruency effect for Greebles reported by Richler et al. (2009) in their Experiment 2. Partial eta square is not provided in the paper. We computed the partial eta square as [sum of square effect / (sum of square effect + sum of square error)], with sum of square effect = F × MSE × df1, sum of square error = MSEdf2.

According to G*Power (Version 3.0; Faul, Erdfelder, Buchner, & Lang, 2009), a sample size of 27 would be required to detect an effect size of ηp2 = .33 at α = 0.05 with a power of 0.9 for a one-way within-subjects ANOVA.

We nevertheless included all the participants that enrolled for the experiment in a 2-month period. Fifty-one Portuguese readers and psychology students from Universidade de Lisboa participated voluntarily in exchange of a course credit (18 males, ages 18–34 years; median age = 19 years). They had normal or corrected-to-normal vision and no known history of a reading disorder.

This study followed the Declaration of Helsinki and Portuguese deontological regulation, and was approved by the Deontological Committee of Faculdade de Psicologia of Universidade de Lisboa. Participants provided oral informed consent.

Material

Ziggerins are artificial novel objects (Wong, Palmeri, & Gauthier, 2009), including six classes, defined by a unique part structure. Within each class there are 12 styles, each defined by variations in the parts’ cross-sectional shape, size, and aspect ratio. Ziggerins are vertical, 3-D colored, shaded objects. For the purpose of this experiment, Ziggerins were converted to grayscale shaded objects. Each composite was made from two different-style Ziggerins within the same class by combining the top half of one and the bottom half of the other. The composites were then rotated to the horizontal. Ziggerin composites were always aligned. Each composite Ziggerin was divided into a left and a right half by a vertical line (5.44° × 3.74° at a viewing distance of 90 cm).

Procedure

At the beginning of each trial, participants saw a fixation mark for 500 ms. The study Ziggerin was then presented for 1,500 ms, followed by a square bracket presented for 500 ms, cuing participants as to which part (left or right) of the test Ziggerin they would be asked to respond to, followed by the test Ziggerin, which remained on the screen for a maximum of 3,000 ms or until response (cf. Fig. 1). The timings we used correspond exactly to those in the Greeble composite task of the interleaved procedure of Richler et al. (2009).

Fig. 1
figure 1

Schematic of two trials in the aligned Ziggerins composite task. The study and test Ziggerins were always aligned

Participants were instructed to judge whether the cued part of the test stimulus was the same as or different from the corresponding part of the study stimulus, while ignoring the irrelevant part. Study and test Ziggerins were always aligned.

Four blocks of 48 trials were presented randomly, totaling 192 trials (the same number of trials as in Richler et al., 2009, Experiment 3—interleaved face and Greebles composite task), with 24 trials for each combination of cued part (left vs. right), congruency (congruent vs. incongruent), and correct response (same vs. different). A practice block of 16 trials preceded the experimental blocks.

Results and discussion

We analyzed all trials considering sensitivity (d'; see Macmillan & Creelman, 2005). Hits of 1 and false-alarms of 0 were replaced by 1 − 1/(2N) and 1/(2N), respectively, where N corresponds to the number of trials in the specific condition of the design. We found no evidence of a congruency effect: congruent trials (d' = 2.75, SEM = .96); incongruent trials (d' = 2.6, SEM = 1.04), F(1, 50) = 1.64, p = .21.

In Richler et al.’s (2009) original study with interleaved composite tasks for faces and Greebles, the study and test Greebles were always aligned because this condition did not lead to any congruency effects in Experiment 1 in that article. In the present study, the novel objects were Ziggerins, not Greebles, and we also observed no congruency effects for aligned Ziggerins. Thus, we can be sure that hypothetical congruency effects found in the following two experiments were strictly triggered by the preceding context and not dependent on the format of the study and test Ziggerin.

Experiment 2: Interleaved composite tasks with words (aligned vs. misaligned) and aligned Ziggerins

Method

Participants

To estimate sample size, we considered the interaction between congruency and face format reported by Richler et al. (2009) in their Experiment 3. Partial eta square is not provided in the paper. We computed the partial eta square as [sum of square effect / (sum of square effect + sum of square error)], with sum of square effect = F × MSE × df1, sum of square error = MSE × df2.

G*Power cannot deal adequately with repeated-measures ANOVA designs (personal communication from G*Power feedback team). We thus reduced the 2 × 2 design to a difference score, computing for each participant two numbers—a (aligned word context_congruent–aligned word context_incongruent) and b (misaligned word context_congruent–misaligned word context_incongruent)—and then test their difference with a one-way within-subjects ANOVA.

According to G*Power (Version 3.0; Faul, Erdfelder, Buchner, & Lang, 2009), a sample size of 42 would be required to detect an effect size of ηp2 = .21 at α = 0.05 with a power of 0.9 for one-way within-subjects ANOVA.

We nevertheless included all the participants that enrolled for the experiment in a 2-month period. Forty-eight Portuguese readers and psychology students from Universidade de Lisboa participated voluntarily in exchange of a course credit (14 males, ages 18–36 years; median age = 19 years). They had normal or corrected- to-normal vision and no known history of a reading disorder.

This study followed the Declaration of Helsinki and Portuguese deontological regulation, and was approved by the Deontological Committee of Faculdade de Psicologia of Universidade de Lisboa. Participants provided oral informed consent.

Material

Ziggerin composites were the same as in Experiment 1. The word composites were made from 24 sets of four (consonant-vowel.consonant-vowel) CV.CV Portuguese words (see the Appendix). Each word was divided into a left and a right half. Within a set, the left and right halves of each word could be interchanged to create the four words resulting from the orthogonal manipulation of response (same; different) and congruency (congruent; incongruent): for example: BIFE, BICO, SAFE, and SACO.

Each aligned composite word was presented in Courier font with 20 point type size (250 × 150 pixels; 3.44° × 1.04° at a viewing distance of 90 cm). The left and right halves of each word were separated by a vertical line. In the misaligned condition, the right half of the word was moved down by 100 pixels (3.44° × 1.66° at a viewing distance of 90 cm).

Procedure

Two composite tasks—one with horizontal Ziggerins and one with words—were interleaved (see Fig. 2).

Fig. 2
figure 2

Schematic of two trials in the interleaved composite task. A composite task with Ziggerins, in which the study and test Ziggerins were always aligned, was interleaved with a composite task with words, in which the study word was either aligned or misaligned, and the test word was always misaligned. The critical manipulation was whether the word that preceded the test Ziggerin was aligned or misaligned. The response of interest was the response to the test Ziggerin

As crucial aspects of the paradigm of Richler et al. (2009, Experiment 3) were changed (words for faces, Ziggerins for Greebles), we decided to maintain the exact same procedure in the interleaved tasks. At the beginning of each trial, participants saw a fixation mark for 500 ms. The study Ziggerin was then presented for 1,500 ms, followed by the study word, which was presented for 1,500 ms. Keeping with the terminology of Richler et al. (2009), this inducing word was either aligned or misaligned. Following the inducing word, a square bracket was presented for 500 ms, cuing participants as to which part (left or right) of the test Ziggerin they would be asked to respond, followed by the test Ziggerin, which remained on the screen for a maximum of 3,000 ms or until response. Then, another square bracket, cuing which part of the word (left or right) they would respond to, was presented for 500 ms, followed by the test word, which remained on the screen for a maximum of 3,000 ms or until a response.

Participants were instructed to judge whether the cued part of the test stimulus was the same as or different from the corresponding part of the study stimulus, while ignoring the irrelevant part.

Study and test Ziggerins were always aligned. The study (inducing) word was either aligned or misaligned, while the test word was always misaligned (cf. Richler et al., 2009, for a discussion of these choices)

Four blocks of 48 trials were presented randomly, totaling 192 trials (the same number of trials as in Richler et al., 2009, Experiment 3), with 12 trials for each combination of cued part (left vs. right), congruency (congruent vs. incongruent), correct response (same vs. different), and inducing word format (aligned vs. misaligned) for the Ziggerin composite task. A practice block of 16 trials preceded the experimental blocks.

Results and discussion

We analyzed all trials considering sensitivity (d'; cf. Table 1). Hits of 1 and false-alarms of 0 were replaced by 1 − 1/(2N) and 1/(2N), respectively, where N corresponds to the number of trials in the specific condition of the design. As mentioned earlier, we computed for each participant two numbers—a (aligned word context_congruent–aligned word context_incongruent) and b (misaligned word context_congruent–misaligned word context_incongruent)—where congruent and incongruent refer to the congruency of the Ziggerin parts, and then tested their difference with a one-way within-subjects ANOVA. The congruency effect for Ziggerins was higher in the aligned word context than in the misaligned word context, F(1, 47) = 2.0, p < .05.

Table 1 Mean d' scores (square error of the mean in brackets)

Thus, we found evidence that the congruency effect induced by words is stronger for aligned words than for misaligned words. It might be the case, however, that the disruption (or at least the smaller influence) of holistic processing of a subsequent artificial object reflects the disruption caused by processing a misaligned word, which is a very unfamiliar format and is not related to holistic processing. Although a similar manipulation of misalignment is common in the composite task with faces and in previous demonstrations of the composite effect for words (Ventura et al., 2017; Wong et al., 2012; Wong et al., 2011), more compelling evidence for the idea that a word processed holistically might influence the holistic processing of an artificial object would be to compare aligned words versus aligned pseudowords. Words are processed holistically (Ventura et al., 2017; Wong et al., 2012; Wong et al., 2011), whereas pseudowords are not (Ventura et al., 2017), and only the former would engage holistic processing (and hence induce a congruency effect on a subsequent artificial object). That was the main aim of Experiment 3.

Experiment 3: Interleaved composite tasks with aligned words versus pseudowords and aligned Ziggerins

Method

Participants

We considered the same power estimation as in Experiment 2. According to G*Power (Version 3.0; Faul, Erdfelder, Buchner, & Lang, 2009), a sample size of 42 would be required to detect an effect size of ηp2 = .21 at α = 0.05 with a power of 0.9 for one-way within-subjects ANOVA. We reduced the 2 × 2 design to a difference score, computing for each participant two numbers—a (word context_congruent–word context_incongruent) and b (pseudoword context_congruent–pseudoword context_incongruent)—and then test their difference with a one-way within-subjects ANOVA.

We nevertheless included all the participants that enrolled for the experiment in a 2-month period. Fifty Portuguese readers and psychology students from Universidade de Lisboa participated voluntarily in exchange of a course credit (20 males, ages 18–36 years; median age = 19 years). They had normal or corrected-to-normal vision and no known history of a reading disorder.

This study followed the Declaration of Helsinki and Portuguese deontological regulation, and was approved by the Deontological Committee of Faculdade de Psicologia of Universidade de Lisboa. Participants provided oral informed consent.

Material

Ziggerin composites were the same as in Experiments 1 and 2. The word composites with Courier stimuli were made from 12 sets of four (consonant-vowel.consonant-vowel) CV.CV Portuguese words (see the Appendix). The pseudoword composites with Courier stimuli were made from 12 sets of four (consonant-vowel.consonant-vowel) CV.CV pseudowords (see the Appendix). Each word and each pseudoword was divided into a left and a right half.

Each aligned composite stimuli were presented in Courier font with 20 point type size (250 × 150 pixels; 3.44° × 1.04° at a viewing distance of 90 cm). The left and right halves of each verbal stimuli were separated by a vertical line.

Procedure

Two composite tasks—one with horizontal Ziggerins and one with verbal stimuli—were interleaved (see Fig. 3).

Fig. 3
figure 3

Schematic of two trials in the interleaved composite task. A composite task with Ziggerins, in which the study and test Ziggerins were always aligned, was interleaved with a composite task with verbal stimuli, in which the study verbal stimuli was either an aligned word or an aligned pseudoword, and the test verbal stimuli was always misaligned. The critical manipulation was whether the verbal stimuli that preceded the test Ziggerin was an aligned word or an aligned pseudoword. The response of interest was the response to the test Ziggerin

We maintained the same procedure of Experiment 2 for the interleaved tasks. Participants were instructed to judge whether the cued part of the test stimulus was the same as or different from the corresponding part of the study stimulus, while ignoring the irrelevant part.

Study and test Ziggerins were always aligned. The study (inducing) verbal stimuli was either an aligned word or an aligned pseudoword, while the test verbal stimuli was always misaligned.

Four blocks of 48 trials were presented randomly, totaling of 192 trials (the same number of trials of Experiment 2), with 12 trials for each combination of cued part (left vs. right), congruency (congruent vs. incongruent), correct response (same vs. different), and inducing format (aligned word vs. aligned pseudoword) for the Ziggerin composite task. A practice block of 16 trials preceded the experimental blocks.

Results and discussion

We analyzed all trials considering sensitivity (d'; cf. Table 2). Hits of 1 and false-alarms of 0 were replaced by 1 − 1/(2N) and 1/(2N), respectively, where N corresponds to the number of trials in the specific condition of the design.

Table 2 Mean d' scores (square error of the mean in brackets)

As mentioned earlier, we computed for each participant two numbers—a (word context_congruent–word context_incongruent) and b (pseudoword context_congruent–pseudoword context_incongruent)—where congruent and incongruent refer to the congruency of the Ziggerin parts, and then tested their difference with a one-way within-subjects ANOVA. The congruency effect for Ziggerins was equivalent in the word context and in the pseudoword context, F < 1.

In the present Experiment, we had a stricter evaluation of contextually induced congruency effects by verbal stimuli, and we found no evidence that an aligned word (which is processed holistically) induces a stronger congruency effect on artificial objects than an aligned pseudoword (which is not processed holistically).

General discussion

Richler et al. (2009) had shown that contextually induced congruency effects can occur within a single trial between objects of different categories. Specifically, trials that contained aligned faces led to congruency effects for Greebles. We used a different type of artificial objects, Ziggerins (Wong et al., 2009), and in Experiment 1 we verified that there were no congruency effects for aligned Ziggerins. When comparing aligned words versus misaligned words as inducing context (Experiment 2), we found evidence that aligned words can induce a stronger congruency effect on interleaved Ziggerins than misaligned words. However, this difference might be simply because misaligned words are in a format that is not usual in reading. In a stricter test, we compared aligned words (which are processed holistically) to aligned pseudowords (which are not processed holistically), and we found no evidence that an aligned word induces a stronger congruency effect on artificial objects than aligned pseudowords. We thus found a dissociation between face and word holistic perception in terms of their contextual influences. This is relevant because all evidence to date (although scarce) points to similarities between the characteristics of holistic face processing and holistic word processing, with the exception of H. Chen et al. (2013) study showing that holistic processing of Chinese characters has an earlier neurophysiological correlate (P1) than that (N170) commonly found for face holistic processing (e.g., Jacques & Rossion, 2009). A similar event-related potential (ERP) study with words from an alphabet are nevertheless necessary to confirm, and eventually extend, this result.

Why would contextual influences of holistic processing be different for words and faces? We have evidence that holistic processing of words has at least two different loci: an earlier locus, probably reflecting basic perceptual processes, and a later, lexical orthographic locus. C. Chen et al. (2016) found that variation in the exposure duration did not bring about significant changes in the holistic word effect, at least when the stimuli were presented in the range of 170 ms to 600 ms. But holistic word processes occurring at an earlier versus a later stage may reflect the influence of different variables and mechanisms.

As mentioned above, holistic processing of Chinese characters has been found to be associated with the event-related potential (ERP) component P1 (H. Chen et al., 2013. A functional magnetic resonance imaging (fMRI) study also showed selectively higher activity for words over visually controlled objects not only in the visual word form area (VWFA) but also in the primary visual cortex (V1) (Szwed et al., 2011). The earlier neural involvement of word processing could contribute to holistic processing. However, we also have evidence for a later locus of holistic word processing. Indeed, Ventura et al. (2017) demonstrated that the word composite effect is immune to surface features of words: The same strong congruency effect, severely reduced in the misaligned condition, was found for handwritten and alternating-cAsE fonts as for a typical print font. The word composite effect thus also stems from access to abstract word representations. At this stage, holistic processing of words might help the workings of the VWFA. The VWFA, it is generally agreed, intervenes in the efficient identification of orthographic stimuli (Dehaene et al., 2001) and enables quick association of such stimuli with phonological and lexical information (Hashimoto & Sakai, 2004). In expert alphabetic readers, the VWFA is organized in a posterior-to-anterior hierarchy (Dehaene et al., 2004; Thesen et al., 2012; Vinckier et al., 2007): Posterior parts respond to individual letters, irrespective of case (Dehaene et al., 2004; Thesen et al., 2012), whereas anterior parts respond to letter combinations such as bigrams (Binder, Medler, Westbury, Liebenthal, & Buchanan, 2006; Vinckier et al., 2007). Holistic processing may intervene to bind together individual letters that activate the posterior part of the VWFA, providing the input that activates more anterior parts of the VWFA, responsive to whole words. Holistic processing of words, at this lexical/orthographic stage, might reflect a critical linguistic component. Linguistic regularities such as word frequency and transitional probabilities between sublexical units may lead to the construction of chunks at the whole-word level, according to statistical learning research (Orbán, Fiser, Aslin, & Lengyel, 2008). Such an organization of the complex visual display of letters into representational objects may be crucial in satisfying the highly demanding visual task of word recognition and reading. Holistic word processing at this late lexical/orthographic stage, and the consideration of all parts of a word together, may thus have at its origin linguistic regularities and variables. Holistic word processes occurring at this lexical orthographic locus might be influenced by other linguistic variables, including phonology. Although the phonological code lags slightly beyond the orthographic code, both are rapidly activated (cf. Van Orden & Kloos, 2005, for a revision), and hence the word composite effect could be modulated by orthography-to-phonology consistency. Note that in Portuguese, an orthographic syllable (e.g., <ca> in <cano> and <cave>; English translation: pipe and basement) can map into more than one phonological syllable (e.g., /k6/ in /k6nu/, and /ka/ in /kavə/, respectively). Recently, we found that phonology modulated the word composite effect (Ventura, Fernandes, Leite, & Wong, 2019). Indeed, the congruency effect for orthography-to-phonology inconsistent words (e.g., cano; cave) was significantly smaller than for consistent words (e.g., tive; tino; /tivə/ and /tinu/; English translation: (I) had; sense, respectively).

Given that the presentation times in our study were long (but in accordance with those used by Richler et al., 2009), the holistic word processes involved might depend strongly on a late lexical/orthographic stage, and the nature of these linguistic-dependent holistic processes may not be abstract enough to allow an influence on other, nonlinguistic categories.

An alternative explanation of our results, and in particular of Experiment 3, is that holistic word processing generalizes, but in Experiment 3 both words and pseudowords induced holistic processing to an equivalent degree, perhaps because they are equally pronounceable and orthographically matched. However, the Portuguese words and pseudowords used in the present Experiments 2–3 were previously used in Ventura et al. (2017) in studies using the composite task. In those studies, we obtained evidence that words were processed holistically: We found a composite effect signaled by the significant Alignment × Congruency interaction. For pseudowords, we observed only a congruence effect, given that performance was affected by the irrelevant part even in the misaligned condition. If anything, the composite effect for pseudowords was in the opposite direction than that signaling the usual composite effect. Thus, it seems unlikely that both words and pseudowords induced a congruency effect due to their being both processed holistically. The congruency effects observed for unfamiliar Ziggerin objects in our study do not seem to reflect the influence of holistic processing of linguistic stimuli, and might more likely reflect strategic influences and constraints of the task (e.g., Murphy, Gary, & Cook, 2017; Richler et al., 2011).

The fact that we have not observed higher contextual effects from words to Ziggerins does not seem to have direct consequences in understanding contextual effects for faces and Greebles. The interpretation of the contextual effects by a perceptual strategy would only be reinforced by the observation of the same type of effects between words and Ziggerins, which are even more different than faces and Greebles. As discussed above, the processes and mechanisms involved in late lexical/ orthographic holistic processes might lack the necessary level of abstractness to exert an influence on an artificial object. Our results do not allow a better understanding of the nature of contextual effects for faces and Greebles.

In conclusion, while the behavioral phenomenon of holistic processing can be a domain-general expertise marker, it can be supported by different mechanisms due to different types of prior experience as a function of the object category concerned, which in the case of words, may reflect linguistic factors.