The phenomenon of insight in problem solving and creative thinking—the sudden realization of the solution to a problem after a period of impasse, the sudden movement from befuddlement to understanding—has captured the interest of researchers and laypersons alike (e.g., Lehrer, 2008). Many refer to the story of Archimedes (circa. 287–212 BC), who, on immersing himself in his bath, suddenly realized the solution to the problem of how to measure the volume of the king’s crown, with the result that he leaped out of the bath and ran naked through the streets of Syracuse shouting “Eureka!” (“I have found it!”). Archimedes’s discovery, although probably apocryphal (for a discussion, see www.math.nyu.edu/~crorres/Archimedes/Crown/CrownIntro.html), is still cited by researchers (e.g., Luo & Knoblich, 2007) as a paradigmatic case of insight in problem solving, and the terms eureka event or aha! experience are now used to refer to similar moments of sudden realization experienced by all of us.

Despite a long history of interest in the nature and phenomenology of insight experiences, debate continues as to the specific mechanisms that differentiate problem solving that is, and is not, accompanied by insight. A commonly drawn distinction is that non-insightful solution arises through methodical, goal-directed, and strategic—or analytic—processes that are available to conscious awareness, whereas solution through insight does not (Ohlsson, 2011; Schooler, Ohlsson, & Brooks, 1993). However, evidence from several studies (Chein, Weisberg, Streeter, & Kwok, 2010; Chronicle, MacGregor, & Ormerod, 2004; Perkins, 1981) points to the involvement of analytic processes even in problems whose solution may be accompanied by a sudden feeling of insight, raising doubts about the utility of this distinction. For example, in our previous investigations of the 9-dot problem, a classic insight problem (Chein et al., 2010), we found that performance was predicted by interindividual differences in working memory (WM) capacity (the conscious mental workspace in which analytic processes are assumed to be carried out; Baars & Franklin, 2003; Baddeley, 1986). Thus, performance on a problem presumably requiring insight for its solution was predicted by individuals’ capacity to mentally store and manipulate information in the service of conscious information processing.

In the present research, to further explore the involvement of WM and analytic processes in insightful problem solving, we turned to a class of verbal problems, compound remote associate (CRA) problems, which have been used frequently in recent studies of problem solving through insight (Ansburg, 2000; Bowden & Jung-Beeman, 2003a, b; Chu & MacGregor, 2011; Cunningham, MacGregor, Gibb, & Haar, 2009; Jung-Beeman et al., 2004; Kounios et al., 2006; Ricks, Turley-Ames, & Wiley, 2007; Sandkühler & Bhattacharya, 2008). In order to set the stage for discussing the present study, we first examine the two current conceptions of insight, and then examine recent research investigating insight in CRA problems.

Two orientations toward insight

Most current researchers agree that solving a problem with insight depends on restructuring of the problem representation, which means that the solution to the problem will develop from a representation that is different from that with which the individual began (Ohlsson, 1992, 2011; Weisberg, 1995). Restructuring can bring with it a sudden solution, encapsulated as a eureka event or aha! experience. As we noted above, solving a problem through insight has been contrasted with solution as the result of analysis (Novick & Bassok, 2005; Weisberg, 2006), in which the problem representation does not change as the person works through the problem, and solution typically occurs gradually and without the strong emotional experience that sometimes accompanies insight (Metcalfe, 1986; Metcalfe & Wiebe, 1987).

Insight as a special process

Although there is general agreement among researchers concerning the overall structure and phenomenology of problem solution through insight, there has been disagreement concerning the particular processes underlying insight; that is, the processes intervening between the initial incorrect solution attempt(s) and the restructuring that results in solution and the aha! experience. In one view (Jung-Beeman et al., 2004; Luo & Knoblich, 2007; Ohlsson, 1992, 2011), which has been called the special-process view of insight (Ball & Stevens, 2009; Schooler et al., 1993), it is proposed that those initial failures lead to impasse, a situation in which the person is at a loss concerning how to proceed. The occurrence of impasse sets in motion what we can call restructuring processes. Examples would be spreading activation in semantic memory (Ohlsson, 1992, 2011) or switching from fine-grained left-hemisphere processing to more coarse-grained right-hemisphere processing (Jung-Beeman et al., 2004). Those processes, which are outside the conscious control of the person, may bring about a restructuring of the problem, resulting in solution accompanied by an aha! experience.

Several different sorts of evidence have been brought forth in support of the special-process view of insight (for reviews, see Chein et al., 2010; Fleck & Weisberg, 2004; Gilhooly, Fioratou, & Henretty, 2010; Gilhooly & Murphy, 2005; Weisberg, 2006). For example, Schooler et al. (1993; see also Schooler & Melcher, 1995) demonstrated verbal overshadowing of insight: Asking participants to think aloud interfered with solution of insight problems, but not with solution of analytic problems. That finding was interpreted as evidence that the processes underlying insight are, in contrast to those underlying solution through analysis, nonverbalizable and hence outside of the person’s conscious control.

The special-process view also suggests that WM should play little or no role in problem solving through insight, since the solution comes about as the result of processes outside the conscious planning abilities of the person. Accordingly, evidence indicating that performance on insight problems is not affected by the imposition of an additional WM load (Lavric, Forstmeier, & Rippon, 2000), and can be independent of individual differences in WM capacity (Ash & Wiley, 2006), has been offered as support for the special-process view. As we already noted, our previous results with a visuospatial problem, the 9-dot problem (Chein et al., 2010), contradicted that view.

Insight as business as usual

In contrast to the special-process view of insight is the business-as-usual view (Ball & Stevens, 2009; Chronicle et al., 2004; Chronicle, Ormerod, & MacGregor, 2001; Gilhooly et al., 2010; Perkins, 1981; Weisberg, 2006), which proposes that solving a problem through restructuring, per se, does not mean that some sort of special process, such as unconscious spreading activation in response to impasse, has occurred. Rather, the problem solver may change the representation of the problem in response to information that becomes available as a result of failures to solve the problem. In this vein, Fleck and Weisberg (2004) reported that the box solution to the candle problem, assumed to be the result of insight, resulted from restructuring that occurred when new information became available as the individual worked on the problem. An example would be when a participant, on attempting to tack the candle directly to the wall, realized that the tacks were not large enough to go through the candle. That realization led to a conclusion that a shelf was needed, which in turn led to the box.

In an important extension of the business-as-usual view, MacGregor, Ormerod, and Chronicle (2001; see also Chronicle et al., 2004; Chronicle et al., 2001) proposed an analysis of performance on the 9-dot problem (Scheerer, 1963; Weisberg & Alba, 1981), which claimed that solution depends on the utilization of heuristic methods, such as a strategy of drawing lines that cover the largest number of remaining dots. A critical component in carrying out that strategy is look-ahead: Mentally determining the state of the problem situation after a series of lines has been drawn. According to MacGregor et al., people with greater look-ahead capacity will be more likely to realize that drawing lines within the square formed by the dots is doomed to failure, and therefore will turn to the possibility of drawing lines “outside the box.” Furthermore, look-ahead capacity should play a role in development of the solution of the problem: people with higher look-ahead capacity will solve the problem more frequently and quickly than those with lower capacity.

MacGregor et al. (2001; see also Chronicle et al., 2001) did not directly examine the relation between look-ahead and 9-dot performance. Rather, they designed problems that they assumed would require differing degrees of look-ahead, and found that participants and computer models took longer to solve the problems they associated with greater look-ahead. To provide a more direct test of the role of look-ahead in 9-dot performance, Chein et al. (2010) operationalized look-ahead as spatial-WM capacity, and found support for the two hypotheses just outlined: (1) people with larger spatial-WM spans were quicker to draw possible solution lines outside the square formed by the dots; and (2) people with larger spatial-WM spans were quicker to solve the problem. In addition, performance on the 9-dot problem was positively related to spatial-WM capacity but not to verbal-WM capacity, which indicated that performance on the problem depended most heavily on the ability to visualize the outcomes of various moves. In sum, our previous work indicated that solution of the 9-dot problem, a classic insight problem, is related to WM involvement, and, particularly, the modality-specific component of WM involved in spatial representation and processing. Those results raised the question of whether other insight problems, particularly verbally based insight problems, might also involve planning in WM. To examine this question, we turned to CRA problems.

Compound remote associate problems

CRA problems were developed by Mednick (1962, 1968) for the remote associates test, a test of creative-thinking capacity. Each CRA problem requires that the individual provide a solution word that is related to three cue words presented in the problem, and the problems are designed so that the solution word is not a strong—that is, frequent—associate to any cue word. Thus, solution requires that the individual search through memory to find unusual or infrequent associations. For example, in one CRA problem, the cue words age, mile, and sand are presented, and the participant must discover their common associate—for instance, stone (stone age, milestone, sandstone; see Appendix A for additional examples). One could characterize solving a CRA problem as requiring the individual to think outside the box, in the sense that deriving remote associations to the cue words in the problem requires that the person break away from the dominant or high-frequency associations produced by each cue word in the problem (Mednick, 1962, 1968).

Aha! experiences when solving CRA problems

The research of Jung-Beeman, Bowden, Kounios, and colleagues (Bowden & Jung-Beeman, 2003a; Bowden, Jung-Beeman, Fleck, & Kounios, 2005; Jung-Beeman et al., 2004; Kounios et al., 2006) and that of Sandkühler and Bhattacharya (2008) supports the conclusion that CRA problems can be solved through insight. For example, in the study by Jung-Beeman et al., participants asked to provide subjective reports of aha! experiences after solution gave “insight” ratings for more than 50 % of problems. Moreover, those solutions rated as having evoked a subjective experience related to insight were associated with a distinct profile of brain activity, as compared with problems that did not evoke that experience (for a review of the neurocognitive findings, see Kounios & Beeman, 2009; for a critique of that research, see Weisberg, 2013).

Working memory in CRA problems

From the special-process view, one might expect that, since CRA problems are frequently accompanied by insight ratings, performance on this class of problem should not relate to an individual’s WM capacity. However, the involvement of both executive (modality-independent) and modality-dependent components of WM in CRA problem solving is indicated in prior research.

Some prior studies examining problems other than CRA problems (e.g., Ash & Wiley, 2006; Gilhooly & Murphy, 2005) suggest that executive components of WM contribute to insight problem solving by controlling the specific allocation of attention during the problem attempt. The ability to direct the focus of attention via executive WM may similarly be a critical aspect of CRA problem solving. Attention control mechanisms may act to prevent the solver from being unduly influenced by irrelevant semantic information encountered during the problem attempt, and may guide a controlled search through long-term memory to facilitate the identification of candidate solution words (Rosen & Engle, 1997; Unsworth, 2009). Evidence from CRA problems supporting this expectation was provided by Ricks et al. (2007). Though their study was focused on the interaction between domain-knowledge and problem performance, in two experiments, Ricks et al. found a significant relationship between individual differences in WM capacity—measured using complex-WM-span tasks (operation span in their Exp. 1; operation and reading span in Exp. 2)—and problem performance. Since performance on these complex-WM-span tasks is generally thought to index executive aspects of WM (Engle, 2002; Kane et al., 2004), the authors suggested that WM capacity contributes to CRA performance “through the executive ability to direct or focus one’s attention” (Ricks et al., 2007, p. 1461).

However, the Ricks et al. (2007) study included only verbal complex-WM-span measures (the memoranda in the reading and operation span tasks were letters and words, respectively), and therefore could not assess the relative contributions of modality-dependent versus modality-independent aspects of WM (Baddeley, 1986; Cowan, 1995). Modality-dependent components of WM (i.e., visuospatial and verbal WM) may provide an essential platform, specific to the domain in which a problem is couched, within which the participant can represent and manipulate the interim products of the problem attempt, and can plan or simulate alternative approaches to the problem (Gilhooly, Phillips, Wynn, Logie, & Della Sala, 1999). Both MacGregor et al.’s (2001) model of the 9-dot problem and our empirical findings with that problem (Chein et al., 2010) reflect such material-specific WM involvement in a spatial insight task. With the verbally couched CRA problems, identifying the common associate word (i.e., the solution) may demand simultaneous representation in verbal WM of the three cue words and their respective associates. In addition, solution may be facilitated by the ability to update verbal WM and maintain a running record of the set of associates that have already been eliminated as possible solutions. Support for the involvement of modality-specific WM mechanisms in CRA performance was provided by a study conducted by Ball and Stevens (2009), in which they found that articulatory suppression—the requirement to overtly repeat an irrelevant utterance (e.g., “1–2–3–4–5”) concurrent with the problem attempt—significantly impaired CRA performance. Since articulatory suppression has been claimed to selectively block the verbal rehearsal component of WM (Baddeley, Lewis, & Vallar, 1984; Farmer, Berman, & Fletcher, 1986), this finding is consistent with the view that verbally specific WM resources are deployed to support CRA problem solving.

Although the studies of Ball and Stevens (2009) and Ricks et al. (2007) point to the involvement of WM in CRA performance, an important limitation is that those investigators did not consider the specific involvement of WM in insightful solution of those problems. As has been indicated by several prior studies (Bowden et al., 2005; Jung-Beeman et al., 2004; Kounios et al., 2006; Sandkühler & Bhattacharya, 2008), when one asks participants to provide subjective aha! ratings for solved CRA problems, many of the solved problems are rated as having been accompanied by feelings of insight, but many are not. That is, CRA problems are heterogeneous with respect to the experience of insight, with a significant portion reported by participants to be solved through analysis (noninsightfully). Accordingly, the results from Ricks et al. (2007) and Ball and Stevens (2009), who made no attempt to determine which problems were accompanied by a feeling of insight, may reflect the involvement of WM in only the specific subset of problems that were solved through analysis, not those solved through insight. In the present study, we therefore followed the procedures used in earlier studies in obtaining subjective reports following solution of each problem. By separately analyzing the subset of problems solved with a feeling of insight, we were able to examine more directly the link between WM and insight in CRA problems.

Furthermore, the conclusions of Ricks et al. (2007) and Ball and Stevens (2009) are conflicting with respect to their emphasis on executive versus modality-specific aspects of WM. Thus, as an additional refinement, we sought to examine the separate involvement of modality-specific and executive WM processes in CRA performance, by including both verbal- and visuospatial-WM assessments. To address a confound between mnemonic and nonmnemonic aspects of attentional processing required in complex-WM-span tasks, we also included in our design a more direct assessment of attention control by examining individual differences in attention tasks that place few specific demands on memory, including the antisaccade (Munoz & Everling, 2004; Unsworth, Schrock, & Engle, 2004) and Stroop color-word interference (Kane & Engle, 2003; Stroop, 1935) tasks.

Further questions concerning performance on CRA problems

Verbal overshadowing of insight

As we noted earlier, Schooler and colleagues (Schooler & Melcher, 1995; Schooler et al., 1993) have reported that thinking aloud interferes with performance on insight problems, which has been brought forth as support for the special-process view of insight. However, a number of studies have since failed to replicate the finding of verbal overshadowing of insight (e.g., Chein et al., 2010; Fleck & Weisberg, 2004; Gilhooly et al., 2010), so it is of interest to examine that phenomenon further. Only one prior study has specifically examined verbal overshadowing in CRA problems (Ball & Stevens, 2009), with mixed results. For a set of CRA problems chosen because they were easy to solve, and which the authors argued were likely solved without impasse being reached, a think-aloud requirement interfered with performance. In contrast, for a difficult set of CRA problems, the think-aloud requirement actually significantly facilitated solution (i.e., the opposite of a verbal-overshadowing effect was observed). As was suggested by the authors, those results, which indicate the absence of a verbal-overshadowing effect for the set of problems that are most likely to have produced impasse, are difficult to interpret within the special-process view and seem to favor the business-as-usual view. However, since insight ratings were not obtained in the Ball and Stevens study, strong conclusions regarding the relationship between verbal overshadowing and the experience of insight in these problems could not be drawn. Accordingly, in the present study we assessed the verbal-overshadowing effect in association with problems that were, and were not, rated by participants as having been accompanied by a feeling of insight.

Insight and solution time

According to the special-process view, insight occurs in response to impasse, which means that all solution attempts arising from the initial analysis of the problem have been exhausted (Ohlsson, 1992, 2011). In contrast, noninsightful solutions presumably come about because one of those early solution attempts was successful. Therefore, all things equal, insightful solutions should take longer to produce than solutions without insight. There is some evidence that contradicts that prediction with CRA problems. In both Jung-Beeman et al. (2004) and Kounios et al. (2006, Exp. 2), participants solved problems rated as being accompanied by insight more quickly than those rated as being noninsightful (but see Kounios et al., 2006, Exp. 1). The difference in solution times was not significant, but was in the direction opposite the one that follows from the special-process view. More recently, Sandkühler and Bhattacharya (2008) found that problems solved with a strong feeling of suddenness were solved significantly faster than those accompanied by weak feelings of suddenness. The present study provided an opportunity to determine the robustness of the finding that insight ratings coincide with faster, rather than slower, solutions.

In sum, in the present study we examined CRA problems in order to determine the role of WM and attention in insight, and analytic, solutions to those verbal problems. We also examined additional aspects of performance to shed further light on how these problems are solved, and on the role that insight might play in solution.

Method

Participants

A group of 53 Temple University undergraduates, fulfilling an optional course requirement, served as participants. Participants were randomly assigned to one of two conditions: think-aloud (verbalization during problem solving; N = 27) versus silent (N = 26).

Materials

Sixty CRA problems (Appendix A) were chosen from the set of 144 problems that had been developed by Bowden and Jung-Beeman (2003b). In an effort to maximize individual differences (increase variance) on CRA performance, problems were chosen to represent a widely distributed range of difficulties, on the basis of the normative solution rates reported by Bowden and Jung-Beeman (2003b). This was accomplished by ordering all the problems in terms of normative solution rates. Then, starting with the most frequently solved problem, we chose every third problem, with the restriction that problems possessing solution words that were identical to previously selected problems were skipped (and, a small number of additional problems were excluded on the basis of agreement among the experimenters that the compound associates in the solution would be unfamiliar to our participant sample). This method resulted in our using problems from nearly the complete range of solution rates reported by Bowden and Jung-Beeman (2003b).

Procedure

Participants were tested individually in a single session lasting 1 h. All of the participants completed the CRA problem set, two computerized tests of WM (automated verbal-WM and spatial-WM tests), and two computerized assessments of attention control (antisaccade task or Stroop color-word interference task).

CRA problems

Following instructions and practice, participants completed the set of 60 CRA problems. The order of problem presentation was randomized across participants. Each problem began with the central presentation of the three cue words, which remained on the screen throughout the problem attempt. Participants were given 45 s to solve each problem by identifying the target word, which they typed using the keyboard. In the event of an incorrect solution attempt, participants were prompted (“Sorry, wrong answer. Keep trying.”) to continue working on the problem until the time limit was reached. Response latency from the onset of the problem to the initiation of the response (first keystroke) was recorded for all correct solutions.

Following a correct solution, participants were prompted to provide a subjective insight rating. Borrowing heavily from the approach developed by Jung-Beeman, Bowden, and their colleagues (Bowden & Jung-Beeman, 2003a; Jung-Beeman et al., 2004), evidence for aha! experiences accompanying solution was obtained through a retrospective subjective-report scale. Specifically, after they solved a problem, participants were instructed (as described in Appendix B) to use a 4-point scale to report the experience accompanying solution. The ratings on the “insight” scale were counterbalanced, with half of the participants seeing a scale with highly strategic solutions rated as 1 and highly insightful solutions rated as 4, and the other half of the participants receiving the reverse scale.

Following “think-aloud” procedures used in prior studies, half of the participants also received instructions to verbalize their thoughts during CRA problem attempts (Fleck & Weisberg, 2004; Perkins, 1981). Instructions concerning production of protocols were taken from Fleck and Weisberg (2004). Periodic reminders to think aloud were given by the experimenter between problem attempts. Since we were interested only in the possible influence of thinking aloud on performance, we did not record the protocols.

Verbal-WM task

Verbal-WM capacity was measured using an automated version of the operation span (OSPAN) task (Unsworth & Engle, 2005). In a typical OSPAN task, the participant must retain a series of verbal items (e.g., letters, words) presented in an interleaved fashion between a series of simple arithmetic equations. The experimenter controls the stimulus timing to match the participants’ individual speed of processing. In the automated version of the task, the participants complete a number of practice arithmetic problems prior to WM assessment, and the mean time to solve those equations is used to automatically (computer-paced) titrate the rate of item presentation during WM testing. The automated version we used required letter recall, and the to-be-remembered letters were subsampled from a set of 12 English consonants. Participants completed three trials at each set size ranging from three to seven letters. Letter recall was tested at the end of each trial, by displaying the complete array of 12 possible letters and requiring the participant to identify (by mouse click) the presented subset, in serial order. The participant’s final score was calculated by summing the number of letters correctly identified (correct letter in the correct serial position) across all presented sets, with a maximum attainable score of 75.

Spatial-WM task

Spatial-WM capacity was measured using an automated version of the symmetry span (SSPAN) task adapted from Kane et al. (2004; see also Heitz & Engle, 2007), which followed the same structure as the automated measure of verbal WM. In the SSPAN task, participants attempted to retain a sequence of spatial locations (positions on a 4 × 4 grid) interleaved with a series of symmetry judgments (in which the participant must determine if the shaded regions of an 8 × 8 matrix are symmetric about the central vertical axis). The rate of item presentation was again automatically titrated to the individual’s speed at performing the symmetry judgment task alone. Participants completed three trials at each set size, with set sizes ranging from only two to five items, due to the relative difficulty of location memory (as compared with letter memory). Location recall was tested at the end of each trial, by displaying the 4 × 4 grid and requiring the participant to identify the presented subset of locations in serial order. The participant’s final score was the number of locations correctly identified (correct location in the correct serial position) across all presented sets, with a maximum attainable score of 42.

Antisaccade task

The antisaccade task requires participants to resist the reflexive orienting of attention toward a visual cue and to intentionally redirect their attention in the opposing direction. We used a variant, developed originally by Kane and colleagues (Kane, Bleckley, Conway, & Engle, 2001), in which participants attempted to identify a pattern-masked target stimulus that appeared very briefly on the side of the screen opposite from the visually presented cue (“=”). The target stimulus on each trial was one of three capitalized letters (“B,” “P,” or “R”), and participants responded on each trial by pressing a correspondingly labeled key (labeled with “B,” “P,” and “R” stickers) on the number keypad. Each trial was participant-initiated (by pressing the spacebar). After an unpredictable interval between 200 and 2,200 ms, the visual cue flashed on either the left of right side of the screen, followed 50 ms later by the presentation of the target stimulus in the analogous position on the opposite side of the screen. The target stimulus was masked after 100 ms (replaced by an “H” and then an “8”), so successful target detection required a rapid attentional shift away from the cue and toward to the target location. Target identification accuracy was used as the primary dependent measure of performance on this task.

Stroop task

The Stroop task is a widely used measure of the ability to internally guide attentional processing toward one source of information (the font color) while suppressing a prepotent response to an irrelevant source of information (the written word). In our variant, participants viewed a series of color-name words written in varying font colors (e.g., “red” written in blue font), and were required to respond to only the font color (blue) by depressing a correspondingly labeled key on the keyboard. After a practice block consisting of 60 trials, participants completed three task blocks comprising 60 trials each, with an equal number of congruent (e.g., “red” in red font) and incongruent (e.g., “red” written in blue font) trials in the series. Since accuracy is generally very high in this task, individual differences in performance were indexed by calculating the difference in participants’ response times for congruent versus incongruent trials (i.e., the congruency effect), excluding trials for which an inaccurate response was given.

Results

Measures of the proportion of participants solving each problem, and the average time to the solution for each problem, were available for the present study and for the normative study conducted by Bowden and Jung-Beeman (2003b). Correspondences between our results and the prior norms, presented in Table 1, provided confidence in our CRA problem results. As can be seen in the table, the measures were all highly and significantly intercorrelated, with the highest correlation being between the proportions of solutions across the two studies. Thus, our study provided a strong replication of the normative data.

Table 1 Correlations between CRA performance in the present study and published norms

Reports of insightful solutions

On average, our participants successfully solved approximately 25 (M = 24.75 problems, SD = 7.6) of the 60 CRA problems presented (42 %). As expected, and also consistent with prior findings (Jung-Beeman et al., 2004; Kounios et al., 2006; Sandkühler & Bhattacharya, 2008), our participants reported that insight accompanied solution a significant proportion of the time. Of the solved problems, ratings associated with either a strong feeling of insight (M = 35 %, SD = 23 %) or a partial feeling of insight (M = 28 %, SD = 16 %) were obtained for an average of 64 % (M = 15.83 problems, SD = 7.0) of the solved problems. Problems rated as having been solved either partially (M = 19 %, SD = 12 %) or entirely (M = 17 %, SD = 21 %) through analysis (strategy) accounted for the remaining 36 % (M = 8.92 problems, SD = 5.8) of solutions. In further analyses based on these subjective ratings, we collapsed the ratings into two categories: Problems with complete or partial insight ratings were treated as insightfully solved; problems with complete or partial strategy ratings were treated as analytically solved. Moreover, for all analyses making a distinction between insightfully and analytically solved problems, the problem classifications relied on each participant’s own ratings of the problems. That is, different sets of problems were treated as insightfully or analytically solved for different participants, according to the insight rating that each individual provided.

Verbal overshadowing of insight?

Before proceeding to other analyses, we investigated the presence of verbal overshadowing of insight, because, beyond interest in that question per se, the outcome of that analysis informed how we would proceed in subsequent analyses conducted across individuals in the think-aloud and silent conditions. In order to examine the possible overshadowing effect of instructions to think aloud on CRA performance, we first compared the overall performance exhibited by participants asked to think aloud to that exhibited by participants who worked silently on the problems. Two variables were of interest: the number of problems solved, and the time to reach the solution for solved problems. As is shown in the first two lines of Table 2, the two experimental conditions produced nearly identical behavior on both measures. In other words, the results provided no reliable evidence for verbal overshadowing of overall CRA performance.

Table 2 Comparison of CRA performance in silent versus think-aloud protocol groups

The verbal-overshadowing hypothesis predicts specifically that verbalization should interfere with insight. Therefore, one might expect that when individuals solved problems while thinking aloud, they would do so without insight. Similarly, one might expect that, even when a problem was solved through insight in the think-aloud condition, it would take longer to do so. Thus, as further tests of the hypothesis of verbal overshadowing of insight, we examined several additional results, summarized in lines 3–6 of Table 2. The two groups were not different in their ratings of the problems (line 3) or in the proportions of problems that they solved with insight (line 4). Solution latencies also did not differ across the two protocol groups for problems of either rating type (lines 5 and 6). The solution latency data did, however, suggest a possible crossover interaction, which we tested via a two-way mixed-effects ANOVA, with Rating Type (insight, noninsight) as a within-subjects factor and Protocol Group (silent, think-aloud) as a between-subjects factor. The interaction did not approach significance: F(1, 49) = 1.37, p = .237. Thus, across a series of tests we found no evidence that the two groups differed, indicating no verbal overshadowing of insight in this problem set. Having obtained no significant differences between individuals given instructions to complete the problems in silence or while thinking aloud, we proceeded by combining the data from the two protocol groups and including protocol condition as a covariate in all of the subsequent analyses.

WM, attention control, and insight in CRA problems

Descriptive statistics for the WM and attention control measures included in this study are provided in Table 3.

Table 3 Descriptive statistics for working memory (WM) span and attention control measures

Partial correlations

Table 4 presents the partial correlations between CRA performance and the set of individual-differences measures included in the study, controlling for protocol group.Footnote 1 To explore the consistency of our results with those obtained by Ricks et al. (2007), we first investigated the potential involvement of WM in overall CRA problem solving by constructing a composite index of WM capacity. The composite WM score was based on participants’ averaged verbal (OSPAN) and spatial (SSPAN) WM scores, which were correlated r = .66 in our sample. A test of the simple correlation between individual differences in this composite WM index and the proportion of CRA problems solved was highly significant, a result that closely parallels that obtained by Ricks et al.

Table 4 Correlations between CRA performance, working memory (WM), and attention control measures

To further elucidate the involvement of executive attention versus mnemonic processes in CRA task performance, we also examined the relationship between performance on the two attention control measures and CRA performance. Since the antisaccade and Stroop color-word interference tasks were not significantly correlated with one another in our sample (r = .24), each measure was considered separately (it is noteworthy that the internal consistency of our Stroop measure was somewhat poorer than that obtained for other measures—see Table 3—and this may explain why the two attention measures were not significantly correlated; cf. Kane et al., 2004). As is shown in Table 4, Stroop performance was not a strong predictor of CRA performance, but individual differences in antisaccade performance were highly predictive of the overall proportion of CRA problems solved, suggesting that the ability to control the focus of one’s attention might also be an important aspect of CRA performance.

One further aspect of the relationship between WM capacity and insightful solution of CRA problems was explored during this analysis step. Namely, one possibility is that individuals with higher WM solve more CRA problems (including those given insight ratings) simply because they reach impasse sooner, and not because WM supports the process of solution following impasse (i.e., WM relates to reaching impasse, not to the subsequent insightful solution of a problem). To explore this possible confound, we examined the correlations between WM (both verbal and spatial) and time to solution for all problems (collapsed) and for each problem type separately (insight, analytic). None of the correlations with time to solution approached significance (r < .20, p > .10).

Multiple regression

The preceding analyses did not discriminate problem solutions on the basis of whether they were or were not accompanied by a feeling of insight. Moreover, the correlations were confounded due to substantial shared variance between the individual performance measures included in the study. Those analyses therefore do not support strong conclusions regarding the unique contributions of modality-dependent aspects of WM, modality-independent aspects of WM, and nonmnemonic processes associated with the control of attention. Accordingly, in subsequent analyses we used a multiple-regression approach that allowed us to parcel out these potentially separate contributors to CRA performance.

Our initial multiple regression model tested the two WM measures, verbal and spatial WM, as simultaneous predictors of the overall number of CRA problems solved. Only verbal WM significantly predicted CRA solution rates (β = .40; t = 2.25, p = .03), and a significant bivariate relationship [r(52) = .28, p = .04] previously observed between spatial WM and CRA solution was completely eliminated (β = .04; t = 0.22, p = .83) when shared variance between the two WM measures was accounted for. This outcome suggests that although executive WM processes, reflected in the shared variance of the verbal- and spatial-WM measures, may be instrumental in CRA performance, verbal-WM processes also make a unique and significant contribution to CRA problem solving. This is an important, and novel, finding.

However, as noted above, the critical test of the involvement of WM in insightful solution of CRA problems called for separate analysis of problems whose solutions were accompanied by a feeling of insight. We used two methods to index the occasions of insightful CRA problem solving. First, we examined whether either WM measure predicted individual differences in the number of solved CRA problems that were given an insight rating (complete or partial), out of the 60 total problems. Repeating the multiple regression on only those insightfully rated problems, we found again that verbal WM was a significant predictor (β = .39; t = 2.22, p = .03), whereas spatial WM was not (β = –.01; t = −0.08, p = .94). Figure 1 shows the scatterplot for the data that produced this pattern of results.

Fig. 1
figure 1

Scatterplot of the relationship between the number of CRA problems solved with an insight rating and individual participants’ scores on each of the two working memory (WM) span measures (verbal and spatial). The solid line represents the regression for the verbal-WM span measure (OSPAN), and the dotted line represents the regression for the spatial-WM measure (SSPAN)

Our second method took into account an assumption from the special-process view—that is, that insight can occur only after attempts at analysis have failed and an impasse is reached.Footnote 2 From this assumption, problems solved through analysis should be excluded from calculations concerning the occurrence of insight. Accordingly, a more precise measure of the likelihood of problem solving through insight would be the proportion of problems solved with insight as a proportion of the problems remaining after excluding those solved through analysis. Examining the proportions of CRA problems solved via insight after discounting those solved via analysis again indicated a significant relationship between verbal WM and CRA performance (β = .44; t = 2.49, p = .02) and no relationship with spatial WM (β = –.04; t = −0.21, p = .83).

To assess the relative contributions of mnemonic and nonmnemonic processes underlying CRA performance, we next tested an expanded multiple regression model that included the two WM measures (verbal or spatial WM) as well as the two attention control measures (antisaccade or Stroop) as simultaneous predictors of overall CRA problem solving, and only of insightfully rated CRA problems (both methods). If only WM-dependent processes are important in CRA performance, we would expect that the inclusion of nonmnemonic attention control measures would have no further impact on the adequacy of the model, and that neither attention control measure would account for a significant portion of unique variance in CRA performance. On the other hand, if nonmnemonic attentional processes are additionally relevant to CRA performance, we might expect these measures to be significant in the model, and a consequent improvement in the overall fit of the model.

The results of the expanded multiple regression model support the latter expectation. Specifically, inclusion of attention control measures in the model substantially improved the overall model fit (from R 2 = .19 to R 2 = .39) for individual differences in the total number of CRA problems solved, with both verbal-WM (β = .48; t = 2.90, p = .006) and antisaccade task (β = .43; t = 3.11, p = .003) measures accounting for a significant portion of the variance. This same pattern of results was obtained when only CRA problems solved with an insight rating were considered as the dependent variable in the model. Once again, the overall model fit improved (from R 2 = .19 to R 2 = .30), and individual differences in both verbal WM (β = .43; t = 2.45, p = .02) and antisaccade task performance (β = .33; t = 2.25, p = .03) were significant factors. Finally, in a model treating the two WM measures and the two attention control measures as simultaneous predictors of the proportion of insightfully solved problems out of the opportunities for insight (after discounting problems solved through analysis), we again found the same pattern, with verbal WM (β = .50; t = 2.99, p = .005) and antisaccade performance (β = .40; t = 2.87, p = .006) as significant factors.

Insight and solution time

Solution times for the solutions rated as being insightful (M = 10.14, SD = 6.06) versus noninsightful (M = 21.74, SD = 6.16) were very significantly different in the present study, t(51) = 8.0, p < .001, and this effect of rating on solution times did not interact with protocol condition, F(1, 49) = 1.37, p = .25. The direction of the result contradicts the prediction based on the special-process view: Insightful solutions were produced significantly faster than noninsightful solutions.

Discussion

Our results provide replications and extensions of several important findings in the literature. In addition, they raise questions concerning our understanding of insightful problem solving in CRA problems. In support of other studies, we found that a significant proportion of CRA problems were reported as being solved suddenly, in what participants subjectively regard as an aha! experience.

The most important new finding from the present study is that both modality-specific and executive components of WM (i.e., those associated with the control of attention) are important in insightful solution of CRA problems (i.e., problems rated as having arrived with a feeling of suddenness and certainty). We examined CRA problems in this study in order to extend the findings from Chein et al. (2010) that visuospatial aspects of WM were related to solutions of the 9-dot problem, but measures of verbal WM were not. It was expected that, in the present study using verbal problems, the modality-specific effect would be reversed. The findings were consistent with that expectation. Specifically, a measure of verbal WM accounted for unique variance in overall solution rates, and in the solution of problems rated as being solved through insight, whereas a weaker correlation between visuospatial WM and CRA performance was found to be nonsignificant once shared variance was taken into account. Extending those results, we discovered that individual differences on at least one measure of nonmnemonic attentional control (antisaccade task performance) were also correlated positively with individual differences in both overall and insightfully rated CRA solutions, and accounted for variation in CRA performance above that captured by the WM assessment instruments alone.

One unexpected result was the lack of a significant correlation between WM and the solution of problems rated as having been solved via analysis. From both the special-process and business-as-usual accounts, one would have expected a positive relationship between these variables. Indeed, one could argue that the business-as-usual account that we have advocated is undermined by an obvious difference between insight and analytic problem solving in this study (e.g., only the former correlated with WM and attention). We would emphasize, however, that a relationship between WM and analytic problem solving has been shown time and time again in prior research, and we suspect that the null outcome in the present sample (the absence of a correlation between WM and analytic CRA solutions) is an artifact of the procedure. Specifically, we note that on average the participants in this sample reported having solved only a small number of problems by way of analysis: about half as many as were solved with an “insight” rating. Thus, it is possible that there was simply too restricted a range in which to discover significant correlations between WM and solution through analysis (i.e., too little variance was associated with the number of analytic solutions reported across participants; we acknowledge, however, that a Levene’s test for homogeneity of variances did not indicate a statistically significant difference in the variances obtained for the number of insightfully vs. analytically solved problems).

In combination, the present findings lead us to a multifaceted account of how WM and attention might function in solution of CRA problems. First, verbal WM may serve as a temporary repository in which the problem solver can simultaneously represent and evaluate the problem cue words and their respective associates, and can keep track of already-tested candidate solutions (i.e., associates of one word that are not associates of the other words). Attention control mechanisms likely play an essential supervisory role in executing and monitoring those modality-specific WM processes (Baddeley, 1986; Cowan, 1995; Engle, 2002). In addition, attentional processes may further act by allowing the solver to focus a search through long-term semantic memory for candidate solutions, and to prevent the solver from being unduly influenced by potential sources of interference, such as dominant associative responses that introduce irrelevant semantic information during the problem attempt. The positive correlation observed between the capacity for attention control and the solution of CRA problems (even those accompanied by a feeling of insight) seems to contradict a view of insightful problem solving that argues that insight results from a loosening of one’s attentional focus (in that view, solution is prevented by an excessive focus of attention on an “inappropriate” problem representation; Jung-Beeman et al., 2004; Knoblich, Ohlsson, Haider, & Rhenius, 1999; Sandkühler & Bhattacharya, 2008).

The “loosening-of-attention” view of insight (Wiley & Jarosz, 2012) has been supported by several studies that have found that WM capacity interacts in various ways with aspects of insight problems. For example, Ash and Wiley (2006; see also Ricks et al., 2007) designed sets of insight problems with few versus many initial moves available and found that WM span predicted performance on the problems with many moves available, but not on the problems with few moves available. Those results were interpreted as supporting the idea that restructuring in problem solving (the main impediment in problems with few moves available) was not dependent on strategic or analytic processes. Similarly, Gilhooly and Fioratou (2009) found that performance on insight problems was not linked with executive functions of inhibition or switching but was linked positively to measures of verbal and visuospatial working memory capacities. Those results were also taken as evidence for nonstrategic factors in solution of insight problems.

However, it should be noted that none of the just-cited studies directly assessed the critical aspects of insight—that is, impasse and restructuring (and neither did the present study). It was assumed, on the basis of problem selection or problem design, that impasse and restructuring were critical in solving the insight problems that were investigated, but we have no direct evidence that that conclusion is warranted. Therefore, at present there is simply insufficient evidence to explain possible conflicts between the conclusions drawn here and those from other studies. For further discussion of this and related issues, see Ash, Cushen, and Wiley (2009).

One further plausible explanation for the observed relationship between WM and insightful solution of CRA problems deserves mention. Specifically, it has previously been demonstrated that proactive interference that builds across task trials may drive a task’s dependence on WM, and that elimination of such proactive interference can remove correlations with WM capacity (e.g., Bunting, Conway, & Heitz, 2004). Likewise, in the present study, materials or solution attempts associated with earlier CRA problems might have produced proactive interference that affected later trials, and caused the observed correlations between WM and CRA performance. Although no solution words were repeated in our problem set, some cue words were redundant across trials, and it is possible that mentally evoked associates of cue or solution words were overlapping from trial to trial. Thus, the buildup of proactive interference cannot be ruled out as a possible contributor to the present findings.

Another important finding from the present study was the lack of support found for verbal overshadowing of insight. Although this was a null finding, we have several reasons to have confidence in it. First, we carried out several different analyses, and none produced a significant difference in performance that was indicative of verbal overshadowing. Second, the findings are consistent with those of Ball and Stevens (2009), who conducted a similar assessment of verbal-overshadowing effects with CRA problems and concluded that verbalization does not negatively impact the emergence of solutions through insight. Third, the lack of verbal overshadowing found here supports negative results observed with other insight problems, such as those reported by Fleck and Weisberg (2004, 2013) using the candle problem and several other insight problems (see also Gilhooly et al., 2010). Thus, our verbal-overshadowing findings join those of several other studies in supporting the business-as-usual view of insight. Of course, the absence of a significant overshadowing effect might relate to statistical power, and thus, testing of a larger sample might have allowed small but reliable differences to be found. However, an estimate of the effect size of the verbal-overshadowing effects obtained in prior studies (e.g., Schooler et al., 1993) produced statistical power estimates ranging between .5 and .8, suggesting that with the present sample of 53 participants, we had a reasonably high likelihood of detecting such effects, had there been any.

As a final point, consistent with the results of Sandkühler and Bhattacharya (2008; see also Jung-Beeman et al., 2004, and Kounios et al., 2006, Exp. 2), we found that solutions rated as having been solved through insight were produced more quickly than were noninsightful solutions. This result would seem to contradict the special-process view, which assumes that insightful solutions occur only after analytic solutions have been exhausted and impasse is reached. However, it might be argued that the shorter average solution latency observed for problems receiving an insight rating signals a limitation of the self-report methods used to assess the occurrence of insight in these problems, not the inadequacy of the special-process view. That is, there is a possibility that the insight ratings provided by our participants, as well as those in prior studies (Bowden & Jung-Beeman, 2003a; Jung-Beeman et al., 2004; Kounios et al., 2006; Sandkühler & Bhattacharya, 2008), were in response to the suddenness of the solution, per se, rather than an outcome of the sequence postulated by the special-process view: failure ⇒ impasse ⇒ restructuring ⇒ aha! For example, analytic solutions to CRA problems may occur suddenly, and hence be given an “insight” rating, even though an impasse was never reached (and no subsequent insight occurred). If so, insight ratings in the studies conducted heretofore may simply have reflected fast—that is, “sudden,” as per the instructions—analytic CRA solutions. Indeed, in the studies conducted by Bowden, Jung-Beeman, and colleagues and by Kounios and colleagues, the occurrence of an insight rating was taken as prima facie evidence of the insight sequence, even though the speed of solution (e.g., median solution rate of 5.8 s in Kounios et el., 2006) makes it unlikely that the participants had experienced impasse and restructuring in solving the problems. Sandkühler and Bhattacharya (2008), who provided data on the occurrence of impasse prior to CRA solutions, reported that participants took an average of 27 s before indicating (by buttonpress) the occurrence of impasse (data were not provided regarding the insight ratings or final solution times for problems on which an impasse was indicated).

Although such findings, considered in combination, suggest that the subjective reports of insight in CRA problems may be incompatible with the sequence postulated by the special-process view, we suggest that it will be necessary to ascertain in more detail the processes involved in CRA solutions before drawing firm conclusions concerning their status as vehicles for the study of insight. A critical need is more direct assessment of the insight sequence, and a move beyond reliance on self-report indices, perhaps through the use of feeling-of-warmth ratings and the analysis of verbal protocols (Fleck & Weisberg, 2004, 2013). In addition, other questions might be raised concerning fine points of the interpretations offered here for our results (for further discussion, see Fleck & Weisberg, 2013). Since protocols have not been analyzed from participants working on CRA problems, some of those points cannot be addressed at this time.

In summary, the present results suggest the involvement of WM and attentional processes in the solution of CRA problems, even (and perhaps especially) when that solution brings with it a subjective aha! experience. Together with solution-latency findings and the failure to obtain a verbal-overshadowing-of-insight effect, these findings are difficult to reconcile with the position that insight solutions arise through unconscious processes such as the spreading activation or the loosening of attention. Although at present there is a critical lacuna in the study of possible insight in CRA problem solving, owing in part to a reliance on self-report measures, a reconsideration of what causes participants to give an insight rating to CRA problems has broad implications, in that it seems to force one of two significant conclusions. On the one hand, it could be concluded that the solution of CRA problems does not, in fact, involve insight. Such a conclusion would undermine a now large number of studies that have used this class of problems to characterize the nature of insight and its relationship to creativity. For example, much of the present literature on the neurocognitive study of insight (Bowden & Jung-Beeman, 2003a; Bowden et al., 2005; Jung-Beeman et al., 2004; Kounios et al., 2008; Kounios et al., 2006; Sandkühler & Bhattacharya, 2008) relies on this task and the subjective-report method as the basis for an examination of brain systems underlying insightful and noninsightful problem solving. If insight ratings reflect the speed of solution, rather than the occurrence of a meaningful aha! experience that accompanies solution arising from impasse and restructuring, those studies have been investigating brain areas involved in sudden analytic solutions to problems, not insight.

On the other hand, it could be concluded that CRA problems do involve insight, but that the type of insight indexed by subjective ratings is not of the type characterized by the special-process view. However, this conclusion raises a very difficult question: If the “insight experience” itself cannot be trusted as a marker of that special process, what observable phenomenon could be used in its place? As has been demonstrated by a growing number of studies, the present study among them, neither the presence of a verbal-overshadowing effect nor a limited reliance on WM provides a more satisfactory litmus test for the involvement of insight in solution. Using an individual-differences approach, we have now shown that with both spatial (Chein et al., 2010) and verbal (the present study) insight problems, problem solvers can come to see a problem in a new light, not through automatic and unconscious processes, but through the intentional and effortful deployment of those shared cognitive resources.

A final comment on insight versus analysis: a questionable dichotomy?

The difficulty that researchers have encountered in finding an approach that can reliably (and meaningfully) distinguish insight from analytic problem solving might be taken as evidence that these two “modes” of problem solving are in fact underpinned by a shared set of cognitive resources. However, as we noted earlier in passing, the present results, although not supporting the special-process view of insight, can also be taken as raising problems for a “single-process” interpretation of the business-as-usual view. According to the latter view, performance on insight and analytic problems should be affected in the same ways by various relevant variables, such as visual versus verbal WM spans, and so forth. That result was not found in the present study: Verbal WM span correlated with problems rated as being accompanied by insight, but not with those rated as having been solved through analysis. An alternative response to this inconsistent pattern of results would be to raise questions about the dichotomy suggested by the labels “insight” and “analysis” when they are applied to problem solving. We have to this point used those two terms without question. However, that usage implies an either/or stance toward problem solving, but it might be more useful to question the assumption that problems can be solved either through analysis or through insight, with no overlap in the processes involved.

A number of previous researchers (e.g., Fleck & Weisberg, 2004, 2013; Jones, 2003; Weisberg, 2006) have proposed that a better way to approach problem solving is through synthesis of those seemingly opposed conceptions. Jones, for example, proposed that it might be useful to combine the neo-Gestalt view (e.g., Knoblich et al., 1999; Ohlsson, 1992, 2011) with the heuristically based approach of MacGregor et al. (2001). Fleck and Weisberg (2004, 2013) independently developed a similar idea, incorporated in a framework that assumes that all problems are dealt with through a series of stages, as the individual attempts to bring his or her knowledge to bear on the problem. This series of stages can be looked upon as encapsulating a range of solution methods that could be applied to any problem that an individual encounters. Different methods of solution are brought to bear, depending on the match between the individual’s knowledge and the problem. Fleck and Weisberg (2004, 2013) collected data from several classic insight problems and found that a range of solutions were applied to them. In a relatively small number of cases, people solved insight problems through the sequence proposed by the Gestalt psychologists: An impasse led to restructuring of the problem. In the much greater majority of cases, restructuring occurred, not in response to impasse, but as a result of new information that initiated a new search of memory. Thus, we have a circumstance in which what one can call a “hybrid” of analysis and insight occurred: Restructuring (assumed be part of the “classic” insight sequence) occurred as a result of a new search of memory (part of the analysis sequence). Such findings show that a given problem can be solved in many ways and suggest that we should allow the data to shape our theories of problem solving, rather than vice versa.