Insight is a subjective experience of suddenly solving a problem when, previous to this “aha” experience, the solver felt blocked (Ohlsson, 1992). Research on insight problem solving has increased in the past 10 years (e.g., Bowden & Jung-Beeman, 2003; Chronicle, MacGregor, & Ormerod, 2004; DeYoung, Flanders, & Peterson, 2008; Fleck & Weisberg, 2004; Gibson, 2004; Helfenstein & Saariluoma, 2007; Jones, 2003; Knoblich, Ohlsson, Haider, & Rhenius, 1999; Schwert, 2007). Most of this research has been conducted with individuals, where participants attempt to solve problems individually, even when tested in a group setting. Yet, much real-world problem solving is a group endeavor (Larson, 2010; McNeese, Salas, & Endsley, 2001; Nijstad, 2009). Do experimental factors that facilitate or hinder insight operate at the group level? In the present study, we sought to extend previous research on the priming of insight in individuals (Gibson, 2004) to the small-group level, where 2–4 individuals worked together on an interactive insight problem to produce a solution. We compared group data with individually tested data in Experiment 1 in order to examine the effects of the group on insight, and, in Experiment 2, we examined whether the presence of the prime throughout the solving period is needed for insight.

Background on insight

The contemporary view of insight is that it requires a restructuring of the mental representation of a problem (e.g., Knoblich et al., 1999) or a change in heuristic when progress monitoring reveals an impasse (e.g., MacGregor, Ormerod, & Chronicle, 2001; Ormerod, MacGregor, & Chronicle, 2002). In the restructuring view, activation of information shifts from incorrect to correct information, and the mechanism for inducing a representational change is constraint relaxation or chunk decomposition (Jones, 2003; Knoblich et al., 1999; Sio & Ormerod, 2009; Wiley, 1998). The initial mental representation is formed by ideas holding the most activation, but once constraints are relaxed, other associative ideas receive enough activation to enter consciousness. In the progress-monitoring view, solvers use heuristics to narrow the possibilities in the problem space, where distance from goals drives the solving process (e.g., Chronicle et al., 2004); progress is evaluated against a criterion of satisfactory progress. The mechanism for insight is monitoring one’s progress and relaxing the requirements for maximizing progress.

Ash and Wiley (2006) looked at individual differences in working memory to examine whether restructuring a mental representation is a controlled or an automatic process. They showed that higher working memory span, an index of the ability to control attention, predicts insight problem solving only when the problem’s initial search phase matters to the solution. For problems that are solved by restructuring only, working memory span does not matter. Thus, it is the initial representation of the problem that involves the control processes of working memory, but restructuring happens more automatically, without demands on working memory resources.

Priming insight

Automatic or noncontrol processes are terms related to implicit memory. Hence, it is not surprising that priming paradigms have been used to study insight (e.g., Dorfman, Shames, & Kihlstrom, 1996; Gibson, 2004; Helfenstein & Saariluoma, 2007; Lockhart & Blackburn, 1993; Lockhart, Lamon, & Gick, 1988). The prior presentation of information (the prime) implicitly biases the solution for that information. For example, in the classic two-rope problem, Maier (1931) “accidentally” walked by and brushed the rope, so that it swung back and forth while participants sought a solution. More participants found the pendulum solution in this primed condition, as compared with a condition in which no prime was given. The implicit nature of the hint is inferred, because none of the participants stated that Maier’s swinging the rope affected their thinking. Thus, conscious thought or the explicit use of primed information may not be necessary for insight. If the environment can foster or hinder insight, this could be due to an automatic, implicit process. Implicit memory is sensitive to environmental information, as in modality and typography effects, but not to spatial/temporal information (e.g., McKone & French, 2001).

Neuropsychological studies support a distinction between implicit and explicit memory (Shimamura, 1993), and, more importantly, they also support a distinction between an explicit use of knowledge to solve problems and the implicit activation of remote associates needed for insight. The automatic spread of activation to remote information, caused by a prime, strengthens the probability that the remote information will affect behavior. Increased activation of appropriate remote associations could facilitate a restructuring of a problem that produces the sudden experience of insight (Knoblich et al., 1999). For example, Bowden and Jung-Beeman (2003) showed that a target’s naming latency is primed following unsolved trials in the Remote Associates Test. The words on the unsuccessful trial sent activation to all associated information, but the activation was not strong enough to produce a successful solution on the trial, yet it was strong enough to speed naming for it afterward. Such data support the ideas that the left hemisphere is responsible for focused, explicit attention and fine semantic coding (where solutions reach consciousness), that the right hemisphere shows activation for alternative and remote associations (where activation increases when one tries to resolve the impasse), and that activation of remote associations is inhibited by the left hemisphere (Fiore & Schooler, 1998; Rauch, 1977).

Birch and Rabinowitz (1951) showed that a prime prior to problem solving can affect the means of expressing insight in the two-rope problem. They gave participants an initial task that involved examining either a relay or a switch prior to solving the two-rope task. The control group received no initial task and was equally likely to use the relay or the switch as the weight for the pendulum solution. All of the participants in the group primed with the relay used the switch as the weight on the pendulum, and 78% of the group primed with the switch used the relay as the weight. Thus, the priming task reinforced the function of that object (its function is not as a weight) and hindered that object from being used in solving the problem. The presentation of information prior to a task biases the mental representation (e.g., Smith, 1995), which may hinder insight when one fixates on maintaining that biased representation (e.g., Duncker, 1945).

Gibson (2004) showed that an object that reinforced a nondominant, correct interpretation of an ambiguously worded problem facilitated insight in individuals, whereas an object that reinforced the dominant, incorrect interpretation of the problem hindered it. In her experiments, after writing sentences about a set of objects, the participants wrote as many solutions as they could think of to an insight problem. When the object related to the nondominant solution had previously been presented, 47% of participants included that solution, whereas when the object related to the dominant representation had been presented, only 17% listed the nondominant solution; the nondominant solution appeared 23% of the time in the control condition (no biasing objects). The present experiment extends Gibson (2004) by asking whether insight can be controlled by objects in the environment at the group level.

Group versus individual problem solving

We examined differences between groups and individuals by comparing data from group and individually tested participants in Experiment 1. Groups allow their members to attend to and reflect upon ideas (Paulus & Yang, 2000). Groups do better than individuals when a problem is complex (e.g., Brophy, 2006; Laughlin, Bonner, & Miner, 2002; Laughlin & Futoran, 1985; Moshman & Geil, 1998). Laughlin and Futoran (1985) noted that groups discussions allow for the recognition of a good hypothesis when one is generated, not that groups produce solutions distinct from those produced by individuals. Individuals working alone have greater difficulty in recognizing correct or best answers, rejecting errors, or exploring multiple strategies (Laughlin et al., 2002). Individuals may also find hints/primes distracting (Maier & Casselman, 1970); in a group, some may find hints less distracting than do others and may be able to get the group to use the hint.

On the other hand, some members might censor themselves from contributing creative ideas, which could keep remote alternatives from being discussed (Maier & Casselman, 1970). A shared mental model, a characteristic that contributes to group performance (Mumford, Feldman, Hein, & Nagao, 2001), could, for the current problem, hold a group back from restructuring the dominant but incorrect representation that members share and thus hurt performance. Mumford et al. noted that shared mental models narrow the ideas being discussed, which can keep the group focused and help them evaluate the generated ideas. Mumford et al. primed the appropriate mental model for a problem by administering a survey related to the content of the problem; they found that priming helped individuals as much as groups but concluded that groups were better at developing good ideas once they had them.

Assuming that groups allow for the exploration of nondominant interpretations of a problem, research comparing individual and group problem solving (e.g., Brophy, 2006; Maier, 1970) suggests that when solving an insight problem, groups have a better chance than do individuals to restructure the problem, consider remote associations as alternative representations, and recognize potential threads of questions that help solve the problem.

The present study

We used Gibson’s (2004, Experiment 3) verbal problem in a small-group situation. This type of ambiguously worded problem had been used previously in Weisberg (1988), in which the paradigm that we extended to a group situation is illustrated. Weisberg (1988) presented a situation, and the participant asked the experimenter questions that could be answered by “yes” or “no” to find the solution. Weisberg’s (1988) problem was the following: “Dan came home and found Charlie dead on the floor and Tom in the same room. On the floor Dan saw some broken glass and water. How did Charlie die?” (p. 155). The participant arrives at the solution that Charlie was a fish that died from lack of oxygen when Tom the cat knocked over Charlie’s fishbowl, which broke. Weisberg (1988) noted that participants begin by asking about human deaths until they find that Charlie died of lack of oxygen, and they ask about a drinking glass until they find it was fishbowl glass. He noted that novel solutions are not the starting point but the endpoint in problem solving. Weisberg (1988) noted that the basis for restructuring is that human names may be used to name animals and classified this problem as requiring pure insight. In a similar vein, the problem we used requires the restructuring that a common noun (“bicycles”) can also be a brand name, Bicycles playing cards. The problem was the following: “A man is found dead, lying among 53 Bicycles. The only other objects in the room are a table and some chairs. What happened?” The solution is that the man was gambling using Bicycle brand cards, was cheating (had an extra ace up his sleeve), was caught, and was shot dead by another man, who left the room, leaving him lying with the 53 Bicycles, a table, and chairs in the room.

The present research examined whether the influence of an object that reinforced either the dominant but incorrect interpretation (bike) or the nondominant and correct interpretation (cards) of “Bicycles” would affect a group’s ability to find the cheating at poker with Bicycle brand cards solution. As a group, participants asked the experimenter questions that were framed for “yes” or “no” answers and freely talked with each other to resolve the impasse and find the solution. The groups, therefore, received information and feedback from the experimenter and from each other to help monitor their progress.

As has been noted by many who study insight and problem solving, hints do not always help as much as hindsight suggests (Kershaw & Ohlsson, 2004; Weisberg, 1988). Best (2000) provided feedback to participants’ guesses in a Mastermind-likeFootnote 1 game and noted that strategy shifting was dependent on familiarity with the feedback in previous examples. Thus, we may not realize that a new solution is required if the feedback looks like it fits our mental representations. The use of the yes/no paradigm to study insight problems with groups is new, and it is unclear how well groups are aided by yes/no feedback. Ash and Wiley (2008) noted that, with insight problems, if one is blocked, one knows it and continues to work toward the solution, which is different from cases in which one thinks that one has resolved an impasse but is incorrect. In the present paradigm, the experimenter’s answers let the group know that the impasse was not resolved correctly. Chronicle et al. (2004) noted that feedback or hints should “encourage strategies for recoding, remembering, and reusing solutions” (p. 26). Being told “yes,” “no,” or “irrelevant” does not suggest new strategies for the groups to use, and thus the feedback mostly informs groups that their representation is wrong but not how to restructure it.

Hypotheses

We hypothesized that the cognitive processes involved in problem solving would extend to the group level, showing the same pattern for individual insight problem solving as that reported in Gibson (2004) for our experimental manipulation: We expected that the experimenter’s bike would slow finding a solution and that the experimenter’s cards would speed finding a solution, relative to a control group. We reasoned that the bike in the room would reinforce the notion that the “53 Bicycles” in the problem referred to vehicles and thus would serve to fixate participants on the wrong mental representation and prevent activation of a card game gambling scenario. The cards would serve to activate remote associations to help restructuring, such as concepts related to gambling, possible activities at a table, the size and shape of the 53 “bicycles”, or why 53 might be a relevant number (52 cards in a deck). An additive activation of the remote solution would increase with any questions about these topics and the presence of cards. Furthermore, we hypothesized that the priming effect would occur regardless of whether the objects were in plain view for the solving period (Experiment 1) or were present just prior to the start of the task (Experiment 2).

For insight to occur, restructuring requires more than one important piece of information. For example, knowing that the glass was a fishbowl did not lead to the immediate realization that Charlie was a fish in Weisberg’s (1988) research. Similarly, we hypothesized that the knowledge that the men played cards or poker would not necessarily result in the realization that the 53 “bicycles” were cards. An analysis of the questions asked by participants during the solving period could help clarify what information participants used to achieve restructuring. Questionnaire data may also show that, as in Maier (1931), our participants did not believe that the bike in the room hurt them or that the cards in the room helped them in solving the problem.

Given that individuals have the disadvantage of not hearing other people’s questions and the answers, of not talking with others to clarify their own thinking, and of not receiving aid in evaluating the quality of threads of questioning and are less likely to recognize a good question or thread of questions (e.g., Laughlin & Futoran, 1985), we predicted that groups would solve the problem more quickly than individuals in all conditions. Comparing questions asked by groups and individuals may provide clues as to the similarities and differences between individuals and groups in reaching insight.

Experiment 1

Method

Participants

We recruited undergraduates up to 4 at a time. Altogether, there were 18 groups of size 2, 17 groups of size 3, and 10 groups of size 4, for a total of 127 participants; there were 15 groups in each of the three conditions. An additional 21 undergraduates were tested individually, 7 per condition. Participants received either course credit or cash in appreciation of their participation.

Materials

The primes were a 27-in. women’s 10-speed bicycle and a 52 red-backed deck of Bicycle playing cards (box not used). A tape recorder recorded the questions, and a stopwatch timed the duration of the solving period. A paper questionnaire, administered at the end of the questioning phase, collected information about whether each member had noticed the primes in the room and had known that Bicycles were a brand of playing cards, and about their attribution of the group’s success or failure.

Design

Condition was a between-subjects variable. In the bike condition, the bicycle was present at the front of the room, next to the experimenter, who sat at a table facing the participants. In the cards condition, the experimenter played solitaire at the table, under the cover story that this was done to relieve any pressure during silent moments by not having the experimenter stare at the participants. In the bike and control conditions, the experimenter kept her eyes on a sheet of paper on the table to avoid staring. Across all conditions, she kept responses as neutral as possible, so as not to be revealing hints through intonation or body language. The students sat within 10 ft of the experimenter’s table and could see the bike and cards clearly. In the control condition, the bicycle and cards were stored in a closet. Although the experimenter was not blind to the design and hypotheses, she attempted to respond truthfully and in similar neutral ways across conditions.

Procedure

After the consent form had been signed, the experimenter read the problem aloud and asked whether anyone had heard the problem previously and knew the solution; if so, the participant wrote down the solution for verification and was dismissed without penalty. The tape recorder and timer were started, the problem was read aloud again, and questioning began. After the problem had been solved or 30 min had passed, each participant completed the questionnaire, was debriefed, and was asked not to tell anyone the problem or the solution, in order to keep the participant pool unspoiled.

Results

Solution time

The number of seconds each group or individual took to solve the problem was tabulated; if a group or individual failed to solve the problem, we entered 1,800 sec. The problem proved very difficult for the individuals; of the 21 we tested, only 1 individual solved the problem (1,620 sec, cards condition). The mean solution times for the groups for each condition is displayed in Fig. 1. As was predicted, the bike hindered solving time and cards facilitated solving time, relative to the control condition. Given that it took the control group an average of 1,165 sec to find the solution, the problem is considered to be difficult for groups but solvable.

Fig. 1
figure 1

Mean solving time (in seconds) as a function of condition, Experiment 1. Error bars represent the standard errors

A between-subjects ANOVA showed significant mean differences of condition, F(2, 42) = 6.53, p = .003, η2 = .24; the bike condition differed from the control, t(28) = 2.08, p = .046, and cards, t(28) = 3.74, p = .001, conditions. The cards condition was not statistically different from the control condition. However, if we exclude those groups that did not solve the problem—those with the artificial 1,800 sec entered into the data set—the cards condition did statistically differ from the control condition (M = 629.67, SD = 311.04 vs. M = 934.00, SD = 366.84, respectively), t(21) = 2.15, p = .043. In order to determine whether the groups’ mean solving time was significantly shorter than that of the one successful individual’s time of 1,620 sec, we performed one-sample t tests on groups for each condition. The mean solving time for the cards and control conditions was significantly shorter than 1,620 sec, t(14) = 5.25, p < .001, and t(14) = 3.50, p = .004, respectively, but the mean solving time for the bike condition was not statistically shorter than 1,620 sec.

The effect of the primes also appears in the distribution across time for groups that solved the problem. Table 1 shows the number of groups that solved the problem within the first 300 sec and for every 300-sec interval thereafter. The last column, labeled “1,800 s,” shows the number of groups that did not solve the problem. In the bike condition, no group solved the problem in less than 600 sec, whereas in the cards condition, six groups (40%) solved the problem in less than 600 sec.

Table 1 Number of groups that solved the problem by 30- sec intervals as a function of condition, experiment 1

Table 2 shows the distribution of groups by size, including individually tested participants, and includes the mean number of questions asked by condition. The conditions were randomly determined prior to participants’ attendance, because the cards or the bike was visible in the room at their arrival, and no-shows or participants knowing the problem resulted in fewer size 4 groups for the bike condition than for the other two conditions. An ANCOVA was performed on solution time for groups only, with group size as the covariate. The effect of size was not significant, and condition still was significant, F(2, 41) = 6.29, p = .004, η2 = .24. When the 1,800-s interval was removed from the data set, size was not significant, and condition remained significant, F(2, 30) = 15.52, p < .001, η2 = .51.

Table 2 Group size, mean number of questions asked, and number of solving the problem as a function of condition, experiment 1

Questions asked

We examined the number of questions in order to see whether quantity mattered, but this variable must be interpreted with caution. Some questions were half-formed and were talked over more than others, and the number used is a best guess. Furthermore, we cannot tell whether more questions being asked would imply that there was a good group process. For example, some groups needed only one question (e.g., “Do they have wheels?) in order to conclude that their representation of “bicycle” was in error, whereas other groups used many questions (e.g., “Do they have handlebars? Do they have seats? Do they have chains?”). We counted repeated questions as separate questions. Furthermore, the number of questions did not include conversation among group members.

An ANCOVA on number of questions for groups, with group’s size as the covariate, revealed no effect of size and an effect of condition, F(2, 41) = 8.00, p = .001, η2 = .28. Planned comparisons (LSD) showed that the difference between the bike and cards conditions and that between the bike and control conditions were significantly different, each p < .001, but no significant difference was found between the control and cards conditions. An ANCOVA on number of questions that included the individually tested participants showed an effect of size, F(1, 62) = 4.39, p = .04, η2 = .07, and an effect of condition, F(2, 62) = 5.88, p = .005, η2 = .16. To further examine the effect of individuals versus groups on questions asked, a 2 (individual, group) × 3 (bike, cards, control) ANOVA revealed that the groups (M = 111.07, SD = 53.21) asked significantly more questions than did the individuals (M = 86.62, SD = 29.44), F(1, 60) = 4.74, p = .033, η2 = .07, and there was a group × condition interaction, F(2, 60) = 3.44, p = .038, η2 = .10. The pairwise comparisons from this ANOVA showed that the bike and control conditions significantly differed from each other (p = .046) but that the bike and cards conditions did not (p = .060). The group × condition interaction was due to a flat change in the number of questions asked across the three conditions for individually tested participants, but groups had an elevated number of questions in the bike condition, as compared with the cards and control conditions.

Restructuring is considered to require more than one piece of crucial information, and therefore, we hypothesized that knowing that the men had played cards would not immediately lead to the realization that the “bicycles” were cards. Prior to solving the problem, twenty-five groups learned that the men had played cards or that cards were in the room. Of these, 8 groups (4 control, 2 bike, and 2 cards groups), or 32%, continued by asking more than 2 questions as if the Bicycles and the cards were separate objects (number of questions, M = 24.25, SD = 26.46; minimum = 3, maximum = 76). Furthermore, of these 25 groups, the knowledge that the men had played cards came late in the period for 24 of them, and all knew first that the “bicycles” were not vehicles. The one remaining group learned early that the men had played cards (question 19) and needed 40 additional questions, the last of which were about the “bicycles” not having the parts or the functions of a vehicle, before asking whether the “cards and bikes have something to do with each other.” Further support for the hypothesis was found with our individual participants. Only 2 of our 21 individually tested participants knew that the men in the problem had played a card game, but both continued asking questions; the individual who solved the problem needed an additional 20 questions before he realized that the “bicycles” were cards, and the other participant (also in the cards condition) asked an additional 36 questions, before time ran out, without realizing that the “bicycles” were cards. Thus, both the group and individual data provide support for the hypothesis that the main clue, that the men had played cards, does not in itself bring about restructuring for the insight that the “bicycles” are cards. Until the representation of “bicycles” changes from vehicle-like representations to the brand name of cards, it is possible for groups and individuals to hold both representations (vehicles and cards) separately.

We examined the questions asked for clues as to how the insight did or did not occur. We coded the questions for ten themes and calculated the proportion of questions for each theme for each group or individual; we report the average proportions by condition in Table 3 (for groups) and Table 4 (for individuals), organized by whether the problem was solved and by condition.

Table 3 Mean (and standard deviation) proportions of questions asked by groups, theme, solving, and experimental condition, experiment 1
Table 4 Mean (and standard deviation) proportions of questions asked by individuals, theme, solving, and experimental condition, experiment 1

Questions concerning alternative interpretations of “bicycles” show the participants’ efforts to move from the initial representation of “bicycles” as bikes. Group solvers (M = .022, SD = .033) asked significantly more questions than did group nonsolvers (M = .002, SD = .004) about interpretations of “bicycles” other than as vehicles, F(1, 43) = 4.67, p = .036, η2 = .10. Also, the individual solver’s proportion of .03 was significantly higher than the individual nonsolvers’ proportions, t(19) = 12.34, p < .001.

The themes concerning the number of Bicycles, activity at the table, and manner of death potentially helped move participants toward a scenario in which the men had played cards and the dead man had been shot for cheating. No significant differences were found between group solvers and group nonsolvers for these themes, implying that this helpful information alone did not bring about restructuring of the problem. The individual solver, however, allotted a significant larger proportion of questions to the number of Bicycles than did individual nonsolvers, t(19) = 12.34, p < .001.

The questions concerning the parts or riding function of the Bicycles, as well as who owned them (ownership is irrelevant to cards), provided evidence that the participants thought of a vehicle, as well as showed how dominant the focus on a vehicle was, as compared with other themes. Those who solved the problem asked a higher proportion of questions about bike parts and functions (M = .12, SD = .08) than did those who did not solve the problem (M = .08, SD = .06), but this difference did not reach significance. In the bike condition, particularly, those who solved the problem asked .13 of their questions about the bike, whereas the proportion for those who did not solve the problem was .06, but this difference fell just out of the alpha region, t(13) = 2.52, p = .06. There were no significant differences between groups for the ownership theme or with individuals for these themes. We averaged all the other themes, except for those in the miscellaneous category, and compared the mean proportion of questions asked for all the other themes (M = .04, SD = .01) with the mean proportion of questions about bike parts/function (M = .09, SD = .08) in a 3 (condition) × 2 (problem solving) × 2 (groups or individuals) × 2 (all vs. bike themes) mixed ANOVA. There was an effect of themes, F(1,56) = 6.22, p = .016, η2 = .10, with no interactions of themes with condition, problem solving, or groups, all Fs < 1.61, ps > .20.

Questions concerning the occupations, the room, and the relationship between the men were irrelevant themes for our solution but are helpful for solving death mysteries; we coded for them to see whether the groups that solved the problem asked as many questions about the solution-irrelevant themes as did those who did not solve it. No significant differences were found between solvers and nonsolvers for both groups and individual participants.

The nine themes mentioned above accounted for about a third of all questions asked; the remaining questions that groups or individuals asked were placed into a miscellaneous category (e.g., motivation, weather, or emotions). Solvers and nonsolvers did not differ on the proportion of miscellaneous questions for groups and individuals.

An examination of the questions asked indicated that the individually tested participants did not follow up on questions and often jumped around across themes, whereas the groups appeared more focused on a thread of questioning. For example, after learning that the number 53 was important to the solution, an individual’s next question was on an unrelated theme, whereas a group asked a related question. On average, the groups stayed on a theme more times (M = 8.38, SD = 4.72) than did the individuals (M = 5.33, SD = 3.69), t(64) = 2.60, p = .011. The proportion of chunked questions, as a function of total questions asked, was significantly higher for groups (M = .08, SD = .03) than for individuals (M = .06, SD = .02), t(64) = 3.44, p < .001. In addition, the total number of chunked questions was larger for groups (M = 25.93, SD = 16.31) than for individuals (M = 17.86, SD = 12.09), t(64) = 2.02, p = .047. The total number of questions within chunks divided by the number of remaining questions also revealed that groups were more focused (M = .12, SD = .05) than for individuals (M = .08, SD = .04), t(64) = 3.09, p = .003. Thus, individuals may have failed to solve the problem within 30 min because they did not have the benefit of others in a group to help them focus on lines of questioning.

In Tables 5 (groups) and 6 (individuals), we present sequential questions that were asked immediately prior to the question of whether the Bicycles were cards, ranked from fastest to slowest for each condition. An examination of the questions indicates no obvious piece of information that caused or prevented insight. For example, the 410-s cards group went from learning that the “bicycles” were not the kind that you ride to learning that the “bicycles” were cards without any indication of how they did it. The fast 300-s cards group asked about the chairs and suddenly made the change to cards. For the slower groups, the realization followed a question about the men’s emotions (bike, 1,575 s), the question about poker came after a question about selling bicycles (cards, 1,210 s), and the number clue after additional parts had been asked about (control, 1,190 s) helped form the realization. The did-not-solve groups asked questions that many of the problem-solved groups had asked earlier in questioning and may have acquired late information that the problem-solved groups acquired earlier. It is not clear how thinking about flowers (bike, 1,741 s) or remote-controlled objects (cards, 1,118 s) or thinking that the “bicycles” were somewhere else (control, 290 s) led to the realization that the “bicycles” were cards. However, a comparison between the questions asked by groups and individuals shows that immediately prior to realizing that the “bicycles” were cards, the groups asked about the number 53, the parts/function of Bicycles, and game/card-playing/poker, whereas few questions from the individuals were on these themes. The individuals’ questions that appear in Table 6 occurred earlier in most of the groups’ questioning. Like the did-not-solve groups, the individuals appear to have been unable to acquire or use, within the time limit, the information that helped with the groups’ restructuring. Nine of the 20 did-not-solve individuals asked questions that clearly showed that they had not yet learned that the “bicycles” were not bikes.

Table 5 Penultimate questions prior to realizing that bicycles were cards or when time was up for each condition ordered by the group’s solving time, groups, experiment 1
Table 6 Penultimate questions prior to realizing that bicycles were cards or when time was up for each condition ordered by the group’s solving time, individuals, experiment 1

Lastly, most groups who realized that the “bicycles” were cards provided a verbal exclamation common in insight problems (e.g., “oh,” “I got it!,” or “ding, ding, ding”). The exclamations support the penultimate question analysis that the problem’s solution appeared suddenly, without systematic, obvious deduction (Davidson, 1995). Also, most participants did not indicate in the questions asked that they had intentionally used the bike or cards in the room to influence their thinking: No group and 2 individuals asked whether the bike was in the room “to help them,” and two groups asked whether it was relevant that the experimenter was playing solitaire.

Questionnaire data

All but three groups contained at least 1 member who knew that Bicycles was a brand of playing cards, but of these three, one group was able to solve the problem by learning that the men had played cards and that the bikes had something to do with cards. Seven individuals reported not knowing that Bicycles was a brand of cards. Summaries of three focal questions from the questionnaire appear in Table 7. Groups that solved the problem attributed their success to good teamwork. The unsuccessful groups and individuals believed that the mental representation of a bike blocked the solution.

Table 7 Number of participants responding to questionnaire themes as a function of solving, for group and individually tested participants, experiment 1

Discussion

Our data are in accord with the view that remote activation to playing cards was blocked by the experimenter’s bike, because the groups in the bike condition took more time to solve the problem than did those in the control or cards conditions. Although the cards condition had the shortest mean solving time, it did not significantly differ from the control condition unless the 1,800-s interval was removed. However, the distribution across 300-sec intervals did show the hypothesized effect of our experimental manipulation. These data support the idea that the dominant prime blocked the activation to remote associates and that the experimenter’s cards helped somewhat in relaxing constraints, relative to the control condition. Group questioning was helpful in both the control and cards conditions in implicitly activating the alternative meaning of “bicycles” as cards. Unfortunately, the individuals did not show the expected pattern, because all but 1 (in the cards condition) could not solve the problem. The mean group solving time for both the control and cards conditions was significantly shorter than the 1 individual’s solving time. Thus, group members benefited by working together, and an individual with a card prime solved the problem at a time similar to that for the groups in the bike condition.

The analyses of the questions asked are in accord with observations made by Weisberg (1988) in that knowing the main clue does not necessarily cause insight. Just as more is needed to make Charlie a fish, because human Charlies can have a fishbowl on the table, we found that, of those who discovered through their questioning that the men had played cards prior to the murder, 32% of the groups and 100% of the individuals did not have the immediate insight that the cards were the “bicycles.” We also found that most solvers did not attribute their solving the problem to the prime in the room, similar to Maier’s (1931) findings. All the groups and individuals started with the dominant view (bicycles are vehicles), and novel solutions were not the starting point, consistent with the findings of Weisberg (1988).

The groups solved the problem more quickly than did the individuals. The questions asked also showed that the individuals chunked the lines of questioning less than did the groups, which, combined with other group process factors, may have hindered the individuals from solving the problem within 30 min. The penultimate questions do not show how insight occurred or why some groups were faster than others in solving the problem, but they do show that the solution appeared without deduction.

Our individuals found this problem hard to solve, yet Gibson (2004) reported that 25% of individually tested participants in the control group, 47% in the cards condition, and 17% in the bike condition produced the solution. The difference between experimental findings may lie in the paradigms used. In the previous study, Gibson’s (2004) participants were reading the problem and writing as many solutions as they could think of in a creative problem-solving study. In the present study, creativity might have been discouraged by self-imposed rules of ways to ask questions and by having to deal with unexpected answers to the questions. Further social constraints on the problem for the individuals may have been added because of the one-to-one interaction with the experimenter, which occupied resources of their working memory. Individually tested participants commented orally during debriefing that they felt that there were too many directions to go in and that they had had no help in knowing what to do with the information that they had learned, such as that 53 was relevant to the solution. Furthermore, Heslin (2009) discussed issues concerning why it may be better for individuals to write ideas (brainwriting) rather than orally brainstorming to generate ideas. In Experiment 1, groups likely were able to solve the problem in the present format because members helped to focus the exploration of questions and evaluate the utility of lines of questioning (Laughlin & Futoran, 1985).

Experiment 2

In Experiment 2, we explored the importance of the presence of the prime throughout the questioning period for group insight problem solving. When participants arrived at the lab, they saw the bike or the experimenter playing solitaire at the front of the room. As they signed consent forms, the bicycle or the cards were put away in a closet, and the problem was presented to them only after the objects had been put away. If it is the case that insight is caused by an implicit process, we should find data much like those in Experiment 1. Would the activation of the dominant solution persist if the prime initially was there but then was not present for up to 30 min?

Method

Participants

Twelve groups were run in each condition, recruited as in Experiment 1. We tested 16 groups of size 2, 17 groups of size 3, and 3 groups of size 4, for a total of 95 participants. The uneven number of the size 4 groups was due to chance.

Materials, Design, and Procedure

The only difference between this experiment and the previous one was that in Experiment 2, the primes were present in the room only when the participants arrived. The questionnaires confirmed that all participants remembered seeing the bike or the cards in those conditions and that the control group saw neither.

Results

Solution time

The mean time in seconds that each group took to solve the problem for each condition is reported in Fig. 2 and shows the pattern found in Experiment 1. A between-subjects ANOVA showed differences between means to be statistically significant, F(2, 33) = 3.91, p = .03, η2 = .19; the bike condition differed from the control condition, t(22) = 2.16, p = .042, and the cards condition, t(22) = 2.68, p = .014. As in Experiment 1, the cards condition was not statistically different from the control condition even when the 1,800-s interval was excluded (M = 811.73, SD = 444.64 vs. M = 939.00, SD = 426.33, respectively), t(20) < 1. To compare whether the mean RTs in Experiment 2 differed from those for the groups in Experiment 1, a 2 (experiment) × 2 (condition) ANOVA on RTs was performed, and there was no significant effect of experiment, nor did it interact with condition, Fs < 1. Furthermore, with 27 groups in each condition, t tests repeated Experiment 2’s results: The means for the bike and control conditions and for the bike and cards conditions statistically differed, t(52) = 3.02, p < .004, and t(52) = 4.65, p < .001, respectively, but the mean for the cards condition did not statistically differ from that for the control condition, t(52) = 1.59, p = .12. Without 1,800-s observations, the mean for the cards condition (M = 716.75, SD = 383.26) did not significantly differ from that for the control condition (M = 936.50, SD = 388.12), t(43) = 1.91, p = .063.

Fig. 2
figure 2

Mean solving time (in seconds) as a function of condition, Experiment 2. Error bars represent the standard errors

Table 8 shows the distribution across time for the groups that solved the problem. As was found in Experiment 1, the distribution shifts to more groups solving the problem more quickly in the cards condition, relative to the control and bike conditions. We combined the distributions of Experiments 1 and 2 so that we could examine with a chi square test whether the distributions across time intervals between the bike and cards conditions were different than chance. Using 1,100 s as the dichotomy boundary, we observed 5 groups below and 22 above in the bike condition, 13 below and 14 above in the control condition, and 18 below and 9 above in the cards condition. This distribution was different than chance, χ2(2) = 12.90, p = .002.

Table 8 Number of groups that solved the problem by 300-sec intervals as a function of condition, experiment 2

Table 9 shows the distribution of groups by size. An ANCOVA was performed on solution time, with group size as the covariate. The effect of size was not significant, but condition was significant, F(2, 32) = 3.79, p = .033, η2 = .19. When the 1,800-s interval was removed from the data set, size was not significant, but neither was condition. When both experiments’ group data were combined, the ANCOVA results also showed no effect of size but an effect of condition, F(2, 77) = 9.005, p < .001, η2 = .19.

Table 9 Number of groups of size 2, 3, or 4 participants and mean number of questions for the group size as a function of condition, experiment 2

Questions asked

An ANCOVA on number of questions, with group size as the covariate, revealed no effect of size or condition. Table 10 reports the proportion of questions the groups asked, organized by the ten themes used in Experiment 1. A 3 (condition) × 2 (solve) ANOVA was run on each theme, and no significant effects or interactions were found, all Fs < 1.91, with the exception of an interaction on the proportions for the restructuring theme, F(2, 30) = 5.20, p = .012, η2 = .26, which appears to have been caused by the control groups that did not solve the problem producing more creative possible interpretations of “bicycles” than did the control groups who solved the problem, whereas the bike and cards conditions favored more ideas for the solved groups. A 3 (condition) × 2 (solve) × 2 (experiment) ANOVA was run on group data from both experiments, and no significant differences were found between experiments on each theme, with the one exception of a three-way interaction for other representations of the bicycles theme, F(2,69) = 3.25, p = .045, η2 = .09; this interaction involved the higher proportion of questions asked by the control groups in Experiment 2 that did not solve the problem.

Table 10 Mean (and standard deviation) proportion of questions asked by groups, by theme, solving, and experimental condition, experiment 2

The average number of chunks (at least two sequential questions coded as one of our ten themes) per group equaled 7.64 (SD = 3.76), accounting for an average of 8% (SD = 3%) of the questions being asked in an organized way, and the average number of questions in a chunk (25.72, SD = 14.42) accounted for 12% (SD = 6%) of the groups’ questions. These averages are not statistically different from those for the groups in Experiment 1 (all ts < 1), but they did differ from those for the individuals in Experiment 1, all ts > 2.10, ps < .05, providing support for the idea that group members helped keep the questions focused on a theme that then helped members “put it all together” or helped with restructuring.

We examined the hypothesis that restructuring would not necessarily happen once the groups knew that the men had played cards. Eleven groups (2 control, 3 bike, and 6 cards groups) knew prior to producing the solution that the men had played cards. Of these, 4 (36%) asked more than two questions before asking whether the Bicycles were cards number of questions, M = 9.0, SD = 7.8, minimum = 4, maximum = 18). Furthermore, all 11 groups learned late in their questioning that the men had played cards, already knew that the “bicycles” were not vehicles, and were looking for an alternative representation of the word “bicycles.” Thus, we find support for the hypothesis that restructuring the representation of “bicycles” as cards requires additional information beyond the main cue.

Table 11 shows the questions asked immediately prior to asking whether the “bicycles” were cards or prior to time expiring, ranked in order of time taken to solve the problem and organized by condition. We see that the groups vary on the information that was active prior to realizing that the “bicycles” were cards. Some groups asked about the men, the Bicycles, the number 53, or the activity. These same foci were present in the fastest, slowest, and did-not-solve groups. The variety of questions asked by the groups that solved the problem immediately before asking whether the “bicycles” were cards indicates that restructuring likely did not occur through a deductive process. The presence of the bike slowed the groups down, but it did not appear to change the nature of the questions or how groups realized that the “bicycles” were cards.

Table 11 Penultimate questions prior to realizing that bicycles were cards or when time was up for each condition ordered by the group’s solving time, groups, experiment 2

Questionnaire data

As can be seen in Table 12, the questionnaire data show that the problem was judged difficult to solve, but if the group solved it, 30% of participants judged it as easier. Teamwork, persistence in asking lots of questions, and being able to piece together the information are the common attributions for success, and the fixation on a bike representation the most common attribution for failure (50% noted that the idea that the “bicycles” were bikes prevented them from solving the problem). Also, as in Experiment 1, three clues that fit our themes (the number 53, “bicycles” that could not be ridden, and activity at the table) were noted as helpful cues by solvers, but those who did not solve the problem did not choose to write those clues as helpful in getting them close to the solution.

Table 12 Number of participants responding to questionnaire themes as a function of solving, experiment 2

Discussion

When the priming objects were no longer present throughout the questioning phase, their effects remained similar to the effects when they were present. The mean pattern and dispersion of groups’ solving times were similar to those in Experiment 1. In the bike condition, 40% of the groups failed to solve the problem within 30 min, whereas 8% failed in the control and cards conditions. It is clear that seeing a bike prior to solving the problem biased the interpretation that the 53 “bicycles” in the problem were vehicles more strongly than for the other groups who also begin with a bike interpretation of “53 bicycles.” Our data support the interpretation that the impasse was harder to resolve because of that extra activation to the dominant meaning of “bicycles” by the presence of the prime, which prevented the spreading of activation to remote associates. Given that the solving times were similar between experiments, staring at a real bike as one tried to figure out how 53 “bicycles” were not bikes did not hurt one more than having the concept activated prior to starting the questioning. One implication of this finding is that if something in the environment is hindering insight, simply removing the object is not enough to prevent its effects. We would expect the priming effect to decrease over time, but future research is needed to know how long the effect on insight problem solving lasts.

General Discussion

Groups often work on problems together, and it is important to know whether findings on priming insight problem solving with individuals generalizes to group problem solving. Our findings indicate that it does. As was hypothesized, we found that the prime biases groups’ solving time: Both experiments showed that the participants in the bike condition took significantly longer to solve the problem than did both those in the control and those in the cards conditions. The cards descriptively speeded solution time in both experiments, as compared with the control condition. The similar pattern of the means in the two experiments suggests that the priming of insight does not require the physical presence of the prime during the problem-solving phase.

We hypothesized that the experimenter’s cards would activate remote associations of “bicycles” that would help restructure the mental representation from vehicles to cards. The dispersion of solving times supported the hypothesis that the experimenter’s cards facilitated restructuring, but solution time was significant in Experiment 1 only when the 1800-s interval was not included in the analyses. Gibson (2004) and Lockhart et al. (1988) noted the value of conceptual processing of primes in insight problem solving. Failure of the cards condition to be significantly faster than the control condition might be explained as a lack of conceptual processing of the cards, whereas participants meaningfully thought about bikes in the bike condition. Groups may also have found the solitaire playing distracting and may have dismissed any conscious thoughts about cards to stay focused on the “bicycles” in the problem.

We hypothesized that having the insight that the 53 “bicycles” were cards would require more than the one critical piece of information—that the men in the problem had played a game of cards. We found that, of those groups who knew that the men had played a game of cards, about a third continued to maintain separate identities for the cards and the bikes. The clue itself does not require restructuring to occur. The fact that the majority of groups knowing that the men had played a game of cards did, within two questions, ask whether the “bicycles” were cards provides evidence that the activation of remote associations between Bicycle cards and bicycles was helped by the activation of playing cards from the group’s questions. We also found, as was hypothesized, that most groups did not overtly state, in their questions or on the questionnaires, that they believed that the experimenter’s cards or the bike in the lab were part of the experiment, similar to the results reported by Maier (1931).

Individuals were seriously disadvantaged relative to groups. The comparison of individuals with groups shows that groups could reach insight sooner and focus their lines of questioning. Our data support research comparing individual and group problem solving using noninsight problems (e.g., Brophy, 2006; Laughlin et al., 2002; Laughlin & Futoran, 1985; Moshman & Geil, 1998). People benefit from other group members’ sharing the memory and attention load, evaluating the threads of questions to ask, helping to reject errors of inference, and exploring multiple strategies for exploration. We found evidence of this benefit in the number of questions asked sequentially on a coded theme: The groups in both experiments chunked their questioning more than did the individually tested participants. Furthermore, individuals’ prior experience or domain knowledge can sometimes inhibit creative problem solving (Wertheimer, 1945/1959), discourage einstellung strategies (Luchins, 1942), instill functional fixedness (Duncker, 1945), or automatically lead to following a path of expectations (Finke, 1995). Groups provide opportunities to diversify these inhibitors and increase the chances for restructuring to occur. Forty-three percent (Experiment 1) and 23% (Experiment 2) of our participants valued teamwork for helping them solve the problem, and this evaluation included comments on how varied viewpoints and lines of questioning allowed for a mix of knowledge to help the group know what questions to ask and what to do with the answers.

The transcript analyses revealed that insight that the Bicycles were cards seemed to appear suddenly in the questioning for most groups who solved the problem, and systematic questioning about the number 53, the activity involved, or that the Bicycles were not vehicles appeared to precede solution of the problem as much as continue to stump groups, so that it is difficult to call the realization that the “bicycles” were cards a deduction. However, group process helped to narrow and focus the possibilities.

Fleck and Weisberg (2004) noted seven levels of impasse in problem solving, and our groups and individuals demonstrated six of them: (1) repetition of an incorrect solution (e.g., “It was not a robbery”); (2) rereading of the problem (12 groups asked to hear the problem again); (3) mind is blank (e.g., “What else can they be?’); (4) guessing (e.g., “Was death caused by eating the bicycles”); (5) emotional frustration (“Is this a trick question?” or laughter when asking “they are intact but have no wheels?”); and (6) demonstrating fixation (e.g., “If it didn’t have wheels, did it have spokes?” “If he could not ride the bicycles, did he want to ride them?”). One level, cessation of behavior, did not occur. Groups and individuals who went the full 30 min asked questions within the last several minutes. The interactive presence of the experimenter and knowing there was a 30-min time limit likely discouraged participants from giving up.

There are four main implications of our study. First, priming effects found with individual insight problems extend to group problem solving. We expect that many variables that affect individuals’ insight will extend to groups, such as findings involving training groups (Ansburg & Dominowski, 2000) to increase their ability to re-encode, elaborate, or relax constraints (Ohlsson, 1992) through practice, feedback, problem comparison, and strategic instructions. Second, groups who work in an environment that biases them to a dominant but incorrect solution are less likely to find insight for the correct solution soon after they change environments (Experiment 2). Third, groups in a negatively biased environment can still restructure a problem, and, conversely, groups in a favorably biased environment can still remain at an impasse. Fourth, verbally ambiguous word problems used with the yes/no paradigm, such as the Bicycles and Charlie problems, provide an ideal situation for groups to work interactively, as well as for experimenters to manipulate the conditions under which they work. In the present study, we did not control group dynamics, individual differences, leadership, or whether a certain number of members had prior experience with such problems, variables known to affect group problem solving (Maier, 1970). Our data suggest that the priming effect was strong enough to be measured despite these complexities being left free to vary; however, research on such group variables could tell us whether they strengthen or weaken the priming effects observed in our experiments. One future direction would be to explore the effects of bringing back unsuccessful groups or mixing members of unsuccessful groups to try again in order to examine whether past experience with working on the problem will affect the current solution, crossed with priming conditions of the first and second attempts. Maier and Thurber (1969) elaborated on the influence of solvers with previous experience on group interaction; it would be interesting to study the influence of unsuccessful group members on a second attempt: Would insight still be affected by the initial priming condition?

We identified the following limitations of our study. First, researchers use a variety of problems to study insight, and we used only one verbal insight problem in a rarely used paradigm. Pilot work with other, similar verbal problems failed to show appropriate primes that hindered or facilitated solutions. For example, one problem we tested involved a housekeeper who turned off a light before going to bed and was upset the next day to read about a shipwreck. We used a lamp as the dominant object and a miniature lighthouse knickknack as the nondominant prime. We found that both groups were no different from those in the control condition. It is possible that the prime needs to be very salient or distinctive and that items sitting on a desk are inadequate. Perhaps the conceptual processing that occurs when one is solving a problem needs to incorporate salient or not-to-be-ignored primes for effectiveness (Gibson, 2004; Lockhart et al., 1988). Additional limitations appeared in pilot work as well. Giving participants less time to solve the problem, thereby increasing motivation or pressure to find the solution quickly, resulted only in more groups not solving the problem. Furthermore, we tried a modeling procedure for the group of kinds of questions to ask; we presented a video of an experimenter and a participant solving the Charlie problem in several minutes with the yes/no procedure, but group solving efficiency did not result. In another study, we tested older adults in groups with the Bicycles problem but found no significant effect of our manipulations, although the means were in the predicted pattern (Grimm, 2001). Despite these limitations, we believe that our findings should extend theoretically across insight problems. Future research is needed to understand the boundary conditions that influence priming insight in groups and whether these differ between groups and individuals.

To close, researchers are very interested in the cognitive processes involved in insight problem solving, particularly the overcoming-the-impasse phase when insight occurs. The literature suggests that insight is a process caused by a shift in activation to a remote alternative from the focused, dominant representation of a problem. Our data on priming group insight are consistent with the view that insight may be an implicit process. Groups working to solve ambiguous or vague problems may acquire insight even when individuals working on the problem may not. Groups provide a context for sharing multiple viewpoints and domain knowledge, allowing remote alternatives to be considered, recognizing correct lines of thinking, and sharing cognitive resources (memory load, attention, and evaluation of ideas).