Educational Psychology Review

, Volume 19, Issue 2, pp 91–110

The Implications of Research on Expertise for Curriculum and Pedagogy


    • Department of Educational Studies, College of EducationUniversity of South Carolina
Original Paper

DOI: 10.1007/s10648-006-9009-0

Cite this article as:
Feldon, D.F. Educ Psychol Rev (2007) 19: 91. doi:10.1007/s10648-006-9009-0


Instruction on problem solving in particular domains typically relies on explanations from experts about their strategies. However, research indicates that such self-reports often are incomplete or inaccurate (e.g., Chao & Salvendy, 1994; Cooke & Breedin, 1994). This article evaluates research on experts’ cognition, the accuracy of experts’ self-reports, and the efficacy of instruction based on experts’ self-reports. Analysis of this evidence indicates that experts’ free recall of strategies introduces errors and omissions into instructional materials that hinder student success. In contrast, when experts engage in structured knowledge elicitation techniques (e.g., cognitive task analysis), the resultant instruction is more effective. Based on these findings, the article provides a theoretical explanation of experts’ self-report errors and discusses implications for the continued improvement of instructional design processes.


ExpertiseSelf-reportKnowledge elicitationInstructionAutomaticity


Subject-matter experts inform curriculum development and instruction through explanations of the ways they perform relevant tasks. The experts themselves may present this information in lectures and seminars, or it may be incorporated into textbooks and training manuals. Thus, an explanation of the process for achieving a target performance represents the reflections of experts describing their own practice—from the instructor directly, through the course material, or a combination thereof. Regardless of the method of dissemination, the goal of such instruction is mimetic. The primary criterion for student success is the consistent replication of the skills presented on assigned tasks (Jackson, 1985).

Likewise, the central role of experts’ explanations is evident in informal instruction. Mentors model and explain their own approaches to solving problems during one-on-one interactions with mentees. Successful learning in these situations requires mentors to explain their strategies accurately and comprehensibly to their students during collaborations on authentic tasks (Radziszewska & Rogoff, 1988, 1991). However, the learner cannot directly observe the cognitive strategy selections and decision-making skills that are essential to performing complex tasks. Therefore, experts’ articulations of these processes are especially important.

The current review argues that, to the extent that the articulation of strategies by the expert is inaccurate or incomplete, the resultant instruction will hinder learners’ success. Experts do demonstrate excellent recall of episodic memories of events and problem states (e.g., Gobet & Simon, 1998). However, relatively little research has attempted to validate the accuracy and completeness of expert recall for the processes by which they determine and execute solutions to problems in their domain. This article reviews available evidence from the empirical literature in three relevant areas. They are (1) the cognitive architecture of expertise, (2) behavioral accuracy of experts’ self-report, and (3) the efficacy of instruction from guided and unguided self-reports of experts.

Definitions of expertise

Intensive debate exists in many disciplines regarding the appropriate criteria for the identification of experts (Cooke, 1992). Some authors argue that the term “expert” should refer only to individuals whose performance is consistently superior to that of non-experts in the essential skills of a domain (e.g., Dawes, 1994; Ericsson & Smith, 1991). Others suggest that discrete sets of competencies are inauthentic and do not capture the true nature of expert performance in authentic settings (Sternberg & Horvath, 1998). Furthermore, criteria for evaluating the identified performances also differ between domains and tend to be applied inconsistently between cases (Cellier, Eyrolle, & Mariné, 1997; Sternberg, 1997).

Moreover, decision makers in professional settings do not necessarily use psychological definitions of expertise when selecting subject-matter experts (Hoffman, 1998; Mullin, 1989; Schraagen, Chipman, & Shute, 2000). Instead, they may consider track records of strong performance, status within a professional community, or number of years working in the domain of interest (e.g., Glaser & Chi, 1988). Therefore, empirical work cited in this review includes studies conducted with participants that meet one or more of these criteria (i.e., performance, status among professional peers, or experience).

The cognitive architecture of expertise

Research in the late 1970s and 1980s produced several robust characteristics of expertise. Glaser and Chi (1988) summarized this work in seven general statements. Experts (1) excel primarily within their own domains, (2) perceive meaningful, interconnected patterns in their domain, (3) are faster and more accurate than novices when performing the skills of their domain, (4) have better short- and long-term memory than novices, (5) perceive problems in their domain at a deeper (more principled) level than novices, (6) spend a larger proportion of time qualitatively analyzing problems, and (7) self-monitor effectively during problem solving. Experts in many domains exhibit these characteristics. However, research has not identified the extent to which these skills might be necessary or sufficient elements of a specific expert’s performance.

In more recent research, Ericsson and his colleagues identified two additional characteristics of expertise. First, advanced expertise typically requires a minimum of ten years of “deliberate practice” to develop (Ericsson, Krampe, & Tesch-Romer, 1993, p. 363). These researchers describe deliberate practice as an individualized training regimen. It includes extensive coached practice with corrective feedback and sustained effort dedicated to skill improvement (Ericsson & Charness, 1994; Starkes, Deakin, Allard, Hodges, & Hayes, 1996). Simon and Chase (Simon & Chase, 1973) suggested that chess mastery requires a minimum of a decade of experience. However, subsequent studies in a number of domains found that sustained deliberate practice is a necessary feature of the extensive time period in order to yield expertise (e.g., Charness, Krampe, & Mayr, 1996; Simonton, 1999).

Second, consistent expert performance requires a “maximal adaptation to task constraints” (Ericsson & Lehmann, 1996, p. 277). Task constraints limit the number of viable pathways through a problem space. They also function within the particular domain-specific constraints that govern the execution of a particular task (e.g., the rules of chess, established flight paths for pilots, etc.). Additional task constraints include the laws of physics and the physiological limitations of the human body (Casner, 1994; Vicente, 2000).1 Experts consistently demonstrate optimal performances and highly refined skills that maximize the efficiency and effectiveness of their solutions under these constraints.

The role of knowledge in expert cognition

Chase and Simon’s (1973) classic work in the memory performance of chess masters suggests that the foundational component of expertise is the quantity and accuracy of their knowledge. This work compared the recall of experts and novices for the locations of realistically placed pieces on the chessboard. The results indicated vastly superior recall of the experts in the briefly presented stimuli when compared to novices. However, experts did not demonstrate the same advantage for randomly-placed pieces or chess-unrelated stimuli under equivalent conditions. The authors concluded that expert performance depends on two factors. First, the selected tasks must reflect a specific domain of mastery and are representative of the tasks performed during normal participation in the activity. Second, experts’ recognition of prior relevant experiences in the domain must generate the high-speed performance and large memory capacity demonstrated by the participants in their study. Thus, expert performance is a product of experience-based knowledge that can be recalled quickly and consistently and then deployed. Subsequent studies of expertise in various domains have found similar results regarding the role of prior knowledge in performance (e.g., Alberdi, Sleeman, & Korpi, 2000; Beilock, Wierenga, & Carr, 2002).

Other knowledge differences between experts and non-experts that impact performance quality include levels of detail, differentiation, and levels of principled abstraction. Chi, Feltovich, and Glaser (1981) examined expert and novice performance in physics problem-sorting tasks. They observed that the categories identified by experts reflected fundamental principles on which the problem solutions relied (e.g., conservation of energy). In contrast, novices conceptualized the problems from surface-level details, such as the presence of pulleys or inclined planes. Similarly, when categorizing lines of computer code, programming novices classify according to syntax, whereas experts use functional or semantic characteristics (Adelson, 1981).

The knowledge structures of experts also provide an advantage in recalling problem states. Elaborate schemas maintain the detailed relationships among problem-relevant details. Consequently, increasing the quantity of available conceptual information further improves experts’ recall of specific problem states. For example, chess experts’ memory for specific piece locations during games improved when conceptually descriptive information was provided before or after the visual presentation of the chessboard (Cooke, Atlas, Lane, & Berger, 1993). Without that information, experts’ recall exceeded that of novices, but to a lesser degree than in the knowledge-rich condition. Thus, the memory performance of experts improves when they can leverage their extensive abstract knowledge in relation to specific events.

The level of conceptual abstraction in experts’ knowledge structures embodies an efficient compromise between representations of concrete elements of a particular problem and more general concepts and principles acquired through experience (Zeitz, 1997). This arrangement facilitates an expert’s ability to recognize sophisticated patterns, and it also enhances performance for recall of salient details in domain-relevant situations. However, studies investigating these phenomena examine experts’ memories of episodic information (i.e., problem states and events). They do not examine experts’ recall of their own decision-making processes.

The role of strategy in expert cognition

The second framework for expert cognition addresses performance in terms of qualitative differences between the problem-solving strategies of experts and novices. Consistent findings indicate that experts engage in forward reasoning processes based on their domain knowledge. Experts leverage their highly structured knowledge of relevant concepts and principles within the domain to efficiently generate effective strategies (Chi, Feltovich, & Glaser, 1981; Chi, Glaser, & Rees, 1982; Singley & Anderson, 1989).

Experts solve problems deductively by manipulating their mental models to identify optimal solutions based on the requirements of the task and task constraints. For example, physics experts initiate the problem-solving process by representing a situation on the basis of physics principles and relevant available data (Larkin, McDermott, Simon, & Simon (1980a); Larkin, McDermott, Simon, & Simon (1980b). They use theoretically-driven strategies and conceptual schemas to integrate both the provided relevant information and the abstract relationships between problem elements (Dhillon, 1998; Larkin, 1985).

In contrast, physics novices reason backwards from the required solution to determine their strategy. They classify problems on the basis of surface-level details that are not relevant to the operational principles of the task (Larkin, McDermott, Simon, & Simon, 1980a); Larkin, McDermott, Simon, & Simon (1980b). Novices then determine which equations will yield an answer that responds appropriately to the presented prompt. Consequently, they reason inductively to identify that solution through trial-and-error tests of constantly changing hypotheses (Lamberti & Newsome, 1989). Such heuristics are typical of novices’ problem solving across many domains (Lovett & Anderson, 1996).

These differences between expert and novice strategies during performance are robust, even when novices are instructed to develop a definite strategy before attempting a solution. Phillips, Wynn, McPherson, and Gilhooly (Phillips, Wynn, McPherson, & Gilhooly, 2001) found that the presence or absence of preplanning had no significant effect on the speed or accuracy of novices’ problem-solving performance. Evidence also suggests that when novices do perceive the deeper principles underlying a problem, their solution strategies do not resemble those of experts. They continue to select their strategies based on surface-level features (Sloutsky & Yarlas, 2000; Yarlas & Sloutsky, 2000).

The role of working memory in expert cognition

A third account of expertise emphasizes the superior working memory of experts when they are performing in their domain. Extensive evidence indicates that experts are able to attend to and process much more domain-relevant information in working memory than is possible for non-experts (Ericsson & Kintsch, 1995; Masunaga & Horn, 2000). This advantage remains robust even when experts perform both domain-relevant and distracting secondary tasks simultaneously (Gobet, 1998; Vicente & Wang, 1998). Several current theories account for this exceptional memory performance. Long term working memory theory (Ericsson & Kintsch, 1995), template theory (Gobet & Simon, 1996), the constraint attunement hypothesis (Vicente & Wang, 1998), and expertise working memory theory (Masunaga & Horn, 2000) all suggest that schematic structures within long term memory functionally augment the limited capacity of short-term memory in relation to domain-relevant problems.

Strong consensus exists across theories that rapid encoding, manipulation, and decoding of relevant information in working memory are essential elements of expertise (Ericsson & Kintsch, 2000; Masunaga & Horn, 2000). In working memory, experts attend to and process task-relevant information on the basis of highly refined schemas that serve as structures or templates to facilitate rapid processing. Therefore, experts perceive situations in their domain through the filter of their extensive experience. In contrast, novices’ schemas are not refined with regard to domain tasks.

For example, expert and novice pilots attempted to assess the effectiveness of specific cockpit actions rapidly to achieve stated flight goals (Sohn & Doane, 2003; also see Wickens, 2002). When the cockpit instrument readings were consistent with the presented scenarios and compatible with the stated goals, expert pilots responded statistically faster than did novices. Separate measures of long-term working memory and short-term memory strongly predicted response times for expert and novice participants, respectively. However, when the instrument readings were incompatible with the other information, the two groups performed equivalently. Further, the long term working memory of the experts no longer predicted their response times. Instead, short-term memory span accounted for both experts’ and novices’ performance equivalently.

The role of skill automaticity in expertise

In addition to the three frameworks described above, some researchers consider automaticity to be a hallmark of expertise (e.g., Bereiter & Scardamalia, 1993). However, others disagree (e.g., Ericsson, 1998). Automaticity is the execution of effortless cognitive procedures that are acquired through the consistent, repeated mapping of stimuli to responses (Schneider & Shiffrin, 1977; Shiffrin & Schneider, 1977). It occurs when individuals with extensive practice in a procedure perform it increasingly quickly and with diminishing levels of mental effort (Anderson, 1982; Logan, 1988). In addition to perceptual stimuli, goal- and rule-based cues also can trigger automated routines (Anderson, 1992; Bargh & Ferguson, 2000). In complex skills, automaticity entails a reduction in the number of intermediate decision points that require conscious resolution (Anderson, 1995; Blessing & Anderson, 1996; Fitts & Posner, 1967; Logan, 1988).

Procedures that become automated are deeply ingrained and difficult to change. In addition, problem solvers are more likely to apply automated procedures unconsciously, even in the attainment of consciously selected goals (Aarts & Dijksterhuis, 2000, 2003). For these reasons, automated skills are not subject to conscious monitoring and tend to run to completion without interruption2 (Wheatley & Wegner, 2001). These characteristics prevent automated routines from being easily modified. However, they also allow an individual to maintain high performance levels during task execution while engaging in a simultaneous, non-automated task (Brown & Bennett, 2002; Logan & Cowan, 1984).

Automated interpretive procedures also can occur before the conscious selection of goals. Automaticity in this form impacts the judgments and situational assessments of the individual, as well as his or her selection of goals Bargh (1999a,b). Evidence suggests that habitual approaches to problems are goal-activated. This activation significantly limits the solution search (Bargh, Gollwitzer, Lee-Chai, Barndollar, & Trötschel, 2001). For example, Aarts and Dijksterhuis (2000) primed habitual bicycle riders with information about the locations to which they typically traveled. The researchers then measured participants’ response time latencies for cycling-related words. Participants responded more quickly to those stimuli than to stimuli that were not cycling-related. This effect persisted, even after the researchers presented alternative travel goals that would preclude cycling (e.g., traveling internationally). The response patterns indicated a strong predisposition to rely on habitual modes of travel (i.e., bicycling) rather than alternative means. In a related study, participants viewed photographs of a library and adopted a goal to visit it. Subsequently, participants unintentionally spoke at a lower volume with the researchers. The effect remained even after controlling for gender, mood, and arousal (Aarts & Dijksterhuis, 2003). Thus, environmental stimuli can activate automated assessment and goal-setting procedures in familiar situations. They, in turn, trigger behavioral plans and subsequent actions without the intention of the individuals.

Adaptivity and expertise

Ericsson (1998, 2004) argued that the development of automaticity impairs the development of expertise. This argument has two major premises. The first premise is that successful adaptation to atypical conditions is the “essence of expert performance” (Ericsson, 1998, p. 92). The second premise is that such adaptations require conscious control of one’s actions to modify performance. If these premises are correct, then the ballisticity of automated procedures would be maladaptive. Specifically, Ericsson (1998) argued, “the key challenge for aspiring expert performers [would be] to avoid the arrested development associated with automaticity” (p. 90). However, other studies in expertise and automaticity suggest that each of these premises may not hold.

The first premise does not consider that experts may vary in adaptivity. Hatano, 1982; Hatano & Inagaki, 1986, 2004 distinguishes between routine expertise and adaptive expertise. The performance of adaptive experts remains robust in the face of changing conditions. In contrast, routine experts demonstrate high proficiency within stable and predictable environments in which new adaptations are unnecessary. However, when task constraints change or unusual events do occur, these individuals fail to maintain their high levels of performance – analogous to functional fixedness in problem solving (Duncker, 1945; Gick & Holyoak, 1980, 1983).

An expert-novice study of bridge-playing exemplifies routine expertise (Frensch, & Sternberg, 1989). The researchers manipulated features of the game to examine the impact of surface-level and structural changes on performance. Both experts and novices adapted equally well to the surface-level changes (i.e., changing the assigned suits). However, changes in deeper levels of the game structure disrupted the experts’ performance to a greater extent than that of novices. In this case, the participating experts demonstrated routine expertise through their failure to adapt effectively. The outcome may be indicative of ballisticity, wherein the experts’ automated processes resisted mid-process modification.

Similarly, in a study of troubleshooting skills in electronics, experts frequently failed to examine relevant components when attempting to diagnose an improbable flaw (58.7% of trials). In contrast, troubleshooting novices tested the component in 75.1% of trials (Besnard, 2000; Besnard & Bastien-Toniazzo, 1999). Ultimately, five of the ten expert participants correctly identified the cause of the fault, in contrast to only two successful novices of the nine novice participants. The five unsuccessful expert participants in the study demonstrated routine expertise, because they were unable to modify their automated procedures. However, the five remaining experts demonstrated adaptive expertise; their automated skills ultimately did not prohibit successful outcomes (Besnard & Cacitti, 2001).

The second premise assumes that automated processes cannot be transferred. However, careful empirical studies of acquisition and transfer for automated skills demonstrate that limited transfer of automated procedures to novel cues and circumstances can occur (e.g., Anderson, 1987; Cooper & Sweller, 1987; Fisk, Lee, & Rogers, 1991; Kramer, Strayer, & Buckley, 1990; Schneider & Fisk, 1984). Further, because complex skills are inherently compilations of many distinct subskills, any particular performance may represent one of three possible paths. These paths are (1) fully automated processes, (2) serial execution of automated and consciously mediated subskills, or (3) simultaneous execution of both automatic and conscious elements. (Anderson, 1995; Bargh & Chartrand, 1999; Cohen, Dunbar, & McClelland, 1990; Hermnns et al., 2000; Shiffrin & Dumais, 1981). Thus, when experts engage in “step-skipping behavior” (Koedinger & Anderson, 1990, p. 511) indicative of automated subprocesses, the cognitive outcomes of those subprocesses may be processed consciously at later points in the overall complex skill sequence (Blessing & Anderson, 1996).

From this perspective, two factors may enhance or limit the adaptivity of experts. One factor is the general quality and effectiveness of the individual’s declarative and procedural knowledge as they relate to the situation requiring adaptation. For example, two expert historians with different subspecialties independently analyzed a set of historical documents to identify Abraham Lincoln’s perspectives on race (Wineburg, 1998). One expert had extensive knowledge of content relevant to the documents, whereas the other possessed only very broad understanding. The first expert generated a rich interpretation of the documents without difficulty. However, the latter expert initially had extreme difficulties in drawing appropriate inferences about the material. However, after he acquired the necessary knowledge, he provided interpretations that were qualitatively equivalent to the initial conclusions of the first expert. Thus, he initially lacked sufficient knowledge to transfer his skills to a task atypical for his subspecialty. However, after he overcame that obstacle, he demonstrated the task-required adaptivity for his expertise.

The other factor is the location of conscious decision points within the overall sequence of subskills that comprise a complex skill. Changes in the characteristics or task constraints of a problem require adaptation at specific points in a procedure. Therefore, conscious mediation at these points may facilitate adaptive performance when differences exceed the transfer range of automated skills. Other subskills that do not require modification to process information effectively can continue to operate in an automated manner. Further, they do not impair an adaptive expert’s ability to maintain high levels of performance. In this way, adaptive experts make optimal use of their cognitive resources; they dedicate maximum working memory resources to novelty without allocating significant portions to routine or irrelevant elements. Thus, experts’ adaptivity for automated skills requires two considerations. First, the overall proportion of automated subskills in a relevant complex skill must be great enough to allow sufficient working memory space for conscious adaptation. Second, the appropriate decision points in the procedure must be available for conscious mediation to adapt to atypical tasks successfully.

Expert sight-reading performance in music is a clear example of this process. While playing music with typical features, expert pianists rely on automated skills to recognize patterns and strike the appropriate keys in sequence (Lehmann & McArthur, 2002). Concurrently, they dedicate their conscious processing to dynamic synchronization with other performers. When the novelty or visual complexity of the sheet music exceeds the threshold of transfer for automated sight-reading skills, the musician engages in effortful, deliberate encoding to mediate the execution of the necessary subprocesses.

In another example, Lee and Anderson (2001) reanalyzed learning data collected from the Kanfer-Ackerman Air-Traffic Controller Task (Ackerman, 1988; Ackerman & Kanfer, 1994). They then replicated the task with specific attention to keystroke latencies and eye tracking process data. The overall execution speed of the necessary complex cognitive skills increased according to the power law of practice. However, aggregating the acceleration functions of the individual subskills explained more of the variance in performance than fitting a single function to the overall task. Therefore, the development of automaticity in component subskills was a necessary and sufficient condition for optimal performance (i.e., maximum adaptation to task constraints).

These findings are compatible with the suggestion by Bereiter and Scardamalia (Bereiter & Scardamalia, 1993) that acquired automaticity facilitates the development of expertise. It functions in this way by allowing the strategic reinvestment of conscious attention where it will maximally enhance performance. The key role of automaticity as an element of both adaptive and routine expertise indicates that important components of experts’ performance do occur outside their ability to monitor consciously.


The manifestation of expertise through the frameworks discussed in the previous section (knowledge, strategy, working memory, and automaticity) yields a common result. Ultimately, each aspect of expert performance improves the effectiveness and cognitive efficiency of the problem-solving process. The effectiveness and efficiency emerge not only as consequences of acquired expertise within a domain. They also further improve performance by freeing up limited cognitive resources to accommodate atypical features or other added cognitive demands that may arise within a task (Bereiter & Scardamalia, 1993; Sternberg & Horvath, 1998).

These properties require consideration when analyzing instructionally-relevant information obtained from experts. First, automated knowledge may be omitted from procedural explanations (e.g., Cooke & McDonald, 1987; de Groot & Gobet, 1996; Gruber, 1989; Hinds, Patterson, & Pfeffer, 2001). Second, novices process task-relevant information in a fundamentally different way than do experts. These facts indicate that the application of expert-appropriate information during instruction requires modification to be effective for novices (e.g., Kalyuga, Ayres, Chandler, & Sweller, 2003; Kalyuga & Sweller, 2004).

The accuracy of experts’ self-report

Careful analyses of self-report accuracy became a focus in the psychological literature in the late 1970s and early 1980s (e.g., Nisbett & Wilson, 1977; Ericsson & Simon, 1980). Most of the evidence pertained to the accuracy of participants’ descriptions for completing novel problem-solving and artificial grammar tasks. These tasks intentionally precluded the use of expertise. The purpose was to prevent the confounding of veridical self-reports with pre-existing theoretical explanations (e.g., Berry & Broadbent, 1984; Broadbent, 1977; Maier, 1931; Reber, Kassim, Lewis, & Cantor, 1980). However, the patterns in the self-report data indicated that self-report errors and omissions increased as skills improved. Discussions of this trend attribute the data to the cognitive dissociation between procedural and declarative knowledge (e.g., Squire, Knowlton, & Musen, 1993). Because the dissociation is robust, there is no reason to expect the accuracy of introspection to increase for experts’ performance on tasks within their domain. Nevertheless, many current studies accept experts’ explanations of their problem-solving processes at face value (e.g., Taylor & Dionne, 2000; O’Hare, Wiggins, Williams, & Wong, 1998; Seamster, Redding, & Kaempf, 2000). There is no evidence that the cognitive relationship between declarative and procedural knowledge differs between experts and non-experts. The following section reviews studies of inconsistencies in self-reports from both experts and non-experts for attributions, assessments of problem states, and problem-solving strategies.

Attributions for performance outcomes

Cognitive theorists suggest that the majority of cognitive task performance is automated (Bargh & Chartrand, 1999; Bateson, 1972; Baumeister, Bratslavsky, Muraven, & Tice, 1998; Miller, Galanter, & Pribram, 1960; Nørretranders, 1998). However, individuals tend to attribute most, if not all, of their actions to intentional decision making processes (Wegner, 2002). The strength of this belief can lead them unintentionally to fabricate consciously reasoned explanations for their automated behaviors. This situation occurs even when their explanations are incompatible with the reality of events as they occurred (Thompson et al., 2004). Experts frequently provide process explanations for instructional purposes. Therefore, the pressure to attribute successful performance to deliberate and fully controlled processes may be even greater than that experienced by non-experts (McAdams, 2001; McAllister, 1996).

Empirical evidence of such attribution errors by both experts and non-experts is available in a variety of studies (e.g., insight problem solving (Maier, 1931); judgment biases (Tversky & Kahneman, 1974)). Maier’s (1931) classic study of insightful problem solving found a consistent pattern of incorrect attributions by participants. They reported solving the problem in a single step following an impasse. However, when the participants reached the impasse and ceased all activity, the experimenter entered and provided a very subtle clue. Invariably, after the experimenter again left the room, the participants realized the solution and solved the problem successfully. During retrospective reports, more than two thirds of the participants failed to attribute their insight to any action by the researcher. Instead, they identified their continued thought on the problem as the sole source of success. For example, one participant claimed that “having exhausted everything else, the next thing was to swing [a string that was an element of the problem]. I thought of the situation of swinging across a river. I had imagery of monkeys swinging from trees. This imagery appeared simultaneously with the solution. The idea appeared complete” (Maier, 1931; as cited in Nisbett & Wilson,1977, p. 241).

Experts often attribute their domain-related decisions to the properties of the particular situation. However, empirical studies in many fields indicate that such evaluations are not stable over time, despite unchanging situation properties. In medicine, for example, the reliability of diagnoses by expert physicians for identical symptoms presented at different times only correlated between .40 and .50 (Einhorn, 1974; Hoffman, Slovic, & Rorer, 1968). Thus, the specifics of the information in each case alone could not have led to the diagnoses. Further, the participants typically attended only to one to four symptoms as cues for diagnosis. Neither the number of decision-relevant details presented to participants, nor the number of details that participants claimed to have considered in making their diagnoses, affected the number of cues to which participants actually attended (Einhorn, 1974).

Similarly, expert clinical neuropsychologists estimated premorbid intellectual function (i.e., IQ) by evaluating hypothetical patient profiles and explaining their reasoning processes (Kareken & Williams, 1994). Participants first reported the correlation between various predictor variables (e.g., education, occupation, gender, age, and race) and IQ. Then, they generated estimates of IQ scores from values of the predictor variables provided for a set of fictitious patients. The experts stated explicitly that IQ would correlate at specific levels for each variable. However, their estimates for the supplied cases demonstrated significant departures from those values. Many were completely uncorrelated. In these cases, performance clearly relied on processes that were dissociated from participants’ articulated beliefs. Thus, any instruction based on their expressed attributions would not have assisted students to replicate the experts’ performances.

Assessments of problem states

Schemas derived from extensive experience serve as stable mental models for efficient evaluation and encoding of relevant events (Anzai & Yokoyama, 1984; Bainbridge, 1981; Biederman, 1995; Clement, 1988; Larkin, 1983). Although these refined mental models are highly adaptive for problem solving, they can subsequently interfere with the accurate recall of problem-solving situations. Long-term memory may not retain details that do not readily map onto those models (Wigboldus, Dijksterhuis, & van Knippenberg, 2003). As a result, errors of generalizability and rationalization may surface in a retrospective account of the event (Nisbett & Wilson, 1977; Wilson & Dunn, 2004). Reports may be inaccurate when participants rely on incorrect preexisting causal theories to explain their processes (Wilson & Nisbett, 1978). Several empirical studies provide direct evidence of incorrect beliefs when mental models do not match actual task parameters.

Non-expert participants did not recall information in a task environment that was irrelevant to their problem solving approach (Logan, Taylor, & Etherton, 1996). Despite performing at better than chance levels on a recognition task for that information, their free recall did not reflect awareness of the exposure during training. Participants attended to the information during the training, as evidenced by relative success in the recognition task. However, they did not encode it as part of an episodic memory for the problem or solution. Information that fell outside the structure of the participants’ mental models went unreported, regardless of its actual relevance to the task.

Studies of implicit learning also report that participants commonly fail to articulate relevant cues presented in problem states. A variety of tasks presenting subtle, consistent patterns in stimuli demonstrate that these patterns trigger procedural responses. When the patterns were altered, performance levels dropped significantly, despite participants’ inability to articulate the strategies that relied on the patterns (e.g., Lewicki, Hill, & Czyzewska, 1992; Ling & Marinov, 1994). Examples are learning to control simulations of virtual economies and sugar refineries. Following task mastery, the ability of participants to articulate relevant governing principles correlated negatively with their skill levels (Berry & Broadbent, 1984; Broadbent, 1977; Broadbent, Fitzgerald, & Broadbent, 1986). Thus, skill is not necessarily predictive of the explicit knowledge of participants.

Similarly, the metacognitive selection of strategies during non-experts’ problem solving can occur implicitly (Reder & Schunn, 1996; Schunn, Reder, Nhouyvanisvong, Richards, & Stroffolino, 1997). Previous exposure to similar problems predicted participants’ strategy selection more accurately than their self-reported attempts at deliberate strategy selection. This pattern was robust, even when participants were unaware of acquiring knowledge during previous problems.

Research in the articulation of problem-solving processes by expert nurses suggests that this phenomenon remains stable at expert levels of knowledge and skill (Crandall & Getchell-Reiter, 1993). Participants were 17 registered nurses who were highly trained and experienced in neonatal intensive care. Participants averaged 13 years of overall experience and 8.1 years specializing in neonatal patients. In unstructured interviews eliciting free recall, researchers asked the participants to provide highly detailed accounts of critical incidents in which the nurses believed they had significantly impacted a patient’s medical outcome. Interviewers emphasized the need for the nurses to be as specific as possible about the assessment parameters, diagnostic cues, and clinical judgments that they used. After the nurses described the events and their decision-making processes in as much detail as they could, the researchers utilized semi-structured knowledge elicitation probes to identify additional relevant information that was not articulated. Analysis of the transcripts revealed that the structured probes elicited more indicators of medical distress in the patients. Before the use of the probes, the nurses’ explanations of the cues they used were either omitted or articulated vaguely as “highly generalized constellations of cues” (p. 50).

Comparison of the elicited cues to those described in the medical and nursing literature provided strong evidence that the nurses’ statements were not confounded by their theoretical knowledge. More than one-third of the individual cues (25 out of 70) used to correctly diagnose infants across the most commonly reported form of infant distress were not listed in any of the existing medical research or training literature. These cues comprised seven previously unarticulated categories that were subsequently incorporated into training for nurses entering neonatal intensive care (Crandall & Gamblian, 1991).

Selection and use of strategies

Extensive practice using procedures to solve problems in a specific domain may lead experts to automate significant portions of their skills. Consequently, the most frequently employed elements—presumably those of greatest utility within a domain of expertise—would be the most difficult to articulate through recall. For example, expert physicists provided predictions of object trajectories accompanied by written explanations of the means by which they reached their conclusions. However, when researchers used the explanations in an attempt to replicate the physicists’ predictions, their results differed significantly from the predictions of the experts in the study (Cooke & Breedin, 1994).

Similarly, a team of engineers and technicians with expertise in the assembly of a particular laser failed in their attempt to generate a complete set of assembly instructions. This occurred despite extensive and repeated efforts to include every relevant fact, process, and heuristic (Collins, Green, & Draper, 1985). When scientists attempted to assemble the laser according to those instructions, they were unable to produce a functional result. After multiple consultations, the scientists eventually discovered that the expert team had omitted a necessary step from the instructions. The step turned out to be a universally implemented practice among the engineers and technicians that they unintentionally had failed to articulate.

In a more conventional laboratory study, six computer programming experts solved a series of challenging debugging tasks. They reported their problem-solving processes using a variety of prescribed self-report strategies. Regardless of the approach used for self-report, no single expert successfully articulated more than 53% of the problem solving steps recorded on videotape during their task performances (Chao & Salvendy, 1994).

Some experts freely acknowledge that they are unable to accurately recall aspects of their problem-solving strategies. One researcher observed significant discrepancies between an expert physician’s actual diagnostic technique and the technique that he articulated to medical students. The researcher interviewed the physician to explore the issue. The physician’s explanation for the contradiction was, “Oh, I know that, but you see, I don’t know how I do diagnosis, and yet I need things to teach students. I create what I think of as plausible means for doing tasks and hope students will be able to convert them into effective ones” (Johnson, 1983, p. 81).

In summary, those studies that have examined the accuracy of self-reports as explanations of problem-solving processes have found errors to be prevalent. Therefore, a critical re-examination of the expert’s role as a direct source of knowledge for instruction is necessary. The assertion that inaccuracies or omissions in instructional materials limit their instructional efficacy is generally accepted (Clark & Estes, 1996; Jonassen, Tessmer, & Hannum, 1999). However, the extent to which these errors are avoidable through knowledge elicitation methods other than free recall requires further exploration.

The efficacy of instruction from guided and unguided experts’ self-reports

We are only beginning to understand the impact of guided knowledge elicitation on the accuracy of self-report and its subsequent impact on instructional outcomes (Hoffman, Crandall, & Shadbolt, 1998; Schraagen, Chipman, & Shute, 2000). Currently, very few published studies utilize rigorous experimental or quasi-experimental designs and report statistical analyses3 (Lee, 2003). However, studies that are available indicate that the use of structured knowledge elicitation techniques (e.g., cognitive task analysis) does yield more effective instruction.

In a study of surgery instruction in a top medical school, an expert surgeon taught a foundational medical procedure (central venous catheter placement and insertion) to first-year medical interns in a lecture/demonstration/practice sequence (Maupin, 2003; Velmahos et al., 2004). The control condition and the experimental condition differed only in the method of knowledge elicitation used to generate the lecture. The control-group’s lecture was an unstructured self-report by the expert that is typical of instructional practice in medical schools. For the experimental condition, a cognitive task analysis (CTA) conducted by the researchers provided the contents of the lecture. Both conditions allotted equal time and access to equipment during the lecture, demonstration, and practice segments. The students in each condition completed a written posttest and performed the procedure on multiple human patients during their subsequent hospital work. The performance difference between the mean scores of the two groups was striking. Students in the CTA-based condition improved on their pretest scores in the posttest more than those in the control group (3.67 vs. 0.64). They also outperformed members of the control group when using the procedure on patients in every measure of performance. Included were an observational checklist of steps in the procedure (12.6 vs. 7.5), number of needle-insertion attempts needed to insert the catheter into patients’ veins (3.3 vs. 6.4), frequency of required assistance from the attending physician (0% vs. 50%), and time-to-completion for the procedure (15.4 min. vs. 20.6 min.).

Similarly, Schaafstal, Schraagen, and van Berlo (2000) compared the effectiveness of a pre-existing training course in radar system troubleshooting with a new version generated on the basis of cognitive task analysis. Participants in both versions of the course earned equivalent scores on knowledge pretests. However, after instruction, participants in the CTA-based instructional group solved more than twice as many malfunctions, in less time, as those in the traditional instruction group. In all subsequent implementations of the CTA-based training design, the performance of every student cohort replicated or exceeded the performance advantage over the scores of the original control group.

Another study examined the relative efficacy of three instructional formats for spreadsheet software training (Merrill, 2002). The first condition provided training in a discovery learning format that presented three authentic posttest problems to be solved. An instructor was available to answer questions when asked. The second condition provided direct instruction that explained necessary concepts and procedures, then offered guided demonstrations of each step necessary to complete the authentic problem set. The third condition provided direct instruction based on CTA-elicited strategies of spreadsheet experts. Scores on the posttest problems favored the CTA-based instruction group. Mean performance scores were 34% for the discovery condition, 64% for the guided demonstration condition, and 89% for the CTA condition. Further, the average times-to-completion also favored the CTA group. Participants in the discovery condition required more than the allotted 60 minutes. The guided demonstration participants completed the problems in an average of 49 min., whereas the participants in the CTA-based condition required an average of only 29 min.

In summary, initial results indicate promise for the further improvement of instructional design and curriculum development. However, there are relatively few published studies in this area to date. In each case, the primary independent variable was the manner in which course designers elicited procedural content from subject-matter experts. When researchers utilized a structured knowledge elicitation method to capture a complete and accurate process, the resulting instruction was consistently more effective than alternative approaches to conveying expert knowledge. This trend suggests that unstructured self-reports lack information important for novices and significantly limit their acquisition of new skills.


What are the cognitive properties of expertise, and how do they impact the development of instructional materials predicated on experts’ explanations of their decision-making processes? Experts’ cognition generates outstanding performance in several ways. First, extensive conceptual and strategic knowledge provides an effective framework for understanding relevant principles and task constraints. Second, effective automated procedures allow experts to solve problems quickly and with little effort relative to non-experts. Third, expanded working memory enables experts to consider problems at a higher degree of complexity. These adaptations enhance experts’ abilities to recall accurately the salient details of problem states. However, these elements also inhibit experts’ abilities to articulate completely and accurately the processes they use. Process explanations that are freely recalled often omit information important for novices’ success. When structured knowledge elicitation processes scaffold experts’ explanations, however, the resulting instructional content yields improved student learning outcomes.

These findings suggest a need for several new lines of research. First, the results of the instructional studies reviewed in this paper are dramatic, but limited in scope. Current studies have focused exclusively on small populations of adult learners that are attempting to master sophisticated skill sets. Further investigations must continue to generate consistent results across larger, more diverse samples of learners in a variety of settings (e.g., K-12 and postsecondary). Additionally, future studies may reveal a difference in effect size related to the complexity of the instructional content delivered.

Second, future research must identify possible interactions between specific methods of structured knowledge elicitation and the content of experts’ knowledge. Previous reviews (e.g., Cooke, 1994; Ericsson & Simon, 1980; Schraagen, Chipman, & Shute, 2000) have suggested that the specific tools for eliciting knowledge may differentially impact completeness and accuracy. However, only a few studies (e.g., Chao & Salvendy, 1994) have evaluated and compared the effectiveness of specific techniques in a systematic way. A thorough understanding of the strengths and weaknesses of knowledge elicitation methods will maximize the educational benefits of instructional content generated by experts.

Third, the causal mechanisms underlying the instructional impact of errors in experts’ self-reports must be explicated and linked to other research efforts in learning and instruction. For example, errors of omission may hinder instruction through the same mechanism that impedes student learning in discovery-based instruction. By definition, discovery learning entails the requirement that students discover or induce principles relevant to the learning objectives that the learner does not yet know. This class of pedagogical techniques imposes very high levels of cognitive load that can prevent learners from successfully acquiring problem-solving strategies (Chandler & Sweller, 1991; Sweller, Chandler, Tierney, & Cooper, 1990; Tuovinen & Sweller, 1999). Thus, withholding necessary information from students can undermine the effectiveness and efficiency of instruction (Kirschner, Sweller, & Clark, in press; Mayer, 2004). When instruction is based on an expert’s self-report, unrecognized errors of omission may embed unintended discovery learning scenarios in the instructional materials presented to the student.

Similarly, when instruction includes experts’ errors of commission, students can acquire misconceptions. Past research indicates that student misconceptions impair both ongoing performance and future learning (Lohman, 1986; Schwartz & Bransford, 1998). Misconceptions are robust and often resist subsequent correction (Bargh & Ferguson, 2000; Chinn & Brewer, 1993; Thorley & Stofflet, 1996). Thus, preventing the inadvertent communication of misconceptions to students is also essential for effective instruction.

Subject-matter experts are an important source of information for instructional materials. Their extensive knowledge and experiences provide the basis for advancement in every field. However, learners benefit fully from experts’ knowledge only when it is conveyed completely and accurately. Structured knowledge elicitation provides a promising means by which to maximize the benefits of expertise in the development of instruction to enhance students’ problem solving skills.


Task constraints do not include an individual’s intelligence. Data from a number of studies indicate that expert performance is not significantly correlated with measures of general or fluid ability (Ceci & Liker, 1986; Doll & Mayr, 1987; Ericsson & Lehmann, 1996; Hulin, Henry, & Noon, 1990; Masunaga & Horn, 2001).


In the automaticity literature, this property is commonly referred to as ballisticity (Hermans, Crombez, & Eelen, 2000; Logan & Cowan, 1984).


Lee (2003) reports that only 8 of 318 studies met these criteria from a search of the following databases: Applied Science Technology, ArticleFirst, CTA Resource, Dissertation Abstract Index, ED Index, ERIC, IEEE, INSPEC, PsycINFO, and Elsevier ScienceDirect.



The author gratefully acknowledges Dr. Margaret Gredler for her assistance in revising this manuscript. The considerable time and effort she invested was essential to its completion.

Copyright information

© Springer Science + Business Media, Inc. 2006