Introduction

ITS authoring tools aim at making the creation of ITS more efficient and easier to learn. Also, they often aim at lowering the skill level needed to build a tutor and sometimes even at providing design guidance for effective tutors. ITS authoring tools by now have a long history in ITS research. Important early work was presented in Murray’s (1999) influential overview paper in IJAIED and in a book edited by Murray et al. (2003). More recent work appeared in a special issue in IJAIED (Koedinger and Mitrovic 2009) and a series of workshops organized by the Army Research Laboratory and the University of Memphis (Sottilare et al. 2013, 2014, 2015). There are many ITS authoring tools, each with its own underlying ITS technology, such as ASPIRE (Mitrovic et al. 2009), ASTUS (Paquette et al. 2015), ASSISTments Builder (Razzaq et al. 2009), AutoTutor tools (Nye et al. 2014), CTAT (Aleven et al. 2009b; Koedinger et al. 2004), GIFT (Sottilare 2012), SDK (Blessing et al. 2009b), SimStudent (MacLellan et al. 2014; Matsuda et al. 2015), and xPST (Blessing et al. 2011; Blessing et al. 2009a; Kodavali et al. 2010). These tools have made ITS development easier and more cost-effective. They have lowered the skill threshold for building tutors. Some studies have provided estimates of cost savings (Aleven et al. 2009b; Razzaq et al. 2009) or have empirically evaluated the authoring efficiency of ITS authoring tools (Aleven et al. 2006b; Blessing and Gilbert 2008; Devasani et al. 2012; Kodaganallur et al. 2005; MacLellan et al. 2014; Paquette et al. 2010; Razzaq et al. 2009).

Although it is hard to generalize from the current crop of ITS authoring tools, one broad trend we see is that non-programmer approaches to tutor building appear to be winning the day. Given that ITS development has traditionally required specialized programming skill that has hampered widespread development and use of tutors, this trend is not surprising. Many tool sets, including ASPIRE (Mitrovic et al. 2009), ASSISTments (Razzaq et al. 2009), CTAT (Aleven et al. 2009b), SimStudent (Matsuda et al. 2015), and xPST (Kodavali et al. 2010), do not require advanced programming or any programming at all. While it is clear why easy-to-use and easy-to-learn tools are popular, we do need to ask whether non-programmer ITS authoring tools are capable of capturing sophisticated tutoring behaviors that are effective in helping students learn in a wide range of domains.

We focus on a set of authoring tools called CTAT, which stands for Cognitive Tutor Authoring Tools. The project started in 2002 with the goal of making a particular type of ITS, namely, Cognitive Tutors, easier to develop. A second goal was to lower the skill threshold for building these kinds of tutors. At the time, a substantial number of Cognitive Tutors had been built and were in a use on a regular basis in many schools in the United States. They had been shown to lead to substantial increases in students’ learning (Anderson et al. 1995; Koedinger et al. 1997), a finding that, with a few exceptions, was reproduced in later classroom studies (Koedinger and Aleven 2007; Ritter et al. 2007), including one conducted by a third party with 17,000 students in 150 schools (Pane et al. 2013). Although our initial goal was to facilitate development of Cognitive Tutors (Koedinger et al. 2004), along the way the project took an unexpected turn, described in our 2009 article in IJAIED entitled “A new paradigm for intelligent tutoring systems: Example-tracing tutors” (Aleven et al. 2009b). We created a novel tutor type, together with a novel non-programmer approach to tutor authoring. We called the new tutor type example-tracing tutors, by analogy to model-tracing tutors (Anderson et al. 1995). Whereas model-tracing tutors use a generalized rule-based cognitive model to interpret student behavior, example-tracing tutors use generalized examples of problem-solving behavior. This choice sets them apart from many other ITS, as it is more common to use representations of general domain knowledge, such as rules (Anderson et al. 1995; Koedinger and Corbett 2006) or constraints (Mitrovic and Ohlsson 1999). Example-tracing tutors can be created without programming and – as we argued in our 2009 IJAIED and continue to maintain today – sophisticated tutoring behaviors can be authored for a broad range of task domains. Further, we presented evidence that even in relatively small tutor-building projects, building example-tracing tutors can be 4–8 times as cost-effective as estimates from tutor building projects reported in the literature. Across projects, we saw roughly a 1:50 to 1:100 ratio of instructional to development time, compared to earlier estimates of 1:200 to 1:300. Additional cost savings come from the fact that tutors can be built without employing expensive programmers. A similar study with the ASSISTments Builder, another non-programmer ITS authoring tool, demonstrated development ratios of 1:28 to 1:40 (Razzaq et al. 2009). (These estimates are not directly comparable across the two tools sets, because the tools support different types of tutors and because the CTAT study included a wider range of development activities.) These studies show that ITS authoring tools for non-programmers can make authoring dramatically more efficient and cost-effective.

In the current article, we first give an overview of example-tracing tutors and of new extensions added to CTAT since 2009. We then revise and bolster the argument first presented in our 2009 IJAIED article that example-tracing tutors are ITS, along the way highlighting the many ways in which they are “adaptive” to learner needs and differences. To illustrate the scope of systems built with CTAT, we present 18 examples of example-tracing tutors, all of which have been used in real educational settings. (Screenshots of these tutors are shown in the Appendix.) We hope that together these arguments present a convincing case that example-tracing tutors are a useful and effective ITS paradigm and, more generally, that non-programmer authoring tools can be used to develop genuinely sophisticated tutors.

Overview of Example-Tracing Tutors and CTAT

CTAT supports two tutor paradigms: example-tracing tutors (Aleven et al. 2006c, 2009b; Koedinger et al. 2004) and rule-based Cognitive Tutors (Aleven 2010; Aleven et al. 2006b; Anderson et al. 1995; Koedinger et al. 1997). In this article, we focus on example-tracing tutors, as they were the focus of our IJAIED 2009 article. The vast majority of tutors built with CTAT have been example-tracing tutors, because they are easier to author and debug than rule-based Cognitive Tutors. Using CTAT, example-tracing tutors can be built and deployed entirely without programming. Table 1 gives an overview of the functionality supported by CTAT and its associated learning management system, the Tutorshop. To give a sense for the scope of this project, we started building CTAT in 2002. Over the years, at least 54 people have contributed to CTAT and the Tutorshop; these two systems together represent approximately 60 man years’ worth of work, the large majority of it by professional software engineering staff. Today, the CTAT/Tutorshop code base comprises 750,000 lines of code (including white lines and comments).

Table 1 Overview of CTAT/Tutorshop functionality; all functions are supported without programming, except where otherwise indicated

Authoring Process

The process of authoring an example-tracing tutor has six key parts, often carried out in iterative cycles (and not necessarily in sequence). Our 2009 article in IJAIED (Aleven et al. 2009b) describes the process in greater detail. First, the author or authoring team investigates student thinking and learning in the given task domain, using cognitive task analysis (de Baker et al. 2007; Clark et al. 2008; Lovett 1998) and educational data mining methods. Second, informed by the results of the first step, the author designs and creates one or more tutor interfaces. These interfaces tend to be specific to the problem types for which the tutor will provide tutoring; they break down complex problem solving into steps. Using CTAT, tutor interfaces can be built through drag-and-drop techniques within an existing interface builder, such as the Flash IDE (Fig. 1, right). Other options for the tutor front end are supported as well (see Table 1).

Fig. 1
figure 1

Authoring an example-tracing tutor with CTAT

Third, the author creates generalized examples needed by the tutor. She demonstrates problem-solving steps on the interface (Fig. 1, middle), which CTAT’s Behavior Recorder tool records in a behavior graph (Fig. 1, left); the links in the graph represent problem-solving steps. During tutoring, the tutor evaluates student actions by comparing them against the graph. The graph is therefore expected to capture all solution strategies that are reasonable within the given interface, so that the tutor can recognize all of them as correct student behavior. Different ways of solving the problem are captured as different paths in the graph. Once relevant paths have been recorded, the author generalizes the behavior graph, to indicate the range of student behavior that the graph stands for, beyond just literally the paths recorded in the graph with steps in the same order as demonstrated. For example, an author can provide input matchers that specify a range of values or notational variations to be accepted for a given step. She also can attach formulas (akin to Excel formulas) that specify how steps depend on each other or can define ordered or unordered groups of steps, nested if needed, to indicate what step orders are acceptable. Solution paths that involve the same steps but different values can be captured with a single path with attached formulas, as a way of keeping the graph manageable.

Fourth, an author annotates the graph. She can attach hints to links in the graph, which recommend the action on the given link as an appropriate next action to take; hints also typically explain why that action is a good next action, in terms of the domain’s problem-solving principles. During tutoring, these hints will be displayed at the student’s request. To make the tutor display feedback messages in response to specific student errors, an author can insert incorrect action links into the graph, each with a feedback message specific to the given error. To support assessment of student knowledge through Bayesian Knowledge Tracing (Corbett and Anderson 1995), an author may attach knowledge component (KC) labels to the links in the graph. Knowledge components are the smallest units into which the knowledge to be learned can be decomposed (Aleven and Koedinger 2013; Koedinger et al. 2010, 2012).

Fifth, to make it easier to create multiple practice problems of the same type, CTAT supports a template-based feature called “Mass Production.” To use it, an author first turns a behavior graph into a template by replacing problem-specific values with variables. She can then define many problems by specifying their values for the template variables in a spreadsheet. A final merge step automatically substitutes the spreadsheet values into the template’s variables and generates a behavior graph for each problem.

Finally, when the interface and behavior graph are ready for initial testing with students, the author uploads the tutor to a deployment environment, for example, the Tutorshop, described below. After one or more rounds of pilot testing and revision, preferably with students from the target population, the author performs the final edits and uploads the tutor to the final production environment.

Inner Loop: Using a Behavior Graph to Provide Tutoring

At student run time, CTAT’s built-in example-tracing algorithm (Aleven et al. 2009a) takes care of the tutor’s inner loop (a.k.a. step loop; VanLehn, 2006; this issue). That is, it provides step-level guidance within each problem, including step-level correctness feedback, on-demand hints, error-specific feedback messages, and assessment of knowledge based on Bayesian Knowledge tracing (Corbett and Anderson 1995) (see Table 1). Essentially, the example tracer algorithm evaluates whether the student’s solution steps conform to one or more solution paths implied by the generalized behavior graph. CTAT now provides a number of feedback policies, including on-demand feedback and no-feedback (for implementing online tests). At the student’s request, the algorithm provides hints by looking for an appropriate choice of next step within the graph, taking into account the solution path(s) the student may be on. CTAT also supports a number of hint selection policies, which give an author some control over how the tutor decides which of multiple possible next steps to recommend in a hint. It provides an error feedback message when the student’s action matches one of the incorrect action links in the graph. Finally, CTAT supports input substitution, meaning that the tutor can transform student input in an author-specified way and echoes the transformed version back to the student; this feature is useful for arithmetic calculations, spell checking, etc.

Beyond these basic behaviors CTAT’s example-tracing technology supports a range of additional inner loop behaviors (see Table 1). Whereas CTAT originally supported tutors for individual learning only, we recently added support for authoring collaborative example-tracing tutors (Olsen et al. 2014b), described in more detail below. We also added support for gradually fading worked examples, based on research that demonstrated the effectiveness of this way of transitioning from examples to problem solving (e.g., Atkinson et al. 2003; Renkl et al. 2003; Salden et al. 2010a). Further, an author can create dynamic interfaces whose components change depending on the problem state or in response to specific student actions. An author can do so by inserting links in the behavior graph that represent tutor-performed actions such as updating, showing, moving, or hiding interface components. This basic functionality can be used to craft a wide range of interface behaviors, such as revealing steps gradually as the student progresses through the problem, creating dynamically linked representations, or breaking down the problems into steps when the student makes an error (cf. Heffernan and Heffernan 2014).

Learner Modeling

CTAT and the Tutorshop support a learner model that records probabilities that the given student masters the KCs targeted in the given problem set. An author defines a KC model for an example-tracing tutor by annotating the links of a behavior graph with KC labels, as described above. The learner model is updated (in the inner loop) through Bayesian Knowledge Tracing (Corbett and Anderson 1995) and is used (as one of the outer loop options) for adaptive problem selection and cognitive mastery. As a form of open learner modeling (Bull and Kay 2010), CTAT provides an interface component that displays the learner model in the form of a skill meter (see Fig. 1, middle, in the bottom right corner of the tutor interface). An author can define an initial KC model based on cognitive task analysis and refine it later based on analysis of tutor log data, for example using tools for learning curve analysis made available in the DataShop (Koedinger et al. 2010). Refined KC models can lead to more effective or efficient instruction (for an overview, see Aleven and Koedinger 2013). As a way of generalizing CTAT, we are looking into the possibility of supporting the plugging in of custom student models.

Outer Loop

The Tutorshop supports a number of outer loop (a.k.a. task loop) options. First, it supports cognitive mastery learning based on Bayesian Knowledge Tracing, a form of personalized problem selection that has proven to be very effective (Corbett et al. 2000). It also supports randomized problem selection, which is sometimes useful for research purposes. For example, when problems are presented in random order (randomized separately for each student), then in offline analyses of tutor log data, the difficulty of problem steps can be assessed without being confounded by the order in which the problems were presented. One of our goals for the near future is to extend CTAT so it can support the easy plugging in of custom problem selection policies and mastery criteria.

Tutor Front End

CTAT currently supports three options for the tutor front-end (i.e., the interface in which the students solve problems, with the tutor’s guidance), namely, Java, Flash/ActionScript, and HTML5/Javascript. For the first two options, drag-and-drop interface building is supported. For the last of these, HTML5/Javascript, interfaces need to be created by writing code, although we hope to support drag-and-drop interface building in the near future. A great amount of our effort in recent years went into CTAT’s front-end technology, mainly to keep up with changes in web-based technologies. Since 2009, we have revised or re-implemented our interface technologies three times. First, we updated the look and feel of the tutor interfaces in our ActionScript 2 code base, in line with the design aesthetic for Mathtutor (Aleven et al. 2009a). Second, we moved to ActionScript 3, a substantial reimplementation effort, as ActionScript 3 provides important enhanced functionality but is not backwards compatible. Finally, we completed an HTML5/JavaScript implementation, which enables us to offer a truly cross-platform ITS approach. We expect this to become the go-to option for CTAT tutors, with Flash not supported on all web client platforms and on the decline. Keeping up with changing interface technology would not have been possible without a factored architecture strictly separates the tool and tutor modules (Ritter and Koedinger 1996), described below.

Delivery and Deployment

Much of our effort since 2009 has also gone into supporting ways to deliver CTAT tutors in a variety of e-learning platforms, including MOOCs. We view this capability as an important goal for ITS authoring tools and for making ITS technology widespread (Aleven et al. 2015b).

So far, the Tutorshop has been the go-to platform for deploying CTAT tutors. The Tutorshop is a web-based content management and learning management system we created for tutor use in classrooms and other settings. It is implemented in Ruby on Rails. Tutorshop’s learning management facilities include management of class lists, student and teacher accounts, assignments, and a wide range of reports for students and teachers regarding student progress and learning. Tutorshop has been used in many research projects. Also, it functions as the backbone of the Mathtutor website (Aleven et al. 2009a) as well as the Genetics Tutor (Corbett et al. 2010). We currently host the Tutorshop as a service to the research community. For large-scale studies, we sometimes run Tutorshop in the Amazon cloud (AWS). Although Tutorshop has not been used to deliver non-CTAT tutors, this would be a useful way in which the CTAT architecture could be generalized. We have also made it possible to embed example-tracing tutors in MOOC and e-learning platforms that adhere to the SCORM and LTI e-learning interoperability standards, as many do. To this end, the CTAT/Tutorshop platform now implements the provider/content side of SCORM and LTI. As evidence of this capability, we have embedded CTAT tutors in Moodle (Rice 2011) and in two edX MOOCs (Aleven et al. 2015b). We have also achieved custom integration with edX through edX’s XBlock API and with the Open Learning Initiative (OLI)’s learning management system (http://oli.cmu.edu).

As part of our effort to support tutoring at scale, we have moved the example-tracing tutor engine (which takes care of the tutor’s inner loop) from the server to the client, a change of direction since our 2009 IJAIED paper. Although there are a number of advantages to having the tutor engine on the server, discussed in the 2009 paper, doing so complicates the embedding of tutors in external e-learning platforms or learning management systems (LMSs). With a server-side tutor engine, a tutor is not a fully self-contained learning object. Further, with very large numbers of users, a server-based tutor engine could incur severe server load. Therefore, we now support a variety of options for running the tutor engine on the client, including a Java Web Start option, a Java applet option, and, most recently, a JavaScript version of the example tracer (Aleven et al. 2015b). We expect the latter to become the go-to option for example-tracing tutors in a variety of deployment options.

We see this work as an encouraging first step towards tutoring at scale (e.g., in MOOCs; Aleven et al. 2015b; Cook et al. 2015; Kay et al. 2013), although work remains to be done. For example, with the current versions of SCORM or LTI, the rich analytics produced by the tutor cannot easily be sent to a learning management system. This information is therefore not displayed in the grade book of the online course or integrated in existing student or instructor dashboards, a lost opportunity to leverage the advanced capabilities of ITS. Fortunately, newer versions of SCORM and LTI are moving toward richer data exchange between a tutor and the LMS, so this limitation may be addressed in the near future. Also, work remains to be done to make CTAT’s learner model available within a MOOC, as well as adaptive task selection.

Support for Research

CTAT tutors, typically running out of the Tutorshop, have been used in many dozens of scientific experiments to investigate questions regarding how tutors can most effectively support learning or other desirable educational outcomes. CTAT and Tutorshop offer some functionality that facilitates the use of tutors in such research. First, CTAT is fully compatible with DataShop (Koedinger et al. 2010), a large, open repository for educational technology data sets that supports offline analysis of tutor log data. All CTAT tutors log in DataShop format, without any additional effort from authors. DataShop supports many analyses geared towards data-driven refinement of the KC models underlying tutors (see e.g., Aleven and Koedinger 2013). As of July, 2015, DataShop contains 290 data sets generated by CTAT tutors, approximately 40 % of the total number of data sets in DataShop, with roughly 48,000,000 transactions by a total of 44,000 students, working for a total of 62,000 student hours. Second, Tutorshop supports assigning students to experimental conditions (although not automatically at this point in time) and recording experimental conditions in the log data. Finally, we are completing a Log Replayer capability, which will make it possible to replay logs through (typically) an extended version of the tutor that generated them, so as to write new information in the logs. This replay will support new analyses of log data not originally foreseen by the creators of the given tutor, as seems to be commonplace in educational data mining (e.g., Aleven et al. 2006a; Harpstead et al. 2015).

Architecture

CTAT is conceived not just as a tool, but as a factored architecture for tutoring, with well-defined components and interfaces between those components (Aleven et al. 2009a, b, 2015a, b). In particular, CTAT and the Tutorshop enforce a strict separation between “tool” and “tutor.” Here “tool” means the problem-solving interface or environment in which the student solves problems; “tutor” means the tutor backend, both the inner and outer loop (Koedinger et al. 1999; Ritter and Koedinger 1996). In CTAT, the tool and tutor communicate through a message protocol that derives from the work of Ritter and Koedinger (1996) (http://ctat.pact.cs.cmu.edu/index.php?id=tool-tutor). This aspect of the factored architecture has been very valuable. It has made it possible to mix-and-match options for the tutor engine (example-tracing or model-tracing tutor) with options for the tutor interface (Java, Flash/ActionScript, or HTML5/Javascript) or problem-solving environment (e.g., simulators). Also, it has made it possible, critically, to keep up with interface and web technology changes and has helped make it easier to extend the front-end technology with new interface components. Finally, the tool/tutor separation has made it easier to deploy tutors in a wide range of environments, as discussed above.

A second aspect of modularity in the CTAT/Tutorshop architecture is the separation between inner loop (within-problem tutor guidance) and outer loop (between-problem tutor guidance). The inner loop and outer loop communicate strictly through the student model (Aleven et al. 2015a). That is, at the beginning of each problem, the outer loop passes the learner model to the inner loop. As the student works on the problem, the inner loop updates the learner model (displaying it as a skill meter, if the author so chooses). At the end of the problem, it passes the updated student model back to the outer loop, so it can be used for adaptive problem selection or stored in the TutorShop database. This separation facilitates the plugging in of different options for student modeling and task selection, an area where we are just beginning to gain experience.

We see a number of ways in which the CTAT architecture could be generalized further, so it can be more versatile and support greater interoperability. For example, the Behavior Recorder has a range of functionality that supports cognitive task analysis and tutor testing in a way that is not specific to example-tracing tutors. A generalized behavior recorder API would make this tool available for use with other tutor types (Aleven et al. 2015a). Likewise, the Tutorshop could be useful for other tutors. A promising direction is further to make it possible to plug in different student models, different algorithms for updating the student model, and different policies for task selection. Finally, additional extensions for supporting research (e.g., A/B testing) would be useful as well.

ITS Authoring Tool Development Philosophy

In developing CTAT we took, and continue to take, a use-driven design approach. We credit this approach with making CTAT useful and usable. The essence of this approach is that we have made it a high priority to promote and support use of CTAT by others, to learn from users’ experiences, and to make sure that what we learned helped shape the tools. We have regularly solicited feature requests and feedback from users. When planning for new releases of CTAT, we invariably prioritized features based on the question: “Who is going to use it?” If we could not identify specific users, the feature would not make it into the release. We provide online documentation and tutorials (http://ctat.pact.cmu.edu). Further, we have made efforts to build a user community, by setting up an online user forum (http://groups.google.com/groups/ctat-users) and by holding yearly summer schools where people can learn about ITS with hands-on work in CTAT. We have also used the tools extensively ourselves to build tutors used in classrooms, primarily in research projects  (Aleven et al. 2009a; Forlizzi et al. 2014; Long and Aleven 2013a, b; McLaren et al. 2008, 2011a, b, 2012, 2014, 2015a, b, 2016; Olsen et al. 2014a, b; Rau et al. 2013, 2014, 2015a; Stampfer and Koedinger 2013; Waalkens et al. 2013; Wylie et al. 2011). “Eating our own dog food” (i.e., using our own tools for our own ITS research projects) has helped spot opportunities for improvement and has driven development of a number of new CTAT features. We have always kept the cost-effectiveness of authoring in mind. Before deciding to implement a new feature, we typically ask: How often will this feature be used and how much time will it save authors? Further, to support cost effective authoring, CTAT takes advantage of existing tools, including Flash and Eclipse Window Builder for building interfaces and Microsoft Excel in CTAT’s Mass Production process, described above.

Should Example-Tracing Tutors be Considered ITS?

In our 2009 paper (Aleven et al. 2009b), we argued that example-tracing tutors should be viewed as ITS. This argument was based on VanLehn’s (2006, 2011) criterion that what distinguishes ITSs from other forms of computer tutors is that they have an inner loop (i.e., provide step-level guidance to learners, rather than feedback only at the end of each problem).Footnote 1 We update and bolster this argument for two reasons: First, a recent chapter by Pavlik et al. (2013) questioned whether example-tracing tutors should be viewed as intelligent tutors, on the grounds that they might be similar to “programmed instruction.” Second, although VanLehn’s (2006) criterion has much going for it, we now more clearly recognize the limitations of this criterion. The issue is important, because how we position our systems vis-à-vis other kinds of learning technologies may influence public perception, acceptance, and eventual widespread adoption of ITS.

Although experts do not agree about how to define ITS (Woolf 2009, p. 21), the crux may be how adaptive to student needs and student differences (e.g., in knowledge or motivation) the tutoring system is in ways that enhance student learning, motivation, or other desirable outcomes. In defining adaptivity, we believe it is appropriate to look at the system’s behavior and the effect that the system’s behavior has on the student experience, in line with how Newell and Simon (1976) view intelligence:

By ‘general intelligent action’ we wish to indicate … that in any real situation behavior appropriate to the ends of the system and adaptive to the demands of the environment can occur …

A strength of VanLehn’s (2006, 2011) criterion (i.e., an ITS is a tutoring system that has an inner loop) is that it emphasizes adaptive behavior as a hallmark of intelligence, in line with Newell and Simon (1976). Also, it aligns with key empirical evidence, namely, that systems with an inner loop tend to have a stronger positive effect on student learning than systems without (VanLehn 2011). On the other hand, this criterion is not without its shortcomings. Step-based guidance may not be very adaptive if the tutor can recognize only one particular set of steps for each problem. Also, certain desirable forms of adaptivity may not easily be viewed as step-level support, such as reacting to student affect or adaptively selecting problems in the system’s outer loop. Furthermore, a number of systems that have a legitimate claim to being adaptive and intelligent do not have a very elaborate inner loop, for example, ASSISTments (Heffernan and Heffernan 2014) and Wayang Outpost/Mathsprings (Arroyo et al. 2014). Instead they have other features that warrant viewing them as ITS, such as being designed with a fundamental and sound understanding of student learning and the specific difficulties that students face in the given task domain, or that their outer loop is adaptive to student metacognition and affect (Arroyo et al. 2014).

Therefore, we offer an alternative definition (cf. Aleven 2015; Aleven et al. 2013, forthcoming):

A learning environment is adaptive to the degree that:

  1. a.

    Its design is grounded in a thorough empirical understanding of learners in the given task domain;

  2. b.

    it takes into account, in its pedagogical decision making, how individual learners measure up along different psychological dimensions; and

  3. c.

    it is appropriately interactive and responsive to learner actions.

Specific to the concerns of the field of AIED, the first part of this definition emphasizes (implicitly) the use of cognitive task analysis and data mining to guide tutor design and cyclical improvement of tutors (Aleven 2010; Aleven and Koedinger 2013; de Baker et al. 2007; Clark et al. 2008; Lovett 1998). The second part emphasizes adaptive individualization across a host of learner variables both in the inner (or step) loop and in the outer (or task) loop, consistent with Woolf’s (2009) emphasis on having a student model and using it to adapt instruction. The third part of the definition emphasizes interactivity, consistent with VanLehn’s inner loop or step loop.

Example-tracing tutor technology supports the building of tutors that exhibit all three factors that make up this definition. Regarding the first factor, example-tracing tutors vary in the depth of cognitive task analysis and other forms of data analysis that underlies their design. While the degree of cognitive task analysis depends ultimately on the efforts and procedures of the designers and developers, we note that CTAT’s Behavior Recorder tool supports this activity, for it permits designers to model and visualize a solution space and to rapidly prototype different tutor behaviors (Aleven et al. 2015a). Furthermore, example-tracing tutors are an offshoot of Cognitive Tutors, which have always been grounded in cognitive task analysis and cognitive modeling. Therefore, example-tracing tutors can meet the first requirement. They also meet the second factor: As mentioned, example-tracing tutors and the Tutorshop support individualized problem selection based on Bayesian Knowledge Tracing (Corbett and Anderson 1995; Anderson et al. 1989; Corbett et al. 2000; VanLehn 2011).

To see how example-tracing tutors address the third factor, we briefly review how they can be adaptive in their inner loop. Example-tracing tutors support basic inner loop (or step loop) functionality (VanLehn 2006) with next-step hints and correctness feedback on steps and error feedback messages for common errors. Step-level feedback is strongly supported in the empirical ITS and education literature as enhancing student learning (Arroyo et al. 2000; Kleij et al. 2015; McKendree 1990; VanLehn 2011). Beyond basic inner loop functionality, example-tracing tutors can follow students within a problem with respect to multiple strategies, regardless of which strategy the student decides to follow. They do so by virtue of having multiple paths in their behavior graph (Waalkens et al. 2013) or by virtue of using formulas to capture commonalities among solution paths. By contrast, the simplest approaches for building interactive e-learning software (e.g., authoring questions one at a time, independent of each other) do not capture dependencies among steps (i.e., do not capture multiple paths). Capturing multiple solution paths also supports other adaptive behaviors. For example, a multi-path example-tracing tutor has the ability to respond differently to the same student input depending on context, a hallmark of adaptive, intelligent behavior. That is, whether a certain student step is marked correct depends on which path(s) the student is deemed to be on, based on the student’s prior steps in the problem. Similarly, the tutor’s hints can be sensitive to what solution path the student is on, with different hints being given for the same step, dependent on the student’s prior path through the problem. When a student revisits a step she previously worked on but without completing it, an example-tracing tutor may change the hint it gives for this step, given more information about the student’s solution may be available the second time around (i.e., upon revisiting the step). The context-sensitity of the tutor’s hint is a form of flexible, adaptive behavior. It is also possible to create a tutor whose hints depend not only on the solution strategy the student is following, but also on the steps within the given strategy the student has completed already, as a way of making hints even more context sensitive. In sum, many adaptive behaviors are possible in an example-tracing tutor’s inner loop. All behaviors described above emerge from the basic example-tracing algorithm.

In spite of the many adaptive behaviors that example-tracing tutors can support, Pavlik et al. (2013) question whether this type of tutor ought to be considered ITS. They view example-tutors as analogues to “programmed instruction” or “branched instruction” and state: “Perhaps these systems [i.e., example-tracing tutors] do not qualify as ITS, considering the student state space is small and that pedagogical options are few.” The analogy between behavior graphs and branched instruction is problematic, however. Both types of graphs are used to provide instruction, but that is where the similarities end. Behavior graphs do not represent instructional sequences or pedagogical decisions, as do the branches of branched instruction. Rather, behavior graphs capture problem-solving processes and their variations (Newell and Simon 1972). More broadly, we reject the notion that only tutoring systems capable of handling elaborate solution spaces should be viewed as adaptive or intelligent. (This would also rule out ASSISTments or Mathsprings, for example, whereas we argued above that they have a number of features that support viewing them as ITS.) In principle, even within a small solution space, many pedagogical decisions are possible, so there is room for adaptivity to be effective. Perhaps more fundamentally, even a limited amount of adaptivity within a small solution space might be just the right amount in order to support an effective and efficient student experience. Pavlik et al. (2013) may look at the system’s internal structures as a key criterion for intelligence or adaptivity, while overlooking the many adaptive behaviors that behavior graphs enable.

We see a number of limitations in example-tracing tutors. First, example-tracing tutors currently do not have features for responding adaptively to student self-regulation, metacognition, or affect, although individual projects have added various forms of adaptive support for these aspects, including machine-learned detectors for help avoidance and for fast actions (Corbett, personal communication), simple support for self-explanation (Rau et al. 2015a; Wylie et al. 2011), self-assessment (Long and Aleven 2013b; Roll et al. 2011), and (with custom-programmed software modifications), shared student/system control over problem selection (Long and Aleven 2013b). Also, one project (AdaptErrEx) added an external module for adaptively responding to student misconceptions in the outer loop (Goguadze et al. 2011; McLaren et al. 2012). Other limitations discussed in our 2009 IJAIED paper (and still apropos) are that the example-tracing technology is not particularly efficient when it comes to authoring simple interactive items (e.g., multiple-choice questions with feedback). Also, it is not geared toward supporting tutorial interactions in natural language or toward building systems with large domain ontologies or that draw on large stores of factual or conceptual knowledge. Also, CTAT provides no support for integrating (without programming) animated pedagogical agents. Additionally, example-tracing tutors have been proven only to a limited degree in open-ended domains, where each problem tends to have its own structure (but see Ogan et al. 2009). However, an example-based approach to authoring such as that supported by example-tracing tutors may be a good option when problem solutions differ on a problem-by-problem basis. We look forward to more work in this area. Finally, example-tracing tutors support tutoring only for problems with a moderately branching solution space or problems for which the similarity among different solution paths can be expressed using formulas - otherwise, the behavior graph becomes unwieldy. As discussed below, this limitation has occassionally, but not frequently, been an obstacle in practice. We note further that in domains in which problems have large solution spaces, the second type of tutors supported by CTAT, namely, rule-based Cognitive Tutors, can be a good option (Aleven 2010; Aleven et al. 2006b).

Evidence of the Effectiveness of Example-Tracing Tutors

To consider whether example-tracing tutors are an effective ITS paradigm and to illustrate the level of maturity that CTAT has reached, we look at example-tracing tutors built with CTAT since our 2009 IJAIED paper (Aleven et al. 2009b). In that paper, we reported that example-tracing tutors had been used in 26 research studies in real educational settings. The domains for which example-tracing tutors had been built included mathematics (at the elementary, middle, and high-school levels), science (chemistry, genetics), engineering (thermodynamics), language learning (Chinese, French, and English as a Second Language), and learning of intercultural competence (references provided in the original paper). Since 2009, a substantial number of new tutors have been built with CTAT. We review 18 such tutors in this section, all of which were used in real educational settings, most of them in research studies. For this review we informally clustered these tutors based on key aspects of their pedagogy. We label these clusters as problem-solving tutors, tutors that use worked examples or erroneous examples, tutors that emphasize the use of interactive graphical representations, tutors that use pedagogical approaches other than standard tutored problem solving, and (in a singleton cluster), a tutor for language learning. We discuss each cluster, focusing on (a) the degree to which the tutors could be built entirely “within” the tools, (b) use in real educational settings, (c) evidence that students learned from the tutors, and (d) applicability and limitations of example-tracing tutors. Screenshots of the 18 tutors are shown in the Appendix.

Problem-Solving Tutors

A number of example-tracing tutors built with CTAT can be viewed as “traditional” problem-solving tutors, meaning they provide step-by-step guidance as students practice the solving of recurrent complex problems. The two most comprehensive such tutors are Mathtutor and the Genetics Tutor. Mathtutor covers mathematics topics for grades 6 through 8 in the American school system (https://mathtutor.web.cmu.edu) (Aleven et al. 2009a). It is a re-implementation, as example-tracing tutors, of a set of Cognitive Tutors for middle-school mathematics that were created in our lab prior to CTAT (de Baker et al. 2007; Koedinger 2002; Koedinger and Terao 2002; Rittle-Johnson and Koedinger 2005). Each of these tutors had seen multiple rounds of classroom use and the curricula of which they were part had been shown to improve student learning, compared to comparison curricula without tutoring software. The Genetics Tutor supports problem-solving and reasoning tasks for high school and college level genetics (Corbett et al. 2010). It has more than 25 units covering topics in Mendelian Transmission, Pedigree Analysis, Gene Mapping, Population Genetics and Genetic Pathways Analysis. Various tutor units have been evaluated in 15 colleges and universities and in 4 high schools. In a total of 45 single-unit in-course evaluations, pretest-to-posttest learning gains averaged about 18 percentage points (equivalent to almost 2 letter grades) across topics at both the post-secondary and high school levels. Like Mathtutor, the Genetics Tutor was originally implemented as a rule-based Cognitive Tutor and later reimplemented as example-tracing tutor.

In addition to these two tutors, several tutors with smaller domain scopes have been implemented that we also categorize as problem-solving tutors. Lynnette is a tutor for basic equation solving (Long and Aleven 2013a, b; Waalkens et al. 2013). It was originally implemented as an example-tracing tutor, and later re-implemented (also using CTAT) as a rule-based Cognitive Tutor (Long and Aleven 2014), so as to be more flexible in recognizing students’ major and minor solution variations. In four classroom studies with a total 487 students in grades 6 through 8, the example-tracing version of Lynnette led to pre/post gains in basic equation solving skill with medium to large effect sizes (d = .69, d = 1.65, and d = 1.17; in one experiment, the gains were not significant, due to a ceiling effect.

Our final example of a problem-solving tutor is the Tuning Tutor, an example-tracing tutor developed by Carolyn Rosé for use in her course at Carnegie Mellon University entitled “Applied Machine Learning.” This tutor teaches students how to apply general principles of avoiding overfitting in cross-validation to the case where parameters of a model need to be tuned. It was used during two semesters and was well received by students, many of whom completed more than the required number of tutor problems. Informally, students expressed their appreciation for the opportunity to practice with feedback and suggested that other concepts in the course include CTAT exercises as well. The incidence of students attending office hours during the unit on tuning, which used to be the most difficult unit in the second half of the course, dropped to nearly zero. These examples confirm that example-tracing tutors can effectively support learning at a variety of educational levels, including advanced college courses.

The Mathtutor, Genetics Tutor, and Lynnette projects help us better understand the practical import of the fact that example-tracing tutors can handle only problem types with a limited number of structurally dissimilar solution paths. As described, both Mathtutor (Aleven et al. 2009a) and the Genetics Tutor (Corbett et al. 2010) were originally developed as Cognitive Tutors (i.e., having a rule-based cognitive model) (Aleven 2010) and later re-implemented as example-tracing tutors. In both instances, the motivation for doing so was to make these tutors available over the Internet. (At the time, CTAT did not support web-based delivery of rule-based Cognitive Tutors. It does now, with a server-side model-tracing engine.) These example-tracing tutors have been used extensively in schools and (in the case of the Genetics Tutor) in colleges, evidence that they are bona fide, real-world tutors. In both projects, the problem types, tutor interfaces, and tutoring behavior were kept largely the same when the tutors were re-implemented as example-tracing tutors. In both projects, the authors were able to capture, as example-tracing tutors, a large proportion of the problem sets of the original tutors, without simplifying their solution spaces or tutoring behavior (roughly 95 % of the problem sets in both projects). On the other hand, in both projects, there was a small residue of problem types whose solution space was too large to be feasibly implemented as example-tracing tutors (namely, a unit on abductive problem solving in genetics and units on equation solving in middle-school mathematics). This finding was confirmed in the Lynnette project, in which a tutor for basic equation solving initially implemented as an example-tracing tutor (Long and Aleven 2013a, b; Waalkens et al. 2013) was re-implemented as a rule-based Cognitive Tutor (Long and Aleven 2014), in spite of the example-tracing tutor’s proven effectiveness in multiple classroom studies (see above). The purpose of the reimplementation was to make it easier to create new problem types for the tutor but also to have greater flexibility in recognizing solution variations within each problem. These three projects thus provide key evidence that example-tracing tutors are often an excellent option for tutor development. The requirement of having a moderately-branching solution space turned out to sometimes be an obstacle, but only infrequently so. This finding is interesting especially if one considers that the problem types that were transitioned from rule-based Cognitive Tutors to example-tracing tutors were not created or selected with example-tracing tutors in mind. Evidently, in many domains, good practice problems need not have widely branching solution spaces.

Tutors that Use Worked-Out Examples or Erroneous Examples

Several CTAT-built tutors have been created for research that investigates benefits of worked examples and erroneous examples in ITS, a topic that has seen increased interest since 2009 (e.g., Salden et al. 2010b; McLaren et al. 2015a).

The Stoichiometry Tutor (McLaren et al. 2014, 2015b, 2016) was extended so that students watch the step-by-step narrated playback of worked and erroneous examples in the tutor’s problem-solving interface. They are prompted to explain the solutions, in the case of standard worked examples, or fix the errors, in the case of erroneous examples. In two classroom studies (McLaren et al. 2014, 2015b), involving 295 10th and 11th grade students across the studies, erroneous examples and worked examples yielded the same learning outcomes as tutored and untutored problems but were more efficient in terms of time (d ranging between 1.76 and 3.31) and mental effort (d ranging between 0.89 and 1.04) (McLaren et al. 2014). The AdaptErrEx project used erroneous examples to help students learn decimals. Working with this tutor, students find, explain, and fix errors in decimal problems. In two studies, one with 208 (Adams et al. 2014) and another with 390 students (McLaren et al. 2015a), students who worked with erroneous examples performed significantly better on a delayed test (d = .62 and d = .33, respectively) than students in a tutored problem-solving condition with explanation steps. Building on this work and on the example-tracing technology, McLaren and colleagues created a suite of educational games, Decimal Point, that use erroneous examples as the core instructional technique (Forlizzi et al. 2014). Although a considerable amount of custom ActionScript programming was needed, CTAT provided a valuable foundation. As a final project that employed worked examples in an example-tracing tutor, the Proportional Reasoning Tutor (Earnshaw 2014) was created and used in a classroom study with 143 middle-school students that examined effects of worked examples and tutored problems. Learners in the worked example condition took less time and scored higher on the post-test than learners in the two other conditions.

With more and more studies showing that worked examples enhance learning with an ITS or make it more efficient, we expect to see worked examples become a staple of ITS, combined with self-explanation prompts. We see a need for more research that reconciles the findings of studies focused on worked examples in the ITS literature with those in other literatures (e.g., Van Gog and Rummel 2010; Renkl 2013). For example, it would be useful to test whether the typical expertise reversal effect (studying examples is more effective early on, problem solving is more effective later on) (Kalyuga 2007; Kalyuga et al. 2003) occurs in the context of ITS.

As the projects reviewed above illustrate, worked-out examples or erroneous examples can often (as in the Proportional Reasoning Tutor) be authored entirely within CTAT (i.e., without programming). The same can be said about prompts for self-explanation, which often accompany worked examples (Booth et al. 2013; McLaren et al. 2014; McLaren et al. 2015a, b; Rau et al. 2015a; see also Conati and Vanlehn 2000; Renkl 2013). At other times, tool extensions were needed to support examples (e.g., the Stoichiometry Tutor and AdaptErrEx). Regarding how examples and problems can be sequenced, as mentioned, CTAT now supports the gradual backward fading of worked examples, shown to be an effective method of transitioning from worked examples to problems (Atkinson et al. 2003; Salden et al. 2010a). This functionality is used in all problem sets of Mathtutor (Aleven et al. 2009a). As an alternative strategy, an author could decide to interleave worked-out examples and fully open problems (e.g., Paas and Van Merriënboer 1994), which she could do using the standard way of ordering problems within a problem set in the Tutorshop. A third strategy is fixed or adaptive fading of example steps at the knowledge component level, which was shown to be effective in one (non-CTAT) project (Salden et al. 2010a). CTAT provides building blocks for tutor authors to implement adaptive example fading in this manner, although we do not know of any CTAT tutors using this capability.

Tutors with Interactive Graphical Representations

Since 2009, a number of example-tracing tutors have been created with CTAT that feature the use of interactive graphical representations of learning materials. This work shows that interactive graphical representations can be a key way of leveraging ITS technology.

For some of these projects, special-purpose graphical interface components were developed (which required programming). For example, the Fractions Tutor (https://fractions.cs.cmu.edu) focuses on conceptual learning with multiple, interactive graphical representations including number lines, fractions circles, and rectangles (Rau et al. 2013, 2014, 2015a). This tutor was used in five classroom studies with over 3000 students. In the last study, learning gains, up from the pre-test, were d = .40 for the immediate post-test and d = .60 for a delayed post-test (Rau et al. 2012). The Fractions Tutor project illustrates that ITS, and example-tracing tutors specifically, can be effective with elementary school students. The only other ITS work we know of at the elementary school level is a set of studies by Stankov et al. (2007). To build the Fractions Tutor, special interface components for the interactive graphical representations of fractions (i.e., an interactive number line, circle, and rectangle) were developed and added to CTAT’s component class hierarchy. They became part of the standard CTAT release package and were used in other tutors. For example, in a different project, also dealing with elementary school fractions learning (Stampfer and Koedinger 2013; Wiese and Koedinger 2015), they were used to implement, entirely within the tools, an instructional approach called grounded feedback (Mathan and Koedinger 2005; Nathan 1998).

Another project in which new interface components were created is Chem Tutor (Rau et al. 2015b). This tutor features domain-specific interactive graphical representations, such as Lewis structures, Bohr models, and energy diagrams. Students receive step-by-step guidance for planning and constructing representations and are prompted to self-explain differences between representations so as to reflect on the limitations of each. New interface components for these interactive representations were built first. Chem Tutor led to large learning gains in a field study with 74 undergraduate students enrolled in an introductory course for science majors (d = 1.44) (Rau et al. 2015b) and in a lab experiment with 117 undergraduates (d = .78) (Rau and Wu 2015). Chem Tutor has been used as a research platform to investigate support for representational competencies (Rau and Wu 2015), effects of students’ spatial abilities on their interactions with graphical representations (Rau 2015), and visual attention behaviors (Peterson et al. 2015; Rau et al. 2015b).

Some projects built interactive graphical representations using standard CTAT interface components, bypassing the need to first create new custom interface components; these include the Genetics Tutor (discussed above) and the RedBlackTree Tutor. The latter is an example-tracing tutor that aims to help students in a college level introductory data structures course understand (i.e., “hand simulate”) a key algorithm for red-black trees (Liew and Xhakaj 2015; Xhakaj 2015; Xhakaj and Liew 2015), a data structure with many applications (e.g., Weiss 2010). Interactive red-black tree structures in the tutor interface were built out of standard CTAT components such as text boxes, radio buttons and drop-down components, combined with standard Flash elements and a small amount of custom ActionScript. In two small evaluation studies (Liew and Xhakaj 2015; Xhakaj and Liew 2015; Xhakaj 2015, approximately one hour of work with the RedBlackTree Tutor led to large learning gains (d = 1.66 and d = 3.06, respectively).

Thus, CTAT offers various ways of supporting interactive graphical representations. Sometimes (as in the Grounded Feedback Tutor), CTAT already offers an interactive interface component for the given representation, so the tutor can be built without programming. At other times, a new interactive graphical representation can be assembled from standard CTAT-supported interface components (as in the Genetics Tutor), in some cases with a small amount of custom ActonScript (as in the RedBlackTree Tutor). Sometimes, when moving into a new domain, it is necessary to create new interactive interface components for the given graphical representations (e.g., Mathtutor, the Fractions Tutor, and Chem Tutor), which can then become part of CTAT. This experience illustrates that it is important that an ITS authoring tool (even one for non-programmers) is “open” so that it is easy to extend its collection of interface components and to extend what tutors built with the tool can do.

Tutors that Support Other Pedagogical Approaches

A number of example-tracing tutors have been built that use pedagogical approaches other than tutored problem solving: educational games (example discussed above), collaborative learning, learning with an external problem-solving environment, activities that target sense making and fluency building, and guided invention activities.

CTAT’s example-tracing tutor technology has been extended so that it now supports the authoring of tutors with simple support for collaborative learning (Olsen et al. 2014b), similar to earlier work done with constraint-based tutors (Baghaei et al. 2007). Using CTAT, an author can build tutor activities that combine tutored problem solving with embedded simple collaboration scripts. In these activities, networked small teams of students work synchronously on tutor problems. The students can have a shared but – if the author so chooses – differentiated view of a joint problem and can have different actions available, so collaborating students can have different roles. Synchronized tutor engines provide tutoring for each student. This form of collaboration support is quite flexible, although it has a number of limitations. For example, it is not straightforward to provide feedback on students’ dialogue (e.g., Adamson et al. 2014) or on how students are collaborating (e.g., Rummel et al. under review; Walker et al. 2014). Further, collaboration scripts have to be built from low-level building blocks.

CTAT’s collaborative features were used to author collaboration scripts that support proven ways of scripting collaboration such as roles, cognitive group awareness (Janssen and Bodemer 2013), and individual accountability (Slavin 1996). In a pull-out study conducted in schools with 56 students, these tutors were shown to help elementary students learn collaboratively (Olsen et al. 2014a), with gains no different from equivalent tutors used individually. In a second study, a classroom study with 189 participating students, there again were no differences in learning gains between students working individually and collaboratively, but students working collaboratively spent less time on the tutor (Olsen et al. under review). This line of work may help as a bridge between ITS and research in Computer-Supported Collaborative Learning (CSCL) (see also Baghaei et al. 2007; Kumar and Kim 2014; Rummel et al. under review; VanLehn this issue; Walker et al. 2014).

CTAT has also been used to provide tutoring within external problem-solving environments including simulators for thermodynamics and chemistry (Aleven et al. 2006c; Borek et al. 2009). In a new project by McLaren and colleagues embeds a CTAT example-tracing tutor within Google Sheets. This tutor guides college students in business modeling problems, represented on spreadsheets. This work builds on work on plug-in tutoring agents that pre-dates CTAT (Koedinger et al. 1999; Mathan and Koedinger 2005; Ritter and Koedinger 1996). Hooking up an external problem-solving environment is facilitated by CTAT’s strict separation between interface and tutor functionality (http://ctat.pact.cs.cmu.edu/index.php?id=tool-tutor). This notion is also being addressed in the xPST authoring tools project (Blessing et al. 2009a, b; Kodavali et al. 2010).

A new project with the Fractions Tutor investigates whether an ITS can be made more effective by adaptively targeting a broader range of learning mechanisms than ITS typically do. The work is grounded in the Knowledge-Learning-Instruction (KLI) framework (Koedinger et al. 2012), which links cognitive theory and instructional design. The tutor aims to support three key classes of learning mechanisms identified by KLI: verbal sense-making (SM), induction and refinement (IR), and fluency-building (F) processes. SM activities in the Fractions Tutor include instructional videos that explain fractions topics, interleaved with brief problem-solving exercises and opportunities to self-explain. IR activities support tutored problem solving, as typically found in ITS. F activities emphasize procedural practice with design features that may encourage students to work more quickly (e.g., bigger problem steps, short hints, and on-screen timers). In a classroom study with 1068 fourth and fifth grade students across 12 schools, the tutor led to significant pre- to post-test learning gains (d = .47) (Doroudi et al. 2015); students who did relatively more fluency-building activities learned more, suggesting (without definitively proving) that extending the range of learning mechanisms can be effective. Ongoing work uses machine learning techniques to create policies for adaptive activity selection.

CTAT has also been used to implement tutors for guided invention activities, in which the goal is for students to develop a quantitative method that captures a mathematical or physics concept, guided by carefully designed sets of contrasting examples. Prior research shows that these kinds of activities prepare students well to learn and transfer from a more traditional lesson (Schwartz and Martin 2004; Schwartz et al. 2011). Over the years, three different tutors for guided invention activities have been built with CTAT. The first one, by Roll et al. (2010) relied on rules and constraints (i.e., was not an example tracing tutor). A second tutor was an example-tracing tutor that provided domain-general guidance during invention activities (Holmes et al. 2014). In an evaluation study, in which 87 undergraduate students in a first-year physics lab course at the University of British Columbia used the system for two activities, roughly 30 min each (Holmes et al. 2014), domain-general guidance during invention activities enhanced students’ conceptual understanding. In addition, the system was used as part of the regular physics instruction at the University of British Columbia until last year. A third tutor for invention activities, for middle-school physics, is currently under development. It builds on example-tracing tutors, although with substantial custom programming (Chase et al. 2015a, b).

A Tutor for Language Learning

The remaining cluster comprises a single tutor, the Article Tutor, an example-tracing tutor that teaches students the English article system (when to use a, an, the, or no article). Other tutors for language learning have been built with CTAT as well (e.g., Guan et al. 2011; Liu et al. 2011; Ogan et al. 2009). The Article Tutor was built to be part of a course for English as a Second Language (ESL). It was used in research to investigate whether the use of self-explanation can be effective as a learning strategy for ESL. In total, six tutor versions were built, including an adaptive tutor that prompted students to self-explain only if they got the question wrong. This form of adaptivity could be authored entirely within CTAT’s non-programmer tutor authoring paradigm. In four classroom studies (390 students total) all conditions learned from the tutor but the practice-only (no self-explanation) condition consistently was the more efficient form of instruction (Wylie et al. 2009, 2010a, b, 2011). The work refines the conditions under which self-explanation is understood to be effective (e.g., Koedinger et al. 2012).

Discussion and Conclusions

The main contributions highlighted in our IJAIED 2009 paper were: First, the CTAT project pioneered a non-programmer paradigm for ITS authoring that involves (a) generalized examples of problem-solving behavior as the tutor’s representation of domain knowledge, (b) tools for creating, without programming, tutors that use these generalized examples to provide tutoring, and (c) an algorithm for flexibly using generalized examples to interpret student behavior and provide step-based tutoring. A second intellectual contribution claimed at the time was a demonstration, across a range of tutor research projects, that this paradigm can be widely useful and effective. A third contribution was evidence of substantial cost savings: Building example-tracing tutors was shown to be 4–8 times as cost-effective, compared to estimates in prior literature.

We see six novel scientific contributions of our project since 2009. First, we update and bolster our argument that example-tracing tutors should be viewed as first-class citizens in the world of ITS. We now focus on adaptive behavior as a hallmark of intelligence, following Newell and Simon (1976). We provide a definition for adaptivity based on three factors, (cf. Aleven 2015; Aleven et al. 2013, forthcoming) and highlight many elements of adaptivity in the behavior of example-tracing tutors.

Second, we provide additional evidence that example-tracing tutors are an effective and mature paradigm for developing intelligent tutors. We describe 18 example-tracing tutors built since 2009 and used in real educational settings, many with statistically significant pre/post learning gains. Most of these tutors were for STEM domains (science, technology, engineering, and mathematics), but we also see tutors for business modeling and language learning. The tutors were used by students ranging from late elementary school to university graduate students. As evidence of widespread use, CTAT-built tutors were used by 44,000 students and account for 40 % of the data sets in DataShop. This work thus supports the notion that a non-programmer approach to ITS authoring can yield effective tutors. It also illustrates that CTAT has reached a state of maturity in which tutors built with CTAT routinely withstand the rigors of classroom use and even use within MOOCs.

Third, the 18 reviewed example-tracing tutors illustrate a range of pedagogical approaches, including (standard) tutored problem solving, the use of worked-out and erroneous examples, interactive graphical representations, and collaborative learning. CTAT supports these features to a substantial degree; however, some amount of programming is sometimes necessary. Thus, we see that an ITS authoring tool, in the hands of creative authors, can be used in unanticipated ways. We also see that an ITS authoring environment, even one that supports a non-programmer approach to authoring, should be easily extensible and accommodate custom programming.

Fourth, the strengths and limitations of example-tracing tutors are now better understood. In particular, we better understand the extent to which it is limiting that example-tracing tutors support only problems that have no more than a moderately-branching solution space (unless many branches are isomorphic so that an author can collapse them into a small number of branches using formulas). The experience across a range of tutor development projects suggests that occasionally this limitation precludes use of the example-tracing paradigm, but more frequently, example-tracing tutors are a viable approach. Other limitations are that adaptivity in response to affect, metacognition, and motivation is not currently supported and that the number of example tracing tutors that have been demonstrated in domains outside of STEM domains remains relatively small.

Fifth, we have learned how an ITS architecture can be factored so it supports the flexible re-use of tutor components. First, the notion of separating tool and tutor pre-dates CTAT (Ritter and Koedinger 1996), but CTAT demonstrates some advantages of it that have not been demonstrated before. Most importantly, the changes in web technologies that forced us twice to revamp our tutor front-end technology would have spelled doom for CTAT had it not been for the strict tool/tutor separation. The tool/tutor separation has also made it possible to mix-and-match tutor engines and interface technology and makes it easier to extend the tutor interface by creating new components (rather than project-specific interfaces). We recommend this separation for any ITS project. The second key way of factoring is to separate inner loop and outer loop, with the student model as the sole means of communication between the two loops. This separation has proven to be useful (e.g., it has been relatively easy to plug in an alternative student model and outer loop). It will be interesting to see if this separation holds up when we extend the range of student models and task selection policies. Sixth and finally, we have created ways of embedding CTAT tutors in a range of e-learning environments. We continue to work on extending the range of environments in which these tutors can be embedded. We also plan to extend the range of advanced tutoring functionality (including learning analytics and student modeling) that can be made available to these environments.

We see interesting days ahead for ITS authoring tools. Our ongoing work focuses on generalizing CTAT and supporting tutoring at scale. For ITS technology to spread, it is critical that authoring tools not only support cost-effective authoring of sophisticated tutor behaviors, without programming. Also, it is important that tutors interface with popular e-learning and MOOC environments across the range of popular client platforms.