Introduction

Constraint-based modelling (CBM) was proposed by the second author as a way to overcome limitations of other student modeling approaches existing at the time (Ohlsson 1992). As discussed in another paper in this volume (Ohlsson 2015), CBM provided a way to eliminate the need for bug libraries. A constraint-based tutor only needs a set of constraints that describe features of correct solutions. In addition, CBM allowed us to expand the types of instructional tasks that can be taught by an intelligent tutor. CBM does not require a runnable domain module (i.e., a problem solver), so it can be applied to open-ended tasks for which there is no algorithmic solution or executable expert model.

The original paper on CBM attracted a lot of attention, as illustrated by the early citations. However, at the time of its publication, and for several years afterwards, there was no system based on the proposed approach. The first author of this paper completed her Ph.D. in 1994, in which she proposed another student modelling approach particularly suited to procedural tasks, implemented in a system called INSTRUCT. It combined advantages of model tracing (Anderson et al. 1990) with those of reconstructive modelling as implemented in the ACM system (Ohlsson and Langley 1988). In the process of revising a paper about INSTRUCT (Mitrovic et al. 1996), Mitrovic learnt about CBM and found it extremely exciting! Ohlsson proposed solutions to some of the problems with previous student modelling approaches (including the one used in INSTRUCT). The motivation for implementing the SQL-Tutor was to investigate whether CBM was as promising as it seemed.

The choice of instructional task was influenced by the Computer Science courses Mitrovic was teaching. She observed over the years that students found the SQL database query language very challenging: although the language itself is well-defined in terms of its grammar, the task of writing queries in SQL is demanding. SQL queries are posed in English; they require a good understanding of the database which serves as the context, and are often ambiguous, requiring common-sense and background knowledge. Additionally, the student needs to be familiar with the relational data model and the Database Management System (DBMS) in which they practise composing SQL queries. DBMSs can be difficult to learn, and they typically provide cryptic error messages in response to syntax errors, but they are not capable of dealing with semantic errors. Finally, writing SQL queries is a design task, a simplified version of programming, and there is no algorithm that students can apply to convert the natural language description of a query into an SQL Select statement (Mitrovic and Weerasinghe 2009). Because SQL has a lot of redundancy built into it, there are often several correct solutions for one and the same problem. For all these reasons, students find it hard to master the art of writing SQL queries.

Consequently, SQL was an ideal domain in which to develop the first constraint-based tutor: a demanding instructional task, included in real courses with real learners who needed to learn SQL. Mitrovic started developing SQL-Tutor at the end of 1995. In 1998, while on sabbatical, she visited Ohlsson at the University of Illinois at Chicago and demonstrated SQL-Tutor. That was the start of our very productive collaboration, which we still enjoy wholeheartedly! Numerous later constraint-based tutoring systems have been developed within the Intelligent Computer Tutoring Group (ICTGFootnote 1), led by Mitrovic at the University of Canterbury in Christchurch, New Zealand.

We start by briefly discussing the two highly cited IJAIED papers on SQL-Tutor. The subsequent section presents a brief history of later projects involving SQL-Tutor, and our future plans. We conclude with presenting the lectures learnt from SQL-Tutor.

Early Versions of SQL-Tutor

SQL-Tutor was designed as a complement to database courses. It assumes that the student has already acquired some knowledge via lectures, labs and demonstrations. The system provides numerous problem-solving opportunities in the context of various databases. Students can freely choose among the latter. The system covers only the SQL Select statement (more precisely, a subset of language constructs used in the Select statement). The system is focused on querying, because queries cause most of the student misconceptions. Additionally, many concepts used in querying are directly relevant to other SQL statements and even to other relational database languages. SQL-Tutor was developed in Allegro Common Lisp,Footnote 2 with the first version being implemented on Solaris workstations, and a later version as a stand-alone MS Windows application.

Mitrovic’s team faced several challenges and issues in the design and development of SQL-Tutor. These included the nature of constraints, the application of constraints, long-term modeling, interface design, pedagogical approach, and evaluation methodology.

Nature of constraints

Because SQL Tutor was the first constraint-based tutor, the first problem was to operationalize the theoretical definition of constraints. This required several conceptual developments. Mitrovic classified constraints into syntax and semantic constraints: The former focused on the syntactical correctness of the student’s solution, while the semantic constraints made sure that the student’s solution was correct for the problem at hand (Mitrovic 1997; Mitrovic 1998c). Both types of constraints were problem-independent; none of the constraints included any problem-specific elements. This feature makes it easy to add new practice problems to SQL-Tutor (and other constraint-based tutors). The system builder only needs to provide the text of the problem and one correct solution. In the case of a new database, the author needs only provide the description of the database. Because most problems have several correct solutions, semantic constraints are needed to check for all correct ways of specifying the elements of a query. In addition, Mitrovic’s work brought into clearer focus the distinction between state constraints and path constraints. The former refer to a single problem state, while the latter refer to a sequence of events. Although Ohlsson’s original ’92 formulation emphasized state constraints, in practice path constraints are needed to catch all student errors.

Application of constraints

The next challenge was related to implementation of short-term modelling. To analyse a student’s solution, it is first necessary to match the relevance conditions of all constraints to find those that are relevant to the solution. In the next step, the satisfaction conditions of relevant constraints are used to determine whether they are violated or satisfied. At the time when the first version of SQL-Tutor was developed, sequential matching of constraints was not feasible due to the size of the knowledge base,Footnote 3 so Mitrovic implemented a modification of the RETE pattern matcher often used in rule-based AI systems (Forgy 1982) to speed up matching. This resulted in relevance and satisfaction networks, providing information about relevant and satisfied/violated constraints (Mitrovic 1997).

Long-term modeling

The next step was to decide on the long-term modelling approach. CBM as proposed by Ohlsson was focused on short-term modelling (or error diagnosis), but ITSs need to track the progress of students over longer time periods. Because constraints are rather different in character from other types of knowledge representations (networks, rules, etc.), these issues had to be thought through from scratch. The first version of the student model in SQL-Tutor was an overlay on the constraint base (Mitrovic 1997). This student model tracks student learning by storing the history of each constraint (and some additional information). This information can be analyzed in many ways. One simple approach is to identify the set of situations within a recent time frame in which some constraint C was relevant, and calculate the proportion of those situations in which it was also satisfied.

Interface design

Another challenge, which is faced by all ITSs, was to design the interface, which should support a natural way of solving problems in the domain and at the same time provide problem-solving support to the student. Figure 1 shows the interface used in the first evaluation study performed in 1998 (Mitrovic 1997). The interface of SQL-Tutor reduces the student’s working memory load by displaying the database schema (the bottom part) and problem text (at the top), by providing the basic structure of the query and also by providing explanations of various elements of SQL. In addition, SQL-Tutor provides feedback on student solutions at several levels of specificity, ranging from telling the student whether the solution is correct, letting the student know which part of the solution is wrong (error flag), providing hint-level messages about one or all violated constraints, to providing partial or even a full solution.

Fig. 1
figure 1

Screenshot of the Solaris version of SQL-Tutor

Pedagogical approach

Other challenges included deciding on the pedagogical approach to be used, and how to exploit the constraint formalism to support that approach. This included feedback (both content and timing) and problem selection strategies. SQL-Tutor allows the student to select a database as a context for problem solving. After that, the student can select a problem, or ask the system to select the problem on the basis of the student model (Mitrovic 1997). In the 1998 study, the system selected problems based on constraints that a student had difficulties with, and/or constraints that had not been relevant for any of his or her problems or solutions so far (Mitrovic and Ohlsson 1999). In this way, the constraint base served as a tool for moving the student through the curriculum in a way that is sensitive to what he or she has or has not yet learned.

Evaluation methodology

The 1999 paper presented the findings from the first evaluation study conducted in 1998. Although the study was uncontrolled, with volunteers using the system, it confirmed that constraints are an appropriate formalism for representing the task knowledge, and also that students learned from the system and found it useful and easy to use. In the course of analyzing the data, we had to figure out how to relate the data collected by a CBM system to classical representations of learning, such as learning curves. We displayed the results by plotting the probability that a constraint is violated as a function of the number of situations in which that constraint was relevant, averaged over constraints and students. This type of plot turned out to yield smooth learning curves that followed the power law of learning.

In short, the translation from the idea of constraint-based student modeling to a working tutoring system presented the Mitrovic team with multiple issues and challenges. Some issues had to be thought through or re-thought from scratch because the constraint formalism that is at the center of the CBM approach is different in character and operates differently from other, more traditional knowledge representation. In the end, all the issues were resolved and SQL-Tutor was, and remains, a successful system. It also remains true that the constraint formalism is difficult to grasp, and it is the aspect of CBM that is most often misunderstood.Footnote 4

What Happened Later?

In the years after the 1999 paper, SQL-Tutor was improved, expanded, and used as a research platform in multiple ways. One important aspect of these developments was re-implementing the system to make it more universally accessible. ICTG developed a MS Windows version, which was downloaded 1946 times from May 1999 to January 2001. Mitrovic (2003) described the web-enhanced version of SQL-Tutor, which has been used in studies since 1999. A web-enabled version was initially developed using CL-HTTP, and later the Allegro Serve Web server. Since 2003, SQL-Tutor has also been available on the Addison-Wesley’s DatabasePlaceFootnote 5 Web portal. The pedagogical effectiveness of SQL-Tutor has been confirmed in 16 different studies, and the system has been in regular use in the University of Canterbury database courses, as well as in courses at other universities worldwide.

Several studies focused on the content of the system’s feedback messages. An early study (Mitrovic and Suraweera 2000) compared the effectiveness of feedback provided in the textual form to the identical feedback being presented by an animated pedagogical agent, showing the persona effect. We also investigated the effectiveness of various levels of feedback (Mitrovic and Martin 2000); the results showed that feedback presenting information about the violated domain principles (e.g., Hint and All errors) was superior to other feedback levels, and resulted in faster and more effective learning. Originally SQL-Tutor only provided negative feedback (i.e., feedback on errors); when we added positive feedback, students were able to complete the same learning tasks in only half of the time while achieving the same learning improvement (Mitrovic et al. 2013).

Problem selection has also been investigated in several studies. As stated previously, early versions of SQL-Tutor selected problems based on constraints that students found difficult (i.e., frequently violated), or that they had not yet encountered (the default strategy). We experimented with other representations of the student’s long-term knowledge based on Bayesian networks, and used them to select problems adaptively (Mayo and Mitrovic 2001). A small-scale study of that approach showed that the Bayesian problem-selection strategy resulted in problems that were better tailored to students’ knowledge, in terms of the coverage of domain concepts and problem-solving effort required, compared to the default strategy (Mitrovic et al. 2002). Additionally, we investigated using artificial neural networks for problem selection (Wang and Mitrovic 2002); we trained a simple feed-forward network to predict the number of errors a student will make, and used such predictions to select the next problem. We also developed an approach to generating problems automatically, from the domain model; the results showed that such problems doubled the learning rate compared to the manually defined problems (Martin and Mitrovic 2002).

We have developed several versions of Open Student Models (OSM) as a vehicle to support students in self-assessment and reflection (Mitrovic and Martin 2007). The results showed that the students became more aware of deficiencies in their knowledge, and they made better decisions during learning. In addition to pure problem solving, we also explored the effect of learning from examples; the results showed that students learn significantly more from alterative presentation of examples and problems than when only examples are presented, and that they also acquire significantly more conceptual knowledge compared to problem-solving alone (Shareghi Najar and Mitrovic 2013). Furthermore, adaptive selection of examples and problems resulted in significantly improved learning compared to a fixed sequence of example/problem pairs (Shareghi Najar et al. 2014).

The work on SQL-Tutor opened the way for other projects. Some of these were designed to demonstrate that CBM is an effective way of modelling domain and student knowledge in a variety of instructional tasks. We developed successful constraint-based tutors for other design tasks, such as database design (Suraweera and Mitrovic 2004; Zakharov et al. 2005), UML class diagrams (Baghaei and Mitrovic 2006) and Java programming (Holland et al. 2009). We also developed constraint-based tutors for procedural tasks, such as data normalization (Mitrovic 2005). Having proved that CBM can be used in a wide range of tasks, we turned to other interesting research problems. These include providing feedback on collaboration for pairs of students (Baghaei et al. 2007) and supporting students’ affective state (Zakharov et al. 2008). SQL-Tutor also provided motivation and foundations for our work on authoring support for constraint-based tutors (Martin et al. 2008; Mitrovic et al. 2009). The interested reader is referred to (Mitrovic 2012) for coverage of other ICTG projects than those directly related to SQL-Tutor.

The 1999 and 2003 IJAIED papers on SQL-Tutor also contributed to popularization of CBM. Since those early days, CBM has been used by many researchers in addition to those coming from ICTG – see e.g., (Billingsley et al. 2004; Billingsley and Robinson 2005; Rosatelli and Self 2004; Riccucci et al. 2005; Petry and Rosatelli 2006; Menzel 2006; Mills and Dalgarno 2007; Siddappa and Manjunath 2008, Oh et al. 2009; Faria et al. 2009; Galvez et al. 2009a, b; Le 2006; Le et al. 2009; Roll et al. 2010; Poitras and Poitras 2013; Zinn 2014).

Our work on SQL-Tutor and other constraint-based tutors has demonstrated the wide applicability of CBM; it can be applied to both well-defined and ill-defined domains and tasks (Mitrovic and Weerasinghe 2009). In the future, we will continue developing constraint-based tutors to provide further evidence of the strengths of this methodology. Furthermore, we believe that, in order to reach the effectiveness of expert human tutors, ITSs need to support multiple learning mechanisms (Ohlsson 2008). Our future research plans are organized around this belief; we are currently enhancing SQL-Tutor to support new instructional strategies in addition to problem solving and learning from examples.

Reflections

The development of SQL-Tutor and the entire line of CBM-based tutoring systems is a prototypical example of a successful research and development process, with a heavy emphasis on “and.” Without basic inquiry into the nature of cognitive skills and how they might be acquired, the hypothesis of constraint-based skill specialization might never have been discovered. On the other hand, without the willingness of technologists to implement systems, the problems and weaknesses of bug library based tutoring systems might not have been seen clearly enough to warrant searching for an alternative approach. In the end, theoretical concepts and efforts at implementation came together because the two authors of this paper reached out to each other across the theory-technology gap. It is often said that such bridges are necessary and productive, but institutional disincentives to gap-bridging all too often get in the way. The first lesson of SQL-Tutor in particular and the CBM approach in general for ITS researchers is thus simple to state but sometimes hard to execute: Keep talking.

The entire process since the 1999 paper follows a radiating pattern. SQL-Tutor served as a platform for research on a wide variety of tutoring related problems. In addition, it served as a paradigm and template for other tutoring systems operating in other instructional domains. These latter systems have in turn served as platforms for yet other studies and projects. Looking down the history of technology, we can see other examples of this core + variants pattern: Think of the Wright Brothers’ first biplane and all the different types of airplanes that followed; the first vaccine and the many that followed; the first sky scraper and the many variants that now stand tall all over the world. In technology, exploring a design space by building gizmos that are variants of existing, successful gizmos is a standard operating procedure. The field of ITS research might benefit from keeping this pattern in mind.