Introduction

In this study the role of the tutor in problem-based learning (PBL) of statistics is investigated. The benefit of human tutoring has been demonstrated in many studies (Chi 1996; Chi et al. 2001; Graesser et al. 1995). Positive aspects of tutor performance in problem-based learning (PBL) have been studied extensively too (Barrows 1988; Dolmans et al. 2002; Dolmans and Wolfhagen 2005; Schmidt and Moust 1995). Summarising these findings it can be concluded that tutoring is effective:

  1. 1)

    When tutors and students interact; tutor’s actions that prompt for the co-construction of knowledge are positively correlated with deep understanding (Chi 1996, Chi et al. 2001).

  2. 2)

    When students are activated; understanding will be improved by answering why, how, and what-if questions (Graesser et al. 1996).

  3. 3)

    When self-study is stimulated; self-study will induce higher achievement (Schmidt and Moust 1995, 2000).

One-on one tutoring can be defined as a way of instruction, which is characterised by an interactive and continuous stream of exchanges between a tutor and a student (Chi 1996). Although tutors in PBL operate in a group, their role is more or less the same as in one-on one tutoring. Instead of dispensing knowledge they should try to activate students, stimulate group processes, try to create an atmosphere in which students can optimally participate in the discussions, help students to monitor their own learning, and to stimulate self-study (Schmidt and Moust 2000; De Grave et al. 1999). In PBL both content expert faculty tutors and non-content student tutors are used. Content expert tutors tend to make more subject matter contributions to the discussions than non-content expert tutors. This has been shown to improve especially novice student’s performance. It seems that a tutor’s expertise can compensate for lack of prior knowledge of students (Schmidt et al. 1993, Dolmans et al. 2002). Research has also shown that tutorial groups with relatively low levels of productivity require more input from a tutor (Dolmans and Wolfhagen 2005). To be able to make subject matter contributions the tutors in PBL usually have tutoring instructions at their disposal. These instructions consist of general information about the course, the main topics, more specific information about the goal of the problems, which subject matter is supposed to be discussed and how to tackle the problems. In the intervention of the current study the tendency of tutors to intervene in the content of the discussion, was taken a step further.

The intervention consisted of additional tutor guidance during the meetings. That is, the tutors more actively guided the discussions in a directive way than it is usual in PBL. The research question of our study was concerned with the effect of this extra directive tutor guidance on students’ achievement and their subjective perception of several course aspects, group functioning, the relevancy of tutor contributions, general tutor functioning, and the quality of the problems that were used for the discussions in the tutorial sessions.

In the following we will first give a short description of PBL. Next, we will describe some characteristics of statistical knowledge in relation to the learning of statistics and we will describe our intervention. After that, we will present our study.

Problem-based learning

PBL refers to a variety of approaches to instruction, which all have in common that much of the learning and instruction is anchored in concrete problems (Hmelo and Evensen 2000). A problem can be anything that raises questions germane to the subject matter and affords free inquiry by students (Barrows 1986). For example, a problem can be a patients’ case (in medical education), the outcome of a study, a hypothesis about a real life phenomenon, or a statistical problem that students need to solve. A problem that could be used in PBL of statistics is the question of what an appropriate analysis technique would be for a given dataset. While exploring and discussing the problem in an initial tutorial session, students extract key information about the problem and discover deficiencies in their knowledge (Hmelo and Evensen 2000). Based on these knowledge gaps and out of an intrinsic curiosity, students formulate their own learning goals and decide on what they are going to study. Instead of merely being exposed to information, the students in PBL are actively engaged in gathering information that is processed in relation to the presented problem. After the initial tutorial session the students in most PBL courses individually study relevant literature and in the following session they report back to the group what they have learned.

During the discussions in the tutorial sessions there are two kinds of guidance. Firstly, the problems used in PBL usually provide a fair amount of guidance to the students. Ideally, a problem directs students to the main topics and focuses on important issues. As a consequence the effectiveness of PBL is closely related to the quality of the problems (Albanese and Mitchel 1993; Schmidt and Moust 2000). Secondly, effective tutoring, as explained in the introduction, also involves appropriate guidance from the tutors in the discussions of the subject matter. This guidance usually consists of hinting the students, pointing to relevant topics, and helping students to monitor their own learning. Tutors traditionally ask questions like: can you explain this; do you understand that; can you see why that is important, etc? However, if a tutor contributes too much to the discussion in PBL, self-study time decreases (Schmidt and Moust 2000). Moreover, research has shown that education in general should not be too directive. Otherwise, students may loose the idea of having control, which may cause autonomy and self-regulated learning to decrease, which on its turn may cause students to get less motivated and become more passive (Ames 1992; Deci and Ryan 1985; Lepper et al. 1997; Pintrich 2003). Passiveness is unfavourable, because the best way to acquire knowledge is to actively process the subject matter (see e.g. Anderson 1983; Graesser et al. 1996; Chi et al. 1989).

In spite of these findings we instructed the tutors in the intervention condition of our study to guide the discussions in a directive way. We think that in PBL of statistics such an approach might have beneficial effects on students’ understanding of the subject matter, for two reasons. First, students have little prior knowledge of the subject. Second, statistics is a special discipline and requires special teaching, as will be explained in the next paragraph.

Characteristics of statistical knowledge

University teaching shows differences among disciplines, which may be due to genuine differences among fields of learning, such as the natural sciences and social sciences (Smeby 1996). Students in natural sciences seem to receive more guidance and structured supervision than students in the social sciences (Acker et al. 1994).

Statistics is a field of learning in which the concepts are relatively abstract and interconnected to a great degree, which makes it difficult for students to understand (Gal and Garfield 1997; Schau and Mattern 1997). For example, the fact that the mean is sensitive to outliers can affect the interpretation of the results of an analysis of variance. This example also illustrates that statistical knowledge is hierarchical. A student first has to know what the mean is, than has to understand the concept deviation from the mean, before variance can be understood, which has to be comprehended before analysis of variance can be understood. Statistical knowledge is said to be highly structured, because of this interconnectivity of the concepts and the hierarchy of the knowledge.

In abstract and highly structured knowledge domains (statistics, computer programming, and mathematics) research has shown that learning can be improved when teaching is more directive. For example, by using worked examples in mathematics (Carroll 1994; Sweller 1999; Sweller and Cooper 1985), and computer programming (Tuovinen and Sweller 1999); guided discovery in computer programming (Debowski et al. 2001; Fay and Mayer 1994; Lee and Thompson 1997); and by explicating strategies in statistics (Paas 1992). The interventions in all these studies acted as a guide, showing the students the most effective way to achieve their aim. In all studies it was shown that the performance of students improved. Our intervention intended to have the same effect.

Directive tutor guidance

As explained in the paragraph on PBL, students in PBL are directed by the problems and the tutors. It was expected that for statistics education extra directive guidance provided by the tutor would promote students’ understanding and that students would gain a better insight into the content of the course. In the current study the tutors in our intervention condition provided directive guidance by asking questions. They received a detailed list of specific questions, in addition to the more general tutoring instructions with background information that tutors usually have. By asking questions, the tutors in the intervention condition initiated the discussion and they directed the discussion in a predetermined way. The tutors intervened as soon as the students would go astray or omit an important subject. The questions were based on how experts would deal with the presented problems. Step by step, in a subscribed order the tutoring questions guided the students through the subject matter. The guiding questions focused the discussions on relevant issues, indicating the direction of reasoning without hampering the active learning processes. Tutors received written directions and training in how to use the instructions. They were instructed to specifically ask for the relations between the concepts. This should make more explicit the interconnectivity of the concepts. The tutors were also instructed to make sure that the topics were discussed in a prescribed order. This was done to adhere to the hierarchy of the knowledge. We will refer to this intervention as directive tutor guidance. In this study a condition in which tutors provided this directive guidance was compared to a standard (not guided) condition. In the standard condition the tutors were involved in facilitating the group and the discussion, but not directing it.

It was hypothesised that the directive tutor guidance would have a positive effect on students’ achievement and subjective perception of the course, group functioning, and the relevancy of tutor contributions. With regard to students’ subjective perception of general tutor functioning, a negative effect was expected due to the reduced autonomy of the students. No effect was anticipated with regard to the quality of the problems that were used for the discussions in the tutorial sessions.

Method

Participants

Two hundred and six students, enrolled in a bachelor statistics course of Health Sciences at the University of Maastricht, participated in this study. They were randomly assigned to 24 tutorial groups. These 24 tutorial groups were randomly assigned to 14 tutors (ten tutors had two groups, four had one group). Finally, the tutors were assigned to either the guided condition or to the control condition using blocked randomisation. This resulted in 12 groups (N = 102) being assigned to the guided condition and 12 groups (N = 104) to the control condition. The tutors were members from the department of Epidemiology and the department of Methodology and Statistics. All tutors had sufficient knowledge of the subject matter and had minimal 3 years experience in tutoring.

Materials and procedure

The bachelor statistics course in which this study was conducted took seven weeks. Students discussed the topics in a weekly tutorial group meeting on the basis of a list of problems. The main topics were methodological and statistical subjects. The methodological topics were randomised clinical trials and quasi-experimental designs. The statistical topics were the central limit theorem, t-tests, ANOVA, and linear regression analysis. For an example of a problem see Appendix A In the control condition, tutors followed the usual PBL tutoring procedure, facilitating the group and learning processes, but without directing the discussions. In the intervention condition, the tutors used a list of questions to guide the discussion in the tutorial group meetings in a directive way. The list consisted of questions for each statistical problem that was discussed in the tutorial meetings. See Appendix B for an example. The questions were in a prescribed order, oriented toward the main topics and their relations. Typically, in PBL the discussion starts with the problem definition. The tutor would therefore start with the question what the problem is about and successively guide the students through the subject matter. The questions were used in the initial discussion to make sure that all relevant topics were covered in the correct order and students would see what they had to know about the subject matter and what they did not yet know. This enabled the formulation of correct learning goals. After individual study, in the reporting phase, the questions were used during the discussions to try to stimulate that students would grasp all the concepts and their relations.

Tutors received training in handling the tutor instructions. It was stressed to only ask questions to keep the students actively engaged and not to explain the issues that were being discussed. Moreover, tutors were instructed to ask a question only when students were in danger of wandering off, when omissions were made, or when students did not know how to proceed. Finally, they were instructed to specifically emphasise the relations between the discussed concepts. For example, they were supposed to ask how sample size is related to the standard error.

At the end of the course the students evaluated the course on various aspects in a questionnaire. All courses are evaluated as a standard procedure. This is why students are used to filling out evaluation questionnaires after the final course exam. The questionnaire used in this study had 19 items consisting of statements that students had to rate on a five point Likert-scale. The statements covered the course itself, the statistical problems used in the meetings, the tutorial meetings, the performance of the tutor in stimulating understanding, and three more traditional aspects of the general functioning of the tutor, with respect to facilitation of the group and learning processes. The statements are presented in Appendix C. To assess students’ achievement, the final course exam was used. This exam consisted of 30 multiple choice questions about the statistical and methodological subject matter of the course.

Analysis

The raw scores of the questionnaire and the final course exam scores were used for the analysis. The item scores were inspected with respect to the mode, skewness and kurtosis. A factor analysis was done on the individual items to distinguish the subscales. This resulted in five subscales, making up five dependent variables. Cronbach’s alpha was computed for each variable. This study had a hierarchical design. First, the students were randomly assigned to the tutorial groups. Next, tutors were randomly assigned to the guided and the control condition of this study. Because of this hierarchical design, the comparison between the two conditions, with respect to the five dependent variables, was done by means of multi-level analyses. Random intercept models were used for all five analyses, with the students as the first level and the tutors as the second level. Deviance tests were used for the random effects, because of the rather small sample size (Snijders and Bosker 2003).

Results

The evaluation questionnaire

Inspection of the item scores with respect to the mode, skewness and kurtosis showed no indication of violation of the normality assumption. Factor analysis resulted in two possible solutions. The scree plot indicated a three-factor solution (see Fig. 1).

Fig. 1
figure 1

Scree plot of the factor analysis on the individual items

The “eigenvalues greater than one criterion” indicated five factors. The eigenvalues for the five factors ranged from 7.00 to 1.02. We have opted for the five-factor solution, because after oblique rotation a clear pattern emerged that could directly be interpreted in relation to the content of the resulting five subscales (see Appendix C). The pattern matrix after oblique rotation is shown in Table 1. The highest factor loadings are in bold type. Items that related to the course itself loaded highly on the first factor (items 1–3). The sum score of these three items comprises the variable course. The first two items that related to the problems (items 4–5) loaded highly on the second factor; their sum comprises the variable problems. The other two items about the problems (items 6–7; referring to what the problems added to the discussions) together with three questions with respect to the discussions in the tutorial meetings (items 8, 10–11) loaded highly on the third factor. One item referring to these discussions (item 9) loaded equally on three different factors. Based on the content of the item it was categorised under the third factor. The sum of these six items (items 6–11) comprises the variable elaboration. Five items loaded highly on the fourth factor (items 12–16). These items were related to those aspects of the tutor functioning that were supposed to stimulate students’ understanding of the subject matter. The sum of these five items forms the variable tutor guidance. Finally, the three items with respect to the general functioning of the tutor (items 17–19) loaded highly on the fifth factor. Their sum forms the variable tutor general. Together the five factors explained almost 68% of the variance. Cronbach’s α for the five variables was high: course (α = .72), problems (α = .60), elaboration (α = .85), tutor guidance (α = .90), tutor general (α = .73).

Table 1 Results of factor analysis on the items of the questionnaire

Fixed and random effects on the five dependent variables

The results of the multilevel analyses showed that in the guided condition the means of course, elaboration, and tutor guidance were significantly higher than in the control condition. No differences were found regarding problems and tutor general. The difference at the tutor level was not significant in the analysis of course. In the other four analyses these differences were significant. The results are presented in Table 2.

Table 2 Random intercept multi level analyses on the five dependent variables with tutor as the second level

The means of the five variables for the two conditions are presented in Table 3.

Table 3 Means of the five dependent variables for the guided and control condition

The mean grade on the final course exam was higher for the students in the guided condition as hypothesised, although the difference was only marginally significant (M control  = 5.85, M guided = 6.20; t = 1.47; p = .072).

Discussion

In this study, it was examined whether directive tutor guidance in problem-based learning of statistics improved the subjective perceptions of the students regarding the course, the tutor, the discussions in the tutorial meetings, the quality of the problems used in the meetings, and general tutor functioning. Directive tutor guidance aimed at stimulating the students to link together the statistical concepts in a structured way. It was expected that this would increase students’ understanding of the topics and that students would gain a better insight into the content of the course.

The final course exam served as a measurement of achievement. The grades of the exam were used as an indication of students’ understanding. We expected all students to be able to pass the exam at the end of the course, but we expected the students in the guided condition to do better. The results confirmed our hypotheses. We found a marginally significant effect. Moreover, students pass the exam if they get a grade of 5.5 (out of 10 points). The means of both conditions were above this cut off score. It should be noted, however, that the difference of 1.5 times the standard error in favour of the guided condition is very close to this cut off score. In the guided condition 67% of the students passed the final course exam, while 60% passed in the control condition. These proportions reveal that for a small group of students the positive effect of our intervention has been crucial. This means that for the individual student, the difference between the conditions may be quit relevant in practice.

With regard to the questionnaire the results showed that most hypothesised effects occurred. The course was valued more positively by the students in the guided condition. These students also rated the discussions in the tutorial meetings higher. Those aspects of the tutor functioning that were supposed to stimulate students’ understanding of the subject matter were also better evaluated in the guided condition. In both conditions the problems were judged of equal quality. Furthermore, general functioning of the tutors was judged as similar in both conditions.

The items of the questionnaire referring to the course consisted of statements regarding the clearness of the goal of the course, the instructiveness of the course, and the organisation of the course. Students in the guided condition evaluated the course more positively than the students in the control condition. The more positive evaluation indicates that the students in the guided condition had better insight in the content of the course and had a more positive overall impression, as hypothesised. This result suggests that students apparently understood better what was expected from them with respect to the objectives of the course, what and how they had to learn, and what they did learn was clearer to them. All statements in the evaluation questionnaire had to be rated by the students on a five point Likert-scale. This means that a neutral opinion with respect to the course would have resulted in a mean score of nine, i.e. nine is the centre of the scale. The means of the two conditions show that students in the guided condition were slightly above this centre of the scale (i.e. a positive rating), where students in the control condition rated the course slightly negative.

Items constituting the variable elaboration included statements concerning the integration of the subject matter in the discussions and the productiveness of the meetings. Items referring to particularly those aspects of tutor behaviour that were supposed to stimulate students’ understanding constituted the variable tutor guidance. These items included statements like: the tutor helped structuring the subject matter and the tutor’s contributions were relevant. The higher ratings of the students in the guided condition of both elaboration and tutor guidance indicate that a more directive tutor behaviour also had a positive effect on the instructiveness of the discussions. A neutral stance toward elaboration would have resulted in a mean score of 18. The means for the variable elaboration show that students in the guided condition rated the discussions slightly positive, the students in the control condition negatively. Tutor guidance was judged positively in both conditions, as students in both conditions rated tutor guidance above 15, the centre of the scale.

Directive guidance in education may not only have positive effects, but it may also lead to a decrease of self-study time, autonomy and self-regulated learning, so students may become less motivated and more passive (Ames 1992; Deci and Ryan 1985; Lepper et al. 1997; Pintrich 2003; Schmidt and Moust 2000). Therefore, directive tutoring might have led to a more negative evaluation of the category of items referring to the general functioning of the tutors. These items consisted of statements regarding the stimulation of autonomy, the activation of the students, and the evaluation of the group processes. However, no differences were found between the two conditions. It can be concluded that directive tutor guidance did not have a negative effect in this respect.

In both conditions the same problems were used to initiate the discussion in the tutorial group meetings. Therefore, no differences between the two conditions were hypothesised in students’ judgments about the problems. This is exactly what we found. These findings together with the equal evaluation of the general functioning of the tutor in both conditions supported the other findings.

The results also show (except for the variable course) that tutors differ significantly from each other. The influence of the tutors on students’ perception of the courses is relatively small. Their influence on the discussions and their own functioning is obviously much bigger. Multi-level analyses of those variables that are influenced by the tutors showed differences between tutors. For the course no differences were found. These results may seem trivial. However, for the uniformity of education it is important to try to increase similarity in the way tutors interact with the students. Our intervention exactly tried to do this. Tutors received specific questions and received training in how to use the instructions. This may have reduced the individual differences in tutoring style. As we did not directly observe tutor behaviour in the groups, but inferred this from students’ responses on the evaluation questionnaire we do not know how the instructions were carried out in practice. Tutors who guide the discussion, as in our intervention, are more prominent in the meetings. As a consequence, directive guiding tutors might be inclined to explain some of the topics, although they were specifically instructed not to do so. Future research could be aimed at how the instructions influence differences between tutors and tutor behaviour per se in practice.

Future research could also be directed at the underlying mechanisms for the results that we found. Our results are in line with cognitive load theory. Providing guidance may have reduced extraneous cognitive load (van Merriënboer and Sweller 2005; Paas et al. 2004). Asking specifically for explanations of the relations between the statistical concepts may have increased germane cognitive load (Paas and van Merriënboer 1994; Sweller et al. 1998). It could be measured whether our approach has such an effect on the perceived cognitive load.

Finally, it is unclear which kind of students have profited most from the extra guidance. We assume that specifically poor students, who have had difficulty in mastering the subject matter, may have had the most benefit from the extra guidance. Future research is needed to confirm this.

This study was done in a field setting. On the one hand this limits the scope of our conclusions. We used a standard questionnaire and the regular final course exam as measurements. Moreover, we could not control all factors. For example, we do not have a complete view of how tutors behaved in the meetings, nor do we know how students studied in the different conditions. On the other hand, because the studied conditions were embedded in realistic learning situations, we think that the outcomes are relevant in practice.