Introduction

There has been an increasing emphasis in the past few decades on the social dimension of education. Citizenship education has been introduced into the education systems of almost all European countries. In general, citizenship is concerned with people’s willingness and capacity to participate actively in a community (Educational Council 2003; Westheimer and Kahne, 2004). Scholars have outlined the demands a democratic and diverse society makes on citizens (e.g. Haste 2004; Naval et al. 2002; Torney-Purta 2004). An important component of contemporary citizenship is, among other things, the ability to form one’s own opinions about matters concerning justice and the public interest. Citizens in Western societies need to be able to take their own moral decisions and be accountable for those decisions. An important aim of citizenship education is, therefore, to enhance the capacity of students to develop personal viewpoints on value-related matters and to justify their opinions to others (Veugelers and Vedder 2003). Moreover, the fact that democracy is about plurality implies that students need to understand that there are multiple perspectives on moral and social issues and that their own view is only one of many possible perspectives (Banks 2004).

This study focuses on two aspects that should be given systematic attention in an education that aims to foster citizenship. First, students need to be able to reflect on the moral values that are at stake and take them into account when justifying their viewpoints to others. What we mean by moral values are articulations of ideas about a good life and how people should live together (Rokeach 1973; Veugelers 2007). Although embedded in social and political contexts, moral values are different from social conventions and norms in that moral values transcend the specific context and have a more general and abstract significance (Killen 2007; Oser 1996; Smetana 2006; Turiel 1983). Second, while developing their own point of view, students need to reflect on multiple perspectives and to take them into account (Banks 2004).

According to the literature, stimulating dialogue in the classroom would seem to be an effective teaching method to enhance students’ ability to take account of moral values and multiple perspectives (Solomon et al. 2001). The importance of dialogue in the classroom is emphasised from the perspective of citizenship education (Schuitema et al. 2008). Discussion is considered to be an essential component of democratic living (Althof and Berkowitz 2006; Gutmann and Thompson 2004; Haste 2004). In a democratic society, moral values, such as justice or fairness, are continuously debated. Democratic society necessitates being able to communicate with different social groups that have different points of view. Schools are one of the few places where people can engage in this kind of discussion and learn the attitudes and skills required (Hess 2009, Parker 1997). In addition, dialogue is advocated as a means to a variety of outcomes that are believed to be important for citizenship education (Hess and Avery 2008; Parker and Hess 2001; Burbules and Bruce 2004). When students engage in dialogue, they are encouraged to consider the perspectives of others and to reason and explain themselves to others. In this way, dialogue is assumed to stimulate the development of critical thinking as a crucial aspect of the competences citizens require to participate in society (Ten Dam and Volman 2004). Furthermore, it supposedly stimulates the development of attitudes such as tolerance, respect, ‘open-mindedness’ and autonomy (Grant 1996; Hess 2009; Saye 1998). In addition, in the domain of moral education, a great deal of research based on the work of Kohlberg (Blatt and Kohlberg 1975) focused on the effects of moral dilemma discussions on the moral development of students (e.g. Berkowitz et al. 2008; Berkowitz and Gibbs 1983, Damon and Killen 1982).

There are different ways to promote dialogue in the classroom (Hess 2009; Webb 2009). A widespread approach is to have the students work in small groups and to stimulate dialogue between students. An advantage of this approach above, for example classroom discussion, is that more students can actively participate in the dialogue. Salomon and Perkins (1998) suggest that it is through active participation in social activities that learners transform their understanding and skills. From a Piagetian perspective, peer collaboration provides opportunities for experiencing different opinions which can elicit cognitive conflict, a mechanism that causes cognitive change in individuals (Blatt and Kohlberg 1975, De Lisi and Golbeck 1999; Doise and Mugny 1984). From a Vygotskian perspective, it is argued that learners appropriate or internalise key aspects of co-constructed ideas and reasoning which will improve performance in a context without social support (De Lisi 2005).

In line with these ideas, many studies involving teaching methods for citizenship education advocate instructional designs in which students have to work together in small groups and are stimulated to engage in dialogue with each other (see Schuitema et al. 2008). However, only a few of these studies actually elaborate on the qualities of student dialogue that are required to achieve the various goals that are set for citizenship education. Most studies go no further than claiming ‘dialogue makes a difference’. However, it is questionable whether every kind of interaction will create productive learning opportunities to the same extent. The quality and depth of the dialogue is assumed to determine the quality of the learning process (Webb 2009). Student dialogue should meet specific characteristics in order to facilitate learning (Berkowitz et al. 2008; Kumpulainen and Kaartinen 2003; Rojas-Drummond and Mercer 2003; Saab et al. 2007).

Dialogue quality

Which characteristics of student dialogue do we consider important for taking into account and reflecting on moral values and multiple perspectives while developing a personal point of view? We consider that both the structural features and the specific content of student dialogue are important.

With respect to the structural features of student dialogue, many researchers have stressed the importance of processes of co-construction (Berkowitz et al. 2008; Van Boxtel 2004; Rojas-Drummond and Mercer 2003; Webb 2009). Students form their own opinions by using the input of others, and they also contribute towards the opinions of others. They co-construct meanings and ideas when they reflect and elaborate on the contributions of others (Berkowitz et al. 2008; Kumpulainen and Kaartinen 2003; Rojas-Drummond and Mercer 2003; Van Boxtel 2004). There is some empirical support for the value of co-construction processes for moral reasoning. Berkowitz and Gibbs (1983) analysed 30 dialogues of students (mean age 20.7) discussing a moral dilemma in dyads. They found that statements that ‘transform or operate on’ the reasoning of their partners were more frequent in dialogues of students whose moral reasoning ability improved.

A requirement for processes of co-construction is that the participants are involved in the interaction and actively exchange opinions and ideas (Brown and Renshaw 2000; Mercer et al. 1999). Students can only experience a variety of perspectives and moral values when they feel their ideas will be valued and respected and verbalise their perspectives and values. Kumpulainen and Kaartinen (2003) suggest that contributions to the dialogue should be equally distributed among the participants, something they consider to be an important feature of collaborative processes in dialogue. This contrasts with unequal participation that indicates an imbalance in social status and power.

In addition, in order to achieve processes of co-construction, it is essential that there is some degree of mutual understanding (Baker et al. 1999). Participants need to ‘check’ new information in order to maintain common ground (Erkens et al. 2005). Checking behaviour includes verifying questions and all types of confirming, accepting or denying responses. Damon and Killen (1982) coded the dialogues of 69 first-, second- and third-grade students. The ability of students with a relatively low initial ability to reason on issues of justice and fairness improved most when they displayed both transformative statements and statements of direct agreement or repetition. In addition, Erkens et al. (2005) found that checking behaviour in students’ interaction had a positive effect on the overall argumentation in a collaboratively written text.

Finally, besides structural features, the content of student dialogue is important for students to develop their own personal point of view. The different ideas and views students contribute to the dialogue should be supported with arguments (Chinn et al. 2000). The reasoning must be made explicit in the talk (Mercer et al. 1999). Erkens et al. (2005) found that students who participated in groups that displayed more argumentative statements in their interaction wrote essays with better overall argumentation. In the context of citizenship education, ideas and views that students bring into the dialogue should be appraised and validated from the perspective of moral values (Veugelers 2000). Hence, the moral values that are at stake must be made explicit in the dialogue.

In sum, the characteristics of student dialogue that we assume to be important for citizenship education are elaboration on the contributions of others, equal participation, checking behaviour and the explication of moral values.

Instructional strategies for dialogic citizenship education

Achieving a dialogue that meets the characteristics discussed above requires specific skills and attitudes. Several studies have shown that instructional strategies with an explicit focus on the skills and attitude students need for a productive dialogue have a positive effect on students’ reasoning skills (e.g. Mercer et al. 1999). Considerably less research has investigated the effects of such instructional strategies in the field of citizenship education (Schuitema et al. 2008).

A previous study (Schuitema et al. 2009) investigated the effects of an instructional strategy for dialogic citizenship education on students’ ability to take moral values and multiple perspectives into account when justifying their viewpoints. We designed a curriculum unit for history education in which we integrated dialogic citizenship education. The aim of the unit was to improve students’ ability to support their points of view on moral issues related to the learning content. Students worked together in small groups. We focused in the unit on the skills and attitudes that students need to engage in a dialogue as discussed above (see also Frijters et al. 2008).

We compared the learning outcomes of the students who participated in the curriculum unit with students in a control group who followed the same history course without an explicit focus on moral values and dialogue. We investigated the effect of the curriculum unit on the ability to reflect on moral values and various different perspectives in an essay. Prior to writing an essay, students discussed a statement about a moral issue. After this discussion they wrote a short individual essay about this statement. Analyses of these essays revealed that students who participated in the curriculum for dialogic citizenship education tended, more often, to take multiple perspectives into account (Schuitema et al. 2009). Similarly, Frijters et al. (2008) found a positive effect of a dialogic instructional design for value loaded critical thinking on students’ ability to reflect on moral values.

However, our previous study and that of Frijters et al. (2008) leaves several questions unanswered. To what extent did the student dialogues display the characteristics we assume to be important for citizenship education? What was the contribution of the quality of student dialogue to the learning effects found in the studies? To answer these questions, in the present study we investigate the relationship between the quality of the dialogues in small groups and students’ individual ability to reflect on moral values and multiple perspectives. The question that this study aims to answer is:

How does the quality of the dialogue relate to students’ ability to take moral values and multiple perspectives into account when justifying their viewpoints?

We expected that students who participated in a group discussion that displays the structural features and the specific content as outlined above would be better able to reflect on moral values and multiple perspectives when supporting their own opinion in their essay.

Method

Instructional materials for dialogic citizenship education

The students in our study participated in a curriculum unit for dialogic citizenship education as an integral part of their history lessons (developed in the previous study described above). The unit was designed for 8th grade and included thirteen, 45-min lessons and covered the history of the USA from the first settlers to the early twentieth century. The curriculum unit discussed the founding of the USA, the position of the Native Americans, immigration to the USA, slavery and the Civil War. We used dialogue as a potentially adequate instructional strategy aimed at enhancing the ability of students to take moral values and multiple perspectives into account.

Systematic attention was paid throughout the curriculum unit to moral values, described to the students as “opinions, wishes or ideals on how people should live together”. Students learned to recognize and identify moral values covered in the learning materials. They studied, for instance, the text of the 1776 American Declaration of Independence and parts of the 1788 Constitution and had to indicate which values these texts incorporated. They also investigated the values they themselves and their fellow students considered to be important. Another important focus of the instructional materials was multiple perspectives on moral issues and values. The students were provided with several sources reflecting different perspectives on a historical event or situation. They also worked on tasks in which they were required to empathise with the perspectives of particular groups.

Instructions for dialogue

The students were prompted to engage in dialogue with each other during the lessons. They worked in small groups and discussed statements that involved moral issues. Explicit attention was given in the curriculum unit to the skills and attitudes students need to engage in dialogue with each other. From the outset, students were encouraged to share their opinions with others, by, for example, writing down each other’s opinions without the need for immediate agreement. Activities aiming at co-construction and validating were gradually added. Processes of co-construction were stimulated through assignments in which students had to write down a collective point of view. They had to arrive at an agreement or, where there was disagreement, they had to write down what it was they disagreed about and why. To stimulate students to validate their opinions and those of others, they had to determine the moral values that had been involved in the forming of their collective point of view. Teachers were instructed to give as little help as possible with the subject content and were explicitly told to guide the process of collaboration by helping the group as a whole only when students asked for assistance. The teachers kept a daily log in which they recorded how much time had been spent on group work. These logs indicate that, on average, students worked in small groups 59% of the time (varying between 55% and 64%).

Design and procedure

This study focuses on an essay assignment in the last lesson of the curriculum unit. In our view, students should also be able to apply what they have learned outside the context in which it was acquired; they should be able to transfer their skills to new moral issues that they will encounter in daily life. We, therefore, opted for a topic that was not part of the history lessons. The students worked on an assignment that included a brief introduction to a moral dilemma and a statement. The statement was: “School uniforms back in the classroom!” It was suggested that school uniforms should be introduced in the classroom to avoid students being bullied for the way they dress.

The students were given 10 min to discuss the statement in self-selected groups of four. They all then wrote a short, individual essay in which they expounded their personal opinion about the statement. In this study, we investigated the relationship between the quality of the dialogue during the first part of the assignment and the essays students wrote individually in the second part. To control for relevant variables that could affect the performance on the essay assignment, data were collected during the first lessons of the curriculum unit for Reasoning skills and Attitude towards dialogue (see Section 2.5 for details)

We randomly selected four classes from the eight classes that participated in the curriculum unit for dialogic citizenship education. In each of the four classes, we randomly selected three or four groups and recorded the dialogues of these groups. This resulted in a selection of 14 groups from the 65 that participated in the curriculum unit. The selection included 50 students at four different schools. Twenty-seven of the students (54%) were female, five students (10%) considered their ethnic identity to be non-Dutch (i.e. Moroccan or Surinami). All students were from the 8th grade of pre-university education (age 13 to 14).

Analysis of the essays

Individual essays were used to assess students’ ability to take moral values and multiple perspectives into account when justifying their opinions. We scored the essays on the use of moral values and multiple perspectives.

The score on the first variable, moral values, was based on the number of arguments that referred to moral values and on the extent to which the students explicitly referred to a moral value. In the scoring of the essays the degree of explicitness was given more weight than the number of references. The more clearly a student refers to a general value that transcends the specific context, the higher the score. An essay on school uniforms in which a student indicates, for instance, that everyone should be able to decide for themselves what to wear, scores higher than an essay in which a student states that she wants to be able to make a decision herself. Even higher scores are given to an essay in which a student links choosing your own clothing with freedom of speech.

The multiple perspectives variable concerns the extent to which students discuss varying perspectives in their essays. Attention was paid in the scoring not only to the number of perspectives, but also to the extent to which the perspectives were elaborated upon. The number of (sub)arguments for each perspective was checked. Most of the time arguments for and against the statement corresponded to two perspectives, but this was not always the case. When a student indicates that a school uniform cannot prevent bullying and that a school uniform conflicts with the freedom of expression, these are both statements against a school uniform. This student, however, elaborates on two different perspectives on the introduction of a school uniform; the second statement can be interpreted as value-related and the first not.

Independent raters scored each essay separately for each variable as a whole, according to a method derived from the comparison method (Blok 1986). The raters were provided with anchor texts, which are model essays with which the raters compare the other essays. We chose (per variable) three typical and representative model essays of a good, an average and a weak essay. The three anchor texts had fixed scores, 50, 100 and 150, respectively. We then asked the raters to assign a score of between 0 and 200 to each essay, compared with the anchors. When a rater assesses an essay to be, for instance, better than the average model essay, but not as good as the good model essay, then this essay will receive a score of between 100 and 150 points.

Different teams of five raters were formed for each variable to prevent one variable influencing another. We implemented the ‘snake method’ for the most efficient assignment of raters to essays (Van den Bergh & Eiting 1989). The essays were randomly divided into as many samples as there were raters. Each rater scored two samples. Rater 1 scored samples 1 and 2, rater 2 scored samples 2 and 3 and so on. The last rater scored the essays in the first and last sample. This meant that each essay was assessed by two raters, making it possible to estimate the reliability of each rater based on all the other raters.

The reliabilities of the essay assessments were estimated using a LISREL model described by Van den Bergh and Eiting (1989). The scores of each rater were modelled as indicators of a latent variable. The estimated standardised regression coefficients of the latent variables of the indicators are the reliabilities of the raters. In the end, each essay was scored by two raters independently of each other. The final score was determined by calculating the average of these two scores. The Spearman–Brown formula for test length was used to calculate the reliability of these average scores on the basis of the two reliabilities of the raters of the sample in question. The reliability of the average scores varied from .76 to .93 (see Schuitema et al. 2009).

Coding of the dialogues

The taped dialogues were transcribed and coded in three phases. In the first phase, we focused on the function of the communication. In phases 2 and 3, we coded the content of what is being communicated. We focused on utterances in which a moral value is expressed and on the number of themes that was discussed. We used the turn shifts of the speakers to mark off the unit of coding. An utterance was defined by the speech of a single speaker without interruption from other speakers.

Communicative acts

The communicative act coding indicates the communicative function of an utterance. Table 1 shows the codes we used and the descriptions. The main aim in this first phase was to identify utterances that indicate processes of co-construction. It is important for the process of co-construction that students contribute to the content of the task by bringing in their viewpoints and that students react to each other. We, therefore, distinguish utterances that are a contribution to the content of the task from utterances with the aim of regulating the dialogue. Contributions to the content of task can be made by bringing in a new viewpoint or argument (informative), by transforming or operating on the contributions of others (transformative) or by checking new information. We consider these three communicative acts as indicators for processes of co-construction. The remaining codes in Table 1 refer to utterances aimed at the regulation of the content or process of the dialogue or utterances related to the writing process. Interrater reliability of the coding was determined by comparing the ratings of two independent raters for two dialogues (294 utterances). Analyses showed the communicative act coding to be reliable with a Cohen’s Kappa of .73 and an interrater agreement percentage of 77%.

Table 1 Codes, descriptions and examples of communicative acts

Value-related utterances

To assess the quality of the dialogue content, we made a distinction between value-related utterances and utterances in which no moral value was expressed. The coding in this second phase involved only those utterances that were coded as informative or transformative in the first phase. Students usually did not explicitly express moral values. However, we consider an utterance to be value related if we can reasonably assume that an appeal is being made to a general, moral value which transcends the specific context. For instance, when a student states: “there are differences in religions and people should be able to express their religion by wearing a headscarves” she is appealing to a general notion that differences between people must be respected. Table 2 gives more examples of value-related utterances. Three dialogues (171 utterances) were double coded. The coding of value-related utterances appeared to be reliable (Cohen’s Kappa = .75; interrater agreement = 93%).

Table 2 Value-related themes: description and examples of value-related utterances per theme

Value-related themes

The quality of the dialogue is not only determined by the number of references to moral values but also by the variety of value-related themes. In the third and last phase we therefore coded the dialogues according to the different value-related themes. Table 2 presents an overview of the themes that students discuss in the dialogues. Reliability analyses revealed a Cohen’s Kappa of .93 and an interrater agreement of 94%. Based on the theme coding we calculated, for each group, the total number of different themes that were discussed.

Asymmetry of participation

To assess the extent to which students participated equally in the dialogues, we calculated the level of participation asymmetry. We first calculated the number of informative and transformative utterances made by each participant as a percentage of the total made by the whole group. Subsequently, we calculated for each member how much this percentage deviates from the ideal situation of perfect participation symmetry. For example, in a group of four, participation is perfectly symmetrical if each member makes 25% of the total number of informative and transformative utterances. The final score for participation asymmetry is determined by calculating the group mean of the deviance scores.

Control variables

At the start of the study, two variables were measured which were expected to correlate with the performance on the essay assignment.

General reasoning skills

It can be expected that reasoning skills play an important part in the ability to formulate an opinion. To estimate the level of students’ reasoning skills, three relevant scales (21 items) of the cross-curricular skills test were used (Meijer et al. 2001): forming opinions, notions and beliefs and distinguishing facts and opinions. Cronbach’s alpha over the three scales was .63.

Attitudes towards dialogue

Students’ attitudes towards dialogic learning may influence the extent to which students learn from the dialogue. We used the ‘attitudes to dialogic learning scale’ (Frijters et al. 2008) to estimate students’ attitudes to the exchange, co-construction and validation of opinions (α = .82).

Statistical analyses

The research question concerns the relationship between the quality of the dialogues and the individual ability to take moral values and multiple perspectives into account, measured by the essay assignments. There are six variables that represent the quality of the dialogues, including three types of communicative acts that we consider to be important indications of co-construction processes: informative utterances, transformative utterances and checking utterances. The other three variables are value-related utterances, number of value-related themes and the level of participation asymmetry. The essay scoring produced a score for moral values and a score for multiple perspectives for each student. A multivariate multilevel analysis was performed (MLwiN 2.02: Rasbash et al. 2005) with the two essay scores for moral values and multiple perspectives as dependent variables. The essay scores were measured at the student level and students were nested in groups. Therefore, we performed multilevel models with two levels: student and group. To control for individual differences in reasoning skills and attitude towards dialogue, these variables were included in the analyses as covariates. A previous study revealed gender differences in students’ ability to justify an opinion (Schuitema et al. 2009). We, therefore, also controlled for gender. Because there were only five students with a non-Dutch ethnic identity, and to keep the model parsimonious, we decided to exclude ethnic identity from the analyses. Three models were fitted and compared. The first model was an empty model and included only the dependent variables. The covariates were added in the second model. In the last model, the six variables representing the quality for dialogue were added to the model. We used a forward stepwise procedure in which only significant variables were included in the model (see Snijders and Bosker 1999).

Results

Descriptive data

Before we discuss the results of the statistical analyses, we first take a closer look at the dialogues. The dialogues lasted an average of 12.4 min (SD = 2.2) and the average number of utterances per group was 126 (SD = 34.6). Table 3 presents an overview of the results of the communicative act coding.

Table 3 Communicative acts: mean, percentage of total number of utterances and standard deviation (N = 14)

About 60% of the utterances (informative, transformative and checking) were directly related to the topic of school uniforms. The ratio between informative utterances on the one hand and transformative and checking utterances on the other suggest that an informative utterance was followed, on average, by one or two transformative utterances and one or two checking utterances. A comparison of the standard deviations shows that the number of informative utterances varied less between groups than transformative and checking utterances. Thus, the groups differed most on the extent to which students elaborated on new contributions. Students used about 25% of the utterances for regulation, the greater majority of which focused on the writing of the essays.

Table 4 shows the value-related themes and the average number of value-related utterances each group made about that particular theme. The total number of value-related utterances per dialogue was an average of 8.71. This is about 18% of the informative and transformative utterances and about 6% of the total number of utterances. The table also shows that the most frequently discussed themes were ‘freedom’, ‘bullying’, ‘equality’ and to a lesser extent ‘diversity’.

Table 4 Value-related utterances per theme: mean, percentage of the total number of value-related utterances and the standard deviation (N = 14)

Relationship between the dialogues and the essays

In this section, we present the results of the multilevel analysis concerning the relationship between the quality of the dialogues and the use of moral values and multiple perspectives in the individual essays. We performed a multilevel analysis, with the essay scores on values and multiple perspectives as dependent variables (see Table 5).

Table 5 Results of the multilevel analyses on the essay scores (N = 50)

In the second model, we added the covariates to the empty model. This resulted in a significant improvement of the fit of the model. As shown in Table 5, girls scored significantly higher on moral values in the essays than boys. In addition, attitudes towards dialogue had a significant effect on the essay score for moral values, and the score for reasoning skills was related to the essay score for multiple perspectives. To construct the third model, we added in a forward stepwise procedure the six variables that represent the quality of the dialogue: informative, transformative, checking and value-related utterances, the number of themes and the asymmetry of participation. Only value-related utterances proved to be significant and were included in the model. The effects of the other five variables were not significant and these variables were, therefore, not included in the model. Value-related utterances appeared to be significant for the essay score on values as well as for the essay score for multiple perspectives. Students in groups that made more value-related utterances were better able to discuss moral values and the different perspectives in their individually written essays.

A closer look at the dialogues: transformative and value-related utterances

Contrary to our expectations, the results indicated that processes of co-construction did not seem to be related to students’ individual performance on the essay assignment. There was no relationship between transformative utterances in the dialogues and essays scores. We did find, however, a relationship between value-related utterances in the dialogues and the essays scores. This indicates that students used the content of the dialogue to write their essays. But the extent to which students elaborated on the arguments contributed by other group members did not seem to matter. To gain a better understanding of these results, we take a closer look at what students actually did during the dialogues. We focus particularly on the number of transformative utterances in comparison with the number of value-related utterances and the scores for moral values and multiple perspectives on the essays.

We selected three groups on the basis of the number of transformative utterances and the scores on the essays. The first group we selected made many transformative utterances and their members had high essay scores. So the performances of this group of students were in line with our expectations. Subsequently, we selected two groups from which the results did not correspond with our expectations: one group with many transformative utterances, but low essay scores for moral values, and one group with a relatively small number of transformative utterances but nonetheless high scores on the essays.

The first example (Table 6) presents parts of a dialogue in a group of four girls with many transformative utterances (37). All four girls had high essay scores. The average essay score for values was 144, and for multiple perspectives, it was 138. As this dialogue shows, there are many indications for processes of co-construction. The girls elaborate on the contribution of others and finish off each others’ sentences. They seem to know when they are on the right track, and they collaboratively elaborate on the value-related themes. Moreover, they recognize moral values in the dialogue: “that’s a value! You should decide for yourself what you want to wear” (line 9). As a result of this collaborative elaboration, they have relatively many value related utterances (18).

Table 6 Example of a dialogue with many transformative utterances and many value-related utterances

The second group is a group of two boys and two girls. They all had very low scores for the use of values in their essays (36) and a moderate score for multiple perspectives (90). As expected, they made only a moderate number of value-related utterances during the dialogue (6). In contrast with our expectations, however, this group’s dialogue did show many indications for processes of co-construction. They made 45 transformative utterances. Why did this not lead to more value-related utterances and higher essay scores? Table 7 presents parts of their dialogue. The students elaborate on the contribution of others just as the first group did. But they focus on more practical issues—not value-related—such as the fact that you have to wash the school uniform.

Table 7 Example of a dialogue with a many transformative utterances and a moderate number of value-related utterances

Later, when they bring up an issue in which moral values are involved, the issue of bullying, the problem is dismissed more than once as a non-issue: “Nobody gets bullied because of the way they dress at this school, unless you wear something really stupid”(line 34). The dialogue continues on this issue without making progress for a while, until the students finally arrive at the next step. They bring up the idea that if there are students who are bullied because of the way they dress, then school uniforms might still be a good idea (line 66).

The last group we selected had very high essay scores for moral values and multiple perspectives (Table 8). Their average score for moral values was 152 and 158 for multiple perspectives. The group comprised two girls and two boys and their dialogue shows many value-related utterances (15). However, they made only 23 transformative utterances which is a relatively small number. This indicates that the students were not so much involved in a process of co-construction. Nevertheless, the transcript of their dialogue shows that they were very cooperative. They seem to use the dialogue as a brainstorming session. In order to benefit from each other they try to come up with as many arguments as possible. They use many regulative utterances to direct the content of the dialogue (19). At a certain point, for example, they think they have enough arguments against school uniforms so they decide that they should concentrate on arguments in favour of school uniforms (line 52). This results in more value-related utterances.

Table 8 Example of a dialogue with a moderate number of transformative utterances and many value-related utterances

The three examples clearly show that it is not necessary for students to make many transformative utterances and still use the dialogue to achieve high essay scores. In turn, many transformative utterances appear not to guarantee high essay scores. Nevertheless, a closer look at the dialogues indicates that not only the number of value-related issues discussed by the group is important for students’ individual performances on the essays. The structural features of a dialogue, as seen here, and the content of the dialogue mutually influence students’ ability to take moral values and multiple perspectives into account when justifying their personal opinion on a moral issue.

Conclusions

We focused in this study on the quality of student dialogue in citizenship education. How does the quality of the dialogue relate to students’ individual ability to take moral values and multiple perspectives into account? We expected that students who participated in a high quality group discussion would be better able to reflect on moral values and multiple perspectives when supporting their own opinion in their essay than students who participated in a low quality group discussion. The quality of student dialogue referred both to the structural features of a dialogue and to the content. We expected the following characteristics of student dialogue to be important for citizenship education: equal participation, elaboration on the contributions of others, checking behaviour and the explication of moral values.

The results partly confirmed our expectations. The quality of the content of the dialogues appeared to be related to students’ ability to take moral values and multiple perspectives into account when justifying their viewpoints. Students who participated in groups that made more value-related utterances in their dialogue, also referred more often and more explicitly in their individually written essays to moral values and were better able to validate the different perspectives. The results indicate that students ‘used’ the dialogue with others to write their own essays. This applies to all students, regardless of their contribution to the quality of the dialogue. To evaluate the validity of the above, we must consider the fact that the number of value-related utterances in the dialogue was measured at the group level, while the performance on the essays was estimated at the individual level. It is, therefore, possible that students who are less skilled in using moral values and multiple perspectives to justify their opinions may attain high essay scores by participating in a dialogue with many value-related utterances. It would be interesting for future research to investigate the quality of the contributions of individual students to the dialogues and compare them with their performance on the essays.

In general, our study indicates that the quality of the content of the dialogue is important for student’s ability to substantiate an opinion. We have to be cautious, however, with conclusions about causality based on correlational research. An alternative explanation for the relationship between the quality of the content of the dialogues and the essays is that students who are able to express moral values can do this both verbally in a dialogue with others (and thus contribute to a high quality dialogue) and in writing in an individual essay.

The other dialogue characteristics that we examined in this study, informative utterances, transformative utterances, checking utterances, the number of value-related themes and the level of participation asymmetry, did not appear to be related to the ability of students to take moral values and multiple perspectives into account. These results contrast with our expectations about the processes of co-construction during the dialogues. The extent to which students elaborated on the contributed arguments did not seem to matter; only the exchange of value-related arguments was positively related to the individual performance on the essay assignment. Apparently, it was enough to repeat what had been said in the dialogue to receive a higher score for moral values on the essay. Students might have considered it unnecessary to elaborate on the various arguments. This is consistent with the finding that most regulative communication and on average 16% of the dialogues focused on the writing of the essays (Table 3). It indicates that at least some of the groups were very much focused on how to write the essay and might have put the dialogue primarily at the service of this focus.

A closer look at the content of some of the dialogues has shown, indeed, that it is not necessary for students to make many transformative utterances and still use the dialogue to achieve high essays scores. Some of the students seemed to have used the dialogue as a brainstorming session, rather than as a discussion in which they formed and changed their opinions. At the same time, many transformative utterances are not a guarantee for high essay scores. If students focus too long on one issue or spend their time on issues that are not value related, the discussion might not help them to write better essays.

One possible explanation for these results may be our conception of processes of co-construction operationalized as the number of transformative utterances. It is possible that other indicators of co-construction are also important for students’ ability to justify their opinion. Mercer et al. (1999), for instance, argue that during an effective dialogue it is not only important that students react to each others’ contributions but also that students seek to reach agreement. A line of argumentation may be more easily reconstructed in an individual task setting following the group work, when it is closed with a joint conclusion. Future research must show the extent to which this characteristic of an effective dialogue also holds true in the moral domain of citizenship education in which ‘room for differences in opinions’ is highly valued.

Another factor that might influence our results regarding the relationship between student dialogue and students’ individual essays is the group constellation. Mischo (2005), for example, found an effect of an explanation-oriented discussion style during a group discussion on perspective coordination, but only with groups that consisted of individuals with a consensus orientation: willing to find the best argumentative solution in a rational and cooperative manner. The results of other studies also show the relationship between group constellations and discussion styles. For instance, children have more transactive conversations with friends than with non-friends (see Berkowitz et al. 2008). The groups in our study were self-selected and it is therefore possible that some groups consisted of friends and others not. In addition, gender differences might also induce different cooperation and discussion styles. In this study, we controlled for gender differences as an individual trait. However, the gender composition of the groups may also have influence (see e.g. Myaskovsky et al. 2005; Hunter et al. 2005). In sum, it is conceivable that the question ‘what makes a dialogue effective’ depends partly on the constellation of the group. To put it differently, the characteristics of a dialogue that are the most important for students to learn how to justify their viewpoints might differ between different types of groups. This issue calls for further empirical research.

Although there might be plausible explanations, in our study we did not find a significant relationship between the ability to reflect on moral values and multiple perspectives and the transformative nature of the dialogue. This result is not in line with those of other empirical studies, such as those by Berkowitz and Gibbs (1983). A final explanation that we would like to mention refers to the task we used to measure students’ ability to justify an opinion. In order to get a high score for the essays it might have been sufficient to repeat what had been said in the dialogue. It is possible that transformative utterances in the dialogues led to a deeper understanding of the moral dilemma and the related values and perspectives without resulting in higher essay scores. Moreover, the students in our study usually agreed that they did not adhere to the introduction of a school uniform, the topic of the assignment. A more controversial topic that would have evoked more debate between students might possibly have led to different results.

To conclude, our study emphasises the importance of analysing student dialogues in more detail to understand how student dialogue can improve students’ ability to form and substantiate their opinions on moral issues. Although learning through dialogue is a central element in many instructional formats for citizenship education, empirical studies that elaborate on the content and structural features of such dialogue are scarce (Schuitema et al. 2008). Our study indicates that the quality of the content of students’ dialogue is important for students’ ability to substantiate their opinion on moral issues with value-laden argumentation. An instructional design for citizenship education in which dialogue is a central element should, therefore, aim to enhance the quality of the content of students’ dialogues. Attention should be given, in particular, to the validation and the invalidation of ideas and views from the perspective of moral values. In addition, this study shows that the results with regard to the effect of co-construction on students’ ability to substantiate their opinions are not univocal. Greater insight into the role of co-construction in students’ dialogue, compared with other characteristics of students’ dialogue and taking into account context variables, such as group composition, is clearly needed.