Keywords

1 Introduction (Problem Definition and Research Questions)

Valuing and decision-making competencies in the context of Education for Sustainable Development (ESD) play an essential role in the national educational standards (NBS) of biology. The students should be prepared to recognize biological facts, evaluate them, and justify their own or third-party judgement. On this basis, they should develop their point of view, considering individual and socially negotiable values. This implies multi-perspective thinking concerning the ESD dimensions: ecology, economy, and social issues (KMK, 2005). However, the diagnosis of these competencies is perceived by teachers as a challenge. Based on empirical data, German teachers perceive the openness of evaluation processes and the associated performance assessment as a difficulty (Alfs, 2012). Therefore, the goal of this study is to develop, test and evaluate a diagnosis grid for valuing and decision-making competencies. In addition, a teaching unit to promote these competencies in heterogeneous classrooms should be promoted.

The overarching research question (RQ) of this Design-Based-Research (DBR) project is:

  • (RQ.DBR): How must a lesson be designed to comply with the conditions of the cooperating school and address the school-specific requirements?

Explanation: The focus of the paper will be to describe how the diagnostic grid was developed, tested, and evaluated. In this sense, “designing a lesson” means mainly the aspect addressed by the diagnostic grid.

In addition, cycle-specific questions are defined as:

  • (RQ. Cycle.1): Which aspects of the design prototype show the most significant potential for development (practical/theoretical)? (RQ. Cycle.2): Which criteria must be modified to improve the selected methods for promotion and diagnosis of the decision-making competence (cf. material-based writing/competence grid)? (RQ. Cycle.3): How suitable is the modified teaching method (cf. material-based writing) to promote decision-making competencies? How suitable is the developed diagnostic grid (PARS model; short for: “In Partnership with Competences Diagnosing”) for diagnosing decision-making competence?

Explanation: The research question for the first cycle was intentionally formulated in an open-ended way, since at that time (2016–2017) it was not yet clear whether the promotion (and diagnosis) of decision-making competence would be the focus of the research project. From cycle.2 onwards, the methods for promotion were defined, but it was also clear that these would have to be modified to meet the needs of the target group. In cycle.3, the focus was on the important testing of the modified methods in practice.

2 Theoretical Framework

2.1 Design-Based Research (DBR)

Didactic development research or design research starts from a concrete problem in educational practice. A solution is developed and validated by a theory developed in the context of this problem. A solution usually involves the further development of a method, selection of materials or medium that addresses the problem. To verify this solution, evaluation methods must then be selected and adapted to measure the effectiveness of the solution. DBR does not offer a systematic repertoire of methods, but falls back on the evaluation methods that are common in subject didactics. They can be qualitative or quantitative, depending on the needs of the research design. DBR works in iterative cycles, which is why data are collected and evaluated several times and the design is further developed based on the knowledge gained, so that an optimized prototype is available at the end of the development (see Fig. 12.1) (McKenney & Reeves, 2012).

Fig. 12.1
A flow diagram of exemplary D B R cycle based on the Model of McKenney and Reeves. The three main steps are problem analysis and exploration, development, and implementation leading to re-design. Further development of each step occurs after the re-design.

Exemplary DBR cycle based on the Model of McKenney and Reeves (2012). The numbers ➊ → ➌ refer to the chronological order in which a design should be developed, implemented, evaluated, and later revised

2.2 Material-Based Writing

The ability of students to write their own judgements based on existing content knowledge is a competence that is required in many contexts today (Schüler, 2017). Different sources are used for writing these texts. These can be continuous texts, statistics, explanatory graphs, graphs, or images (Abraham et al., 2015). The method of material-supported writing transfers the usual combination of reception and production phases into a new task type (target text) and is thus a demanding form of reading and writing promotion that serves the purpose of (subject matter) learning (see Fig. 12.2) (Phillip, 2017).

Fig. 12.2
A double bar graph of total number of tasks solved and tip cards used. The total number of tasks solved percentage is the highest for cycle 3. The total number of tip cards used is the highest for cycle 1.

Comparative representation of the total number of tasks solved and tip cards used over the course of cycles 1–3. Since the total number of possible tasks to be solved decreased from cycle 1 → 2, the figures were converted into percentages for the purpose of normalization. This does not apply to the “tip card selection”, where the real numbers were taken over unchanged. “Linear” represents the linear trend developments based on the numbers

Regarding the relevance of material-based writing for education, Abraham et al. (2015) state that the skills required of students to meet the objectives of this method are central to school, extracurricular and future professional activities. However, this alone does not yet legitimize the use of the method in biology lessons from a didactic perspective. Previous studies on the handling of situational writing tasks in biology lessons lead to the conclusion that the writing of subject-specific texts is largely outsourced from the classroom in the form of homework (Thürmann et al., 2015). It could be observed that in biology lessons themselves, only about 6% of the class time is spent on written work phases. In addition, the writing tasks ultimately used are mainly reproductive forms of writing. This also applies to subject-specific text types, such as experimental protocols (Thürmann et al., 2015). From this, it can be concluded that although writing activities are a prerequisite for teaching science lessons, they are hardly ever integrated into subject-specific work (Gebhard et al., 2017). Many of the special topics of biology lessons defined in the curriculum touch on bioethical issues of society, which are of relevance to the students. Material-based writing offers a methodical approach that enables the students to deal with these topics in depth and independently. This has the potential that students learn to position themselves on controversial socio-scientific issues in a controlled way. In this way, they are gradually introduced to a scientific language and discourse culture (Schüler, 2017).

3 Methodology for Data Sources, Collection, and Analysis

The participants of this study are ninth graders of two biology classes and two German language classes. In the biology classes, the course lasted 360 min (for the design see Table. 12.1 – topic 1–3) and in the German language classes 180 min (only topic.3). The procedure in topic.3 always includes the reception phase (collecting arguments), the production phase (writing a statement) and a written survey. In addition, in the biology classes, the subjects were sensitized to the topic beforehand. A total of 181 students took part in the study (Ø age: 14,5). The students’ performance was defined by their last report card grade in biology.

Table 12.1 Overview of the lesson design (post-cycle.3)

The basis for this paper was the analysis of students’ written judgements, as well as teacher interviews (Table 12.2). The transcription and subsequent analysis of the collected writing products were carried out by F4 analysis (Version 2.5) based on the method of content-structuring qualitative content analysis according to Mayring (2015). Cohen’s Kappa was calculated using the online tool: ReCal2. The Ø coefficient is 0.79. The competence levels of the written judgements of the students were analysed using the diagnostic grid (see Table 12.3).

Table 12.2 Overview of the data used in this paper
Table 12.3 PARS-model – diagnostic grid – characterization of the quality of argumentation structures in the context of ESD science discussions in biology lessons (post-cycle.3)

All judgements were evaluated using the PARS model (see Table 12.3 again). When working with the model, it quickly became apparent that only four out of five categories (cf. Perspective, Scope, Knowledge, and Solutions) were suitable for evaluation, as the category Values was not explicitly part of the task, and was therefore not designated in the judgement. In this way, the use of this category was only possible if the evaluator had knowledge of the authors’ value system. The teachers were involved in the evaluation through the interviews (in cycle.3). They were asked to evaluate two previously selected judgements with the PARS model before the interviews. Care was taken to ensure that the two judgements differed in quality to create a contrast. This was to ensure that the differences in quality were also independently confirmed. The actual interviews then dealt with the evaluation as well as the work with the model and possible suggestions for improvement.

3.1 Diagnostic Grid (PARS-Model)

Based on the results of the cycle.2, it has become clear that there is a lack of judgement competence grids that are practicable in the everyday work of a teacher and context-specific concerning the learning products.

In 2014, Christenson and Chang Rundgren (2015) presented a framework for evaluating argumentation structures in the context of socio-scientific issues, which aims to capture both the content of a learning product and the argumentation structure. The aim is to provide teachers with a tool to identify quality-producing and defining features of arguments in the context of problem-oriented judgement. Against this background, a grid was developed that additionally includes further sub-elements from other models to take the ESD perspective relevant to biology lessons more strongly into account. The basis of this grid is the SEE-SEP model, which relates the thematic complexes: sociology/culture (sociology/culture; S), environment (environment; E), economy (economy; E), science (science, S) ethics/morality (ethics/morality; E) and politics (politics; P) to the personal aspects of knowledge, values, and personal experiences. In addition, the SOLO classification was used to differentiate the levels (Biggs & Collis, 1982). The claim of complexity reduction is expressed above all in the approach postulated by Christenson and Chang Rundgren (2015) of a concrete judgement or evaluation and the associated justification, whereby both pro and contra arguments are considered. The arguments can consist of value-based, as well as technical, arguments. Value-based arguments indicate how something should or may be, i.e., they are normative. The technical arguments are mainly descriptive. They are therefore descriptive in character. In the PARS model, the value-based arguments are considered, but unlike the SEE-SEP model, they are not equally weighted and are only represented in one category (see Table 12.3 – Values). The technical arguments are taken into account to a greater extent than if an entire category is dedicated to the ESD aspect mentioned above (cf. perspective ESD) and, according to the Göttingen model of evaluation competence, spatial dimensions [local/global], as well as the time dimension (short- /long-term consequences), are considered in a separate category (cf.) In addition, the category of solution finding was taken over from the Göttingen model. The latter two categories had to be modified because the original grid did not take as its basis the specific characteristics of the construct of judgement competence described in the German educational standards for biology (KMK, 2005). However, since the approach to socio-scientific issues is comparable, if not identical, to judgement competence, only slight modifications were necessary (Hostenbach et al., 2011). For application in everyday school practice, an additional operationalization according to the curriculum of the federal state of Bremen was carried out for the secondary schools, considering the increasing level of difficulty (cf. Level.0 → Level.4) of operators with a higher level of difficulty (see Table 12.3 again).

4 Results and Discussion

The following three sub-chapters are structured based on the questions mentioned at the beginning, with the three cycle-specific questions being answered first and the core question in the conclusion.

4.1 Results: Cycle.1

  • (RQ.): Which aspects of the design prototype show the most significant potential for development (practical/theoretical)?

The results show that the first prototype of the lesson design only partially worked in the given framework conditions of the cooperation school and that there is a lot of potential for improvement.

(…) And that discouraged and that’s why it didn’t work and because the methods built on each other the next one didn’t work either and that’s why the discussion was heavily moderated. (…) The students told me in feedback to the (.) about these methods, the reason why they didn’t adopt them is that it would have required text work, so the implementation.

Source: Interview.1_Teacher.1_Cycle.1_Paragraph.341 (Translation: German → English)

A more general criticism that applied to the entire prototype was that the information texts used were simply too extensive and complex to be processed in the time provided. The biggest problem, however, was the third part of the lesson, since the promotion of decision-making competence was simply not successful. The central problem was that the transfer of information (arguments) from the used educational film to the list of arguments did not take place, which is why the subsequent steps of classification into pro/con protection and the discussion in plenary did not work. Regarding the research design, there was also the problem that theoretically only the argument lists were available as a data source and that, for example, a video of the discussion did not take place. There was also a lack of a suitable tool to analyze the quality of this discussion in relation to ESD.

On the practical side, accessibility must be increased through simpler texts and more explanatory illustrations, and the difficulty of the lesson must be significantly reduced. On the theoretical side, the promotion of decision-making competence in the context of the chosen topic shows the greatest potential for promotion. This is also confirmed by the teachers interviewed in the first cycle.

(…) So, in relation to the ecosystem, I think decision-making competence is very, very important, but evaluation is not possible without content. Basically, you must have worked out something properly to present the problem, (...) you can only love something that you also know. (…)

Source: Interview.1_Teacher.1_Cycle.1_Paragraph.114 (Translation: German → English)

4.2 Results: Cycle.2

  • (RQ.): Which criteria must be modified to improve the selected methods for promotion and diagnosis of the decision-making competence (cf. material-based writing/competence grid)?

Compared to the first cycle, the revised lesson worked better, which is reflected, among other things, by the fact that the number of tip cards used has decreased (see Fig. 12.2). In particular, the reduction of the amount of text and the accompanying reduction of technical terms, as well as the replacement of text content with explanatory illustrations, seems to have improved the processing behavior. Here, however, much more needs to be reduced or simplified for low-performing students with German as a second language. The tip cards worked better after the revision in the sense that there were fewer comprehension problems. However, they were used less often, which is an indication that the need for this form of assistance was not quite as high.

The promotion of decision-making skills through the method of material-based writing in the third part of the lesson was successful. The students worked out arguments from information material and arranged them in a list (of arguments). In a reasoned judgement on the question of whether the swamp should be protected, the use of these arguments works. However, it was problematic that the task posed enormous problems, especially for the low-performing students, as the formulation of the judgements requires linguistic skills that were not yet present in this target group. A further problem is the focus of the diagnostic grid used on precisely these linguistic skills.

Regarding its use in biology lessons, the focus of the grid in cycle.3 should be shifted to the subject-specific ESD aspects and the task for formulating a judgement should be better supported.

4.3 Results: Cycle.3

  • (RQ.): How suitable is the modified teaching method (cf. material-based writing) to promote decision-making competence? How suitable is the developed competence grid (PARS model) for diagnosing decision-making competence?

The learning effectiveness of the design could be further improved compared to cycle.3. The triple differentiated working materials (differentiated according to the reduced amount of text, shorter sentences, fewer technical terms, and more explanatory illustrations), as well as the possibility for the students to choose them independently, have further increased the processing behavior, especially of the students who tend to be low performing. The tip cards were used even less frequently compared to the previous cycles, which is seen as a positive sign that the need for such support has decreased, even though it is still there (see Fig. 12.2 again).

The third part of the lesson works better with the revised methods of material-based writing compared to the second cycle. Through the support of judgement formation by a guideline, as well as the use of partner work, even low-performing students can be led to write a reasoned judgement in the ESD context. The results show that the students always argue in favor of swamp protection, whereby the ecological and economic aspects are particularly important to them. The social aspect only plays a minor role, and its relevance seems to be linked to the students’ personal experiences. In about half of the cases, the approaches to solutions are compensatory: the students mostly try to offer compromises between the different parties of interest. Most of the students think in very long-term time frames and global, spatial dimensions. They perceive swamp protection as a task for society (Fig. 12.3).

Fig. 12.3
A stacked bar graph of the levels achieved between the students. The data is as follows. Cycle 2 students, level 4 0, 3 26.66, 2 16.66, 1 56.66. Cycle 3 students, level 4 0, 3 13.33, 2 63.33, 1 23.33. Cycle 3 students, 4 16.66, 3 72.22, 2 5.55, 1 5.55.

Comparative presentation of the levels achieved between the students of cycle.2 and cycle.3. Contrast this with the levels achieved by the expectably higher-performing university students. Since the number of participants differed between the various groups, the values were converted into percentages for normalization. Levels 1–4 are derived from the average scores of the four categories (cf. perspective, scope, knowledge, and solutions) achieved by each student

5 Conclusions

  • (RQ.): How must a lesson be designed to comply with the conditions of the cooperating school and address the school-specific requirements?

To measure the (learning) efficiency over the course of the 3 cycles using the designated designs, five markers have been defined (repetitive data acquisition in each cycle at predefined points). These markers are (I) the process log (a competency grid to record the progress of individual students), (II) the complexity of the food web, (III) the quality of the decision, (IV) the lines of argumentation, and (V) the revision of the PARS-model. The overall results show that the advancement of students in highly heterogeneous groups can be successful if the education material is differentiated on multiple levels. These materials must include suitable language (simplified language with reduced use of specific terminology) and a higher percentage of descriptive illustrations than complex text passages. Further support for students can be provided by taking on preparatory tasks, considering time as an additional crucial differentiation factor. The study shows that low-performing students can successfully produce a qualified opinion that meets the requirements of the ESD context by using supporting guidelines and/or partner work. The results also show that the students always argue in favour of swamp protection, whereby they describe ecological and economic aspects as the most important ones. However, students with personal experiences regarding swamps assign social aspects as another critical factor, unlike students without personal experiences. Approximately half of the students try to use compromises to negotiate between the interested parties. Most of the students consider extended periods and global dimensions and classify swamp protection as a challenge to be addressed by the whole society. To further improve the developed materials and methods of the education units, it is recommended that the use of specific terminology in the evaluation process of the opinions be implemented. Furthermore, the linguistically improved materials should be integrated into a more extended teaching lesson on an ESD-approved topic. This should guarantee a continuous improvement of the pre-existing lesson. The continuous improvement of how lessons are taught is of significant importance regarding differentiated educational materials for Oberschulen (high schools) in Bremen, in the field of biology and science. For future projects to further improve the PARS model, it is recommended using additional empirical studies with a higher number of test persons and different school types, such as grammar schools. What remains is the problem described at the beginning with the category: values, which requires an understanding of the students’ values and can therefore only be assessed by teachers who have knowledge of these values. Level.4 of the PARS model was only achieved by university students. This was not surprising given the requirements, but for the purposes of clarity, it should be mentioned that the task for ninth graders did not require them to reflect on the topic, and level.4 thereby set requirements that the students were not supposed to achieve at all. A further development of the PARS model could therefore be to reduce the requirements at this level or use the model only with level.3 or to revise the assignment for the students regarding the ability to reflect. Overall, the exchange within a community of practice, including science educators, to create a lesson that fulfils the school-specific requirements of an integrated comprehensive school, was a success.