Introduction

During the past decade, there has been evident progress in applications of digital materials in learning mathematics, as educators have become more aware of the importance of such instructional tools in educational practice. In line with this progress, teaching equipment (both hardware and software) has been modernized at all levels, from primary schools to higher education (De Witte & Rogge, 2014; Takači, Stankov, & Milanovic, 2015). Furthermore, extreme situations such as the COVID-19 pandemic highlight the importance of having digital versions of learning materials available. Digital learning materials span a wide range in terms of form and content, but a question common to all is how effective they are.

Although learning environments have changed over the years, the long-standing problem of mathematics education is that students often apply routine thinking, while educators expect them to develop greater understanding, make complex conceptual connections, and be problem solvers (Bergqvist & Lithner, 2012; Hiebert, 2003; Lampert, 1988; Lithner, 2008). The development of different types of student mathematical reasoning is an important aim of contemporary mathematics education. Special attention is given to creative mathematical reasoning that will prepare students for solving non-routine mathematical problems and life-long learning (Freudenthal, 1987; Granberg & Olsson, 2015; Norqvist, Jonsson, Lithner, Qwillbard, & Holm, 2019). In one of the early studies, Lithner (1998) stresses that students often focus on what is familiar to them and on what they can easily recall at a superficial level, avoiding the use of creative mathematical reasoning (perceived as a complex thinking activity). It is imperative to foster and improve students’ creative reasoning, but in the same time, it is the most difficult task for educators (Freudenthal, 1987; Norqvist, 2018; Norqvist et al., 2019).

Researchers have shown that students have more opportunities to learn facts and practise simple procedures to solve problems than to engage in more extensive learning processes and deeper cognitive strategies (Boesen et al., 2014; Hiebert, 2003; Jonassen, 2000). As a consequence, students encounter difficulties when trying to solve complex mathematics tasks and tasks that require creative mathematical reasoning (Boesen, Lithner, & Palm, 2006; Lithner, 2004; Palm, Boesen, & Lithner, 2005). However, experimental studies have shown that students who were involved with solving complex tasks and developing creative mathematical reasoning outperform students who were exposed to routine problems only (Jonassen, 2000; Kapur, 2011). These results indicate that high-level mathematical reasoning can be learned and complex mathematical skills can be developed, but this is a multifaceted and complex educational challenge that goes beyond one educational system, and more comprehensive research of effectiveness of different teaching strategies is needed (Freudenthal, 1987; Lampert, 1988; Lithner, 2015). Moreover, insights about the effects of interactive learning materials on student success in solving mathematics tasks that require different types of reasoning (as described below) are still lacking.

Bearing in mind the above-mentioned issues, the purpose of this study is to investigate the effects of an interactive learning environment that is a digital version of standard school learning materials in Serbia. Our interactive learning environment is based on a particular instructional design model and use of available tools (mostly GeoGebra), with the purpose of helping students to better understand mathematical concepts and properties. We will investigate to what extent the design of such a learning environment supports students in solving tasks that require different types of reasoning.

Theoretical Framework

In an extensive body of work, Lithner (1998, 2000, 2004, 2015) explored the relationship between mathematical tasks that students encounter and type of reasoning they need to solve these tasks. He defined mathematical reasoning as “the line of thought, ways of thinking adopted in order to produce assertions and reach conclusions” (Lithner, 2000, p. 166). Lithner (2008) gives two basic types of reasoning: imitative reasoning (IR) and creative mathematics reasoning (CMR) and then divides these into subtypes. IR is defined as reasoning connected to routine tasks, where the respondent imitates procedure or facts memorized from the learning materials. IR includes two subtypes: memorized reasoning (MR) and algorithmic reasoning (AR).

MR takes place when recalling a complete answer or when the strategy implementation consists only of writing it down. Usually in this kind of reasoning, previously learned material has been just recalled. For example, for the question: “How many dm2 are in m2?”, a student can give the correct answer merely by recalling information from the book without deeper insight into the concept of area. Similarly, this type of reasoning is usually present when students memorize mathematical proofs without being able to explain them.

AR is reasoning based on repetition of already learned familiar procedures, which lead to the conclusion. Actually, this kind of reasoning is based on recalling the solution algorithm, and usually is applied in calculation tasks or tasks which do not include the need for creating a new solution. In this kind of tasks, the major challenge for student is to choose the appropriate algorithm. For example, in the task: “Find the area of a triangle if a = 5 and ha = 3”, a student can easily find the area only by applying an algorithm without really understanding what ha, a, or area are. Hence, AR may be applied even with weak understanding of a topic (Lithner, 2008).

While MR and AR are characterized with a lack of conceptual thinking and deep insight into intrinsic mathematical properties of elements, these are indispensable in the case of CMR. Students usually apply CMR when they are faced with a new task and need to create a new reasoning algorithm (novelty). Moreover, it involves both plausibility (arguments which support strategy choice) and mathematically founded arguments (based on understanding of mathematical properties of the objects involved in the reasoning).

In an attempt to design an outline for task classification according to appropriate mathematical reasoning, Boesen, Lithner, and Palm (2010) present a framework for classifying tasks, which relies on the study of Palm et al. (2005). The classification is made according to the similarity between tasks that students have to solve, and tasks and information that can be found in textbooks and other learning materials. It includes the following steps: (a) defining the task solution; (b) describing tasks through variables such as information about mathematical components and various hints; (c) identification of the rules, theorems, described facts, exercises, and tasks in learning materials which have the same (and similar) solutions; and (d) conclusion about relatedness between tasks and learning material. If there are at least three exercises, examples, rules, etc. with similar characteristics as the task in the learning materials, it will be classified as high relatedness, while in all other cases, it will be classified as a low relatedness task (Boesen et al., 2010).

Within high relatedness tasks, two subtypes are defined: high relatedness answer (HRA1) and high relatedness algorithm (HRA2). HRA1 are tasks whose solutions require textual answers or facts that can be found in learning materials. HRA2 are tasks whose solutions are reached through implementation of algorithms available in learning materials. HRA1 and HRA2 correspond to MR and AR, respectively (Boesen et al., 2010).

Low relatedness tasks are divided in two subtypes, local low relatedness (LLR) and global low relatedness (GLR). LLR are tasks whose solutions can be reached through modification or combination of different algorithms available in learning materials. GLR tasks require creating a new (for the student) solution, which is to a lesser extent based on previously seen algorithms and facts. Usually, these tasks are new for reasoner, and are often presented in real-life contest. The framework from Boesen et al. (2010), together with connections to the Lithner’s (2008) types of reasoning, is presented in Table 1. We note that the above categorization is closely aligned to levels of cognitive demand (National Council of Teachers of Mathematics [NCTM], 2014), where high relatedness corresponds to low-level tasks, and low relatedness corresponds to higher-level tasks. Although there are various task categorizations according to students’ reasoning, we used the classification given by Boesen et al. (2010) in our research. Their work presents the task taxonomy based on the similarity between tasks and information that can be found in textbooks, which best fits the aims of the present study.

Table 1 The connection between types of tasks and corresponding reasoning

Conclusions of Previous Studies

The type of a task can be an important trigger, which initiates the kind of reasoning students apply (Jäder, Sidenvall, & Sumpter, 2017). When students are faced with a task that has similarities to a task they have already seen in the textbook or in class, they will try to recall the ways of solving it or relate to a suitable algorithm, i.e. students will apply MR or AR (Boesen et al., 2010). However, when students are faced with a completely unfamiliar problem, they usually reason in a way that can be characterized as CMR. Furthermore, in such a situation, students sometimes apply IR but without success (Jäder et al., 2017). In the conclusion of their paper, Boesen et al. (2010) write that students in their study achieved higher scores on problems that required the use of IR.

Bergqvist (2007) points out that school tests in mathematics in Sweden are created so that over 70% of the tasks can be solved by applying only MR and AR, while tasks that require CMR are often neglected. In addition to this, Bergqvist (2007) notes that many teachers create tests that rely heavily on IR. This phenomenon emphasizes the important role of learning environments and learning content, i.e. ways in which the material is presented and the type of outcome that is expected of students. Lithner and Bergqvist (Bergqvist, 2007; Lithner, 2000, 2004, 2008) stress the importance of the approach in presenting mathematical content. They acknowledge that difficulties in learning CMR are with respect to lack of (a) reflection on learning, (b) mathematical arguments, and (c) mathematical foundation during the learning process (Bergqvist & Lithner, 2012). In a study by Jonassen (2000), one group of students was exposed to content that requires AR, while another group of students was engaged with materials that require more CMR. Students who were oriented toward more CMR content performed better on the final exam.

Granberg and Olsson (2015) present the advantages of using the software GeoGebra for developing CMR in students. As one of the main advantages of GeoGebra, the authors stress the importance of immediate feedback (the so-called creative feedback). Creative feedback does not give simple clues for the next step or the final answer, but helps students visualize their ideas, enabling them to connect mathematical properties with reasoning. In such a case, a direct answer is not given, but every move in the applet is visualized to allow students to test and verify/falsify their own ideas in solving tasks. Creative feedback (1) fosters students to experiment and evaluate their strategies in order to construct understanding (Olsson, 2017), and (2) allows students to evaluate and interpret their conclusions (Olsson, 2017). Earlier findings indicate that GeoGebra software, due to these advantages, is a suitable environment for developing CMR (Granberg & Olsson, 2015). Dragging, another advantage of the GeoGebra, allows discovering invariant properties and investigating geometry problems using various strategies, such as exploring, conjecturing, validating, and justifying (Arzarello, Olivero, Paola, & Robutti, 2002). Dragging enables students to feel “motion dependency”, which can be related with logical dependency within the mathematical context (Baccaglini-Frank & Mariotti, 2010), and contribute to the formation of abstract knowledge (Leung, 2008). Finally, GeoGebra supports students in linking together geometric interpretation, algebraic properties, and arithmetic interaction through the interactive representation of mathematical objects and concepts. To the extent of our knowledge, there is a lack of investigations of the effects of learning materials with the GeoGebra applets on imitative reasoning.

Present Study

This study is the continuation of a project that analysed the influence of the Interactive Learning Materials Triangle (iLMT) on students’ attitudes toward mathematics, mathematics classes, and textbooks (Radović, Radojičić, Veljković & Marić, 2020). In a previous study (Radović et al., 2020), we emphasized that students recognize interactive learning environments such as iLMT as a positive and welcome material in mathematics learning. Although the overall impact of iLMT was highly positive, authors stated that a more detailed study was needed, which would give better insight into students’ understanding of the material and students’ reasoning.

The iLMT content is similar to the standard learning environment, i.e. official mathematics textbooks for sixth grade (same definitions, theorems, and tasks), but the material is presented in digital and interactive form. In particular, iLMT is based on certain instructional design models and utilizes information-communication technologies such as GeoGebra, which can contribute to the development of different forms of mathematical reasoning in students (Arzarello et al., 2002; Baccaglini-Frank & Mariotti, 2010; Bergqvist & Lithner, 2012; Granberg & Olsson, 2015).

Mathematics classrooms in Serbia are still, to a large extent, knowledge-oriented, with the practices being largely teacher-centred (Radišić & Baucal, 2018). The students’ role is, for the most part, receptive, as lecturing is the dominant form of teaching (Mincu, 2009; Radišić & Baucal, 2018). The official curriculum for the sixth grade in Serbia states that mathematics should be taught so that it enables students to develop abstract and critical thinking and apply the acquired knowledge and skills in further education. However, the prescribed learning outcomes related to triangles can only be met by activating IR, and it is noticeable that the largest number of tasks in textbooks is related to specific calculations and implementation of known algorithms for solving tasks. Only about 30% of tasks in regular textbooks can initiate the development of students’ CMR (in regard to the content used in this study).

Having in mind the advantages of GeoGebra applets and digital learning materials, the aim of this study is to investigate the effect of the developed interactive learning environment iLMT on student success in solving mathematical tasks that require different types of mathematical reasoning. Considering that iLMT presents a digital, student-oriented, and interactive version of textbook materials used in the standard environment (i.e. control group), but grounded in a particular instructional design model and the use of available tools (mostly GeoGebra), this study provides a new and different contribution from similar previous studies (Granberg & Olsson, 2015; Jonassen, 2000). Unlike other studies that tested materials specifically created to foster a certain type of reasoning, we investigated the effect of digital versions of standard learning materials presented with available advantages of information-communication technologies. In this paper, we will explore the influence of iLMT on four groups of tasks introduced earlier in this paper: HRA1, HRA2, LLR, and GLR. Thus, the main research questions underlying this article are as follows:

  1. 1.

    Are there any significant effects of group (control or test), gender, or their interaction on student success on HRA1, HRA2, LLR, and GLR tasks, on knowledge and knowledge retention tests?

  2. 2.

    Are there any significant effects of group, gender, or their interaction on student success in solving tasks that belong to two different types of reasoning (IR and CMR) on knowledge and knowledge retention tests?

Design of the Interactive Learning Materials Triangle

The iLMT environment is developed as a supplementary teaching material in digital and interactive form. The main principles for design of interactive learning environments, on which the iLMT is grounded, are indicated by Chiu and Churchill (2016) and Radović et al., (2020). The instructional design model for interactive learning of mathematics consists of providing a learning environment encouraging and valuing the following: (1) experimentation with ideas and receiving feedback, (2) making students’ mathematical thinking and understanding of problem situations public, (3) giving students autonomy for making and testing conjectures and playful exploration of mathematics ideas, (4) allowing direct manipulation of objects and linking theory with applications, and (5) linking the symbolic to the visual. iLMT is based on the regular teaching material (text book) used by students, coupled with main advantages of GeoGebra applets and interactivity, which should empower students to explore mathematical concepts and foster use of different types of reasoning (Lampert, 1988). The applets are used as a supporting tool in the process of developing mathematical ideas and reasoning. Moreover, with creative feedback, GeoGebra helps students to visualize their ideas and supports them in developing understanding of mathematical properties and in algebraic interpretation. This kind of visual creative feedback allows students to interpret their conclusions with respect to cognitive models and reasoning they develop.

The content of iLMT is in line with the national curriculum (Radović et al., 2020). Special attention in creation of iLMT was also paid to recommendations given in Lithner’s extensive research about mathematical reasoning (Lithner, 2000, 2004, 2008). Creation of materials is supervised and checked by experts in the field of methodology of teaching and education technology from the Faculty of Mathematics at the University of Belgrade and elementary school teachers. After approval by experts, final versions of the instructional materials were presented in a digital format.

The Content of iLMT

The units in iLMT are designed for students in elementary school, age twelve. Teaching materials cover triangle geometry, which involves the following teaching units: (1) the concept of a triangle, (2) elements and types of triangles based on side lengths, (3) the sum of the interior angles of triangles and triangle types based on angle size, (4) the sum of exterior angles and the ratio of sides and angles in the triangle, and (5) the triangle inequality. There were 5 lessons that introduced new mathematical concepts and 4 lessons that contained examples of and exercises for those concepts. Content of the learning materials in iLMT presents a digital and interactive version of the materials that was used by students in the control group. We have calculated that 17% of the iLMT tasks can be classified as HRA1, 30% as HRA2, 30% as LLR, and 23% of them as GLR.

Each iLMT learning unit consists of three sections: introductory, main, and final part. The introductory part of a lesson serves to remind students of essential concepts and relationships they learned in previous lessons. It is combined with GeoGebra applets to review previous material and to give motivation for the new content (Hohenwarter, Hohenwarter, & Lavicza, 2008). Various types of materials are used to address IR: students repeat definitions, annotate triangles, fill in the gaps in sentences, or make appropriate moves in applets. For example, in the material helping the acquisition of the definition of triangle, in the test group, students have the opportunity to fill in the gaps in the definition, choose the correct definition from a few options, answer questions about triangles, and mark correct statements about triangles. In all cases, students can repeat tasks as many times as they want, and have immediate feedback on whether a given answer is correct or not. There is also a “check” button in this type of task, which allows displaying the correct answer.

The introductory part of the second lesson also contains interactive tasks about triangle annotation where MR is required, like the ones in Fig. 1. Furthermore, in the first task in Fig. 1 (upper picture), students annotate a triangle by moving letters to their appropriate places (HRA1, i.e. MR required). If a student moves the letter to the appropriate place in the triangle, the letter fits perfectly and then blinks green. But if the student moves the letter to the wrong place, the letter is not matched. As this is practising essential concepts they learned in previous lessons, immediate feedback helps them visualize their activity and increase their knowledge after each mistake. In that manner, through assisting students in remembering how to annotate a triangle, MR should be triggered. Differently from the iLMT, in the regular school context, students do similar tasks on paper. Students try to solve the task from the textbook, and then, the teacher gives feedback on the blackboard.

Fig. 1
figure 1

Interactive tasks in the introductory part of the lesson: The sum of the interior angles and types of triangles depending on the size of sides

In the second task (lower picture), students in the test group should classify triangles by their side lengths (MR and AR required). In this type of task, students have to know the classification of triangles by side length (HRA1, i.e. MR required) and to apply that definition (HRA2, i.e. AR required). They drag triangles to the appropriate group of equilateral, isosceles, and scalene triangles. If a student makes a mistake (triangle cannot fit into the group), it will reset and an additional theoretical hint will appear. Such hints remind students of the previously learned definition of different types of triangles, which contributes to MR development. Such feedback does not give simple clues for the next step or the final answer, but helps students to repeat the experiment and make a correct conclusion. During the entire time, the student has information about whether something is correct or incorrect. Students may try until they find the correct answer (which should foster AR through HRA2 task). As in the first task, in the regular school context, students solve the problem on paper and the teacher presents correct solutions at the blackboard.

In the main part of the lesson, the goal of the unit is achieved with the assistance of appropriate illustrations, texts, and interactive applets. These three types of learning aids help students to acquire new concepts, to reveal relationships that are presented in the unit, and to develop different types of reasoning. For example, in one of the interactive applets, students can drag angles to one vertex and form the straight angle (Fig. 2). Students might reshape the triangle and repeat the experiment with different angle sizes, to reveal that the sum of interior angles in a triangle is constant. During a learning period, students can repeat the experiment as many times as they want. Here, students have to form a conclusion with the support of GeoGebra, and get creative feedback on every move from the applet. This approach aims to enhance the development of CMR (through LLR and GLR task), as students are faced with an unfamiliar situation where they have to discover a rule and make an abstract conclusion. While students using iLMT form conclusions about the sum of interior angles of a triangle by experimenting, in the regular school context, the teacher presents the theorem of the sum of interior angles of a triangle on the blackboard from the regular textbook.

Fig. 2
figure 2

GeoGebra interactive applet in the main part of the lesson: The sum of the interior angles and types of triangles depending on the size of angles

The final part of each iLMT unit includes exercises, quizzes, questionnaires, and games. The main objective of the final part of the lesson is to review what was just learned. Each teaching unit contains examples placed in an authentic context and adjusted to the age and background of the students.

Overall, the learning design (within introductory, main, and final parts of a lesson) in iLMT is student-oriented, which encourages students to explore materials, interact with GeoGebra applets, and enter answers in different forms. Although students are faced with multifaceted and complex mathematics properties, they are supported with corresponding mathematical visualization and creative feedback suitable for the development of CMR (Granberg & Olsson, 2015; Lithner, 2008). For example, the applet shown in Fig. 3 prompts students to move vertices and angles’ sides to conjecture when it is possible to create a triangle depending on angle size and distances between points. Students can move vertices on the base and change angle sizes by using sliders or just by moving sides with a mouse. They can repeat the experiment as many times as they want but conceptual knowledge should be the outcome of their activities.

Fig. 3
figure 3

GGIA that helps discover when it is possible to form a triangle depending on angle size and side length

Research Method

Participants

Participants in this study were sixth-grade students. The study included 633 students (320 boys and 313 girls) and 13 teachers from 11 primary schools in Serbia. Nine schools participated with one teacher, and two schools participated with two teachers. Students were divided into two groups at random—test (320 students: 165 boys and 155 girls) and control (313 students: 155 boys and 158 girls) groups.

Procedure

The experimental period lasted twelve school classes (period of three weeks), taking into account that the students had 4 classes of mathematics per week. Every school class lasted 45 minutes. In the test group, students during regular class used iLMT in a digital classroom where every student has access to a computer. iLMT materials were used during the entire time, where the teacher guided students through materials from the introduction to the final part of the lesson. Students were instructed to the iLMT materials, were supervised in their work by the teacher, and were given instructions when needed. Students in the control group learned about the topic using regular textbooks during the regular classroom setting, where teacher manages the class and directs students to work. Students in the control group were learning the same content (theorems, definitions, similar tasks, similar exercises, examples) as the test group, but without the assistance of interactive learning materials. In the control group, the teacher guided students through teaching materials given in regular textbooks. For example, when learning new content, students in the test group first do experiments using interactive applets, and then make conclusions together with the teacher. In the control group, the teacher presents the same content through discussion with students with appropriate explanations and proofs on the blackboard. Also, while exercising, students in the test group got creative feedback from the iLMT and could try solving tasks again using different strategies. However, students in the control group, after trying to solve a task, would get feedback from the teacher only in the form of a correct answer on the blackboard.

Before the experimental period, teachers had undergone training to ensure the same treatment toward students in all groups. Several meetings were held for teachers to present their suggestions for the modification of iLMT. Minor changes were made to the material to make it fit into their daily teaching practice and material preference. Teachers were given detailed instructions regarding the entire procedure of the study.

Data Collection and Measuring Instruments

The data for this study were collected over the period of three months. Prior to the experiment, in order to measure students’ prior knowledge, grades (learning marks) from the previous learning period were collected. Students received grades on a 1–5 interval scale (1 denotes insufficient, 2 sufficient, 3 good, 4 very good, and 5 excellent knowledge). During the first class after the experimental period, students from both groups solved problems on a KT (knowledge test). Two months after the experimental period, a RT (retention test) was conducted.

The RT and KT were created with the aim of covering all four types of tasks (HRA1, HRA2, LLR, GLR) and measuring students’ knowledge and retention of learned material. All of the tasks given at KT and RT were categorized according to the presented theoretical framework. Formulation of tasks and relevance of task within RT and KT were constructed and checked by professors from the Faculty of Mathematics at University of Belgrade and experienced teachers of mathematics, as a result of joint discussions (Radović et al., 2020). Cronbach’s alpha for both tests is equal to 0.71. Afterward, a pilot study was conducted, which involved 60 students who had used iLMT for two weeks prior to KT, and 15 out of the 60 were also interviewed. Final adjustments and minimal changes were made to the tests after the pilot study.

There were 9 tasks on the KT: one (11%) HRA1 task, 3 (33%) HRA2 tasks, 3 (33%) LLR tasks, and 2 (22%) GLR tasks. On the RT, there were 7 problems: one (14%) HRA1 task, 3 (43%) HRA2, 2 (29%) LLR, and 1 (14%) GLR task (as shown in Table 2). On the KT, students should have shown that they mastered the complete studied area, including both basic and more demanding learned concepts. On the other hand, on the RT, we expected from students to show basic knowledge and understanding of the concepts. Consequently, the RT contained fewer examples. These two tests do not measure the same level of students’ knowledge, so their scores are not comparable. For each problem, after an analysis of students’ open answers, we recorded students’ solutions as a dichotomous variable denoted with 0—not solved and 1—solved. Student success was measured by their success rate on each task group, as well as for the tasks that require IR and CMR. Success rate was calculated as 100 × Number Of Successes/Total Number Of Tasks. Examples and number of tasks, their description, and possible values of success rate, on each of four types of tasks, on both the KT and RT are given in Table 2.

Table 2 The description of two tests (KT and RT) with the number of tasks, task examples, task description, and possible values of success rates, for four task types

Data Analysis

Pretest grades were summarized by mean and standard deviation (sd) for female and male students in the control and test groups. Effects of gender, group, and their interaction on grades were tested with Brunner nonparametric two-way ANOVA (Brunner, Konietschke, Pauly, & Puri, 2016). Success rates on four types of tasks and on tasks that require IR and CMR, on both KT and RT, were graphically represented with a pyramid bar graph, for students in the control and test groups. Brunner nonparametric two-way ANOVA was used for testing the effects of group, gender, and their interaction on success rate while solving HRA1, HRA2, LLR, and GLR tasks, on KT and RT. Testing the effects of group, gender, reasoning type, and their interaction on success rates on KT and RT was done using Brunner-Langer nonparametric mixed ANOVA (Brunner, Domhof, & Langer, 2002). ANOVA-type test statistic was used for testing the effects of within-subject factors (reasoning type) and ANOVA-type statistic with modified box approximation for testing the significant main effects and interactions involving only the between-subject factors (group and gender). Post hoc analysis was performed using the Brunner-Munzel test for independent samples (Brunner & Munzel, 2000) and the Munzel-Brunner exact rank test for paired samples (Munzel & Brunner, 2002).

P values < 0.05 were considered statistically significant. Data analysis was performed in statistical software R, version 3.5.1 (using R packages stats, nortest (Gross & Ligges, 2015), nparLD (Noguchi, Gel, Brunner, & Konietschke, 2012), rankFD (Konietschke, Friedrich, Brunner, & Pauly, 2016), and lawstat (Gastwirth et al., 2017).

Results

Students’ pretest knowledge was measured with grades they had received before the experimental period started. Mean (sd) grade in the test group was 3.5 (1.2) and 3.4 (1.2) in the control group. Mean (sd) grade for girls was 3.6 (1.2) and 3.3 (1.2) for boys. The effect of gender on students’ grades was statistically significant (F = 18.224, df1 = 1, df2 = 625.18, p < 0.001). Mean rank of grades of female students was 346.3, and 288.3 for male students, indicating higher grades for female than male students. Success rates of students in both control and test groups, while solving HRA1, HRA2, LLR, and GLR tasks on KT, are graphically represented in Fig. 4.

Fig. 4
figure 4

Success rates of students in the control and test groups while solving HRA1, HRA2, LLR, and GLR tasks on KT

Statistically significant effects on success rates on these task types for both tests and mean ranks of groups are given in Table 3.

Table 3 Statistically significant effects on students’ success rates on HRA1, HRA2, LLR, and GLR tasks

On the KT, by looking at the significant differences between success rates and mean ranks of the groups, we can conclude that students in the test group were more successful in solving HRA1 and LLR tasks than students in the control group, as well as that girls were more successful in solving HRA2 tasks than boys.

Success rate of students in the control and test groups, while solving HRA1, HRA2, LLR, and GLR tasks on RT, is graphically represented in Fig. 5. On the RT, students in the test group were more successful in solving HRA2 and LLR tasks than students in the control group, and girls were more successful in solving HRA1, HRA2, and LLR tasks on the RT than boys. On the KT or RT, there was no significant interaction effect of group and gender on students’ success rates. There were no statistically significant effects on students’ success on GLR tasks on either test.

Fig. 5
figure 5

Success rates of students in the control and test groups while solving HRA1, HRA2, LLR, and GLR tasks on RT

Reasoning Type

Statistically significant results and mean ranks on tasks that require IR and CMR on the KT and RT are given in Table 4. On both tests, students were more successful in solving tasks that require IR than CMR, and students in the test group were more successful than students in the control group. On RT, girls were more successful than boys in solving both IR and CMR tasks.

Table 4 Statistically significant effects on students’ success rates on tasks that require IR and CMR

Analysis of Gender Effect

As it was stated, female students had higher pretest grades. To distinguish between the effect of mathematics knowledge (measured by grades) and real gender differences, significant gender effect was further analysed. We redid Brunner-Langer nonparametric mixed ANOVA, by including the grade as one of the factors. Procedure and results are described in more detail in the Supplement.

Task Types.

On the KT, effect of gender on students’ success rate on HRA2 tasks is not statistically significant (see Table 5 in the Supplement). On RT, gender has a statistically significant effect on success rate on HRA2, but not on success rates on HRA1 and LLR task types. Girls solved HRA2 tasks on the RT more successfully than boys.

Success Rates on Tasks That Require IR and CMR on RT.

On RT, both boys and girls were more successful in solving tasks that require IR than CMR, and girls were more successful than boys when solving both IR and CMR tasks.

Discussion

Groups of Tasks

Knowledge Test.

The results of this study show that students in the test group outperformed students in the control group in solving HRA1 and LLR. The use of iLMT with interactive text fields, the option to repeat a task multiple times, availability of correct answers, and various types of applets, influenced the development of cognition needed for solving HRA1. This indicates that the usage of GeoGebra as part of iLMT has potential for development of IR.

Higher success of students in the test group in solving LLR is another positive effect of iLMT. LLR presents part of CMR, which is an important factor for deeper understanding of concepts in mathematics and their application in real-life situations (Bergqvist & Lithner, 2012; Jonassen, 2000). Our result is in line with earlier studies that discuss positive effects of GeoGebra on CMR (Granberg & Olsson, 2015; Olsson, 2017, 2018). We consider the main properties of iLMT that contribute to the development of LLR to visualize students’ thinking, students’ autonomy, the possibility of repetition of experiments, and immediate creative feedback that does not give direct answer but encourages students to explore and make conclusions (Granberg & Olsson, 2015).

Despite the existence of high level of interactivity, immediate feedback, and advantage of GeoGebra applets, a positive effect of iLMT on solving GLR tasks is still lacking. The issue may be in the small number of tasks that support GLR in learning materials (both textbook and iLMT). Regardless of interactivity and fast feedback, more time should be dedicated to GLR tasks. Our results can be relevant to educators and more broadly to educational policymakers because results indicate that material in school textbooks does not sufficiently foster students’ ability to solve GLR tasks. Our findings are in line with the results from the PISA 2018 assessment, where only 5% of the students in Serbia (OECD average 11%) could model complex situations mathematically and apply higher-level reasoning (Organization for Economic Cooperation and Development [OECD], 2019).

Many countries support introducing digital and interactive materials in learning mathematics (Pepin, Choppin, Ruthven, & Sinclair, 2017). Furthermore, educational regulation in Serbia requires that each textbook must contain an electronic supplement as additional learning material. However, to the extent to our knowledge, there is a lack of evaluation of their effectiveness. Our results indicate that even when learning materials are digitalized with many technological advantages, GLR reasoning may still be lacking. Having in mind the importance of GLR in interactive learning material development and future research, special attention should be given to the improvement of iLMT regarding GLR tasks.

Retention Test.

The results showed that the students in the test group were more successful in solving HRA2 and LLR tasks than students in the control group. There was no statistical difference in solving HRA2 on the KT, but there was on the RT. This indicates that iLMT has a positive influence on students’ knowledge durability of learned algorithms. Better success of the students in the test group on both the KT and RT in solving LLR also indicates that iLMT is a suitable environment for training students for solving tasks that require LLR. Such results point out that students who used iLMT had deeper insight into algorithms, which allowed for combining them in LLR tasks. This is in line with other studies that support usage of GeoGebra in tasks related to CMR (Granberg & Olsson, 2015; Olsson, 2017). Another advantage of iLMT is student-oriented teaching. While students’ active role in the process of learning mathematical content in the control group was incidental, it was a widespread practice in the test group. Despite advantages provided by technology, as in the KT, there were no differences in students’ success on the RT between the test and control groups in GLR tasks. Such a result indicates that even carefully created digitalized materials may not make a significant effect on students’ success and knowledge durability in solving GLR tasks.

Findings Related to Reasoning Types

On both the KT and RT, reasoning scores of students in the test group were significantly higher than reasoning scores of students in the control group, and IR scores were higher than CMR, which is in line with other similar research (Bergqvist, 2007; Boesen et al., 2010; Lithner, 2000). Higher success of students in the test group testifies to the positive effect of iLMT. On the other hand, better success on tasks that require IR than CMR indicates the need for improving iLMT. Considering that the iLMT content is similar to that in the official mathematics textbooks in Serbia, these results indicate the need for fostering CMR content in both textbooks and iLMT. Our findings are in line with PISA results, which show that about 40% of the students in Serbia can only use their mathematical knowledge and skills in a familiar context with information given explicitly when applying routine procedures (OECD, 2019). Development of CMR requires that students deal with unfamiliar problems (Hiebert & Grouws, 2007), with the help of supportive creative feedback (Granberg & Olsson, 2015). The main change in iLMT, according to literature (Boesen et al., 2014; Hiebert & Grouws, 2007; Jonassen, 2000), can be in the ratio of IR and CMR content, where advantage will be given to CMR materials.

Gender Differences

The study found that female students solved HRA2 tasks on the RT more successfully than male students. This result is in line with Sumpter (2015) who stated that girls tend to use familiar strategies. Also, girls are considered diligent and hardworking and more prone to use strategies their teacher had showed them (Brandell & Staberg, 2008).

Furthermore, on the RT, the success rate of female students on IR and CMR was higher than that of male students. On the other hand, there is vast research that testifies the advantage of males in solving mathematical tasks (Brandell, Leder, & Nyström, 2007; Brandell & Staberg, 2008). In line with literature, there are many factors that can have an impact on gender differences such as learners’ attitudes, motivation, stereotype threat in mathematics tests, differences in socialization, and the impact of socioeconomic variables. Having in mind the delicate area of gender differences with all those conditions, it is expected that the change of learning environment can have an important role, which could be the case in applying iLMT. More comprehensive investigation is required to make a reliable assertion about gender differences.

Conclusion and Recommendation for Further Studies

In this study, we explored the influence of the iLMT on students’ success in solving four groups of tasks: HRA1, HRA2, LLR, and GLR. Considering that iLMT presents a digital, student-oriented, and interactive version of textbook materials used in the standard environment (i.e. control group), this study provides a new and different contribution from similar previous studies that tested digital materials specifically created to foster a certain type of reasoning.

Taken altogether, the findings of this study suggest one major contribution. Based on the discussed results, iLMT had a positive impact on students’ success on both the KT and RT to certain extent. Moreover, it represents suitable instructional material for teaching students for solving tasks that require different types of reasoning, with potential for improvement. Other scholars could use our approach and design principles, as well as our results to design even more effective, efficient, and enjoyable learning environments to teach mathematical concepts.

Use of digital learning materials in teaching mathematics is very delicate, and it must be done carefully and with special attention to the kind of mathematical reasoning it is designed for. However, we emphasize the great potential of GeoGebra applets, especially for improving students’ success in solving tasks that require CMR. We emphasize that the mere digitization of content will not bring significant effects on student success in solving all types of tasks. Our study showed that even carefully created digitized materials may not have significant effects on students’ success in the solving GLR tasks, if they are based on content that is insufficiently related to GLR. In addition to the many benefits of GeoGebra, we believe that more content in the textbook and interactive materials must be dedicated to the GLR tasks. Keeping in mind the wide usage of GeoGebra applets and rapid digitization of materials, we believe that these results can be useful to many educators. These results can be important primarily for scholars and teachers interested in improving critical thinking, abstract understanding, and facilitating connections between authentic learning and mathematical theory. However, we call for further studies to research connections between these aspects of mathematical understanding and interactive learning environments.

In terms of limitations, our research has been focused on teaching units about triangles, while other topics were not considered. However, it is possible that the significant effect of interactivity on the learning was due to the geometry topic. It would be interesting to witness the effect of iLMT (or effects of implemented characteristics of the instructional design) on different mathematics topics (for example algebra) or different student age. We also note that a short-term intervention could cause the halo effect of novelty (Radović et al., 2020). The question remains of what would happen if students used iLMT or similar materials over a longer time period.

Our research also indicates the greater success of girls in some cases. Further research may explore the fact that girls outperformed boys in solving HRA2 and to bring more light on the result that girls outperformed boys on the RT in solving tasks that require IR or CMR. We have not been able to provide more clarity to these interesting results.