## 1 Introduction

In contemporary mathematics textbooks for elementary education, students predominantly have to solve routine problems in which they have to reproduce and apply a fixed solution procedure in one or two steps (Kolovou et al., 2009; van Zanten & van den Heuvel-Panhuizen, 2018). However, it is increasingly considered important that students also learn to solve problems that are not straightforward and for which they do not have a learned solution immediately available (Schoenfeld, 1983), and which, therefore, may elicit other, more complex cognitive processes, such as creative thinking (Liljedahl et al., 2016). To promote creative thinking in mathematics, other types of problems should be offered to students, such as non-routine and open-ended problems (Carlson & Bloom, 2005; Levav-Waynberg & Leikin, 2012a; Silver, 1997). Different types of non-routine and open problems exist (Leikin, 2018), and it is an open question if all types of non-routine and open problems call upon creative thinking equally. To answer this question, we investigated the relations between students’ creativity and their performance on three types of mathematical problems in the upper grades of elementary school.

The current study was conducted in the mathematical domain of geometry. Geometry can be defined as grasping the concept of space and the mathematization of space. Geometry education aims to teach students to understand, explain and predict geometric phenomena, to reason spatially, and to order and organize spatial situations. For example, it is considered important that students be able to draw a map or to reason about the effect of the height of the sun on the length of the shadow (Gravemeijer et al., 2007). Typical topics in geometry in the upper grades of elementary school are (1) ‘spatial sense’, including localizing, taking a standpoint and navigation, (2) ‘plane and solid figures’, including spatial properties and relations between figures, operations, transformations and constructions, and (3) ‘visualization and representation’, including representations of two-dimensional and three-dimensional reality (Gravemeijer et al., 2007).

### 1.1 Types of mathematical problems

The opportunity for students to learn geometry in elementary school is largely determined by the geometrical content and geometrical problems presented to them in the textbooks teachers use for mathematics education (Meelissen et al., 2012; Stein et al., 2007). Different types of problems can be distinguished. Problems can be closed-ended with only one correct solution or, alternatively, open-ended with multiple solutions or interpretations (Bahar & Maker, 2015; Mihajlović & Dejić, 2015), and problems could be familiar and call upon well-established routines (routine problems) or, alternatively, require students to discover which facts, skills and procedures can be combined to solve the problem which is as such unfamiliar to them (non-routine problems; Carlson & Bloom, 2005; Schoenfeld, 1983). Most textbooks contain mainly closed-ended and routine problems and only a few open or non-routine problems (i.e., 0–8%; van Zanten & van den Heuvel-Panhuizen, 2018). By using both dimensions (open vs. closed, and routine vs. non-routine), four types of problems can be distinguished.

A typical example of a closed-ended routine geometry problem is a multiple choice test item in which students have to choose the correct fold-out of a cube or the correct picture of a block construction from a given perspective; these problems have one correct solution and call upon well-established routines (van Grootheest et al., 2011). Closed-ended non-routine problems require students to apply existing knowledge of facts and procedures to a novel problem not encountered before, for example when students are asked to draw a floor map of a still-life painting (Schoevers et al., 2020b). While this novel problem is conceptually similar to a routine block construction problem, the novel problem requires students to figure out how to apply their knowledge and skills to make the map (Gravemeijer et al., 2007). An example of an open-ended non-routine problem is a multiple solution task in which students have to compare different plane figures (e.g., isosceles triangle, right-angled triangle, square) and are invited to give multiple answers to the question of how the plane figures differ from each other. In this case, students need to recall, use and combine facts, skills, procedures and ideas in a new and meaningful way to solve the problem (Liljedahl et al., 2016; Schoevers et al., 2020b; Warner et al., 2003). Open-ended routine problems are rare, because textbooks generally contain closed-ended problems, which makes open-ended problems by definition non-routine. If a child would encounter a specific open-ended problem repeatedly, this kind of problem could become routine for the specific child, but this is strongly dependent on the context and cannot be generalized.

### 1.2 Creativity and problem types

Open-ended and non-routine geometrical problems usually require more complex cognitive processing than closed-ended and routine problems and particularly require students to think creatively (Liljedahl et al., 2016). Creativity can be defined generally as a thinking process that results in a novel and meaningful product (Runco & Jaeger, 2012; Sriraman, 2005). In elementary mathematics education, this product, for example, could be an idea that is new to the student, a newly posed problem, or a solution to a problem that is novel and meaningful for a specific age group or student (e.g., Leikin, 2009). Mathematical creativity thus also depends on formerly acquired knowledge and skills in the particular domain. This means that both domain-general creativity and domain-specific knowlegde are needed to apply creativity in a specific domain (Baer & Kaufman, 2005; Willemsen et al., 2020). Accordingly, mathematical creativity has been found to consist of both domain-general creativity and domain-specific (mathematical) knowledge (Schoevers et al., 2020a).

Several researchers have argued that especially non-routine mathematical problems—both closed (i.e., with specific constraints on the solution or interpretation; Bokhove & Jones, 2018) and open problems (e.g., Kwon et al., 2006)—have the potential to elicit creative thinking. Other researchers have argued, and indeed shown, that creative thinking is elicited by tasks in which students have to provide multiple solutions for a problem (Kwon et al., 2006; Leikin, 2009, 2018; Levav-Waynberg & Leikin, 2012b; Silver, 1995). Although the role of creativity has been investigated in both open-ended and non-routine multiple solution tasks (e.g., Schoevers et al., 2020a), to the best of our knowledge no study to date systematically compared the role of creativity in solving different types of mathematical problems. Therefore, in the current study we examined whether the relations between students’ domain-general creativity and their mathematical performance differ among three problem types: closed-ended routine, closed-ended non-routine and open-ended non-routine problems. Note that open-ended routine problems are not included, because it seems impossible to construct problems that are both open-ended and routine for all children (because it depends on child and context whether an open-ended problem could be regarded as routine). In our study, we adopted the basic assumption that for learning mathematics a certain level of domain general creative thinking skills could be helpful (cf. Kroesbergen & Schoevers, 2017) and thus that the association between general creativity and mathematical performance indicates the extent to which solving the mathematical problem calls upon general creative thinking processes.

### 1.3 Possible factors influencing the relation between creativity and mathematical performance

When studying the relation between creativity and mathematical performance, the possible effects of other factors that could at least partly explain the observed relation, should be taken into account. This holds in particular for students’ visual-spatial working memory (VSWM), general mathematical ability, gender, and socio-economic status (SES). Especially in solving geometrical problems, students’ VSWM is likely to be involved (Giofrè et al., 2013). VSWM refers to the ability to temporarily hold visual-spatial information activated for processing (Baddeley, 2003). When geometrical problems are related to spatial visualization and spatial sense, students may need temporarily to store and manipulate visual-spatial information. For example, a target figure, such as a block construction, needs to be stored temporarily in memory in order to mentally rotate the construction to see how it would look from another perspective. In addition, general mathematical ability should be taken into account, since basic mathematical procedures and knowledge about, for example, numbers, proportions and measurement, can be used to solve geometrical problems (Gravemeijer et al., 2007). Gender might also play a role. Although male students often score more highly on routine mathematical problems than girls (Reilly et al., 2015), girls have been found to outperform boys on multiple solution tasks (Mann, 2006), while they often also score more highly on creativity tasks (Baer & Kaufman, 2006). Finally, SES should be taken into account since a low SES is related to lower educational performance in general, and in mathematics in particular, possibly due to disadvantages in financial, cultural and social resources, and lower parental involvement in students’ education (OECD, 2016; Sirin, 2005).

In addition, class level factors may influence the relation between creativity and performance, such as teachers’ experience and the type of mathematical textbook that is used. Research has shown a positive effect of teachers’ years of experience on students’ mathematical performance (Clotfelter et al., 2007). More experienced teachers might be more effective, because they have improved their teaching performance over the years (Darling-Hammond, 2000). Furthermore, the contents of mathematical textbooks, in particular the types of problem they provide, may influence students’ performance as well (van Zanten & van den Heuvel-Panhuizen, 2018).

In the present study we investigated the relations between students’ independently assessed domain-general creativity and their performance on closed-ended routine, closed-ended non-routine and open-ended non-routine geometry problems. The aim of the study was to determine whether the relation between creativity and performance differed among the three problem types, which we assumed would indicate differences between the problem types in the degree to which they call upon creative thinking. VSWM, gender, age, low SES and general mathematical ability, teachers’ experience and the type of mathematical textbooks were included as covariates, in order to control for spurious relations between creativity and mathematical performance.

We hypothesized that students’ creativity had stronger predictive relations with performance on non-routine than on routine problems, because the former were expected to be unfamiliar to students and, therefore, to require stronger involvement of higher-order creative thinking compared to routine problems. Moreover, as there is consensus that in addition to unfamiliarity, openness of a task puts additional demands on creative thinking, we also expected that students’ domain-general creativity would have the strongest association with their performance on the open-ended non-routine task.

## 2 Methods

For the present purpose, pre-intervention data of a large-scale evaluation study of the Mathematics, Arts and Creativity in Education (MACE) program were used (Schoevers et al., 2020b). The MACE program was developed to support elementary schools in the Netherlands to meet the partly overlapping learning goals and objectives of the disciplines visual arts and geometry, and to promote students’ creative skills in both disciplines. We focused on students in the upper grades of primary school, because they had sufficient reading and writing skills to perform the planned tasks. In the present study, only the students who took all the tests relevant to the present purpose were included.

### 2.1 Participants

Participants were 1665 students from grades 3 to 6, in 92 classes at 50 schools in The Netherlands. Schools were recruited by sending flyers to 428 regular elementary schools in all regions of The Netherlands; 11.68% of the schools were willing to participate. Schools differed regarding their educational vision and teaching methods, and were located in both rural and urban areas spread across The Netherlands. The sample consisted of students with low, medium and high socio-economic background, of which 48.3% were boys, and with a mean age of 10.91 years (SD = 0.93).

### 2.2 Instruments

#### 2.2.1 Geometric Ability Test (GAT)

The GAT took between 20–30 min and was stopped after 30 min. The test consisted of 11 closed-ended geometry problems, of which four were routine problems, and seven non-routine problems (see Fig. 1). All routine problems called upon spatial sense and spatial visualization. In the non-routine problems, students mainly had to reason geometrically in relation to a painting. In one problem, students had to draw a floor plan of a painting. The test started with four routine problems and ended with the non-routine art-geometry problems. Within problem type, the order of the problems was randomized and was the same for all students.

The routine problems were relatively straightforward, with clearly one correct answer. Therefore, one point was given for a correct answer and zero points for an incorrect answer. An average score for the four routine problems was calculated; a total score between 0 and 1 could be obtained. The non-routine problems related to visual arts were more complex and could also yield partially correct solutions. Therefore, two points were given for an answer with correct reasons about geometric phenomena, for example when students explained why persons or buildings in front of the painting looked bigger than in the back by referring to perspective (e.g., ‘by painting smaller in the back and larger in the front). One point was given for an answer with incomplete reasoning (e.g., ‘by painting smaller’, without a specific explanation). Zero points were given for answers without reasoning (e.g., ‘it is a photo’ or ‘when you paint everything is possible’). The last non-routine problem of the task was considered too difficult for grades 4 and 5 (only 9% of the students scored > 0 points) and was not included in further analyses. An average score for the non-routine closed-ended problems was calculated based on the number of problems the students had finished.

The test–retest reliability of the GAT was acceptable (r = 0.66). Test–retest reliability was determined over a two-week interval, and involved 184 4th, 5th and 6th grade students. The interrater reliability ranged from sufficient to excellent for all items (κ = 0.81–0.1.00, ICC = 0.67–1.00, based on 25 tests). In line with our expectations, the internal consistency of the GAT was moderate in this study (α = 0.62), because the GAT represents a heterogenous set of knowledge and skills.

#### 2.2.2 Geometrical multiple solution task (GMST)

The GMST, based on the mathematical multiple solution task (Kattou et al., 2013; Schoevers et al., 2020a), took between 20 and 30 min. The task consisted of five geometry questions that were open-ended and non-routine. Students were instructed to provide multiple, but distinct and original solutions. A sample question of the test is depicted in Fig. 2.

For the scoring of the GMST, we used the scoring scheme of Leikin for creativity in the individual solution space (Leikin, 2009; Levav-Waynberg & Leikin, 2012a). Within this scheme a distinction between fluency, flexibility and originality is made. Fluency was calculated by adding the number of correct answers for each question. With the use of the scheme, each solution of a student was scored regarding flexibility (the use of a solution from a different (sub)group of strategies) and originality (the infrequency or unconventionality of a used solution). Next, a final score per solution was calculated as a product of Flexibility × Originality. Afterwards, a score per question was concluded as follows: Fluency × (∑ (Flexibility × Originality). Last, the scores of all questions were added into a total creativity score. In Fig. 3 we illustrated how the answers of a 6th grade student to a question of the GMST were scored.

The test–retest reliability of the GMST was good (r = 0.84 over a two-week interval; Schoeverset al., 2020b). The interrater reliability ranged from sufficient to excellent for all scores per solution (ICC = 0.72–0.99, based on 25 tests). The internal consistency of the GMST was moderate in this study (α = 0.68).

#### 2.2.3 Test for creative thinking-drawing production (TCT-DP)

Because children’s domain-general creativity is also related to their mathematical creativity, we included a non-mathematical, nonverbal creativity test. The TCT-DP is based on a more holistic concept of creativity (Urban, 2004, 2005). Students have to complete a drawing using given figural fragments, such as a half circle and a half square, within 15 min (Urban & Jellen, 1996). The task requires students to apply (cycles of) divergent and convergent thinking processes to create a final drawing. Students first have to think divergently about what they can draw starting from a (combination of) figural fragment(s) and then they have to choose what they will draw on the paper (convergent thinking process). Subsequently, they may think how they can elaborate on their drawing (divergent thinking). The TCT-DP was scored according to the guidelines in the manual (Urban & Jellen, 1996). The task was scored on fourteen aspects considered to constitute creativity, such as making connections, boundary-breaking, and unconventionality. The TCT-DP was scored by four raters and had a good interrater reliability in this study (ICC = 0.80, based on 25 tests).

#### 2.2.4 Visual-spatial working memory

The online computerized task ‘the Lion game’ was administered to measure students’ VSWM. The Lion game is a visual-spatial complex span task (van de Weijer-Bergsma et al., 2015). Students are presented with a 4 × 4 matrix containing 16 cells. In each trial, eight lions of different colors are consecutively presented at different locations in the matrix. Students have to remember the last location where a lion of a certain color has appeared, and to click on that location after the sequence has ended. The task contains 20 items at five difficulty levels with increasing WM load. The proportion of correct items was calculated. The Lion game is reported to have excellent internal consistency (Cronbach’s α ranged between 0.86 and 0.90), satisfactory test–retest reliability (ρ = 0.71), and good concurrent and predictive validity (van de Weijer-Bergsma et al., 2015).

#### 2.2.5 Questionnaire for teachers

A questionnaire was used to obtain information on student, teacher and class characteristics. The educational attainment of both parents was used as an indicator of students’ SES. General mathematical ability was measured with a criterion based test (Janssen et al., 2007), which covers a wide range of mathematical domains such as arithmetic operations, geometry, measurement, time, and proportions. The test has been shown to be highly reliable; the reliability coefficients range from 0.91 to 0.97 (Janssen et al., 2010). Furthermore, teachers were asked to provide information about their own gender, years of experience and the mathematic textbook(s) they used to teach mathematics.

### 2.3 Procedure

The data were collected in September 2017 by the first author and twelve research assistants with a bachelor’s or master’s degree in (special) education. Tests were administered individually in one session in a quiet classroom. Passive informed consent of parents was obtained before the start of the study for 99.2% of the students. The study was approved by the Ethical Committee of the Faculty of Social and Behavioural Sciences of Utrecht University (FETC15-083).

### 2.4 Analyses

Multivariate multilevel analyses were conducted in MLwiN 3.02 to take the nested structure of the data into account, and because the three outcome measures used in this study were expected to be correlated (Goldstein, 2011). A three-level model was used in which the three types of problems (level 1) were nested within students (level 2), which were nested within classes (level 3). We did not take the school level into account as a fourth level, as preliminary analyses revealed little variance (< 5%) located at this level, and no school variables were available that could be expected to be related to outcome measures. All variables used in the multivariate multilevel analyses were z-standardized on the grand mean.

In the first model, second- and third-level predictors were added for each outcome variable, as follows: VSWM, students’ age, students’ gender, low SES, general mathematical ability, creativity (level 2), teachers’ years of work experience, and dummy variables representing the four most frequently used mathematical textbooks (i.e., Wereld in Getallen (WiG), Alles Telt, Pluspunt and Rekenrijk; level 3). Effects of all predictors could vary between the different outcome measures. Next, for each predictor a new model was estimated to test whether effects were similar for all outcome measures. The order in which coefficients for each predictor were constrained to be equal was random. Based on the deviance and a chi-square test, we tested whether the new model, with common coefficients for a predictor, fitted the data. If this was true, common coefficients of a predictor were kept in the next model. This procedure led to a total of 12 models that were compared (see “Appendix 1”). Subsequently, if the effects of creativity on the three different problems were not similar, we tested if creativity differed significantly between the different problems by using contrasts. If necessary, coefficients of creativity were constrained according to the results of the contrasts in a final, 13th model.

Before the multilevel analyses were conducted, the assumptions were checked (Hox et al., 2018). With the use of SPA-ML (Moerbeek, 2015), we calculated the required sample size for a univariate model with two levels (i.e., students in classes), because the program could not calculate the required sample size for a multivariate model. With a desired power of 0.80 and expected effect size of 0.25 with 15% variance located at class level, and 85% at student level, a sample of 61 classes and 1220 students is required. Furthermore, the assumptions of linearity and absence of outliers were met for all variables. The assumption of normally distributed residuals was violated for the residuals at student level for the GMST. Therefore, robust standard errors are reported (Hox et al., 2018).

## 3 Results

### 3.1 Descriptive statistics

Teachers in this sample had on average 17.11 years of experience (SD = 10.36). Regarding the mathematical textbooks, we found that four textbooks were predominantly used for mathematics education (‘Wereld in Getallen’: 32.4%; ‘Pluspunt: 23.0%; ‘Rekenrijk’: 15.7%; ‘Alles Telt’: 11.9%). Four other mathematical textbooks were used much less frequently and pooled in one rest category which served as reference category. Descriptive statistics for the student measures can be found in Table 1. Table 2 presents the Spearman correlations between students’ creativity and their performance on the three types of problems. Creativity was related to all types of geometrical problems, but more strongly to the multiple solution problems than to the routine problems (z = 1.73, p < 0.01).

### 3.2 Multivariate multilevel results

First, an intercept-only model was estimated to calculate the percentage of variance located at student and class level. The variance components were nearly equal for each type of problem; 15% was located on class level for the closed-ended routine problems, 14% for the closed-ended non-routine problems and 12% for the open-ended non-routine problems. Subsequently, student- and class-level predictors were added for each outcome variable. Next, for each predictor we tested in a new model whether the effects were similar for all outcome measures. We found that the effects of age, low SES, teachers’ experience and the four mathematical textbooks were similar for all three types of problems. Effects of students’ VSWM, gender, mathematical achievement and creativity were different for the three types of problems (see “Appendix 1”). Subsequently, using contrasts, we found that the effect of creativity was similar for the closed-ended routine and closed-ended non-routine problems, but significantly different for the open-ended non-routine problems, χ2(1) = 0.68, p = 0.41. The final multivariate multilevel model can be found in Table 3.

The main question of this study was how creativity was related to the three problem types. The results show (1) that creativity is significantly related to all types of problems and (2) that creativity shows the strongest relation with the open-ended non-routine task. No difference in predictive value of creativity was found between the closed-ended routine and the closed-ended non-routine problems. These results show that students who score more highly on creativity, also perform better on all three types of mathematical problems, but also that this effect is the strongest for non-routine open-ended problems. In other words, the problem types differ in the degree to which they call upon creative thinking, with the non-routine open-ended type requiring the most creative thinking.

In addition, covariates were included to control for spuriousness. Some covariates were found to have a differential effect on the geometry problems. The results show that VSWM was a significant predictor of both types of closed-ended problems, however, no effect of VSWM was found for the open-ended problems. Children with higher working memory skills performed better on both closed-ended routine and closed-ended non-routine problems than children with lower working memory skills, but such an effect was not found for the open-ended non-routine problems. General mathematical ability was related to all types of geometrical problems, but the strongest to the routine problems. This means that students with high mathematical ability scored better on all problem types. This relation was even stronger for performance on both routine problems. In contrast, gender was not related to performance on the routine problems, but the results show a significant effect for both types of non-routine tasks, on which girls performed better than boys.

## 4 Discussion

The present study investigated the relations between students’ independently assessed domain-general creativity and their performance on three types of geometry problems: closed-ended routine, closed-ended non-routine and open-ended (multiple solution) non-routine problems. The results show that students who scored higher on the general creativity test, were also better in solving geometrical problems regardless the type of problem. This finding confirms our hypothesis and is in line with former research that also domain-general creativity is related to general mathematical ability (Kattou et al., 2013; Kroesbergen & Schoevers, 2017; Mann, 2006). The hypothesis that creativity would show the strongest relation with the open-ended non-routine task was also confirmed, but contrary to our expectations, the predictive value of creativity was the same for the closed-ended routine problems and the closed-ended non-routine problems.

The basic assumption of the current study was that the association between domain-general creativity and mathematical performance indicates the extent to which solving particular mathematical problems calls upon general creative thinking processes. Following this assumption, the current results suggest that open-ended problems trigger creative thinking of students more than closed-ended problems. This result raises the question of whether problems that trigger creative thinking could also be used to foster children’s creative thinking. The results from Levav-Waynberg and Leikin (2012a) indeed showed that systematically implementing multiple solution tasks in 10th grade geometry classes increased not only students’ geometrical knowledge but also their creativity.

Before drawing conclusions, alternative explanations for the current findings need to be considered. One alternative explanation is that not general creativity but general intelligence explains the pattern of findings, as creativity measures are often found to be moderately to strongly related to measures of general intelligence (e.g., Silvia, 2008). Note, however, that in the present study a measure of students’ visual-spatial working memory was included as a control covariate and that the associations of creativity with mathematical performance we found were net of the shared variance with working memory. Research has shown that working memory underlies the correlation between creativity and intelligence (Benedek et al., 2014). In addition, working memory has been found to be a stronger predictor of mathematical achievement in elementary school than intelligence (e.g., Alloway & Alloway, 2010). Therefore, explaining the pattern of findings as an effect of intelligence instead of indicating involvement of creativity seems less plausible. We cannot rule out that other generic abilities such as verbal skills, motivation or work attitude, might explain part of the shared variance (Carlson & Bloom, 2005; Elia et al., 2009). However, by including other covariates, in particular general mathematical ability, the involvement of such general processes is—at least partly—already controlled for. A specific interpretation, therefore, is more likely: especially open-ended non-routine problems call upon creative thinking.

Other findings confirm this pattern and suggest that different geometrical problem types involve different cognitive processes and problem solving strategies. VSWM and mathematical ability were relatively strong predictors of performance on both routine and non-routine closed-ended problems, but not of performance on open-ended non-routine problems. A possible explanation is that the format of the closed-ended problems is more similar to the typical mathematical problems with which the students are familiar. This aspect might trigger strategies such as the retrieval of knowledge or facts and procedures, and activating memory of previous experiences with similar problems to find the correct answer or procedure. In contrast, the multiple solution task was not only non-routine, but also open-ended. Openness of a problem seems to trigger creative thinking the most, as it stimulates students to find answers beyond what is already known and readily retrievable from memory (Kroesbergen & Schoevers, 2017; Leikin, 2018; Schoevers et al., 2020a).

Contrary to our hypotheses, the association of creativity with performance did not differ between routine geometry problems and non-routine geometry problems. We expected that the relative unfamiliarity of non-routine problems compared to routine problems would require more creative thinking and, thus, would show a stronger association between creativity and performance, which was not the case. A possible explanation is that openness of problems is more important than their familiarity. Closed-ended problems might trigger more convergent than divergent thinking processes in students, while open-ended tasks require especially divergent thinking skills. Although the closed-ended non-routine task was also intended to stimulate students to combine mathematical knowledge and procedures in a new and meaningful way, the predominantly closed-ended nature of the task, asking for a single correct answer, may have limited the effect of the unfamiliarity.

In addition to mathematical ability and working memory, gender was also found to be differentially related to the problem types. Girls outperformed boys on both non-routine types of problems, but no differences were found on the routine tasks. It seems that unfamiliarity does trigger creative thinking in girls. This is in line with previous studies showing that girls are not advantaged at regular mathematical tasks, but do perform better on tasks that require creative thinking (Baer & Kaufman, 2006; Mann, 2006).

In interpreting the results it should be kept in mind that in the study we examined the role of creativity in only three types of geometrical problems. There are other types of open and non-routine geometrical problems (Leikin, 2018), which were not included in this study. Moreover, geometry is only one of the several domains constituting the discipline of mathematics in elementary school. The present findings can therefore not be generalized to these other types of problems or mathematical domains. Future research could extend the present approach to other types of problems and other mathematical domains. Another limitation of this study relates to our measure of creativity. The TCT-DP, although regarded as a generic, holistic measure of creativity, does not fully capture the complex multidimensional nature of creativity as it is defined nowadays (Cropley, 2010). Note, however, that this test covers both divergent and convergent thinking processes and can, for that reason, be regarded as a good single indicator of students’ domain-general creativity. Nonetheless, we recommend that multiple measures of creativity be included in future research.

To conclude, students’ domain-general creativity was positively associated with their performance on three different types of geometry problems. However, students’ creativity was significantly more strongly associated with performance on multiple solution problems than with performance on routine and non-routine closed-ended problems. Students with higher levels of creativity performed better in solving geometry problems in general, but especially in geometry problems asking for multiple solutions. As several covariates were included to control for spuriousness, we may cautiously conclude that the remaining effects of creativity reflect true involvement of creative thinking processes in geometrical problem solving, especially in the more open multiple solution problems. Further research should provide more evidence regarding the causal direction of the relations or the effect of including more open tasks in the curriculum on students creative skills. Other research has suggested that providing students with appropriate open-ended problems in mathematics education is probably an important way to promote creative thinking (Levav-Waynburg & Leikin, 2012a). However, longitudinal and experimental research is needed for further evidence. Nonetheless, the present study provides important first ideas on the possible involvement of creativity in mathematics that may inspire new research and curriculum development.