1 Introduction

The research presented in this paper was motivated by several observations concerning research associated with mathematical creativity, expertise, problem solving and problem posing and the relationships between them.

The first observation concerns research on expertise in mathematics (as elaborated in the background section). While expertise is commonly addressed as superior performance in a particular domain (e.g., mathematics), in the research literature the notion of mathematical expertise acquires a broad range of meanings, as expressed in different groups of target populations varying from school students who excel, to professional mathematicians. Taking into account this variance, we examine differences in creativity and proving skills among participants with rich mathematical backgrounds of 2 types: (1) MO—problem-solving experts who were candidates or members of the Israel National IMO team and (2) MM—mathematics majors who excelled in university mathematics courses and also completed a High School Mathematics Teaching Certificate. The participants in these two groups are considered experts with different types of mathematical expertise.

Second, over the past two decades mathematics education researchers have—fortunately—increasingly paid attention to mathematical creativity and creativity-directed activities as major twenty-first century skills. At the same time, there are inconsistent arguments about the connections between mathematical expertise and creativity, and, moreover, empirical studies on such connections are scarce.

Third, in contrast to unconscious (to a large extent) mathematical creation by professional mathematicians, problem posing usually involves producing new problems in response to a requirement to do so. We found that empirical studies that examine connections between problem solving expertise and problem posing performance are rare, and we explore this connection here by employing problem posing through investigation (PPI) tasks.

Furthermore, we base our work on arguments about the domain-dependency of expertise and creativity (Baer 2015). We focus on participants from the two groups with two different types of mathematical expertise in order to gain a better understanding of the connection between mathematical creativity and these two types of mathematical expertise.

The PPI—problem posing through investigation—employed in this study is a mathematical activity that combines problem posing and problem solving. PPI provides multiple opportunities for investigations in a dynamic geometry environment (DGE), allowing participants to create auxiliary constructions, measure, search for geometric properties, and conjecture regarding the examination and formulation of new problems, which the participants are then required to solve (Leikin 2014; Leikin and Elgrably 2020). As such, PPI tasks are open from the start and from the end (Leikin 2019), since solvers are encouraged to choose what they examine and how, and the outcomes usually constitute an individual space of posed problems, which are based on the discovered properties. These collections differ among different individuals in terms of the number, types and complexity of the posed problems. A PPI task is completed only when all the posed problems are solved by the participants; they are free to choose how to prove any discovered property. The openness determines the complexity of PPI, since an investigation can lead in unpredicted directions, conjectures can appear to be incorrect, or solving some posed problems can require knowledge and skills at a level that surpasses the level of problem solving expertise of those who posed the problems. At the same time, the openness of the PPI tasks and their complexity determines the power of these tasks as tools for the investigation of creativity and problem-solving expertise.

2 Background

2.1 Expertise in mathematics and beyond

Ericsson and Lehmann (1996) defined expert performance as “consistently superior performance on a specified set of representative tasks for a domain” (p. 277) and stressed that “it is generally assumed that outstanding human achievements (i.e., expertise) reflect some varying balance between training and experience (nurture), on one hand, and innate differences in capacities and talents (nature) on the other” (p. 274). There is a consensus that expert knowledge differs from novice knowledge in its organization, as well as its extent (Glaser 1987; Lesgold 1984). Experts also rely on more 'abstract' or general structures (Voss et al. 1983). Hoffman (1998) argued that experts differ from non-experts in the reasoning operations or strategies they apply, and their ability to apply these operations and strategies in different orders and with different emphases.

Experts in mathematics have the ability to focus attention on appropriate features of problems, and have more cognizance of their own thought processes and of how others may think (Carlson and Bloom 2005; Lester 1994). Researchers characterized experts’ performance as processing flexibility linked to the ability to form multiple alternative interpretations or representations of problems (Hoffman 1998; Greer 2009; Star and Newton 2009). In contrast to an expert, a novice's system of representations of a mathematical concept may be deficient in number and in connections that form an adequate network of knowledge (Lester 1994). Mathematical knowledge and skills in experts are developed through deliberate practice and are characterized by robust concept images, procedural fluency and strategic competence in problem-solving, high levels of abstraction, and mathematical flexibility, expressed in the number of ways in which experts can tackle a problem (Schoenfeld 1985). Experts differ from novices in the problem-solving strategies they employ (Schoenfeld 1992) and in their ability to categorize problems according to solution principles and choose the most efficient ways of solving a particular type of problem (Sweller, Mawer and Ward 1983). Moreover, according to Duncker (1945) proposing an hypothesis is an intrinsic part of the problem-solving process for mathematical experts.

Beginning with Poincare’s (1908/1952) work, researchers’ studies of mathematical expertise have often been based on retrospective analyses of their own mathematical activities, or analysis of the mathematical performance of highly performing students or colleagues (Berman 2009; Schoenfeld 1985; Wilkerson-Jerde and Wilensky 2011). Studies on mathematical expertise are often linked to studies on mathematical giftedness, which analyze exceptional mathematical performance and connect mathematical giftedness to the work of mathematicians (Leikin 2019; Sriraman 2005; Usiskin 2000). As such, studies on mathematical expertise and mathematical giftedness are greatly intertwined (Leikin 2019), as reflected in the research populations of these studies, which include mathematical professors and graduate students (Wilkerson-Jerde and Wilensky 2011), participants in mathematical Olympiads (Koichu 2010; Koichu and Berman 2005; Reznik 1994), students who passed SAT-M tests with high scores, or participants in summer mathematics camps, or simply students with extremely high mathematical scores in school, or mathematical majors (Lubinski and Benbow 2006).

In contrast to studies that describe and analyze mathematical performance of mathematically advanced individuals alone, in this study we employed a differentiated view of mathematical expertise. We focused our study on two groups of participants with different types of mathematical expertise (MO and MM participants). To the best of our knowledge, no previous study has performed a comparison of mathematical creativity in groups of participants with different types and levels of mathematical expertise.

2.2 Creativity in mathematical problem solving and problem posing

In the vastly changing world of the twenty-first century, the importance of creativity is difficult to overestimate. Development of creativity in general and of mathematical creativity in particular is extremely important nowadays, both from a personal point of view—to strengthen people’s ability to adopt to new and challenging situations, which is essential for the well-being of each individual—and as a basic mechanism of societal, technological, and scientific development (Amado, Carreira and Jones 2018; Leikin and Pitta-Pantazi 2013; Leikin and Sriraman 2016; Sriraman and Hwa 2010).

Torrance (1974) considered creativity to be an effective combination of divergent and convergent thinking. Operationally, this view led to the definition of creativity based on four related components, namely, fluency, flexibility, novelty, and elaboration (Torrance 1974). Divergent thinking includes finding different solutions and interpretations, applying different techniques, and thinking originally and unusually, and creativity is one of the learning outcomes. At the same time, for convergent thinking, knowledge is of particular importance as a source of ideas, pathways to solutions, and criteria of effectiveness and novelty.

Providing a precise and broadly accepted definition of mathematical creativity is extremely difficult, probably impossible (Mann 2006; Sriraman 2005). Sternberg and Lubart (2000) drew a connection between creative performance and the ability to produce original and useful products, and, moreover, there is consensus among researchers that originality is the major component of creativity.

Mathematical creativity in school mathematics is usually associated with problem solving or problem posing. Problem posing and problem solving can be employed for the development of mathematical creativity (Matsko and Thomas 2015; Levav-Waynberg and Leikin 2012). Creative problem solving in mathematics is associated with mental flexibility (Silver 1997; Star and Newton 2009) and with mathematical insight (Ervynck 1991; Krutetskii 1976; Leikin 2009). Following Torrance (1974), Silver (1997) suggested developing creativity through problem solving as follows: Fluency is developed by generating multiple mathematical ideas, generating multiple answers to a mathematical problem (when such exist), and exploring mathematical situations. Flexibility is advanced by generating new mathematical solutions when at least one has already been produced. Originality is advanced by exploring many solutions to a mathematical problem and generating a new one. Leikin (2009) suggested a model for the evaluation of creativity using multiple solution tasks (MSTs). This model suggests evaluation of creativity with the three abovementioned categories—fluency, flexibility and originality—through analysis of similarities and differences between the multiple problem-solving strategies used. The PPI tasks, as described in the introduction section, are an instance of MSTs, thus in the current study we utilized Leikin’s (2009) model with regard to the variability of problems posed by the study participants.

2.3 Relationship between creativity and expertise

The relationship between creativity and expertise is an intriguing research topic and one can find inconsistencies between researchers’ arguments about this relationship. For example, the publications reviewed above in this paper do not connect expertise and creativity. This can be seen for example, in the word cloud for the 60 most frequent words created based on Hoffman’s (1998) chapter “How can expertise be defined? Implications of research from cognitive psychology” Fig. 1.

Fig. 1 
figure 1

Studies on experts do not mention creativity

Baer (2015) demonstrated that creativity and expertise are related, but are very different things. He argues that whereas expertise does not usually require creativity, creativity may require a certain level of expertise. In contrast, the bulk of the research literature on mathematical expertise at high level considers creativity to be an integral component of mathematical expertise in mathematically gifted individuals. Poincare (1908/1952) and Hadamard (1945) characterized the work of professional mathematicians as a creative activity, based on introspective analysis of their and their colleagues’ activity. Sriraman (2005) suggested a theoretical model of connections between creativity and expertise that included 8 levels of expertise according to the creativity component (introduced by Usiskin 2000), arguing that “in the professional realm, mathematical creativity implies mathematical giftedness, but the reverse is not necessarily true” (Sriraman 2005, p. 21). Findings about domain dependency of expertise and creativity (Baer 2015) are an additional factor that motivated our study. “People may be expert, and people may be creative, in many domains, or they may be expert, or creative, in few domains or none at all, and one cannot simply transfer expertise, or creativity, from one domain to another, unrelated domain” (Baer 2015, p. 165). In our study we considered whether and how MO and MM types of mathematical expertise are expressed in PPI.

2.4 Problem posing and problem solving

In the past two decades mathematical investigations have been acknowledged as powerful tasks for the teaching and learning of mathematics (Leikin 2016; Ponte 2007; Ponte and Henriques 2013; Silver 1994; Yerushalmy et al. 1990). Problem posing is a broad concept, usually related to the creation of a new problem in response to a requirement to create a problem or a set of problems. Mathematics educators categorize problem posing and investigation problems as 'open problems' (Pehkonen 1995; Silver 1994, 1997). Some problem posing related to problem transformation was explored by researchers focusing on systematic transformations of a given problem involving variations in goals and givens (Brown and Walter 1993). Silver et al. (1996) and Hoehn (1993) drew attention to the “symmetry” transformation of a problem, which leads to the creation of a problem in which the givens and the goals have been swapped. Silver et al. (1996) also described the “goal manipulation” strategy, in which the givens remain and only the goal is changed. Leikin and Grossman (2013) demonstrated that “What if yes?” problem posing strategies are more effective when performing investigations and problem posing in DGE if conditions are added to givens instead of removing them. PPI tasks employed in this study allow both manipulation of givens and goals and this activity is supported by the use of DGE, which is naturally associated with investigations in geometry (Yerushalmy et al. 1990).

Complex problem solving by experts, including Olympiad participants, includes problem posing; “problem formulation and problem solution go hand in hand, each eliciting the other as the investigation progresses” (Davis 1985, p 23). Duncker (1945) observed that problem solving by mathematical experts consists of successive re-formulations of an initial problem (which is a type of problem posing). Koichu (2010) analyzed problem posing in the context of teaching for advanced problem solving. However, the way in which experts with different types of mathematical expertise perform problem-posing tasks has not been explored systematically.

Reznik (1994) described the Putnam contest as designed to test originality as well as technical competence in problem solving. He believed that success in Olympiads and in studying mathematics at the university level are related, but not necessarily equivalent, thus not all mathematics majors can solve Olympiad problems. Sriraman (2005) maintained that in the hierarchy of mathematical giftedness, majoring in mathematics stands at a lower level than does participation in mathematical Olympiads. Thus, in our study, the two groups MO and MM were chosen in order to shed light on the relationships between problem-solving expertise of different types and levels (MO and MM), and creativity linked to PPI.

3 The study

3.1 Problem posing through investigations

PPI is a complex mathematical activity that includes the following (Leikin and Elgrably 2020):

  • Investigating a geometrical figure (from a proof problem) in a DGE (experimenting, conjecturing and testing), in order to find several [at least 2] non-trivial properties of the given figure and related figures that are constructed using auxiliary constructions. A non-trivial property is defined as one for which the proof includes at least 3 stages (Fig. 2).

  • Formulating several [at least 2] new proof problems based on the investigations performed, and solving (proving) them.

In what follows we use the terms ‘posed problem’ and ‘discovered properties’ interchangeably since the posed problems require proving the discovered properties. Figure 2 depicts the PPI task used in the study presented in this paper.

Fig. 2
figure 2

PPI task used in this research

Task 1 was formulated using a proof problem from a 10th grade geometry textbook. The problem required students to prove that \(BE/ EA=2\) (Fig. 2). The proof problem is simple for both groups of participants, allowing a focus on their problem-posing performance. To control the level of participants’ expertise we examined participants’ success in proving the posed problems.

3.2 The study goals

The major goal of the study presented here was to examine mathematical creativity as a function of mathematical expertise. The examination was performed with regard to proof skills (auxiliary constructions performed in the course of PPI, correctness of proof of the posed problem and complexity of the posed problem) and creativity components (fluency, flexibility, originality and creativity). To achieve the goal, we asked the following research questions:

QA. What are the differences between PPI by MO and MM participants from the point of view of proof skills and creativity components?

QB. What are the mutual relationships between proof skills and creativity components of PPI by MO and MM participants and how do these relationships differ between the MO and MM students?

3.3 Participants and data collection

Two groups of participants took part in this study, namely the MO group and the MM group. The following characteristics led us to consider the groups as having different types of mathematics expertise.

The MO group included 8 participants who were candidates for, or members of, the Israel National Olympiad team—problem solving experts in this study. All these participants passed the problem-solving training for the IMO (International Mathematical Olympiads). IMO is the most prestigious mathematics competition nowadays, and includes problems from classical content areas and those that are not usually studied in school or university (Koichu and Andžāns 2009). The training is directed at the development of the highest level of problem-solving skills and strategies. The 8 participants volunteered to participate in our study upon our request.

The MM group in this study included 11 excelling mathematics majors who had studied more than 1000 h of mathematics in university. These 11 participants were chosen from a group of 68 participants in a wider study, since, in contrast to the other 57 participants, they received scores above 90 in such courses as calculus, advanced calculus, linear algebra and analytical geometry. In addition to holding a BSc degree in mathematics, these participants completed a 52-h geometry course directed at the development of problem solving (proving) skills in geometry through the systematic employment of PPI. This course included PPI linked to Menelaus’ theorem, Ceva’s theorem, 9 points circle and Euler line, so that they discovered and proved the theorems as well as being asked to use them when solving other problems during the course (Leikin and Elgrably 2020).

Participants from the MM group were asked to solve Task 1 during the written test conducted as the final examination of the course. MMs were given 90 min to solve this task. They performed PPI in dynamic geometry and submitted their investigation outcomes accompanied by GeoGebra files that demonstrated the entire sequence of constructions and discoveries performed in the course of their investigations. Additionally, MM participants submitted written documents that included problems posed by the participants and their proofs.

Since MO participants did not have training in solving PPI tasks, they first received a preliminary, very short introduction to PPI tasks and the ways of working with DGE, and then were asked to solve Task 1 during individual interviews. The interviews were recorded using Camtasia software that allowed analysis of each action during the investigation process and formulation of the posed problems. Participants from both study groups were engaged in solving the PPI task for about 90 min. This form of data collection allowed us to perform identical analyses of the PPI outcomes produced by the participants from the two groups, as explained in the next section.

3.4 Data analysis

We utilized the decimal-based scoring scheme introduced in Leikin (2009) for all of the criteria examined in this study. To examine the relationship between creativity and expertise, we evaluated each individual space of posed problems with respect to creativity components and proof skills. Proof skills included the following: (a) auxiliary constructions performed by the participants to discover a property, (b) correctness of proofs of the discovered properties, and (c) complexity of the posed problem. Creativity components included the following: (d) fluency, defined as the number of discovered properties, (e) flexibility, defined as the number of discovered properties of different kinds, (f) originality, defined as the newness and rareness of the discovered properties. An individual space of posed problems is made up of all of the problems the person posed based on the discovered properties. We evaluated each of the individual spaces of posed problems as explained in Table 1.

Table 1 The scoring scheme for the evaluation of an individual space of the problems posed through investigations

We open the findings section with a description of the interview with Dave—the most creative MO participant in our study—and explain the ways in which his performance on PPI tasks was scored. Then, in order to answer the research questions, we report our comparison of the individual spaces of problems posed by the participants from the MO group and those of the participants from the MM group with respect to the creativity components and proof skills. We also report the analysis we performed of the collective spaces of the problems posed by the participants of the two research groups.

4 Findings

4.1 Example: interview with MO expert

Dave (pseudonym) was 17 (at the time of the interview), and had been studying in the Technion (Israel Institute of Technology) since the 9th grade (Spring 2014). Dave took part in the International Mathematical Olympiad (IMO) during 2014–2017. He won 4 medals: 3 bronze medals and one silver medal. Dave exhibited the highest performance on a PPI task both for the MO and the MM groups. Figure 3 presents excerpts from the interview with Dave.

Fig. 3
figure 3

Excerpts from the interview with Dave (the highest performer in the study)

Before analyzing the problems posed by Dave, let us remember that to allow for fluency of the interview, MO participants were not asked to present a complete formulation of proof problems (given X, prove Y), but only to find properties that can be proven (Y).

Dave not only discovered multiple properties and proved them, he also refuted a number of properties either by construction and dragging in DGE (e.g., the points \(N\), \(L\), \(H\) are not on one line) or by performing a formal proof (\(CI=DI\) was shown to be mistaken by calculating the power of the point I). In the course of examining conjectures by proving or in the course of proving the properties, Dave formulated additional properties that sometimes he did not find interesting enough to explicitly present as posed problems or did not recognize as discoveries at the time. However, he also used DGE to test the conjectures he raised, when proving or trying to prove some properties that seem to be correct based on the observation of the figure in DGE. After performing a number of auxiliary constructions he understood the power of DGE for discovering properties and asked whether he is allowed ‘to build whatever he wants to’. After this moment of the interview he performed a variety of constructions in the course of tackling the PPI task.

Auxiliary constructions Problems D1, D2 and D3 got a score of 0 because discovering the properties required at most one auxiliary construction. For example, to find discovery D1, that ADBF is a rhombus, Dave had to perform one auxiliary construction, namely, drawing a line parallel to BD through point A, that is, the segment AF. Similarly, property D8 was discovered without need for any auxiliary construction, and so also received a score of 0. On the other hand, property D4 was discovered using the construction of two auxiliary lines in the shape, namely, DF and AF, and therefore received a score of 1. In order to get a score of 10, more than 3 auxiliary constructions are required within the shape; a good example is the discovery of property D5, which consists of creating the points G and H and drawing a circle inscribing the shape.

Complexity of the posed problem Proving D1 was relatively simple: AFBD is an auxiliary construction (1) ADBC—if alternate interior angles are equal (∢BAD = ∢ABC = 60°), the lines are parallel; (2) then AFBD is a parallelogram (according to definition). (3) AD = BD—adjacent sides in the parallelogram thus AFBD is a rhombus. The proof included 3 stages, use of 2 definitions and the equality of alternate interior angles as a sufficient condition for parallel lines. Thus complexity of D1 was scored with 1.

The proof for property D4 was based on the proof of D1 and the additional stages: (1) FH = HD since the diagonals bisect each other, and therefore AH is a median line, (2) the diagonals in a rhombus are perpendicular, therefore ∢BHF = 90°, (3) FDAC—if corresponding angles are equal (∢BHF = ∢BAC = 90°) then the lines are parallel, (4) ACFD is a parallelogram according to definition. (5) The diagonals in a parallelogram bisect each other, and therefore AG = FG and therefore DG is a median line and E is the intersection of the medians in the triangle. The proof included 4 additional stages to the proof of D1.

Proof correctness Dave proved all the discovered properties but the last, which he did not prove due to the end of the interview. Each of his proofs was scored with 10.

Fluency Overall Dave discovered and explicitly formulated 12 properties (D1–D12, see Fig. 3), thus his fluency score was 12.

Flexibility The properties (see Fig. 3 and Table 2) were of 6 different types: D1, D3 and D10—special quadrilaterals, D2 midpoint of a segment, D4 a point is a triangle’s center of mass, D5 and D7 four points on a circle, D8 two segments ratio, D6, D9 and D12—three points on a straight line, and D11 a line is tangent to a circle. Properties D1, D2, D4, D5, D6 and D11 were scored with 10 points for flexibility as these properties were of different types. D8 was scored with 0.1 points for flexibility since it was the same property as D2. D3 was scored with 1 for flexibility since D1 also was a parallelogram (rhombus), however, D10 was scored with 10 since it was a different type of special quadrilateral than D1 and D3, and the discovery of D10 required many complex auxiliary constructions. D7, D9 and D12 were repeating properties discovered in different locations of the figure based on a series of auxiliary constructions, and this received a score of 1 for flexibility.

Table 2 Evaluation of Dave’s individual space of posed problems

Originality Originality of the problems was evaluated based on the frequency of the property, as determined by the number of participants who discovered the property. The frequency was calculated based on the problems posed by the participants from MO group and the big MM group. Each of the properties ‘the quadrilateral is a rhombus’, ‘the quadrilateral is an isosceles trapezoid’, and ‘4 points are on a circle’ appeared in the spaces of posed problems of 1 to 5 performers each. Thus D1, D4, D5, D7, D10 and D11 were scored with 10 points for originality. On the other hand, more than 97% participants from the big study group posed problems that included a ratio of segments and areas. Thus D2 and D8 each received 0.1 for originality.

Creativity The creativity of each posed problem within an individual space of posed problems was evaluated as a product of flexibility and originality of the associated discovered property (Leikin 2009).

4.2 Comparing problems posed by MO participants and those by MM participants

4.2.1 Individual spaces of posed problems of Dave and Jerry

Table 2 below depicts, in a condensed form, the individual space of problems posed by Dave (see Sect. 4.1), including a summary of the auxiliary constructions, all the discovered properties and their evaluation. Table 3 presents the space of problems posed by Jerry, who is a MM participant with the highest creativity score among the MM participants.

Table 3 Evaluation of Jerry’s individual space of posed problems

Jerry’s space of posed problems received the highest creativity score among the 11 excelling MM participants. He posed 7 problems, which is fewer than Dave did (12). In Dave’s space of posed problems, 7 problems included complex properties (scored with 10), whereas Jerry’s space included 4 problems with complex properties scored with 10. Dave proved 11 of the 12 problems that he posed, whereas Jerry proved 5 of 7 problems. Dave’s flexibility score was 74.1 whereas Jerry’s flexibility score was 42.1. Dave made 6 original discoveries and his originality score was 64.2 while Jerry’s originality score was 51.1. Jerry’s original discoveries included the following properties: a quadrilateral is a parallelogram, ratio of areas of two quadrilaterals equals 4.5, similarity of two triangles and tangency of a circle and a line. Note here that ratios of areas and ratios of segments were commonly examined by the participants in the MM group. As a result, all the characteristics of the spaces of posed problems were higher for Dave than for Jerry.

4.2.2 Overall differences between the spaces of the problems posed by MM and MO participants

Table 4 displays the number of problems in the collective spaces of posed problems by MO and MM groups, evaluated with the highest scores for different examined criteria. Figure 4 depicts boxplots representing range, mean and median for the all examined criteria.

Table 4 Numbers of posed problems with 10-score for different examined category
Fig. 4
figure 4

Boxplots of scores assigned to the posed problems in the two groups

Table 4 and Fig. 4 demonstrate that the 8 MO participants produced more than twice as many problems through investigations than did the 11 MM participants. The mean number of problems posed by MO participants was 3 times bigger than that of participants from the MM group. We compared the spaces of problems posed by the participants from the two groups according to the highest scores for all the examined criteria. We found that overall, problems posed by MO participants were based on a larger number of complex auxiliary constructions, included more complex discovered properties, and were proved in 97% of cases as compared to 59%. The properties discovered by MO participants demonstrated more flexibility and were more original. On average MO participants posed 3 times more problems that received 100 for creativity than did MM participants.

Figure 4 illustrates most of the examined criteria. The highest score assigned for the problems posed by MM participants was lower than the lowest score attained in the MO group for all the participants except Jerry. This result held for auxiliary constructions (44 vs 57), proof correctness (70 vs. 110), fluency (10 vs.12), flexibility (51 vs. 58), originality (22.2 vs. 33.83) and creativity (201 vs 275). Jerry’s scores on originality and creativity (Table 3) were within the range of scores of MO students. Comparing median scores for all the examined proof skills and creativity components showed significant differences in the quality of discoveries: median scores were more than 4.9 times higher in the MO group than in the MM group for auxiliary constructions, 5.4 times for proofs, 2.9 times for complexity of discoveries. The ratio of median scores in creativity components was 2.8 for fluency, 4 for flexibility, 3.6 for originality and 3.2 for creativity. A Mann–Whitney test demonstrated that the differences among the posed problems were significant for all the proof skills and creativity components.

4.2.3 Relationships between different creativity components and proof skills linked to PPI within the groups of MO and MM participants

An additional comparison between the PPI performed by MO and MM participants was conducted focusing on correlations between the associated proof skills and the creativity components separately for MO and MM groups. A Spearman correlation test was applied to all the proof-related and creativity scores within each study group. Consistent with the findings of our previous studies (Leikin and Elgrably 2020; Levav-Waynberg and Leikin 2012), in the 2 groups of participants the correlation between creativity and originality was found to be significant (rs = 0.881, p < 0.01 in MO group; rs = 0.991, p < 0.01 in MM group). This correlation confirms the validity of the model suggested for the evaluation of creativity linked to PPI.

Based on the initial analysis of the individual and collective spaces of the problems posed by the two groups of participants, and based on our previous study with a big-MM group (Leikin and Elgrably 2020), we hypothesised that auxiliary constructions performed by the participants in the course of PPI led to more complex properties and a more flexible discovery process. To our surprise, the complexity and number of auxiliary constructions performed (as reflected in the auxiliary constructions score) did not correlate significantly either with complexity of the posed problems, or with creativity-related criteria linked to PPI.

For the problems posed by the MO participants we found significant correlations between fluency and flexibility (rs = 0.862, p < 0.01 for fluency and flexibility in MO). This correlation demonstrated that a larger number of posed problems led to a larger number of problems of different types posed by MO participants. This correlation supports our observations regarding the MO students’ inclination to find ‘interesting’ discoveries, as was obvious in the interview with Dave. This connection between fluency and flexibility of the PPI process was specific to the MO participants. This correlation did not appear to be significant in the MM group. Interestingly, both fluency and flexibility of PPI correlated significantly with proof correctness in MO participants only (rs = 0.970, p < 0.01 for fluency and proof in MO; rs = 0.905, p < 0.01 for flexibility and proof in MO). This correlation supports our observation that, as in the case of Dave’s PPI, many of the properties discovered by MO participants were discovered in the course of searching for proofs of earlier discovered properties, and PPI by MO participants constituted chains of proofs and discoveries supported by DGE.

Flexibility of PPI in the MM group correlated significantly both with originality and creativity of the PPI (rs = 0.900, p < 0.01 for flexibility and originality in MM; rs = 0.945, p < 0.01 for flexibility and creativity in MM). MMs’ ability to pose more different problems was related to their success in posing original problems. Surprisingly, these correlations did not appear to be significant in the MO group. We suggest that the proof skills that characterised MO mathematical expertise led to their flexibility, while the posing of original problems was rooted in their geometrical curiosity, expressed in an inclination to find interesting properties.

5 Conclusions, discussion and some additional facts that explain our findings

The goal of the study presented in this paper was to examine relationships between creativity and expertise in mathematics in two groups of participants with different types of mathematical expertise. The first group (MO) included 8 candidates or members of the Israel National Olympiad team. MOs were experts in mathematical problem solving at high level, including solving complex geometry problems. 7 of them did not study university mathematics before or during the study. The second group (MM) included mathematical majors that excelled in mathematical courses during their studies for a BSc degree in mathematics. They succeeded in solving different kinds of problems at an advanced level but were not experts in solving complex mathematical problems at the Olympiad level.

The study demonstrates significant differences between the two kinds of expertise in mathematics. We found that problem solving expertise at high (MO) level significantly influences the quality of PPI as reflected in proof skills and creativity components. Unfortunately, we found again (Leikin and Elgrably 2020) that university mathematics courses do not develop creative mathematical abilities and skills. The MO participants appeared to perform PPI significantly better than MM participants. They were more fluent, flexible and original and produced more complex problems with more complex auxiliary constructions. The lowest scores on almost all the examined criteria in MO were higher than the highest scores achieved by MMs on PPI tasks. This result was in spite of the fact that MMs completed university degrees in mathematics, excelled in their mathematics courses, and took a geometry course with a specific focus on PPI.

One possible explanation, that expert knowledge is the reward of years (10 years) of concentrated effort, does not apply well to our findings, since both groups invested time and effort in studying mathematics. We assume that the difference is related rather to the type of training for Olympiads (Koichu and Andžāns 2009) and considerations of participation in international competitions as an established indicator of expertise and talent (Bloom 1985), than to majoring in mathematics (Sriraman 2005).

We found that the high level of mathematical expertise of MO participants was reflected in the significant correlation between proof skills and creativity skill. We demonstrated clearly—both through the analysis of the interview example and by means of the correlation analysis—that problem posing performed by MOs and proving by MOs were inseparable. These findings are in accord with Duncker’s (1945) position that raising a hypothesis is an intrinsic part of the problem-solving process in mathematical experts. According to Duncker, problem solving by experts involves deep understanding of available data, seeking information to test alternatives, and producing a judgment. The MOs in our study tended to approach PPI as a problem-solving task, and through seeking for alternative properties which were more interesting for them. They used DGE mostly to test their hypotheses about additional properties, along with searching for properties using dynamic geometry. The auxiliary constructions that they performed were performed consciously, oriented to a goal. We suggest that this behavior is reflected in the absence of correlations between the auxiliary constructions performed by MOs and other examined criteria. In addition, since they approached PPI similarly to proof problems, and based their hypotheses about new properties on their previous experiences in solving mainly proof problems, high correlations between proof correctness, fluency and flexibility were found. An additional explanation for our findings can be found in Hoffman’s (1998) argument that expert performance is characterized by flexible reasoning linked to the ability to form multiple alternative interpretations or representations of problems, and an increased ability to revise old strategies and create new ones as problem-solving proceeds (Shanteau and Phelps 1977). Most of the MO participants searched for more original properties based on their inner curiosity.

Note here that a major study limitation is the different formats (i.e., a test and individual interviews) in which the task was employed with the two groups of participants. Nonetheless, both of these two different formats included solving a PPI task in the same dynamic geometry environment and tracking the auxiliary constructions performed and the problems posed by the participants. This data allowed us to conduct identical analyses of the PPI outcomes produced by the participants from the two groups. In contrast to the individual interviews performed with MO participants, the test conducted with MM participants did not record PPI strategies. Thus comparative analysis of the PPI strategies used by the participants from these two groups is a subject for a further investigation.