The analyses of the data collected in this study provided us with two types of results: firstly, they informed us about kindergartners’ performance in the domain of measurement of length (Questions 1 and 2); secondly, the analyses generated knowledge about whether reading picture books to kindergartners contributed to their performance (Question 3).
Kindergartners’ performance in length measurement
General performance in length measurement: results from pretest
The mean performance for the total sample (N = 308) of the 11 PICO measurement items in the pretesting was .34 (SD = .15), which means that the average number of correct items was 3.74. The minimum total score was 0 correct items and the maximum total score was 9 correct items. The older children who were in K2 demonstrated a higher performance (M = .39, SD = .14) than the younger children who were in K1 (M = .25, SD = .12). The difference between these two age groups was significant [t(306) = −8.65, p < .01].
Table 1 shows children’s success rates per item in the pretest for the whole sample and for the K1 and K2 children separately. Furthermore, this table contains the results of the chi-square tests that were carried out to examine the success differences between the K1 and K2 children.
Table 1 Success percentages per PICO item in the pretest
The easiest items were the Baby item (86%) and the Rope item (85%). The children were less successful (49%) in the Door item. A proportion of 42% of the children could respond correctly to the Plant item. The children’s success was relatively low in the Tree (27%), Flower (24%) and Snail (20%) items. The Plants, Snake and Shawl items were even more difficult for the children to answer correctly, as the success rates were 16 and 10% for the latter two items, respectively. Most difficult was the Steps item as only 4% of the children provided a correct answer.
The K2 children were significantly more successful than the K1 children in 7 out of 11 items including the Baby, Rope, Plant, Plants, Door, Snail and Shawl items. This result indicates that the general performance in length measurement increases with age.
Components of length measurement performance
To get more insight into the structure of children’s ability in length measurement, we carried out a hierarchical similarity analysis on the assessment items by using the computer software Classification Hiérarchique, Implicative et Cohésitive (C.H.I.C.) (Gras, Suzuki, Guillet, & Spagnolo, 2008). This analysis identifies hierarchical similarity between groups of variables (Lerman, 1981). In our study, these variables consist of the children’s responses to the different measurement items. For instance, the similarity of two distinct items can be determined by the probability that the number of subjects who simultaneously satisfy the two variables, that is, the number of children who answer consistently (i.e., correctly or incorrectly) to the corresponding items, is greater than the random number expected in this situation.
The similarity groups were established in an ascending manner as a function of their strength: the stronger the similarity connections were in the groups, the higher was the level at which they were established. Thus, the similarity groups are represented in a hierarchically constructed similarity diagram, which allows us to study and interpret groups of items in terms of a resemblance of performance characteristics. The similarity diagram in this study revealed a number of components in kindergartners’ performance in length measurement.
Figure 3 shows the similarity relations based on the correctness of the children’s responses to the items in the pretest in the total sample, including K1 and K2 children. The responses to the Plant and the Plants items are more similar than to any other pair of items. The similarity relation is situated at the first level of the hierarchical tree. Then, the similarity group consisting of the Plant, Plants and Shawl items, which is formed at the next level, presents a better aggregation than any other pair of items. Next, the similarity group is extended by the responses to the Flower and the Door items. The Snake item and the Steps item are linked at the next level. They are more similar than any other extension of the group of items consisting of Plant, Plants, Shawl, Flower and Door. The next level consists of the group including Plant, Plants, Shawl, Flower, Door, Snail and Tree. This group is higher than the level of the pair Baby and Rope, which in turn is higher than any extension of the pair Snake and Steps. Thus, in total three groups were identified by the similarity analysis. To enhance the interpretation of the structure found by the similarity analysis, we added the percentage of success to the item names.
Next step was that we had to reason why the items belonging to these groups were solved correctly by the same children. A deliberation among the research team about the determining characteristics of the items resulted in the following interpretations.
The first similarity group involves the responses to the Baby and Rope items. The remarkable thing about this group is the children’s high success rate on the two items compared to the other test items. A distinguishing characteristic that differentiates these two items from the other items is that they strongly trigger the use of holistic visual recognition. Thus, in the Baby item, the children could have asked themselves: which picture looks like a baby? In the rope item they could have asked themselves: which picture has more “rope”? We think this solution approach fits well with children of this young age.
The second similarity group involves the responses to the Snake and Steps items. What these items have in common is, in one way or another, the partitioning of the length of an object (i.e., a snake or a pathway) into equal-sized units. Therefore, the second similarity group can be considered as reflecting children’s understanding of measurement related to unitizing.
The third similarity group is based on the children’s responses to the Plant, Plants, Shawl, Flower, Door, Snail and Tree items. All these items require ordering abilities based on the length of objects along a continuum. This is clearer in the Plant, Plants, Shawl and Flower items, but applies also to the other items. For example, in the Tree item, to understand the relationship between the height of the tree and the height of the girl and to find out which photograph showed the highest tree, the children probably used the order of the photographs. This is because the photographs were ordered according to the depicted height of the girl, starting with the photograph with the girl who looked the tallest. Because the tree is of the same height in all the photographs, the height of the depicted girl determines the height of the tree. So, the taller the depicted girl, the smaller is the tree in reality. Similarly, in the Snail item, the possible covered distances are ordered from the shortest to the longest one. This ordering probably helped the children to select the required distance. As for the Door item, children probably used ordering as well. Possibly, they imagined a girl’s height increasing over the years and projected this height on the measuring strip next to the door. On the whole, we interpreted the third similarity group to contain items that required ordering.
A closer look at the items revealed that a fourth group might have appeared, but it did not show up in the statistical analysis. We identified three other items that shared a common characteristic. The Shawl, Tree and Steps items all include an inverse relation. In the Shawl item, as the shawl grows longer the ball of wool grows smaller. In the Tree item, the smaller image of the girl relates to a taller tree in real life. In the Steps item, the number of steps is larger when the size of the steps is smaller. The fact that this similarity is not reflected in the children’s responses can be interpreted to indicate that an inverse relation is too advanced a concept for children of this age to use in their reasoning.
The success rates on the items of each similarity group suggest that the three components of length measurement performance, which correspond to the three similarity groups, did not have the same level of difficulty for the kindergartners. The holistic visual recognition items (similarity group 1) were the easiest ones, whereas the unitizing items (similarity group 2) were the most difficult ones. The items that required ordering abilities (similarity group 3) were of moderate difficulty level.
Carrying out the similarity analysis for the responses of the children in K1 and K2 separately, revealed that the components in the children’s performance in length measurement differed only slightly from the components identified in the whole group. The results from the K1 sample are shown in Fig. 4.
In the K1 sample, the Baby and Rope items, which in the whole sample belonged to the similarity group of responses labeled as holistic visual recognition, were linked with a second similarity group including the Snake and Steps items, which in the whole sample were identified as requiring unitizing. The linking of these two similarity groups indicates that the youngest kindergartners are not yet able to solve these latter items just by unitizing and may have used holistic visual recognition as well.
Another difference that was found between the results from the similarity analysis in the K1 sample and the whole sample was that, in the K1 sample, the items that comprised the third similarity group in the total sample were distinguished into the similarity groups 3 and 4, which were linked to each other. Comparable to the third similarity group of the diagram of the total sample, all the items in these groups of the K1 diagram require ordering abilities. However, there seems to be two types of responses triggered in the children. The items in group 3, that is, the strongly related Plant and Plants items, and the Tree and Snail items, were potentially solved by starting from the four possible answers. Therefore, we labeled group 3 as recognizing answers. This group clearly differs from group 4, which could be based on responses that involve producing answers. In the Door, Flower and Shawl items, children may first have produced their answer and then looked for the matching answer. For example, in the Door item children could have reasoned: I reach just to the doorknob, so the arrow next to it is the answer. In the Shawl item, children might have known directly that in the first picture of the ball of wool should be the largest, and consequently have looked for that ball. Finally, the approach of first producing the answer is probably most obvious in the Flower item where children could have imagined the height of the next flower followed by looking for the flower of that height in the answer boxes. This Flower item clearly contrasts with the Plant and Plants items, which belong to the third similarity group and which concern finding missing plants within a series of plants with increasing length. To sum up, our hypothetical interpretation for this distinction is that in the items of the fourth similarity group, the strategy of starting with the given answers and checking whether they each fit is more difficult to use and less spontaneous than it was with the third group.
The relative difficulty levels of the items across the three components of length measurement performance by K1 children are similar to the ones referring to the whole group. Specifically, difficult items require unitizing (similarity group 2), items of moderate difficulty involve ordering (similarity groups 3 and 4) and easy items include holistic visual recognition (similarity group 1). Within the third and the fourth similarity groups, though, the Plants and the Shawl items appear to be more complex for K1 children than the other items that require ordering.
The results from the similarity analysis of the pretest responses in the K2 sample are shown in Fig. 5.
The component structure of the children’s length measurement performance in this sample is globally the same as in the K1 sample, but there are also some changes in how the items group together. The second group that may represent the unitizing response turned out to be quite stable over the kindergarten years. The differences between the K1 and K2 samples were only found with respect to the two other similarity groups.
In the K2 sample, the first similarity group referring to holistic visual recognition is extended with the Tree item. This change could indicate that for the older kindergartners it is not necessary anymore to use the order of the photographs to find the answer, but that they see directly that the last photograph containing the smallest girl shows the tallest tree in reality. Another difference in the K2 sample is that the remaining six items in the K1 sample belonging to the third and the fourth similarity groups, in which the items require ordering, are more strongly linked, thus forming one similarity group. However, they are distinguished into two subgroups. Different from the K1 sample, in the K2 sample the items in these subgroups are not differentiated as requiring recognizing answers versus producing answers. Instead, in this K2 sample with the older kindergartners, there is rather a division in items that include the ordering of just length versus items that deal with length in connection with other physical quantities. The first category includes the Plant, Plants and Flower items. The second category includes the Door and Shawl items. The Door item involves length and age, and the Shawl item length and an informal understanding of volume. The Snail item has characteristics of both. It deals with two physical quantities, length and time, and the way it is presented has a strong connection to ordering. This is why there is a similarity between this item and the items belonging to subgroups 3a and 3b.
As in the K1 sample, the difficulty level of the items in the K2 sample varies as a function of the length measurement components the children encounter, with the holistic visual recognition items (similarity group 1) as the easiest tasks, the ordering items as the tasks of moderate difficulty (similarity group 3) and the unitizing items as the most complex tasks (similarity group 2).
Effect of the intervention on length measurement performance
To answer the third research question, we compared the results of the experimental group with those of the control group. We did this for the general performance in length measurement and for its components. Before investigating whether the intervention had an effect on children’s measurement performance, we examined whether there were initial differences in measurement performance between the two groups. We found that the experimental group children (M = .33, SD = .14) and the control group children (M = .35, SD = .15) demonstrated similar initial performance [t(306) = 1.24, p = .22].
Effect on general performance
The effect of the intervention program on kindergartners’ general performance in length measurement was analyzed by means of a repeated measures univariate analysis of variance (ANOVA) with the factors Condition (control or experimental group), Test Moment (pretest or posttest), Mathematical Ability (levels A, B, C, D or E) and Gender (boy or girl) as independent variables and children’s achievement on the PICO test as the dependent variable.
The findings of the analysis showed a significant main effect of Test Moment [F(1, 280) = 25.71, p < .001, η
2 = .08] on the general performance of length measurement. A weaker but significant interaction effect was found between Condition and Test Moment [F(1, 280) = 4.04, p < .05]. An effect size (η
2) of only .01 was found (Cohen, 1988). This finding indicated that the intervention had only a small positive impact. A further analysis revealed no significant triple interactions with Mathematical Ability [F(4, 280) = 1.71, p = .15, η
2 = .02] or Gender [F(1, 280) = .51, p = .50, η
2 = .002], indicating that all mathematical ability levels and both boys and girls did not differ with respect to their contribution to the Condition and Test Moment interaction effect. The estimated marginal means of the children from the experimental group (pretest: M = .28; posttest: M = .37) and the control group (pretest: M = .33; posttest: M = .38) on the pretest and the posttest are illustrated in Fig. 6.
The same analysis was carried out for the K1 and K2 children separately to investigate whether the impact of the intervention on children’s results would vary with kindergarten year. The findings in both K1 and K2 revealed that the interaction effect between Condition and Test Moment was not significant [K1: F(1, 86) = 1.18, p = .28, η
2 = .01; K2: F(1, 177) = 1.51, p = .22, η
2 = .01]. However, in both grades the main effect of Test Moment remained significant [K1: F(1, 86) = 21.48, p < .001, η
2 = .20; K2: F(1, 177) = 10.53, p < .01, η
2 = .06].
Effect on the components of the performance
The impact of the intervention program was further examined on the components of the length measurement performance, holistic visual
recognition, ordering and unitizing, which were identified previously in the similarity analyses. Repeated measures multivariate analyses of variance (MANOVA) were applied to the data of the total sample, and of the K1 and K2 children, separately—with these three components as dependent variables and Condition, Test Moment, Mathematical Ability and Gender as independent variables.
The analysis of the total sample showed a significant main effect of Test Moment [F(3, 278) = 10.09, p < .001, η
2 = .1]. Nevertheless, we did not find a significant interaction effect between Condition and Test Moment [F(3, 278) = 2.52, p = .059, η
2 = .03]. Univariate analyses revealed though that this interaction was significant for the first component of the length measurement performance, namely, holistic visual recognition [F(1, 280) = 5.15, p < .05, η
2 = .02]. As illustrated in Fig. 7, children of the experimental group made more progress in holistic visual recognition than the children of the control group.
The application of the same analysis on the two kindergarten years, separately, showed similar results for K1 children and rather different results for K2 children. With respect to K1 children, a significant main effect of Test Moment [F(3, 84) = 13.26, p < .001, η
2 = .32] was found, but no significant interaction between Condition and Test Moment was revealed [F(3, 84) = 2.62, p = .056, η
2 = .09]. However, again the univariate analyses revealed that this interaction was significant for holistic visual recognition [F(1, 86) = 7.94, p < .01, η
2 = .09]. This finding suggests that the K1 children in the experimental group realized considerably more length measurement improvement on the holistic visual recognition ability than did the control group (Fig. 8).
As regards K2 children, the findings showed a significant main effect of Test Moment [F(3, 175) = 3.96, p < .05, η
2 = .06], but no significant interaction between Condition and Test Moment [F(3, 175) = .50, p = .69, η
2 = .01]. Univariate analyses revealed that this interaction was not significant for any of the three components of length measurement performance [holistic visual recognition: F(1, 177) = .29, p = .59, η
2 = .002; unitizing: F(1, 177) = .08, p = .78, η
2 < .001; ordering: F(1, 177) = .98, p = .33, η
2 = .01], indicating that the intervention program did not result in a significant increase of the older children’s general performance in length measurement or of its components.