Most of the time we find ourselves in situations of time pressure when acquiring new knowledge. This is not only the case in our professional lives but is particularly true for children and adolescents in educational settings. Such time pressure might affect our metacognitively guided decisions, which involve the monitoring and control of our own cognitive processes (Flavell, 1971; Walczyk & Griffith-Ross, 2006). In learning situations, people might monitor how they are progressing and, based on this self-assessment, decide whether to continue investing time and effort into studying, but equally important, whether to stop doing so. Yet, little is known on how time pressure affects these metacognitive decisions. Moreover, from a developmental point of view, time pressure might have dissimilar effects on different age groups, given that the mechanisms underlying metacognitive processes show substantial changes across childhood and adolescence (Garner & Alexander, 1989; Paulus et al. 2014; Schneider, 2008). Knowledge about how time pressure affects metacognitive control in children and adolescents would thus be highly valuable for both evidence-based approaches in education and instruction as well as theories on the development and nature of metacognition.

Nelson and Narens (1990) outlined a theoretical framework of procedural metacognition, which proposes that during task performance metacognitive monitoring processes inform subsequent control. This model highlights the importance of accurate monitoring as a prerequisite for adequate control and, not surprisingly, considerable research with adults, but also children, has investigated the basis of metacognitive monitoring and the conditions that influence its accuracy (Hertzog et al., 2003; Koriat, 1997; Metcalfe & Finn, 2008; Schneider, 2010a; Townsend & Heit, 2011). Moreover, an increasing amount of developmental studies has explored the second major component of procedural metacognition, namely that of metacognitive control and how children come to use the outcome of their metacognitive monitoring processes to guide their study choices (e.g., Bernard et al., 2015; Destan et al., 2014; Koriat et al., 2014; Metcalfe & Finn, 2013; Roebers et al., 2014; Son, 2005; Tsalas et al., 2015).

Metacognitive control, time pressure, and difficulty

One line of research on metacognitive control, which has crucially shaped our understanding of the developmental pathway of these abilities, has focused on the amount of study time learners allocate to learning material of varying difficulty. In a typical study time allocation paradigm, participants are presented with easy and difficult learning material, for example picture- or word-pairs, and can study these in an untimed manner. On the one hand, developmental research on study time allocation has focused on the extent to which learners allocate their study time to objective item difficulty (a priori defined easy and difficult learning material); on the other hand, it has explored the extent to which learners allocate their study time in line with subjective monitoring judgments about item difficulty of the material they have to learn, in other words, subjective judgements of learning.

One of the first developmental footprints of metacognitive control that has been observed, is that at the age of 6 to 7 years children learn to differentiate in their study time allocation between material of objectively varying difficulty (e.g., Destan et al., 2014; Dufresne & Kobasigawa, 1989). Additionally, and for the purpose of the current study more relevant, research which has specifically looked at the extent to which learners allocate their study time in line with their subjective monitoring judgments has shown that the relation between these subjective judgments and the allocated study time becomes stronger across the elementary school years (e.g., Lockl & Schneider, 2003). At around 10 years of age, children, like adults, engage in self-regulated learning, defined as the extent to which learners allocate their study time in line with their own subjective monitoring judgments (e.g., Lockl & Schneider, 2003; Metcalfe & Finn, 2013).

An important aspect that has so far been neglected in research on metacognition during studying is the effect of time pressure on metacognitive control, despite its ecological and theoretical relevance. First, as noted at the outset, both children and adults often find themselves in situations of time pressure when studying new material. A better understanding of the effect of time pressure on metacognitive control during learning might therefore give us an important insight into children’s abilities to cope with such situations. Second, time pressure has been shown to create a hot cognitive situation that affects human behaviour in various domains, such as decision-making and risk taking (e.g., Edland & Svenson, 1993; Maule & Svenson, 1993; Ordonez & Benson, 1997; Suter & Hertwig, 2011) but also learning (e.g., Ackerman & Lauterman, 2012; Walczyk & Griffith-Ross, 2006). Moreover, empirical evidence has indicated that children, who perform equally well as adults in cold situations – cognitively less demanding and/or less stressful, nevertheless show developmental differences in hot cognitive situations – cognitively more demanding or stressful (e.g., Luna & Sweeney, 2004). Therefore, developmental differences in metacognitive control might become apparent in more demanding learning situations, such as learning under time pressure or learning difficult items and would lead to a more comprehensive understanding of the nature and development of metacognitive control. Comparing metacognitive control between untimed and timed conditions could therefore complement our understanding of the development and refinement of the interplay between monitoring and control skills across childhood and adolescence.

In summary, considering the demanding cognitive situation that time pressure might create during learning, learners might allocate their study times more efficiently under time pressure. There have been no systematic studies that have specifically looked at the effect of time pressure on subjective self-regulation of study time (monitoring based study time allocation), forming the focus of the current study. Our study adds and goes beyond to the models in the literature, and investigates of metacognitive control and timing, an issue, that has so far hardly been examined in a developmental context.

Deterioration and enhancement hypotheses

As indicated above, there are limited number of studies in the literature on study time allocation, and it is hard to predict the possible developmental effects of time pressure on metacognitive control with the existing literature. Nevertheless, based on the bigger metacognition literature, we can derive two hypotheses on the effect of time pressure on metacognitive control: the deterioration hypothesis and the enhancement hypothesis. The deterioration hypothesis is based on considerations that the awareness of time constraints might distract learners from the task at hand, thereby reducing important cognitive capacities, such as working memory resources (Dunlosky & Thiede, 2004; Kellogg et al., 1999). There are indeed cognitive and neural processing similarities and relations between metacognitive monitoring and control, and executive functions (for reviews see Demetriou et al., 2018; Lyons & Zelazo, 2011; Roebers, 2017). Considering that monitoring and control only seem to become linked with age, time pressure might have a more negative effect on those populations whose monitoring and control processes are still more loosely tied together. In other words, it is possible that time pressure has a negative effect on metacognitive self-regulation, but particularly in children. Indeed, it has been shown that developmental differences become more apparent in hot cognitive situations on a variety of tasks, (Best et al., 2009; Blakemore & Robbins, 2012; Figner, et al., 2009; Luna & Sweeney, 2004) lending reasons to assume that in a timed learning context, a similar pattern might be observed with respect to monitoring based control. That is, even though children might show similar metacognitive self-regulation in untimed learning conditions as adults, their skills might not be refined enough to withhold a more overwhelming learning situation under time pressure. In contrast, the enhancement hypothesis suggests that slight time pressure improves peoples’ performance, because they do not engage in “failing courses of action” (Henderson et al., 2007, p. 81) and instead might even engage in a “more strategic and situation sensitive manner” (Son, & Metcalfe, 2000, p.218).

Examining metacognitive control in timed and untimed learning situations is, beyond that, informative for the question when and how children and adults decide to stop learning. That is, when studying under untimed conditions for example, it would not be efficient to invest an endless amount of time into studying something, if no learning is taking place (cf. Metcalfe & Kornell, 2003; Murayama et al., 2016). Continuing to invest study time without any learning return would be, as Nelson and Leonesio (1988) argued, labouring in vain. From a developmental point of view, people are likely to develop a better sense of their own learning progress and learning rate and thereby become more efficient in metacognitive self-regulation with age (for a review, see Schneider, 2008, 2010b). This leads to the prediction that with age learners become increasingly skilled in assessing their rate of information uptake and in adjusting whether they will continue investing study time or not. However, considering that metacognitive abilities continue to improve especially in adolescence years (e.g., Koriat et al., 2014; Van der Stel & Veenman, 2010; Weil et al., 2013; see also Schneider, 1998), it would be crucial to include an adolescence sample for more conclusive results. Importantly, time pressure might mediate the efficiency of the study time allocated into learning, because learners might rely on different rules when deciding to stop investing study time into learning items in untimed and in timed conditions. It would, for example, not be efficient to invest the spare time you have under time pressure into memorizing something really difficult as compared to several easier items if the ultimate goal of learning is to have memorized as many items as possible. Therefore, in untimed study conditions, learners might focus on the optimisation of their learning outcome. Indeed, whilst learning, they could be mentally rehearsing whether they have really memorised the learning item and in doing so, might end up investing more study time than needed and thereby might be labouring in vain. In the timed conditions on the other hand, learners might rather focus on reaching a satisficingFootnote 1 level of learning (cf. Simon, 1955) and might in fact study in a more efficient manner. Consequently, imposing time constraints would lead learners to study more efficiently. From a developmental point of view, it would be interesting to see what the effect of time pressure on children’s and adolescence efficiency would be. Would they benefit from the time constraint as suggested by the enhancement hypothesis, or would it have a detrimental effect on their efficiency as proposed by the deterioration hypothesis?

This study

In this developmental study we were interested in the effect of time pressure on two aspects of metacognitive control: (1) self-regulation of study time, that is, the extent to which learners allocate their study time in line with their subjective monitoring judgments and, (2) the efficiency of study time allocation, that is the ratio of successfully recalled items and the mean time invested into learning these items. Note that the operationalization of the term self-regulation is the allocation of study time guided by metacognitive monitoring. Thus, in the following paragraphs, instead of “monitoring based study time allocation”, we used the term “self-regulation” as suggested in the literature stated above.

In line with the enhancement hypothesis, learners might act in a more strategic manner and therefore engage in a higher level of self-regulation in timed conditions, whilst the deterioration hypothesis would predict that the level of self-regulation would be lower in the timed than in the untimed condition. Similarly, we wanted to see whether the efficiency rate would be affected by time pressure. In line with the enhancement hypothesis, and if learners try to reach a satisfactory level of learning rather than aiming to optimise their learning outcome, we expect a higher efficiency rate in the timed condition. From a developmental point of view, we were interested to see whether adults, who are more experienced, benefit from time pressure and would act in a more self-regulated manner, whereas children aged 10, who perform well in untimed conditions (e.g., Dufresne & Kobasigawa, 1989; see also Waters & Schneider, 2010), might show problems in timed conditions and therefore have a lower level of self-regulation in the timed than in the untimed condition. Similarly, with respect to efficiency, we were interested whether participants of all age groups might benefit from time pressure and act in a more efficient manner, or whether there would be differential effects between the age groups. Because this is the first study to investigate study time allocation in adolescents, we were also interested to see whether participants generally become more self-regulated with age even in untimed conditions. Note that this is also the first systematic study investigating the relation between metacognitive control and time pressure in a developmental sample.

We adopted a standard allocation of study time paradigm (e.g., Lockl & Schneider, 2003; Metcalfe & Finn, 2013) in which learners had to study picture pairs of varying difficulty. Learners studied a set of picture pairs in an untimed and a set of picture pairs in a timed condition. We captured participants monitoring through judgments of learning (JoLs), that is, participant’s predictive judgment about the likelihood of remembering what they recently learned in the future. As a measure of self-regulation, we calculated the relation between the study time and learners JoLs. Furthermore, as a measure of efficiency we divided the sum of correctly recalled items by the study time invested into these items (e.g. Ackerman & Lauterman, 2012).

Method

Participants

There were 183 participants in this study: 60 ten-year olds (M = 10.32 years, SD = 0.20; 23 girls), 63 fourteen-year-olds (M = 14.41 years, SD = 0.14, 31 girls) and 60 adults (M = 25.52 years, SD = 6.77, 42 females). Eight participants did not provide their date of birth. However, the age group that they were belonged to were provided. Our sample size was based on the f effect sizes of two studies using similar paradigms (Dufresne & Kobasigawa, 1989; Paulus et al., 2014). G*Power (Faul et al., 2007) sensitivity calculation from these two studies for the mixed ANOVA with 0.05 alpha value and 0.80 power with 0.50 assumed correlation among repeated measure gave the effect sizes for the between subject factors 0.19 and 0.21, respectively. Calculations based on these effect sizes yielded a sample size of 162 and 192.

All participants were native German speakers with heterogeneous socioeconomic backgrounds. Children’s families were contacted via letter. Parents received a compensation for travel expenses and children received a small present at the end of the study. The group of adults were university students who participated for course credit. Informed consent was obtained from participants or from the primary caregivers who accompanied the children to the study. The institutional ethics committee approved the study.

Materials and design

The experiment used a 3 × 2 × 2 × 2 mixed design with three age groups (10-year olds, 14-year olds and adults, between-subject), in two orders (untimed first or timed first, between subject), tested in two learning conditions (untimed and timed, within subject), and learning material at two levels of difficulty (easy and difficult, within subject). Some participants got the untimed learning block first (n = 89, 28 ten-year-olds, 31 fourteen-year-olds, and 30 adults) and some others got the timed learning block first (n = 94, 32 ten-year-olds, 32 fourteen-year-olds, and 30 adults). In the results section, the order effect was excluded, and the difficulty effect was excluded from the confirmatory analyses (see the data analyses section for further details).

The learning material consisted of 56 pictures, which were combined into 28 two-dimensional picture pairs for the study phase: 14 easy ones, where the pictures had a strong association (e.g., king-throne, baker-cake) and 14 difficult ones, where the pictures had a weak association (e.g., nut-horn, salad-candle). In each timing condition (see procedure section) participants were presented with 14 picture pairs (7 easy and 7 difficult ones). Pictures were selected from various internet resources. “Easy” and “difficult” classification were based on the categorical association between picture or word pairs following the relevant literature in paired association tests (Dufresne & Kobasigawa, 1989; Tibon & Levy, 2014; Tversky, 1973). We used easy and difficult items to prompt some variance in the JoL given by learners, which is necessary for the calculation of one of our main dependent variables (self-regulation). Similar to other studies (e.g. Lockl, 2013; Lockl & Schneider, 2003; Paulus et al., 2014), we used a 5-point smiley scale ranging from a frowning smiley on the left (indicating low likelihood of remembering), to a smiling smiley on the right (indicating high likelihood of remembering) to collect JoLs. Material was presented on a 15-inch MacBookPro using the software Psychopy (Peirce, 2009). A written transcript was used to note down participants’ answers in the recall phase.

Procedure

Participants sat at a table in front of a laptop and next to the experimenter. At the beginning of the experiment, participants were told that they would have to learn picture pairs so that in a subsequent test they would be able to name the picture on the right side (target) when only presented with the left picture (cue). In order to familiarise participants with the pictures, the 56 single pictures were first presented, item-by-item, in the middle of the screen. Participants were asked to name the pictures in one word and were corrected in case they could not name the picture or gave an incorrect answer. The familiarisation was followed by two learning blocks, which were the two conditions participants were tested in: an untimed learning block in which they could study for an unlimited time, and a timed learning block in which they had ~ 29–30 s to study all the material for this block. Because of the program used to present study items, there was a very slight difference between age groups in total presentation time in timed condition (10-year-olds, m = 29.08, SE = 0.05; 14-year-olds, m = 29.17, SE = 0.05; adults, m = 28.95, SE = 0.14), even though all the participants were tested with the same computer and in the same location. However, this difference was negligible, as the overall distribution and median of the total presentation time in timed condition among three age groups were not significant, independent-samples median test, mdn = 29.02, χ2 = 0.10, p = 0.995. In each learning block participants were presented with 14 picture pairs (7 easy and 7 difficult ones) which were presented in random order in each of the phases of the learning blocks which will be described below. The untimed and the timed learning block both followed the same procedure and were separated by a 10-min interval in which participants engaged in unrelated activities. For different participants, these activities involved one of the following: participating in a short eye-tracking experiment (anticipating the actions of an agent in very simple setting), pilot studies on grasping behaviour (end-state-comfort effect), or watching a movie in case one of the first tasks were not given. Both tasks are cognitively very low-demanding for the age groups in this study, and participants generally demonstrate a ceiling effect in these tasks and perform them with great ease. We defined the stimulus presentation time for the first learning block (1.5 s for each item), and also for the second learning block in timed condition (overall ~ 29–30 s) based on a pilot testing. Note that these ranges are compatible with other studies using paired associate’s tasks (Tibon & Levy, 2014; Tversky, 1973).

Untimed learning condition

They were initially shown 14 pictures pairs (7 easy, 7 difficult) one after the other in a randomised order, presented for 1.5 s each. Participants were instructed that they should try to memorise which pictures were presented together. After this initial fixed learning phase, a second phase followed in which participants were only presented with the left picture of the picture pairs they had just seen with a 5-point smiley scale appearing underneath the picture. For each item, they were asked to make a JoL, i.e. to indicate how sure they were that they would remember and could name the second picture in approximately 5 min time. After these JoLs, participants were told that they would now have time to re-study those picture pairs again at their own pace for an unlimited time. They were asked to try to study the picture pairs, so they would remember all or as many picture pairs as possible. As usual, the experimenter emphasised that a good performance would include them having spent as little time as possible to study these pictures (e.g., Lockl & Schneider, 2004). In order to allow for a dynamic study experience, participants could move forward and backward in the list and were informed that if they arrived at the end of the list, it would automatically start from the beginning. During the presentation there was a progress-beam with 14 blue dots, one for each picture pair. The dot that corresponded to the picture pair currently seen on the screen was highlighted in orange. The progress beam served to allow participants to have an overview over the number of items in the list. The picture pairs were presented in random order. After the presentation, participants had to complete a one-minute arithmetic exercise. They were then presented with the cue pictures (the left pictures) of all the picture pairs they had seen and were asked to name the target picture. The experimenter noted the answers down.

Timed learning condition

The timed learning block followed the exact same procedure as the first learning block with 14 randomised pictures pairs (7 easy and 7 difficult) presented in an initial fixed presentation phase followed by a JoL phase, a study phase, a short distractor task and the final recall phase. The only difference to the untimed learning block was that in the study phase participants had a limited time of 30 s to study all the picture pairs. A countdown on the top right corner of the screen starting at 30 s and counting down in full seconds indicated how much time they still had left to study.

Data analysis

Our study was both hypothesis-driven and data-driven. It was hypothesis-driven in a way that we always kept the main effects that we had specific hypotheses in our statistical models. It was data-driven in a way that we compared all possible models including the interaction effects, and selected the best models based on model comparisons. Even though we did not have specific hypothesis on the effect of order (timed condition first or untimed condition first), and the difficulty effect, we kept these effect in the models as a control variable, whenever possible. We indicated the results with the order effect in the Supplementary Material. The order effect and its interactions were deleted for the results section of the main paper for the sake of parsimony, and the results were written without the order effect and its interactions. Before testing our hypotheses on metacognitive control (self-regulation and efficiency), we checked the effect of our manipulations on judgements of learning, recall (percent), and study time allocation (mean study time). We calculated five different models for these dependent variables, after the process described below.

As the assumptions for General Linear Models (in particular mixed ANOVA) were violated (normal distribution of dependent variable for each between subject group, homogeneity of error variances, homogeneity of inter-correlations, skewed distribution, or normal distribution of errors) in most of the cases, we used Generalized Linear Mixed Model (GLMM) (see Bolker et al., 2009; Stroup, 2013). Conducting GLM while these assumptions are violated (especially with complex designs) might lead Type-II errors, so that we adopted the GLMM approach. Note that GLMMs are very flexible and robust if the data represent non-normal distribution. Beyond that, they allow to add a random effect, such as controlling for individual differences as random intercepts (see Bolker et al., 2009; Stroup, 2013).

If the assumption of normality violated (normal distribution of the data and the errors), for model selection, we first compared three type of error probability distributions (Gaussian, Gamma, inverse Gaussian) with log link function (see Lo & Andrews, 2015), and also with different covariance types as the repeated measures data mostly violated homogeneity of inter-correlations assumption. We used robust estimator for the test for fixed effects as the homogeneity of variances assumption were violated in most the cases. The models included all main factors and their interactions as fixed effects, and random intercept based on subject number to control random intercept variability. In the error distribution selection phase, the best model was selected based on Bayesian Information Criteria (BIC) score, regarding the fact that the number of fixed and random effects were the same in each model, but only the repeated covariance type and probability error distribution was different (Raftery, 1995). However, having too many interaction terms may inflate models, and increase both Type-I and Type-II errors in both General Linear Models and GLMMs (Murtaugh, 2014; Weakliem, 2016). Following the recent discussions in the statistics literature, we used stepwise deletion of interaction terms, beginning from the most complex interaction term, based on both alpha and Akaike’s Information Criterion corrected (AICC) values (Harrison et al., 2018; Weakliem, 2016). We started deletion from higher order interaction with the highest alpha value, while keeping all the main effects through different models. If the deletion decreased the AICC value, we kept the predictor independent of its alpha value (Lindsey & Jones, 1998; Weakliem, 2016). In the last step, we also deleted the effect of order and its interactions. Please see Supplementary Material 1 for the analysis including the order effect. All the post hoc comparisons were sequential Bonferroni corrected. In order to estimate effect sizes, we additionally calculated mixed ANOVAs (Gaussian distribution, identity link function, unstructured covariance matrix for repeated measures, model based covariances, and without random effects) as there is no agreed upon effect size measure for GLMMs, especially when the covariance structure of the repeated measures is changed from unstructured to other covariance types. Thus, there is no index of effect sizes for the respective tests in the results section (for GLMM models). We provided mixed ANOVA results with the partial eta-squared (ηp2) for the effect sizes (see supplementary material).

Descriptive statistics and the figures in the text are based on actual (non-transformed) values. However, statistical calculations were based on transformed data. In GLMMs, the link-function transformation is different than log-transforming (or any other transformation of) the dependent variable. Instead, in the GLMMs described below, the model link function transformation can be described as follows: “Transformed and original scales are connected by a monotonic differentiable link function that allows back-transformation to the original metric by providing a one-to-one mapping between the range of fitted values produced by the linear predictor on the transformed metric and the range of observed values on the original metric” (Lo & Andrews, 2015). For further information about GLMM with a linear predictor, please see Lo and Andrews (2015).

All the data was analysed, and the graphs were constructed with the statistical package SPSS-25 (IBM Corp., 2017). All the data, model comparisons, assumption checks, and analyses syntax can be found in the following Open Science Framework link: https://doi.org/10.17605/OSF.IO/UZBQK

Results

As described above, before testing our hypotheses related to metacognitive control, namely whether participants allocated their study time based on their judgements of learning (self-regulation) and efficiently allocate their study time (efficiency), we checked the effect of our manipulations on the JoLs, recall, and study time allocation as exploratory/preliminary analyses. Note that metacognitive control calculations were metric values based on these values. In this way, we also would be able to compare our preliminary results with the literature (please see Table 1 for descriptive results).

Table 1 Means and standard deviations of the dependent variables in three between subject age groups, and two time conditions (timed and untimed) and two item difficulty (easy and difficult)

Exploratory/preliminary analyses: judgments of learning, recall, and study time allocation

Judgments of learning

In order to analyse whether manipulation of item difficulty was successful, and easy and difficult items received different JoLs, we calculated a GLMM with JoLs as the dependent variable, age group (10-year olds, 14-year olds, adults) as between subject variable, and time condition (untimed and timed) and item difficulty (easy and difficult) as within subject variables. Based on the BIC scores, the GLMM with Gaussian distribution and log-link function, and compound symmetry repeated covariance type were selected. Besides four main effects, the best-fit model did not include any interaction term. Random intercept was not included in the model, as the model did not converge with the random intercept.

There was a significant main effect of age group, F(1, 727) = 8.468, p < 0.001. Ten year old’s JoLs were significantly lower than 14-year olds’ JoLs, t(727) = -4.218, p < 0.001, and adults’ JoLs, t(727) = -2.998, p = 0.006. There was also main effect of item difficulty, F(1, 727) = 534.136, p < 0.001, with easy items receiving higher JoLs than difficult items. See Table 1 for the descriptive results. These results demonstrate that judgements of learning increase with age. Nevertheless, all age groups gave lower judgements of learning to difficult items, which indicates that our objective item difficulty was in line with the subjective evaluations of difficulty.

Recall

We calculated a GLMM with number of recalled items as the dependent variable, age group (10-year olds, 14-year olds, adults) as between subject variable, and time condition (untimed and timed) and item difficulty (easy and difficult) as within subjects variables. Based on the BIC scores, the GLMM with Gaussian distribution and log-link function, and diagonal repeated covariance type were selected.

There was a significant main effect of age group, F(1, 720) = 17.069, p < 0.001, with adults remembering significantly more items than 14-year olds, t(720) = 2.881, p = 0.008, and 10-year olds, t(720) = 5.293, p < 0.001. Moreover, 14-year olds recalled significantly more items than 10-year olds, t(720) = 2.121, p = 0.003. There was also a significant main effect of condition, F(1, 720) = 73.917, p < 0.001, with more items being recalled in the untimed condition than in the timed condition. The main effect of item difficulty was also significant, F(1, 720) = 152.745, p < 0.001, with easy items were recalled more than difficulty items.

The best-fit model also included four interaction term: (1) Age Group X Condition, (2) Age Group X Difficulty, (3) Time Pressure X Difficulty, and (4) Age Group X Time Pressure X Difficulty). There was a significant interaction between age group and condition, F(2, 720) = 3.399, p = 0.034. In timed condition, while adults’ had a higher recall rate than 10-year-olds, t(720) = 4.454, p < 0.001, and 14-year-olds, t(720) = 2.783, p = 0.011, respectively, the recall rate was not significantly different between 10- and 14-year-olds, p = 0.112. On the other side, in untimed condition, even though adults had higher return rate than 10-year-olds, t(720) = 3.860, p < 0.001, other pairwise comparisons were not significantly different (all ps > 0.086). There was also a significant interaction between age group and difficulty, F(2, 720) = 15.479, p < 0.001. For the easy items, adults recalled significantly more items than 10-year-olds, t(720) = 2.528, p = 0.035, but the other pairwise comparisons were not significant (all ps > 0.052). For the difficult items, adults recalled significantly more items than 10-year-olds, t(720) = 5.697, p < 0.001, and 14-year-olds, t(720) = 3.239, p = 0.003, respectively, but the difference between 10-year-olds and 14-year-olds was not significant, p = 0.053. There was a significant interaction between condition and difficulty, F(1, 720) = 43.386, p < 0.001, which mainly stem from the dramatic increase in the recall rates of difficult items in the untimed condition compared to timed condition, (26% increase), t(720) = 8.920, p < 0.001. For the easy items, there was also an increase (%7) in recall performance in untimed condition compared to timed condition as well, and the difference was also significant, t(720) = 5.625, p < 0.001.

Finally, there was also a significant three-way interaction between age group, condition and difficulty, F(2, 720) = 3.892, p = 0.021. To inspect this complex interaction, we split the condition and item difficulty, and compared age groups in each situation: timed easy, timed difficult, untimed easy, and untimed difficult. In the timed condition with the easy items, none of the comparisons between age groups were significant (all ps > 0.132). Again, in the timed condition, but with difficult items, adults recalled significantly more items than 10-year-olds, t(720) = 4.994, p < 0.001, and 14-year-olds, t(720) = 3.242, p < 0.001, respectively. The difference between 10-year-olds and 14-year-olds was not significant, p = 0.133. In the untimed condition with the easy items, none of the comparisons between age groups were significant (all ps > 0.143). In the untimed condition with the difficult items, adults had higher recall rates than 10-year-olds, t(720) = 3.956, p < 0.001, and the other pairwise comparisons were not significant (p > 0.112). See Fig. 1 and Table 1 for the descriptive results.

Fig. 1
figure 1

Mean percentage of recalled items for the three age groups in the timed and untimed conditions and for easy and difficult items. Error bars represents the 95% CI

Not surprisingly, the overall recall rate increased from 10-year-olds to adults, and participants recalled more items in untimed condition. The item difficulty results clearly showed that our difficulty manipulation, which was based on categorical similarity or difference between picture pairs, provides ‘objective’ difficulty in recall. Even in the timed condition, participants recalled more easy items than difficult items in all the age groups. Careful examination of the interaction effects reveal that results mostly based on the dramatic difference between adults and 10-year-olds in the most challenging condition: timed condition with the difficult items.

Study time allocation: mean study time

In order to investigate the average study time that participants in the three age groups allocated to easy and difficult items, we calculated a GLMM. Mean study time was calculated by summing up the time spent on the easy or difficult items separately and then dividing these sums by the number of items that were studied (7 easy and 7 difficult). We conducted a GLMM with age group (10-year olds, 14-year olds, adults) as between a subject factor, and condition (timed or untimed) and difficulty (easy and difficult) as within subject factors, and the mean study time invested into studying as the dependent variable.

Based on the BIC scores, the GLMM with Gamma distribution and log-link function were selected. Diagonal repeated covariance type was used. The main effect of condition, F(1, 722) = 472.275, p < 0.001, and item difficulty, F(1, 722) = 321.829, p < 0.001, were significant, with more time invested into items in untimed condition than timed condition, and less time being allocated to easy items than difficulty items, respectively. Another main effect, age group, was not significant, p = 0.120. The best-fit model also included a two-way interaction between age group and difficulty, F(2, 722) = 8.654, p < 0.001, and condition and difficulty, F(2, 722) = 30.820, p < 0.001. Even though included in the best-fit model, the age group and condition interaction was not significant, F(2, 722) = 2.677, p = 0.069. To follow up the age group and difficulty interaction, we split up the age groups, and compared item difficulties. Results demonstrated that with growing age, participants separated less time on easy items and more time on difficult items, all ps < 0.001. To follow-up the condition and difficulty interaction via splitting the condition and comparing two item difficulties, results indicated that the time allocation difference between easy and difficult items was far higher in the untimed condition and lower in timed condition, all ps < 0.001. Please see Table 1 and Fig. 2 for the descriptive results.

Fig. 2
figure 2

Mean study time allocated to easy and difficult items by each age group and in two timing conditions. Error bars represents the 95% CI

The mean study time allocation was higher in the untimed condition, and for difficult items. Moreover, age had a positive effect on strategical time allocation, such that allocating more time on difficult items compared to easy items. On the other side, expectedly, time allocation difference between easy and difficult items was higher in untimed condition, in favour of difficult items. In summary, beyond giving additional support for the objectivity of our difficulty manipulation, these results indicated that adults – compared to 10-year-olds – are better in allocating their study time between easy and difficult items in both timed and untimed conditions.

Confirmatory analyses: metacognitive control (self-regulation and efficiency)

As our hypotheses were based on metacognitive control (self-regulation and efficiency), this section includes our confirmatory analyses. Note that for the main analysis with self-regulation (based on gamma correlation values), we did not include the factor of item difficulty in the calculation. This was because, as mentioned above, easy and difficult items were introduced to generate variance in JoL which is a prerequisite for the calculation of a gamma correlation. Followingly, the factor difficulty was also excluded from the final model on efficiency. However, please see the supplementary material for the efficiency analysis with difficulty.

Gamma correlations for self-regulation

In order to explore whether participants allocated their study time in line with their metacognitive monitoring, a gamma correlation between the JoLs and study time was calculated for each participant in both conditions (following Nelson, 1984) as a dependent variable. Gamma correlation coefficients between JoLs and study time are used (Koriat et al., 2006) as a self-regulation measure of metacognitive control in the literature (see Son & Metcalfe, 2000).

Gamma correlations are based on the relative difference between concordant and discordant pairs for each participant between items. In our case, the pairs are consisted of JoLs and the corresponding study times for each item (14 items in total). To calculate gamma, the JoL values are sorted ascendingly. For each item, the JoL and the corresponding study time value form the pairs. For each particular pair, when the study time value is higher compared to the study time value of the previous pair, then the pair is evaluated as concordant, and the other way around is discordant (for more information, see Koriat et al., 2006; Nelson, 1984; Son & Metcalfe, 2000). In this calculation, while the positive values indicate low self-regulation, negative values indicate high self-regulation. Namely, while the positive values indicate that high JoLs are related to longer study time, meaning low self-regulation, negative values indicate that high JoLs are related to shorter study time, meaning high self-regulation. The values are depicted in Fig. 3. Note that there were 5 missing cases (due to systematic selection of the same value in JoLs in a specific condition, correlation coefficients could not be computed). One sample t-tests showed that for all age groups and in both conditions, the gamma values were significantly different from zero (all ps ≤ 0.0005), thus from chance, with large effect sizes (Cohen’s ds > -1.014). As the normality assumptions were not violated, a GLMM with Gaussian distribution and identity-link function was used. We did not use a general linear model such as mixed-ANOVA as we created a random intercept model in the previous models too. Diagonal repeated covariance type was selected. The model included age group (10-year olds, 14-year olds, adults) as between subject variable, and time condition (untimed, timed) as the within subject variable.

Fig. 3
figure 3

Gamma correlations between JoLs and study time for the three age groups. Error bars represents the 95% CI

Results yielded a main effect of age group, F(1, 357) = 16.253, p < 0.001, and condition F(1, 357) = 8.364, p = 0.004 (see Fig. 4). The best-fit model did not include the interaction effect between age group and condition. Gammas were stronger in the untimed condition compared to timed condition. Post-hoc tests showed 10-year-old had weaker performance in self-regulation than 14-year-olds, t(357) = 3.167, p = 0.003, and adults, t(357) = 5.701, p < 0.001, and 14-year olds had weaker performance in self-regulation than adults, t(357) = 2.418, p = 0.013. Please see Table 1 and Fig. 3 for the descriptive results.

Fig. 4
figure 4

Efficiency of learning in untimed and timed conditions by three age groups. Error bars represents the 95% CI

Self-regulation – measured by the gamma correlations – results demonstrated that even though the ability to monitor the study time based on the judgements of learning is relatively coupled by the age of 10 (as indicated by the significantly negative Gamma correlations by t-tests against the zero-correlation value in each age group), self-regulation performance nevertheless increased with growing age. Moreover, participants were better in regulating their responses in untimed condition compared to the timed condition.

Efficiency

As a measure of efficiency, we divided the sum of correctly recalled items by the mean study time invested into these items, following Ackerman and Lauterman (2012). In this way, we could see the rate of correctly recalled items per mean second. Thus, higher values of this rate indicate better efficiency. A GLMM with the rate of efficiency as the dependent variable, and with age group as a between subject factor, and condition (untimed, timed)as within subject variable were calculated. Following the confirmatory analysis with the self-regulation, the effect of difficulty was also deleted from the final models (see Supplementary Material for the full analyses with the effect of difficulty). Based on the BIC scores, the GLMM with Gamma distribution and log-link function, and diagonal covariance type were selected. Four cases were not included in the model (0.5% of the data) as they were missing, which based on the zero-recall result of a 14-year-old child in a specific condition. Note that Gamma-log models do not accept zero and negative values.

There was a significant main effect of age group, F(2, 722) = 10.147, p < 0.001. Adults had significantly higher return rate than 10-year olds, t(722) = 4.055, p < 0.001, but not than 14-year olds, p = 0.120. However, 14-year olds had significantly higher return rate than 10-year olds, t(722) = 2.847, p = 0.009. There were also significant main effects of condition, F(1, 722) = 84.759, p < 0.001, with higher return rates in timed condition compared to untimed condition. The best-fit model also included a nonsignificant interaction effect between age group and condition, F(2, 722) = 1.982, p = 0.139. Please see Table 1 and Fig. 4 for the descriptive results.

Beyond self-regulation, efficiency results also indicated that metacognitive monitoring developed with the growing age. More crucially, return rates were higher under slight time pressure compared to under no time pressure.

Discussion

In the current developmental study, we investigated the effect of time pressure on metacognitive self-regulation in study time allocation as well as on the efficiency with which learners allocated their study time. In order to reveal developmental differences, we assessed groups of 10-year olds, 14-year olds and adults using a standard study time allocation paradigm. Learners studied under time pressure or in an untimed condition. There are two major findings related to the aims of the study, and they will be discussed in the following paragraphs. The first major finding of our study relates to metacognitively based self-regulation of study time: The results showed that a) the level of self-regulation of adults was overall higher than that of children and adolescents, and, the self-regulation of 14-year olds was higher than that of 10-year olds, even though all the age groups’ have a relatively coupled self-regulation scores– as indicated by the results showing that the self-regulation score of all the groups were significantly different from zero correlation value. More interestingly, b) the level of self-regulation was negatively affected by time pressure. The second major finding of our study showed that, in line with the enhancement hypothesis, under time pressure, participants from all ages studied in a more efficient manner than in the untimed condition and moreover, in the timed condition adults studied even more efficiently than 10- and 14- year olds. Nevertheless, there was no interaction between age group and timing condition in efficiency, suggesting that time pressure affect efficiency in all age groups in a similar way.

We interpret our findings as evidence for the following conclusions: First, our findings indicate that even though self-regulation is indeed an established strategy in 10-year olds (considering that all age groups performed better than zero correlation values on self-regulation), there still appears to be a developmental lag (based on age comparisons) since the overall performance in self-regulation was higher in older age groups, which is in line with the previous studies (see Paulus et al., 2014). This result might be explained in the extent to which learners rely on experiential based cues for the control of study (for a review see Schneider, 2008). Second, our findings on higher efficiency of learning in the timed condition. This result might highlight the fact that different stopping rules might have been at play in the untimed and in the timed learning condition (see Ackerman, 2014) and it can be claimed that the use of a heuristic based stopping rule under time pressure increases with age (Koriat et al., 2009a, b). Overall, the results of our study suggest that there are still important developmental differences in metacognitive-based control between age groups. We will expand these major findings regarding the self-regulation and efficiency in the following sections.

Metacognitive control: self-regulation

Time pressure and age

First major finding of our study relates to participants’ self-regulation (based on gamma correlations) component of metacognitive control, which was negatively affected by time pressure. We found that gamma correlation was stronger in untimed condition. In line with the deterioration hypothesis, under time pressure, participant’s self-regulation was weaker compared to the no time pressure condition. Results also demonstrated that there was a significant effect of age group. Ten-year-old children, whose monitoring-control processes are still not well-coupled compared to older age groups (e.g., Koriat et al, 2014), have more difficulty in self-regulation compared to other age groups, even though 10-year-olds are still able to regulate their responses at some degree. Comparable to the findings from the literature on objective item difficulty, it seems that allocating more study time to items perceived to be less likely to be remembered is an established approach by 14 years, and can be employed even in more challenging learning situations, i.e. under time pressure. Adults overall had a stronger gamma correlation in the current study, suggesting that they allocated their study time even more in line with their subjective monitoring judgments than children and adolescents. Thus, our findings thereby extend prevailing views in the developmental literature by showing that, even though at 10-years of age children differentiate well in the study time allocated to items they perceive to be easily or more difficult to learn (Waters & Schneider, 2010), the interplay between monitoring and control processes continues to be refined across adolescence.

From a developmental perspective this finding leads to an interesting speculation on the factors that might influence the development of monitoring-based control. We can think of two mechanisms that might be at the root of this developmental effect, one of which relates mainly to monitoring processes and one which relates mainly to the ability to translate the outcome of ones monitoring into adequate control. There are two crucial premises to the first explanation. First, the cue utilisation hypothesis suggests that there are different kinds of cues on which learners base their JoLs (Koriat, 1997). Intrinsic cues give rise to theory based judgments about item-relatedness for example, whereas mnemonic cues, such as ease of learning or the ease of retrieval, are data driven and give rise to an experiential feeling of the difficulty of a task (Koriat, 1997; Proust, 2013). Second, developmental empirical studies have shown that already 6–7-year old children rely on internal cues (item relatedness) in their JoLs, however that the reliance on mnemonic cues (ease of learning or retrieval) is a later developmental achievement (e.g., Koriat et al., 2009b; Koriat & Shitzer-Reichert, 2002). Taken together, in the current study, it could be argued that two types of cues could have informed participants’ JoLs: internal cues (item-relatedness) and mnemonic cues (effort needed to study each item). Both types of cues would have led to a distinction between easy and difficult learning items in monitoring judgments (and also study time), albeit through slightly different processes: internal cues through theoretical inference that highly related items are easier to remember than difficult ones, and mnemonic cues signalling that more effort is required for difficult than for easy items. However, compared to internal cues, mnemonic cues would have led to a more sensitive differentiation in JoLs and study time between items of normatively equal difficulty, for example between items within the set of difficult items. Therefore, one interpretation of the finding in the current study that adults overall engaged in higher self-regulation than children and adolescents, is that they relied on different, more item sensitive metacognitive cues for their metacognitive control.

We would also like to put forward a possible second explanation by referring to previous findings in the metacognitive literature, that the ability to translate ones monitoring into adequate control processes seems to be a later developmental achievement (e.g. Dufresne & Kobasigawa, 1989; Lockl & Schneider, 2003). Note that previous findings have shown that already 9-year olds can use experiential cues as a basis for their monitoring judgments (Koriat et al., 2009a). There is therefore no reason to assume that 10-year olds in the current study were not sensitive to such mnemonic cues. Instead, it could be that younger participants, like older ones, indeed relied on mnemonic cues in their monitoring but failed to translate this into appropriate control. Indeed, this would coincide with a recent study by Paulus et al. (2014) which showed that only at 14-years of age did learners apply the knowledge gained from first task experience in their predictions of others' learning success. Extending this finding to participants’ own regulation of learning, the current study may suggest that using experiential cues for subsequent application of metacognitive control is a later developmental achievement.

Metacognitive control: efficiency

Time pressure

The findings of the current study – related to efficiency – add to the existing literature by showing that time pressure can lead to higher efficiency of metacognitively controlled learning in adults but also in children and adolescents. In line with the enhancement hypothesis, it might be claimed that (mild) time pressure prevented participants from engaging in unnecessary learning effort, resulting in higher efficiency. Indeed, in the untimed condition, despite being instructed to study for as little time as possible, participants invested much more study time than in the timed condition, which ultimately led to a less efficient outcome. We might therefore say that in the untimed condition, participants ended up labouring in vain (cf. Nelson & Leonesio, 1988). In other words, the recall rate per mean second was lower in untimed condition, which indicated that the additional time that participants pursued in untimed condition did not increase their efficiency rate.

As indicated above, participants from all age groups benefitted from the time constraints in sense of efficiency. We would like to put forward a possible explanation for this which relates to mechanisms investigated in the literature on heuristic based decision-making (e.g., Epstein, 1994; Gigerenzer & Todd, 1999; Kahneman & Tversky, 1973). Here, a distinction has been made between decisions that rely on heuristics, i.e. simple and quick "rules of thumb", as compared to controlled and more complex rational reasoning. Markedly, it has been proposed that under time pressure people rather rely on quick heuristics when making their decisions (Gigerenzer & Todd, 1999). The results of the current study open up the question whether similar mechanisms might be at play with respect to cue based metacognitive control in the timed condition. More specifically, in the untimed condition, participants might have been inclined to ensure the optimisation of their learning, thereby engaging in more elaborative monitoring and control of their progress, which resulted in the much higher amount of study time invested into learning overall (cf. Simon, 1955; Winter, 2000). In the timed condition however, the stopping rule for learning each item might have been a different one, because there was simply not enough time to ensure that items had been optimally learned. Therefore, it might be, that under time pressure, participants relied on a heuristic stopping rule in their allocation of study time, which signalled that they have achieved a satisficing level of learning and should move on to the next item (cf. Simon, 1955).

Age and efficiency

Notably, in the timed condition adults were more efficient than both, 10- and 14-year olds, and 14-year olds in turn were more efficient than 10-year olds. This highlights that there are still developmental differences in metacognitive control between these age groups. If participants indeed applied a heuristic stopping rule for learning in the timed condition, then it is conceivable, that with age, learners were more sensitive to such cues, and that through experience, their heuristic responses were better calibrated. Such an interpretation is in line with existing findings that point towards a developmental trend in the reliance on heuristics for decision-making (e.g., Davidson, 1995; Reyna & Ellis, 1994) and extends this theoretical approach to the study of metacognitive development.

Thus, it could be argued that our study is thereby the first to point towards the use of heuristic based decisions into the realm of metacognitive monitoring and control processes during learning, and moreover suggests that these are sensitive to developmental processes. Interestingly, even though there was no generally discernible developmental trend in the efficiency of learning between the age groups in the untimed condition, 14-year olds studied in a significantly more efficient manner than 10-year olds. In line with other recent studies (e.g., Koriat et al., 2014; Paulus et al., 2014; Weil et al. 2013) this pronounced effect in 14-year olds highlights that adolescence might be an important transitional stage for the development of the metacognitive coupling between monitoring and control abilities.

Conclusion, limitations, and future directions

To briefly sum up, the current study investigated the effect of time pressure on metacognitive self-regulation and efficiency of study time allocation during learning in children, adolescents and adults. It appears that during untimed study all participants engaged in more reflective metacognitive self-evaluations of their learning progress. Time pressure affected self-regulatory strategies in all three age groups, in a way that the overall level of self-regulation was significantly higher in adults compared to 10- and 14-year olds. Therefore, the current findings highlight that, albeit somehow-established in 10-year olds, the interplay between metacognitive monitoring and control processes in the self-regulation of study time becomes increasingly sophisticated and continues to be refined across adulthood. With respect to the efficiency of learning, developmental differences between the age groups became more apparent especially in the timed condition, in that adults were even more efficient than 10- and even 14-year olds and 14-year olds studied in a more efficient manner than 10-year olds.

Even though we found an overall positive effect of slight time pressure on efficiency in all age groups, the other component of metacognitive control (self-regulation) was negatively affected by slight time pressure. These results may have some implications for different learning settings at school or work. We may speculate that children older than 14-year-olds and adults may benefit from slight time pressure in learning some materials, especially easy ones, in sense of efficiency. However, on the other side, it should also be noted that especially 10-year-olds have difficulty in regulating their study time in line with their metacognitive judgements – that is self-regulation.

On the other side, there are also some theoretical implications. While the results of the self-regulation were in line with the deterioration hypothesis, efficiency results supported the enhancement hypothesis. This particular theoretical implication may lead further studies on the subject. Even though these results are contradictory at the surface level, they might actually account for different levels of metacognitive control in study time allocation. To ascertain the optimality of a given time allocation strategy, Son and Sethi (2006) differentiate between (1) the relation between time allocation and competence, (2) and learner’s objectives in their metacognitive model for study time allocation. While the efficiency measure might unfold the relation between time investment (study time) and competence (e.g. recall), the self-regulation measure might unveil whether learners allocate their study time in line with their JoLs. Note that these two levels might have been affected from learning conditions in divergent ways. In the timed condition, participants may have not had ‘enough time’ to regulate their study time in line with their JoLs. However, the time constraint in learning might have paid of via reaching satisficing level of learning (cf. Simon, 1955), in other words, studying in a more efficient manner.

While our results clearly demonstrate the positive role of slight time pressure on metacognitive control during learning – efficiency, there are some limitations that should be addressed in future studies. First, our study focuses on the metacognitive control during acquiring and recalling knowledge under time pressure or under no time pressure. However, results may differ in different types of tasks and in different time ranges of time pressure. Thus, further research should investigate the effects of time pressure on different type of tasks and learning situations in greater detail. This is particularly relevant as time pressure is present in many current learning contexts, such as schools or workplaces. Second, metacognitive control during time pressure might have been affected by differing time perception in age groups. Temporal sensitivity, especially towards short time ranges, develops over the years (Droit-Volet, 2013; Zélanti & Droit-Volet, 2011). However, one should also note that metacognitive control and time perception may be co-developing abilities. Moreover, considering the hot cognitive situation that time pressure may create, children’s metacognitive control performance could have been affected from their developing emotional regulation system. Future studies should investigate the possible developmental relations between metacognitive control, time perception, and emotion regulation. That is being said, 30-s time limit might have been far more challenging for the 10-year olds compared to adolescents and adults. It is possible that the difference between 10-year-old’s difficulty in metacognitive performance compared to other groups may have been due to the limited time window in the timed condition. Nevertheless, considering the exploratory results, there was no age group difference in study time allocation, but on JoL, recall, and metacognitive control results. Thus, it could be claimed that not the speed of acting on the item but the further cognitive and metacognitive processing of information may have affected the performance. Future studies should also consider the effect of differing time windows on metacognitive control to further clarify the results and our claims.

Third, while interpreting our results for JoLs and self-regulation, we referred to the internal cues and mnemonic cues during metacognitive processes. However, future studies should investigate how exactly cues are used in different age groups. This may be done via owing an item-based study design and manipulating the type of cues used in the stimuli. Similarly, it should be also noted that the difficulty of an item is dependent on the difficulty of other items. Participants may apply different metacognitive strategies based on the level of relative difficulty of different items, rather than merely making “easy” and “difficult” distinction as we classified. Again, an item-based approach considering the ordinal difficulty level of each item or a more comprehensive study including comparisons based on various item difficulty levels (e.g., easy-mild, easy-difficult, mild-difficult) might help at this point. Finally, the relatively-loosely coupled relation between metacognitive monitoring and control processes in 10-year-old children may be explained by the mediating role of the working memory (Touron et al., 2010), and the hot cognitive situation that time pressure might have created on children’s working memory. Extensive future research is needed to understand the casual relations between metacognitive monitoring/control, executive function (and working memory), and emotion regulation in hot cognitive situations. Considering that our adult group were university students (a selective group), who probably have better executive functions and emotion regulation compared to their peers, future research should also include wider groups of adults. Taken together, the current study explored metacognitive control in timed and untimed learning conditions and points to developmental changes in metacognitive control.