Keywords

6.1 Introduction

Understanding changes in educational achievement over time is an essential goal for most stakeholders within the field of education. Countries taking part in recurring International large-scale assessments, such as IEA’s Trends in International Mathematics and Science Study (TIMSS), are not only provided with reliable and comparable measures of achievement trends in mathematics and science but also with a vast amount of contextual data, which are valuable for further analysis. Educational circumstances and potential causes behind changes in educational outcomes are at the core of interest for policymakers, practitioners, researchers, students, parents, and the public. It is often difficult to provide evidence of the reasons behind changes in achievement. Not only because measures of knowledge, for example, the TIMSS achievement scores in mathematics, are complex, in the sense that there are very many determining factors behind student performances on these tests, but also because the design of these studies is cross-sectional, and this makes it hard to determine what is the cause and what is the result, as a cause must appear before the effect. However, there are statistical techniques and methodologies with which one can investigate theories of plausible causes despite the cross-sectional design (see Gustafsson & Nilsen, 2022, for an overview and examples). Even if each cycle of TIMSS is cross-sectional for the respondents, it is longitudinal at the country level. This fact opens possibilities to investigate factors that may have contributed to the observed changes in achievement.

In this chapter, we focus on the classroom level, and investigate changes in students’ Opportunity to Learn (OTL) with respect to the core aspect of OTL at the implemented level of the curriculum, content coverage. We investigate if changes in content coverage may account for any of the changes in mathematics and science achievement in grade four over the three TIMSS cycles 2011, 2015, 2019 in the Nordic countries. The concept content coverage refers to the content assessed by the TIMSS achievement tests in mathematics and science, as reported by the teachers of the assessed students who reported whether the topics in the test had been taught to the class before the tests were administered.

6.2 Theoretical Framework

In this chapter we address OTL at the implemented level of the curriculum, i.e., the classroom level, to find out if students in schools have had similar content coverage in school as is represented in the TIMSS tests. The question relates to the concept of OTL, which is an old multidimensional educational construct with roots in curricular theory (Dahllöf, 1970; Husén & Dahllöf, 1965) and also in learning and instruction research (Caroll, 1963) and further developed within the framework of educational effectiveness research (e.g., Scheerens, 2016). The curricular strand of OTL was coined by IEA (International Association for the Evaluation of Educational Achievement) in the 1960s when the research design for the first early international large-scale assessments of mathematics and science achievements was developed (Husen, 1967; Pelgrum, 1989). Students’ OTL is one of those multidimensional factors that hold explanatory power at both school- and system levels (The OTL construct and its relation to different curricular levels are explained in Chap. 4).

Measuring students’ OTL is challenging, and its measures vary over the years (see e.g., Schmidt et al., 2009; McDonnell, 1995; for a brief history of the construct’s origin and development). Nevertheless, OTL measures are taken in TIMSS, which aim to indicate whether the students have had an opportunity to learn the topics or cognitive tasks included in the achievement test.

The aim of measuring OTL is to reflect what opportunities have been offered in the classroom to learn the tasks and content of the test and is primarily aligned with the implemented curriculum. In the classroom, the following three teaching factors moderate students’ opportunities to perform on knowledge and skills tests: content coverage, the time spent on the content/task and quality of teaching. Content coverage refers to whether the tasks and content of the test have been taught to the student or not, and to what extent. Time on content refers to how much time has been spent learning the content/tasks. The time spent on the task also includes timing, which refers to the time gap between when the content was taught and the test administration. Quality of teaching refers to the teacher’s competence to teach the content to the students and provide a good learning environment for the students. Of these three OTL factors, content coverage is the first and necessary condition for enabling students to achieve on the test (Schmidt et al., 2010). Time on the task or content, and quality of teaching, come second and are relevant moderating factors to determine the level of learning.

These days, the OTL construct is often used in a much broader sense, including factors related to students’ backgrounds and the learning conditions outside of school. For IEA, the measure of OTL was, and still is, primarily meant to capture curricular differences when studying achievement differences across school systems. Indicators of OTL are included in all IEA studies and observed at different curricular levels. Chapter 4 offers a comprehensive presentation of the links between different curricular levels and the OTL construct, and a presentation of the indicators used in TIMSS.

In this chapter, the investigation of OTL factors’ relationship with achievement and changes in achievement is limited to content coverage at the classroom level and whether the student has been taught the topics included in the TIMSS assessment of student achievement. Differences in content coverage across classrooms and time may account for some of the differences and changes in achievement.

6.2.1 Content Coverage in Grade Four Mathematics

Relative to other predictors of achievement, at present, there are fewer studies on the relationships between achievement on large-scale international assessments and content coverage in the classroom. However, one of relevance is a study by Scheerens (2016), where the relationship between OTL and achievement was investigated in both TIMSS 2011 (grade four and grade eight) and PISA 2011 (15-year-olds). Scheerens conducted a series of regression analyses to assess the effect of OTL on mathematics and science achievement while controlling for the “books at home”-variable as a proxy for students’ home background. The many content coverage items from the teacher questionnaire in TIMSS were combined first into domain indices (three in science and three in mathematics), all with a scale from 0 to 100, and then averaged into a single mathematics OTL index and a single science OTL index. The findings for grade four were surprising. The results showed that content coverage in mathematics was significantly related to mathematics achievement in only 12 out of the 23 included countries, and the average effect was a modest 0.074. In science, there were virtually no effects of the science OTL index on achievement. Scheerens concludes that there are methodological challenges attached to the OTL indicators and calls for a closer examination of the validity of the OTL measures in TIMSS.

6.3 Research Question

The overarching aim of the present study is to investigate whether changes in content coverage are related to the changes in mathematics and science achievement from 2011 to 2019 in Sweden, Norway, Finland, and Denmark. We address this aim more specifically through the following research questions:

  1. 1.

    Is there a positive correlation between the OTL measures of content coverage and achievement?

  2. 2.

    Has the amount of content coverage changed from 2011 to 2019?

  3. 3.

    Are the possible changes in content coverage related to changes in grade four mathematics and science achievement from 2011 to 2019?

  4. 4.

    Are there notable differences between Denmark, Finland, Norway, and Sweden?

6.4 Methodology

6.4.1 Data and Sample

In the present study, we use data from the 2011, 2015, and 2019 TIMSS cycles. We include data from the teacher questionnaires and the results from the students’ achievement tests in mathematics and science. Included are the fourth graders and their teachers from the Nordic countries that participated in these cycles: Sweden, Norway, Finland, and Denmark. In Norway, however, the target population changed in 2015 from fourth to fifth grade, so our sample includes Norwegian fourth graders in 2011 and fifth graders in 2015 and 2019. Findings from Norway hence need to be interpreted with caution. Part of the explanation for the large increase in achievement in Norway from 2011 to 2019 is that students are both older and have one more year of schooling (Olsen & Bjørnsson, 2018). For further descriptions of TIMSS data, including sampling, plausible values, and weights, please see Chap. 3.

6.4.2 Measures

In the present study, we use a set of derived variables available in the TIMSS database which are based on teachers’ responses to whether and when they have covered the content within each of the three content domains for mathematics (number, geometry, and data) and the three domains for science (life science, physical science, and earth science). There are a number of different topics within each domain for which the teacher is asked to select one of the following three response options: “mostly taught before this year”, “mostly taught this year”, “not yet taught or just introduced”. For example, within the domain number, one topic (out of many) is: “adding, subtracting, multiplying, and dividing with whole numbers”. It should be noted that these content coverage questions do not include any information about to what extent the topics have been taught. These topic items were then combined to form indices, one for each domain in mathematics and science, respectively.

The description of how TIMSS constructed the indices is available in the TIMSS 2019 User Guide for the International Database (Fishbein et al., 2021). In short, the three response options were re-coded into two categories, to inform whether the topics had been taught or not. After that, the questionnaire items were combined into indices, one for each content domain. The indices then reflect the percentage of topics within each content domain (three in mathematics and three in science) that had been taught to the students.

In mathematics, the three indices represent percentage of number topics taught, percentage of measurement and geometry topics taught, and percentage of data topics taught. For simplicity, in this study, we refer to these content coverage indices as OTL in number, OTL in geometry, and OTL in data, respectively.

Similarly, for science, the three content coverage indices in the TIMSS database represent the percentage of life science topics taught, percentage of physical science topics taught, and percentage of earth science topics taught, and we refer to these indices as OTL in life science, OTL in physical science, and OTL in earth science.

Descriptive information for the OTL variables in mathematics, in the three TIMSS cycles studies and for all four countries analyzed, is presented in the bar graphs below (Fig. 6.1). Additional descriptive statistics for these OTL mathematics indices are presented in Chap. 4, along with those for science. The descriptive information about the OTL variables is selected from the TIMSS 2019 International Results in Mathematics and Science (Mullis et al., 2020).

Fig. 6.1
figure 1

Bar graphs showing the average proportion of topics in the TIMSS mathematics test for grade four that had been taught to the students at school by the time of testing. Note One bar per country and cycle for each content domain. Data derived from the teacher questionnaire

The graphs show that the average proportion of topics covered in each mathematics content domain is relatively high, but also that it changes across time in all countries. The graphs also show that the countries differ in the pattern of change.

6.4.3 The Analytical Approach

The method of analyses in this study resembles that of longitudinal growth models (Murnane & Willete, 2010) but is adjusted to trend analyses. Such causal methods enhance the robustness of inferences (Gustafsson, 2010).

Causality and causal language. To investigate the relationship between predictors and outcomes in ILSA, most studies utilize data from one cycle only (Scherer, 2022). Some include two or more cycles and do the analyses separately for each cycle and compare results across time. The present study merged the data from three cycles (2011, 2015, and 2019) and used a causal method to utilize the trend design of TIMSS (see Chap. 2 for details on the trend design of TIMSS). This approach is far more robust and enhances the plausibility of causal inferences, as well as the reliability and validity of inferences.

The analytical approach. The analytical approach in the present chapter is the same as in Chap. 7, a structural equation model (SEM) with mediation. We used the software Mplus 8 for the analyses (Muthén & Muthén, 2017). Student and teacher data were merged using the IEA International Database Analyzer (IDB Analyzer), and a dummy variable called Time was added to each of the three datasets for the three cycles. Time is coded 0 for 2011, 1 for 2015, and 2 for 2019. The data for the three cycles were then merged. In Mplus, a null model estimate changes in achievement over time. In Fig. 6.2, c is the regression coefficient for the relation between time and achievement and describes the slope of the effect of time on achievement. The c-coefficient represents the average change in achievement for one unit change in time. Thus, to get the average change in achievement between 2011 and 2019, the c-coefficient should be multiplied by two.

Fig. 6.2
figure 2

The null model

We hypothesize that other factors besides the passing of time may “explain” changes in achievement. More specifically, we hypothesize that changes in OTL may account for any of the changes in achievement. Another (more technical) way of phrasing this is that we hypothesize that OTL to some degree may mediate the effect of time on achievement, as illustrated in Fig. 6.3.

Fig. 6.3
figure 3

Hypothesized mediation model in which OTL mediates the effect of time on achievement

Suppose OTL mediates the effect of time on achievement. In that case, it may mean that changes in OTL are related to changes in achievement. It may, in turn, indicate that changes in OTL, which in our analysis refers to changes in content coverage in the classrooms, explain changes in achievement over time.

If OTL has improved over time, the regression coefficient a would be positive. If OTL is positively related to achievement, the regression coefficient b is positive. However, we still needed to test whether the mediation is significant, which was done through the command Model indirect in Mplus. This estimate of the indirect effect is the main focus of the present study, as it tells us whether, and to what extent, OTL may mediate (or “explain”) changes in achievement.

This analysis aims to explain student achievement changes over time, not differences in achievement between classrooms. Therefore, the analyses were done at the student level. However, to take into account the hierarchical design of the data where students are nested within classes that are nested within schools, and to avoid under-estimation of standard error, we used the option in Mplus called “TYPE = COMPLEX” where the data is clustered at the class-level. This way, the analyses take the hierarchical clustering of students and the between classroom variation into account.

6.5 Results

6.5.1 The Base Model—Changes in Achievement Over Time

We started by examining the effect of time on achievement. The results reflect the slope of the relation between time and achievement. A positive regression coefficient means that achievement increased over time. A large positive coefficient would reflect a large increase. The regression coefficients reflect the mean change in achievement across the three cycles and are provided in Table 6.1.

Table 6.1 Regression coefficients for the effect of time on achievement in mathematics and science in grade four

The findings show an average increase in science and mathematics achievement from 2011 to 2019 for Norway and Sweden. The numbers represent changes in score points on the test from one time-point to the next. The total change in achievement from 2011 to 2015 to 2019 is thus 2 * the regression coefficient, which for Sweden equals 10.6 score points in science and 22.6 score points in mathematics. For Norway, a substantial part of the large increase (2 * 22.6 = 45.2 score points in mathematics and 2 * 22.1 = 44.2 score points in science) is explained by the shift of target grade in Norway from grade four to five in 2015 (Olsen & Bjørnsson, 2018). For Finland and Denmark, the achievements declined in both subject domains. The regression model assumes a linear relationship between time and achievement, that is that the changes are evenly distributed between study cycles, which is not necessarily the case as can be noted when looking at the descriptive statistics for each time point. Instead, these model estimates represent the average across the three cycles of TIMSS analyzed in this study. For further details on changes in achievement, see Chap. 1.

6.5.2 Relations Between Changes in OTL and Changes in Mathematics Achievements

Interpretation of results. For the analyses of the percentage coverage of the topic within each content domain in mathematics, we first analyzed each domain separately, and these results are shown in Table 6.2. This table presents the results as text and symbols (see Appendix 1 for all estimates). To interpret the results, one must remember that previous research and our hypothesis predict that OTL should be positively related to achievement and that if OTL increases over time, it should cause increased achievements. If OTL decreases over time, it should cause a decline in achievement.

Table 6.2 How changes in the proportion of students having received instruction in the topics; number, geometry, and data are related to changes in mathematics achievement over time

However, the four Nordic countries analyzed in this study have different achievement profiles over time (see Chap. 1). In Norway and Sweden, achievements have increased from 2011 to 2019, while in Denmark and Finland, achievements decreased. Here, we use Finland as an example to illustrate how to interpret results. Firstly, one must recall that Finland had declining student achievements from 2011 to 2019. If OTL is positively related to achievement and if OTL increases over time in Finland, and if the indirect effect is significant and positive, it means that OTL probably prevented a further decline in Finland’s negative trend in achievement. Hence, an increase in OTL over time could prevent further achievement declines for countries with a negative achievement trend. For countries with positive achievement trends, an increase in OTL may explain part of this increase. At the same time, a decrease in OTL may have prevented an additional increase in the already positive achievement trend.

Results. We first provide the results for the relations between the different OTL measures and mathematics achievement, which are provided in the first column of Table 6.2 (“what is the effect of OTL on mathematics achievement?”). There were no significant relations between any OTL variables and mathematics achievement for Sweden, but slight, positive, and significant findings for all the other countries for all three OTL measures. Minor effects here reflect effect sizes below 0.2. In other words, a clear pattern of positive associations between OTL in mathematics and achievement exists.

OTL in number increased from 2011 to 2019 in all the four Nordic countries. This means teachers reported higher percentages of students having covered the topics in the content domain number in 2019 compared to 2011. In Norway, this increase in OTL for the content domain number, accounts for about 6.5 points of their 45 points increase in achievement from 2011 to 2019. The remaining 38 points of increase are explained by other factors (e.g., the change in target grade from grade four to five and the age of students). Achievements in both Finland and Denmark decreased from 2011 to 2019; however, had it not been for the increase in OTL for the content domain number, their achievements could have decreased by yet another two points. In Sweden, the OTL for the content domain number was not significantly related to achievement; thus, it could not explain any part of the increased achievements in Sweden.

OTL in geometry increased from 2011 to 2019 in Sweden and Finland but decreased in Norway. There was no significant change in Denmark. OTL in geometry was related to changes in achievement only in Norway and Finland. In Norway, the decrease in OTL in geometry, hindered a further increase in achievements. The indirect effect was about minus 3.3 points, meaning that Norway’s achievement could have increased by about 3 points, had it not been for this decrease in OTL. In Finland, the opposite was the case. In Finland, mathematics achievement decreased by 12 points from 2011 to 2019. Had it not been for the increase in OTL in geometry, their achievements could have declined by almost 14 points.

OTL in data did not change significantly in Denmark, but in Sweden and Finland the level decreased, and in Norway, it increased. The increase in OTL in data in Norway explained two of the 45 points of increased achievement over time (and/or from grade four to grade five). In Finland, the indirect effect was about two points. This means that had it not been for the decrease of OTL in data over time, their achievements could have declined by only 10 points rather than 12 points.

The results from the analyses of the mediation model, where all three OTL variables were included as mediators at the same time, are provided in Table 6.3. The relations between OTL and achievements and changes in OTL over time are not different from the results provided in Table 6.2. Hence, only the indirect effects for each OTL variable and the total indirect effect are provided. The total indirect effect reflects the sum or the total contribution of the three OTL variables when they are controlled for each other.

Table 6.3 Indirect effects and the total indirect effect for the model where all three OTL variables in mathematics were included simultaneously in one model

Similar to the results provided in Table 6.2, there were no significant effects for Sweden. For Norway, the total indirect effect of all three OTL variables was about four points. Meaning that four out of the 45 points of increase in achievement may be explained by changes in OTL. Note that the increase in achievement in Norway may stem from changes in time and/or the change in grade (from grade four to five). However, while OTL in number and data contributes positively, OTL in geometry contributed negatively. The total effect for Finland was about one point, which reflects a positive contribution by OTL. Had it not been for the increase in OTL in geometry and number, their achievements could have declined by another two points. In Finland, there was no significant contribution from OTL in data. In Denmark, only OTL in number contributed significantly to the total indirect effect. Like Finland, Denmark’s decline in achievement could have been worse had it not been for the increased OTL in number.

6.5.3 Relations Between Changes in OTL and Changes in Science Achievements

For science, none of the relations between the three OTL variables (OTL in life science, OTL in physical science, and OTL in earth science) and achievement were significant. This may be because the proportion of missing data for these OTL variables was relatively high in all countries but Finland, so the results may not be trusted. Appendix 2 presents the proportion of teachers/classrooms missing information on the OTL variables for all countries in each of the analyzed TIMSS cycles.

Due to the absences of data in the OTL variables, we did not include a large table like Table 6.2 for science achievement. The mediation model indicated that neither the indirect effect of each OTL variable nor the total indirect effects of the three OTL variables joined together were significant for any country (see Appendix 3). Regarding significant changes, there was a large decline in OTL for earth science over time and a slight decline in life science in Norway. Swedish data also indicated a significant decline concerning OTL in earth science. There were no significant changes over time in Finland or Denmark. However, whether these results may be generalized is doubtful as the level of missing responses to the content coverage questions in the teacher questionnaire were high in Denmark, Norway, and Sweden.

6.6 Summary and Discussion

This study aimed to investigate if changes in students’ opportunities to learn the content and tasks included in the TIMSS tests can explain any of the changes in achievement. The first question was if the OTL measures of content coverage were positively related to achievement. In mathematics, the content coverage items are combined into three OTL-scales for the three different mathematical content domains in TIMSS; one for number, one for geometry and measures, and one for data. The items in each scale were aimed to indicate whether the topics within the domain had been taught to the students. All three OTL scales in mathematics were positively related to mathematics achievement on the tests in all four countries, The effect was small and statistically significant in all countries except Sweden, where the effect was too small to be significant.

The OTL scales in science were constructed in a similar way to mathematics, one for each content domain in the science TIMSS test; one for life science, one for physical science and one for earth science. However, the proportion of missing data on these scales was too high in Norway, Sweden, and Denmark for any reliable interpretation of the analysis of relationships. Finland had an acceptable level of missing data, but since no reliable comparisons could be done with the other Nordic countries, we chose to not discuss those results.

The second question was on whether there had been any change between 2011 and 2019 in the OTL measures. An increase was found in OTL in the number domain in all countries. An increase in OTL in geometry was found in Sweden and Finland, while Norway showed a decrease and Denmark showed no change. OTL in the data domain decreased in Sweden and Finland but increased in Norway. There was no change in Denmark. Finland and Denmark reported higher levels of content coverage in the number content domain compared to Norway and Sweden. Finland also reported higher levels of content coverage in data, but apart from this, the differences in content coverage between the Nordic countries were small. The overall picture indicates a medium–high level of OTL in all three mathematics content areas for all Nordic countries.

The last question was on whether the changes in content coverage explain any achievement changes. The analysis showed small but significant contributions of all three OTL variables in Norway and Finland. In Denmark, only the change in OTL in the number domain contributed to the change in achievement, whilst in Sweden, none of the changes in OTL (mathematics domains) are related to achievement.

We can conclude that our results align with previous findings in mathematics; more content coverage is positively associated with higher achievements in all countries (Scheerens, 2016). Furthermore, an increase in content coverage over time is associated with increased achievements (or a less negative decline), while a decrease from 2011 to 2019 in students’ learning opportunities is associated with decreased achievements (or a less positive incline in achievement).

It must be pointed out that although most findings were significant, the effects are minor. There may be several reasons for this, one is that average level content coverage is high in all countries. Moreover, the changes in content coverage appear small and mostly in the positive direction. Low levels of variation cause low effects. For policymakers and teachers, low effects of content coverage are desired as it signals that the assessed topics have been addressed in most of the classrooms.

Another reason for the small effects could be related to the reliability of the OTL scales, which are low due to limited items and response options. Furthermore, the questions in the teacher questionnaire regarding topics covered in the classroom can be interpreted in different ways, potentially adding some inconsistency or noise to these measures. Another reason for minor effects may be the small sample of teachers participating, which affects the power. Had similar questions been asked to the representative samples of students, the effects would probably have been larger (see e.g., Schmidt et al., 2011; Scheerens, 2016). However, it would be hard for fourth-grade students to answer such questions.

Finally, it should be noted that all the content coverage items in the teacher questionnaire suffer from large levels of non-responses, in science, even more than in mathematics. This is true for Norway, Denmark, and Sweden. This is unfortunate as it also contributes to weakening the effect sizes. No such missing problems were found in the Finnish data.

6.6.1 Limitations, Reliability, and Validity

The method used in the present study is more robust than analyses of one cycle of TIMSS, and more robust than comparing results from separate analyses of each cycle. Data is longitudinal at the country level, which is good when investigating the effect of system level factors, as many plausible important factors (social, cultural, and economic) remain stable over time at this level. Nevertheless, in this analysis, we only investigate a limited number of potential country level factors that may underlie the achievement changes between 2007 and 2019. So strictly speaking, no causal inferences can be made regarding content coverage as an explanatory factor to changes in grade four mathematics achievement. We can, however, conclude that our results concord with previous research and theory, that content coverage matters, and on average, when content coverage increases, so does achievement.

One limitation of this study is the assumption of linear changes in achievement over time. The model assumes such a linear relation, but the changes in achievement from 2011 to 2019 are not always linear. For instance, in Sweden, achievements inclined much more between 2011 and 2015 (15 points) than between 2015 and 2019 (2 points). The regression coefficient between time and achievement in our models reflects a mean growth from 2011 to 2015. This prevents us from providing detailed information about the differences in growth between each cycle. Our findings instead reflect mean changes across the whole period from 2011 to 2019.

The OTL measures have changed over time, as has the test. While this is reasonable, it makes it harder to connect to changes in the countries’ curricula. However, in this study, we have only used those content coverage items that have been repeated over all three cycles, which is the major part. Not including the new content coverage items may have limited the coverage of each sub-topic. Further research is warranted to investigate this possibility.

The large number of missing responses from the teachers in Denmark, Norway, and Sweden, to the questions about content coverage in science, prevented a proper analysis of OTL in science. The countries may want to encourage and inform teachers of the importance of the OTL measures for research and policy (even if there are many items for the teachers to respond to). Also, these rather demanding questions regarding content coverage in science to the teacher are currently located late in the teacher questionnaire, relocation of these should be considered for a more optimal response rate.

6.6.2 Concluding Remarks, Contributions, and Implications

The main aim of the present study was to examine whether changes in OTL with respect to content coverage were related to changes in achievement. Our findings confirmed this, and this may imply that changing the implemented curriculum may have consequences for students’ competence. All countries have a national curriculum that all teachers should follow. Given the within-country variation in achievement, there is a need to investigate to what degree this variation may be due to a lack of content coverage of the assessed curriculum in the schools. Lack of, or variation in content coverage in schools implies an unequal opportunity for students to learn, and unequal opportunities to be fairly assessed. The topic of unequal opportunities for students to learn is further examined in Chap. 8.

The findings from this study contribute to teacher education, stakeholders in education and educational policy, and curriculum makers, as content coverage considers the alignment between what is taught and what is assessed and the alignment with the curriculum. The study further contributes to research, more specifically to OTL research. The content coverage variables in TIMSS are far from optimal from a measurement point of view. We agree with Scheerens’s (2016) conclusion that there is a need to address the methodological challenges attached to the OTL indicators. A closer examination of the validity of the OTL measures in TIMSS, and actions and methods to improve the measurement properties of the indicators are urgently needed. Actions include considerations of the questionnaire design, the location of the OTL items, the phrasing and the translations of the items, and some validation studies to ensure that the questions and the response scales are working as intended and interpreted in the same way by the responding teachers.

The study also contributes to research within assessment, measurement, and psychometrics. The methodological approach should interest researchers who wish to examine relations between changes in predictors and outcome variables, such as achievement in other ILSAs and/or other countries.