Results from the MGCFA approach
A single-factor measurement model was fitted to each of the teacher-related constructs with the pooled data of all 46 education systems. These single-factor measurement models, however, did not fit the data well. Modification indices suggested the inclusion of one or more correlated residuals to improve the model fit. These modified single-factor model structures were used to test the measurement invariance across the 46 groups in the conventional approach. Table 3 presents the model fit indices of the configural, metric, and scalar MI models for all teacher-related constructs.
The configural models of all the latent constructs in Table 3 show acceptable or close model fit, with the Root Mean Square Error of Approximation (RMSEA) and Standardized Root Mean Square Residual (SRMR) being below .08, and comparative fit index (CFI) and Tucker-Lewis index (TLI) being greater than .95 (see, e.g., Hu and Bentler 1999). Three out of the seven teacher-related factors (teacher perception of School emphasis on academic success, School Condition and Resources, and teacher’s Self-efficacy) reached metric invariance, which implied that the factor loadings of each of the three latent constructs were equal across all educational systems, but not the intercepts of the latent construct indicators. It may also be observed that none of the scalar MI models fits the data, indicating that the assumption that both intercepts and factor loadings be equal across the 46 systems cannot be held true.
With the traditional measurement invariance approach, the restricted MI assumption (scalar invariance) has been proven false. Additionally, metric invariance was only found in three latent constructs. Consequently, cross-country comparisons cannot be made with the latent variable means as well as the relationships among the latent variables. Given these results, the next section will aim for an approximate partial measurement invariance (e.g., Millsap and Kwok 2004) by using the alignment approach (Muthén and Asparouhov 2014).
Results from alignment optimization
Alignment optimization explores partial (approximate) measurement invariance by starting out with a well-fitting configural model. It then adjusts the factor loadings and intercepts of the factor indicators in such a way that these parameter estimates should be as similar as possible across groups without compromising the model fit. Essentially, the fit for the aligned model stays the same as the configural invariance model. In this section, the aligned model results for each of the seven teacher-related factors will be presented.
Table 4 presents the results from the aligned modeling approach for the latent construct JS. The highest R-square of the intercept estimate is observed for the variable My work inspires me. About 87% of the variation in the intercept observed in the configural model can be explained by the variation in latent variable mean and variance in the aligned model, indicating a high degree of invariance. Morocco is the only non-invariant country in the intercept estimate of the indicator I am proud of the work I do. This variable together with the indicator I am enthusiastic about my job also displayed a rather high R-square. I am content with my profession as a teacher and My work inspires me hold completely invariant factor loading estimates across all systems. For the variables I am enthusiastic about my job, and I find my work full of meaning and purpose, a large number of groups with invariance in the intercept estimates are also observed, ranging from 44 to 46 educational systems. The variable I am going to continue teaching as long as I can holds the least invariant intercept with the R-square being the lowest, 44%. For the factor loadings, the indicator I am proud of what I do is the least invariant, with an R-square of 23%.
Countries with extreme parameter estimates can be found in columns 4 to 7. For example, South Korea holds the lowest intercept estimates in My work inspires me, while Canada-Ontario has the lowest factor loading estimate. In general, the overall degree of invariance of the construct JS is rather high, with few education systems showing measurement non-invariance in the factor loadings, complying with the close fit for the metric invariance model in Table 3. The average invariance index is 58% for JS. The percentage of significant non-invariance groups is 8.9%, much lower than the limit of 25% suggested by Muthén and Asparouhov 2014. A higher number of groups show invariance in the factor loadings of each of the indicators as compared to the intercepts.
Teacher perception of school emphasis on academic success
Five indicators are used to identify the latent construct of school emphasis on academic success, and the results from the aligned model of SEAS are presented in Table 5.
For factor loading estimates, all five indicators to the construct School emphasis on academic success showed complete invariance over the 46 countries. This agrees with the model fit indices for the metric invariance model in Table 3. For the intercepts, only two countries are non-invariant for the indicator Teachers’ degree of success in implementing the school’s curriculum, corresponding with the high R-square estimate 73%. The intercept of Teachers’ expectations for student achievement holds the most variation, with only half of the countries being invariant. The minimum and maximum estimates of the intercept and factor loadings can be found in columns 4 to 7. Only 7.8% of groups have been observed with significant non-invariance. In general, the high degree of confidence indicated by the average invariance index of .65 implies that the mean of the construct SEAS can be compared meaningfully across the different groups.
Teacher perception of school conditions and resources
Table 6 shows the results of approximate invariance from the aligned model of the school condition and resources.
As revealed in Tables 6 and 4 indicators, The school building needs significant repair, Teachers do not have adequate instructional materials and supplies, The school classroom needs maintenance work, and Teachers do not have adequate support for using technology have invariant factor loadings across all education systems. Only Lithuania is non-invariant in the factor loadings for the variables Teachers do not have adequate workplace and Teachers do not have adequate technological resources. The R-square for these indicators also showed a high degree of invariance, being above 60%. However, one exception can be observed for the variable The school building needs significant repair, for which the R-square is 29%, despite showing complete invariance across all groups. For the intercept estimates, the number of non-invariant systems in each indicator ranges from 4 for the variable The school classroom needs maintenance work (R-square = 82%) to 10 for the variable Teachers do not have adequate workplace (R-square = 57%). These results were also confirmed by the conventional measurement invariance results, where metric invariance was achieved for the SCR construct but not scalar invariance (see Table 3).
The average invariance index for the construct SCR was 62%, indicating 62% confidence to carry out trustworthy cross-system comparisons. The total non-invariance measure is 8.39%, below the limit of 25%.
Teacher perception of safe and orderly school
Among the 8 indicators of the latent construct Safe and orderly school (Table 7), The students behave in an orderly manner, The students respect school property, and The students are respectful of the teachers are completely invariant in the factor loadings over the 46 countries. The R-square estimate for the factor loading of these three variables is around or above 70%, implying that approximately 70% or above of the variation in the factor loadings estimated in the configural model can be explained by the factor mean and variance across the groups. For these three variables, the standard deviation of the parameter mean is also smaller, compared to those of other indicators. The lowest R-square for the factor loading is observed in the indicator The school is located in a safe neighborhood (29%), relating to a larger variation (see column 3 under SD).
Students respect school property holds the highest R-square (i.e., 83%) for its intercept estimate, only Lebanon is non-variant. The lowest R-square is found in the indicator The school’s rules are enforced in a fair and consistent manner (35%). The number of countries with non-invariance intercept ranges from 1 and 13. From the model fit indices of the conventional measurement invariance model, metric invariance is supported and was confirmed by the aligned model.
In sum, the parameter estimates of the latent variable model reached 58% confidence to make reliable across-country comparison and the percent of significant non-invariance for education systems is only 9.8% over all estimated parameters.
Aligned model results for self-efficacy can be seen in Table 8. The intercept estimates show the indicator Developing students’ higher-order thinking skills as the most invariant, with an R-square of about 90%. Here, only four educational systems show measurement non-invariance and the variance in the estimated mean intercept is rather small. The intercept estimate for indicator Making mathematics relevant to students also holds a high R-square (86%). Improving the understanding of struggling students and Assessing student comprehension of mathematics show the lowest R-square values, implying a high degree of non-invariance. This is also confirmed by the higher standard deviations in column 3. Over ten educational systems show non-invariance for these two indicators. Columns 4 to 7 present the education system with the minimum or maximum estimate of the intercepts.
The number of educational systems with invariant factor loadings for the TSE constructs is higher than that of the intercepts. Developing students’ higher-order thinking skills, Improving the understanding of struggling students, Providing challenging tasks for the highest achieving students, and Adapting my teaching to engage students’ interest are completely invariant over all 46 education systems. The factor loading estimate for Inspiring students to learn mathematics has the highest number of non-invariant systems (5).
In general, the average invariance index was rather high for all estimated parameters in the aligned model and a low proportion of significantly non-invariant groups. We, therefore, have 57% confidence to make meaningful comparisons of the means and variances of teacher self-efficacy.
Monte Carlo simulation
As recommended by Asparouhov and Muthén (2014), Monte Carlo simulations were conducted in order to check the quality of the alignment results of the five teacher-related factors. These simulations used parameter estimates from the alignment models as data-generated population values. For each of the teacher-related factors, two sets of simulations were run with 100 replications, 46 groups, and two different group sample sizes (500 vs. 1000). Table 9 shows the correction between the generated population values and estimated parameters.
The correlations in Table 9 are the average of the correlation between the population factor mean (or factor variance) and model estimated factor mean (or factor variance) of the 100 replications. These correlations generally are very high, most of which are .98 or above, with the average correlation higher than the factor variance. However, relatively low correlations also are observed for the simulations based on 500 group sample size, for example, .95 for the average correlation of the factor variance in Job satisfaction and .96 in teacher perception of School emphasis on academic success. These correlations tend to get higher when the group sample size is increased to 1000. Asparouhov and Muthén (2014) suggested a level of .98 for these correlations to be able to confirm reliable alignment estimates, and a correlation below .95 may be cause for concern. The current simulations therefore suggest that to a great extent the aligned results for the teacher-related constructs are highly reliable for cross-country comparison, despite some non-invariance among education systems. It can be noted that the aligned models work better when the group sample size is higher, implying an asymptotic accuracy in the alignment results under maximum likelihood estimation.
Average estimates of intercepts and factor loadings across invariant groups
Table 10 presents the weighted average estimates of factor loadings and intercepts across all invariant groups in each teacher-related construct. These weighted mean values are common for the invariance education systems, and only apply to those invariance systems. The number of such systems can be found in the column next to the weighted mean of intercepts and factor loadings.
As is shown in Table 10, the highest average intercepts for teacher’s Self-efficacy, for example, is observed on its indicator Providing challenging tasks for the highest achieving students (v = 1.616)—and the lowest on Helping students appreciate the value of learning mathematics (v = 1.375). The average factor loading was highest for Developing students’ higher-order thinking skills (λ = .495), indicating that this indicator forms an important part of the construct of self-efficacy in teaching mathematics.
Comparing estimated latent variable means of the teacher-related constructs
Latent variable means of all teacher-related latent constructs that were estimated for the 46 education systems by the aligned model (see Appendix Table 11). Groups can be compared based on these factor means.
Teacher job satisfaction
The latent variable mean of teacher job satisfaction is based on indicators concerning teachers’ feelings of contentment with the profession as a whole, their current school, their enthusiasm and pride in their work, and their intention to continue teaching. According to the estimated mean of JS in Fig. 1, students in Japan, Singapore, England, Hong Kong, and Hungary have mathematics teachers with the highest level of job satisfaction as compared to other education systems in TIMSS 2015. Students in Italy, Lithuania, Sweden, South Korea, and New Zealand also have mathematics teachers with relatively low levels of job satisfaction. By contrast, in Chile, Qatar, Thailand, Argentina (Buenos Aires), Kuwait, Oman, Israel, Lebanon, Malaysia, and the United Arab Emirates, students have mathematics teachers who are the least satisfied with their job.
Teacher perception of safe and orderly school
Broadly, SOS refers to whether teachers feel the schools are located in a safe neighborhood and feel the students are respectful. The latent variable mean of SOS is shown in Fig. 2. The results indicated that students in Botswana, South Africa, Morocco, Turkey, Japan, Italy, Slovenia, South Korea, Sweden, and Jordan had mathematics teachers with the highest levels of perceived school safety. In Argentina (Buenos Aires), Ireland, Kazakhstan, Norway, UAE, Lebanon, Qatar, Singapore, Hong Kong, and Lithuania, students had mathematics teachers with the lowest levels of feeling as though the school was orderly and safe.
Teacher perception of school conditions and resources
SCR refers to school infrastructure, whether teachers have adequate workspace and instructional materials, and whether the school environment is well taken care of. Results for latent mean comparisons can be found in Fig. 3. Students’ mathematics teachers in Botswana, South Africa, Turkey, Morocco, Saudi Arabia, Egypt, Jordan, Armenia, Malaysia, and Iran reported the highest levels of satisfaction with school conditions and resources. In UAE, Singapore, and Bahrain, students’ mathematics teachers reported the lowest perceptions of SCR.
Teacher perception of school emphasis on academic success
SEAS is indicated by teachers’ perceptions of whether teachers understand schools’ curricular goals, their success in implementing the curriculum, their expectations for student achievement, and their ability to inspire students. Latent variable means are presented in Fig. 4. Recall that SEAS is reverse coded so countries with the lowest levels show the highest mathematics teacher perceptions of SEAS. Students in Italy, Japan, Russia, Hong Kong, Chile, Hungary, Sweden, Norway, Turkey, and Thailand have mathematics teachers who report the highest levels of SEAS. In Qatar, Malaysia, Oman, Ireland, Canada, South Korea, UAE, Bahrain, and Kazakhstan, students generally have mathematics teachers who report the lowest levels of school emphasis on academic success.
Latent variable means for TSE are found in Fig. 5. Teacher self-efficacy is measured by teachers’ feelings of capacity to inspire students in mathematics, show students a variety of problem-solving strategies, adapt their teaching to engage students, make mathematic relevant, and develop higher-order thinking skills. In Japan, Hong Kong, Singapore, Chinese Taipei, Thailand, Iran, Morocco, New Zealand, Sweden, and England, students have mathematics teachers who report the highest levels of self-efficacy in teaching mathematics. In Qatar, UAE, Bahrain, Lebanon, Oman, Argentina (Buenos Aires), Slovenia, Kazakhstan, and Botswana, students have mathematics teachers with the lowest levels of self-efficacy to teach mathematics.