Keywords

14.1 Introduction

Singapore’s participation in the Second International Science Study (SISS) in the 1980s marks the beginning of nearly four decades of involvement in studies undertaken under the auspices of the International Association for Evaluation of Educational Achievement (IEA). Because of the methodological and process rigor that goes into each IEA study (as detailed in the other chapters in the book), the high-quality data generated from these studies are valuable resources that Singapore uses in ongoing efforts to improve the quality of education.

In this chapter, we first discuss the value of large-scale international studies to Singapore, including IEA’s Progress in International Reading Literacy Study (PIRLS) and Trends in International Mathematics and Science Study (TIMSS). We then describe how the Singapore Ministry of Education (MOE) has used PIRLS and TIMSS data for system-level monitoring and derived insights from secondary analyses to inform policymaking and program development. We illustrate such data use with specific examples. We conclude by distilling some guiding principles that underpin the ways in which MOE uses the data.

14.2 Why Singapore Participates in International Large-Scale Assessments

Education has always been highly valued in Singapore. For the people, education is perceived as a means to a better life. For policymakers, it is a key strategic lever to ensure the economic survival of a small nation with few natural resources other than its population. An urgent nation-building task in the post-independence years of the 1960s was thus to quickly build a public school system out of the largely disparate, generally community- and faith-based schools that existed at that time, driven largely by the mission to rapidly raise the basic literacy and numeracy of the people. That early phase of rapid expansion, termed the survival-driven phase by education historians (Goh and Gopinathan 2008), was characterized by high levels of central control by MOE, for efficiency reasons.

Over time, that early governance structure has evolved to one characterized by a close nexus between policymakers and practitioners, deliberately designed to achieve a balance between the centralization and decentralization of different aspects of the education system. Today, MOE is still responsible for setting national policies that affect access to education for all children (e.g., curriculum, funding rates, and school fees), for reasons of both efficiency and equity. But it also devolves significant autonomy and responsibility to school principals and teachers in administration and professional matters pertaining directly to their own schools (e.g., budget allocations in the school, setting of school policies, customization of the national curriculum, and pedagogical approaches for students with different learning needs). Such autonomy for local customization is a key feature that now allows the Singapore education system to be nimble and responsive to student needs.

However, a key feature that remains even as the system evolves in governance structure is how MOE has always adopted an evidence-based approach in designing, developing, and reviewing the curriculum, programs, and policies since the early post-independence years in the 1970s. For example, what prompted the development of the Singapore model method for learning mathematics were the worrying results from a study conducted by MOE in 1975, showing that at least a quarter of the primary school graduates could not meet the minimum numeracy levels expected at the end of primary school (MOE 2009). This, set against the broader context of double-digit dropout rates even at primary-school level (29%; Goh et al. 1979), suggested strongly to policymakers then that something was clearly not right, either with the curricular design or the way in which it was implemented, or both. MOE introduced the model method to address the concerns raised by the study. Subsequently, after years of experimentation and refinements based on feedback from actual use in local classrooms, the model method has become a pedagogical approach integral to Singapore’s primary school mathematics curriculum.

In fact, in those early post-independence years when Singapore was rapidly building and expanding the school system, MOE was particularly keen to learn best practices from all over the world. Participating in international studies in those early years (e.g., SISS and IEA’s 1991 Reading Literacy Study) contributed valuably toward that purpose.

Today, the impetus to look outwards is even stronger, catalyzed by an increasingly interdependent and networked world. In this world, education systems no longer have the luxury to be insular or inward-looking, especially in terms of the skills and knowledge that they help students develop. This is particularly so for Singapore because of the open economy and society. Large-scale international studies therefore continue to play an important role in complementing other data sources to inform MOE about different aspects of the education system, serving at various times as reassurances that the system is progressing in the right direction in some areas and, on other occasions, providing insights into where improvements can be made. This is necessary to ensure that the education system is always forward-looking and responsive to dynamic changes in the external environment, in order to adequately prepare Singapore’s students to thrive in an equally, if not more, dynamic future.

Singapore currently participates in five large-scale international studies: IEA’s PIRLS and TIMSS, and three programs run by the Organisation for Economic Co-operation and Development (OECD), namely the Programme for International Student Assessment (PISA; OECD 2020a), Teaching and Learning International Study (TALIS; OECD 2020b), and Programme for International Assessment of Adult Competencies (PIAAC; OECD 2020c). MOE chose these carefully and purposefully, among the options available internationally, to form a suite of studies that would best address its knowledge needs. Together, PIRLS, TIMSS, and PISA allow policymakers to monitor and derive insights about the education system at different milestone grades (primary, lower-secondary, and upper-secondary) for various student developmental outcomes, and educator characteristics and practices. TALIS, targeting only teachers and principals, provides in-depth insights into teacher characteristics, policies, and practices. PIAAC, involving adults aged 16–65 years old, provides information about longer-term continual skills development, employment, and other life outcomes beyond the formal education years. This last study is thus a rich source of information on both the progress of lifelong learning and how well the formal education system has prepared its students for life.

More broadly, these international studies share several features that make them useful to Singapore, as we now explain in more detail.

14.2.1 Participating in International Large-Scale Assessment Facilitates Benchmarking of Student Developmental Outcomes and Educator Practices

An important feature of large-scale international studies lies in the participation of many education systems, which is crucial to allow benchmarking and cross-national comparisons. On the student front, the international studies show how well students of different grade levels/ages are developing skills, competencies, and attitudes towards learning that are considered essential for thriving in the 21st century by the international education community, relative to their international peers. For example, PISA results show that Singapore’s 15-year-old students not only do well in the “traditional” domains (reading, mathematics, and science), they are also competent collaborative problem solvers by international standards. Similarly, ePIRLS provides findings on how well the grade 4 students can navigate in an e-environment to select relevant information and integrate ideas across webpages. Such benchmarking is important because the students do not live in isolation, but have to compete in the global marketplace of skills in the future.

Similarly, apart from student data, the studies provide internationally comparable information about various policies and practices pertaining to educators (e.g., teacher and principalship preparations and teaching strategies), allowing MOE to understand the strengths and weaknesses of local practices beyond what can be gleaned from local data. For example, MOE already knew from local data that Singapore teachers worked long hours and spent a lot of time on out-of-class activities, such as running co-curricular programmes for students. But it was results from TALIS which showed that Singapore teachers put in some of the longest working hours among TALIS participants and that they spent proportionately more of those hours on “marking” and “administrative work” than their peers elsewhere.

The cyclical nature of the studies further means that benchmarking can be done not just at single time-points, but also over time, using scales deliberately designed for such trend analyses. This is particularly useful when there are changes to policies and curriculum over time. For example, trend data from these studies has enabled MOE to monitor changes in student outcomes and attitudes over time. It has also provided policymakers with some evidence of the impact of programs and policies, which cannot be easily measured using local administrative or national examination data. For example, results from PIRLS, TIMSS, and PISA consistently assured MOE that cuts in curriculum content and the corresponding shifts towards emphasizing higher-order thinking skills since 1997 had not impacted student performance negatively, but were instead associated with increased levels of application and reasoning skills. Similarly, analyses using TIMSS data from multiple cycles before and after the launch of the 2008 science syllabi centered on the inquiry approach showed that there were more inquiry-based practices in grades 4 and 8 science classrooms after the implementation.

14.2.2 Participating in International Large-Scale Assessment Provides Additional High-Quality Rich Data Sources for Secondary Analyses

Besides benchmarking of outcomes and practices, another important feature of large-scale international studies is the availability of rich contextual data about each education system. This is useful for conducting secondary analyses to derive further insights about education systems, with implications for policymaking and practice.

In particular, the rich information about students’ learning contexts in the classroom, school and home from these studies allows analysts to better understand the influence of these contexts on student outcomes. For example, using PIRLS and TIMSS data, it was found that grade 4 students whose parents engaged them more frequently in early literacy and numeracy activities during the preschool years did better in reading, mathematics, and science at grade 4, even after accounting for home socioeconomic circumstances. This means that, where parents are unable to provide the support, early intervention in preschool education and care centers is important, thereby supporting the government’s investments in the sector.

In this regard, IEA’s PIRLS and TIMSS share two distinguishing features not found in the OECD studies, which enhance their analytic value. First, the direct links of these two studies to the curriculum, across all three aspects of intended, implemented, and attained curricula underpinning the research frameworks of TIMSS and PIRLS, allow insights to be drawn for informing curricular review work. For example, information on the intended curriculum across different countries provides insights into both the broad common areas of curricular focus and the differences in emphasis between countries. These are useful as part of the external scans for regular syllabus reviews in Singapore. Similarly, information on the implemented curriculum (e.g., teaching practices) and the attained curriculum (e.g., student achievement scores and student attitudes towards learning) enables Singapore to monitor the enactment and impact of the curriculum, especially useful during curricular reviews.

Second, the direct links between students and their reading, mathematics, and science teachers in PIRLS and TIMSS open up additional analytic possibilities for discovering important relationships between teacher practice and student outcomes. Although such estimated relationships are non-causal in nature because of the study design, they nonetheless allow some inferences to be made about hypothesized relationships (e.g., teacher inquiry-based practice and student outcomes), serving as a first step to further, more targeted studies aimed at uncovering any causal relationships where appropriate.

14.2.3 Participating in International Large-Scale Assessment Builds International Networks of Educationists and Experts

Over time, each of the international studies has built an entire ecosystem of parties interested, and in many cases, actively involved, in the work of educational improvements. These international communities comprise individuals with a diverse range of experiences ranging from research to policymaking to practice, but all driven by the common goal of providing quality education for the students. Participating in each study thus opens up opportunities to be plugged into international conversations about education with thought leaders and educationists from different parts of the world, exchanging views and learning from one another while working to improve our respective systems.

Another very useful, albeit incidental, benefit that MOE derives from participating in the international studies is that the staff learn and grow professionally in the specialized areas (e.g., sampling, design of computer-based assessment items, and measurement and analytical methods) from being directly involved in the projects. Over time, they have also built networks with the various experts, who can be readily tapped for advice in areas beyond work directly related to the studies.

14.3 How MOE Has Used Large-Scale Assessment Data

In this section, we illustrate how MOE has capitalized on valuable PIRLS and TIMSS data to inform both policymaking and program improvement, using three practical examples. We have chosen these cases to illustrate the range of typical uses, from serving as an external signal of areas for improvement, to monitoring the implementation of changes via a specific policy or program, to system-level monitoring of broader changes to policy or practice. The examples we have selected are all related to the curriculum because, as mentioned earlier, one distinguishing feature of PIRLS and TIMSS, which makes the data particularly valuable to Singapore, lies in their links to the curriculum.

14.3.1 STELLAR: “We Must and Can Do Better!”

The STrategies for English Language Learning and Reading program, better known as STELLAR, is the primary level English language instructional program, specially developed to cater to the learning needs of Singapore’s children, taking into consideration the nation’s multilingual environment. While English is the common language for government, business, and education in Singapore, Singaporean students do not speak only English at home. STELLAR was therefore created from a deliberate combination of first- and second-language teaching approaches. It was the first time that MOE articulated and operationalized a core set of pedagogies to guide the learning of the English language across six years of education (grades 1 to 6). In addition to applying principles gleaned from the research literature, the curriculum team behind STELLAR conducted a systematic review of English language teaching in Singapore, consulting teachers, observing classes of students, and speaking to stakeholders in the community, including employers. The team also conducted three study trips to learn from educators in Hong Kong, India, and New Zealand. During its implementation, the STELLAR team worked closely with English language teachers, influencing and changing classroom practices over time. Today, nearly 15 years since its launch, STELLAR remains the signature programme that builds a strong foundation for students in English language, not just as the common language for communication in multilingual Singapore, but also as a language for accessing further learning.

But what is perhaps less known about STELLAR is that Singapore’s results from IEA’s 10-year trend study of reading literacy, using data from the 1991 Reading Literacy Study and PIRLS 2001 (Martin et al. 2003), played an important role in galvanizing support for a fundamental redesign of the primary English language program that eventually became STELLAR. Specifically, on average, Singapore’s grade 4 students had not made much progress in English language reading proficiency over the intervening decade, unlike four of the nine countries involved in the 10-year trend study, which had shown improvements. Moreover, Singapore was the only country with a widened gap between the highest and lowest performers. Further analyses showed that, while there was some progress made at the upper end of the achievement spectrum, with the 75th and 95th percentile scores being higher in 2001 than in 1991, students at the lower end of the achievement spectrum had largely stayed at the 1991 proficiency levels. These findings, especially the “long lower tail” phenomenon, led to a wide-ranging review of the curriculum, efforts that subsequently culminated in the launch of STELLAR in 2006, with a phased-in implementation approach that reached all schools at grade 1 in 2010. At the same time, for students at grades 1 and 2 who were assessed to need more support in beginning reading skills, an enhanced learning support program was implemented from 2007.

The use of PIRLS did not end with the launch of STELLAR. Instead, analyses using an external, stable benchmark, such as PIRLS, served as a useful complement to other evaluations that were conducted using local data. For example, data from the four cycles of PIRLS (i.e., 2001, 2006, 2011, and 2016) showed that Singapore’s grade 4 students had made steady progress over the 15 years from 2001 to 2016, with a growing percentage having acquired higher order reading skills. Importantly, there was a reduction in the proportion of students who could not meet the “low” benchmark in PIRLS, from 10% in 2001 to 3% in 2016. A six-year (from 2007 to 2012), quasi-experimental, longitudinal study in 20 schools conducted by MOE also found that students in the STELLAR program performed significantly better than a control group on a number of language and reading skills as they progressed through each grade level and at the end of grade 6 (Pang et al. 2015).

Aside from reading achievement, PIRLS data also showed how reading habits changed over time. For example, data indicated that the proportion of Singaporean grade 4 students reading silently on their own in school every day or almost every day had increased from 56% in 2011 to 62% in 2016. This was set against a decline in reading habits outside of school, with the proportion of students reading outside of school for at least 30 min on a school day falling from 68% in 2011 to 56% in 2016. This decline in reading habits outside of school was also observed in more than two-thirds of the education systems with trend data that participated in PIRLS 2016. For the STELLAR curriculum team, the increase in silent reading in school, set against a declining reading culture outside of school, affirmed the importance of school reading programs in promoting extensive reading.

The design of PIRLS, with its stronger links to the curriculum and its emphasis on robust trend data, has enabled the Singapore curriculum development team to monitor the impact of its programs over a long-term trajectory, contributing to efforts to improve the teaching and learning of English language in Singapore.

14.3.2 A New Pedagogical Approach to Learning Science: “We Tried a Different Method, Did It Materialize?”

From the mid-2000s, MOE made concerted efforts to move towards an inquiry-based approach to the learning of science at both primary and secondary grade levels. Specifically, the 2004 Science Curriculum Framework adopts inquiry as the central focus, both to (1) enable students to appreciate the relevance of science to life, society, and the environment; and (2) equip them with the knowledge, skills, and dispositions to engage in science meaningfully in and out of school (Poon 2014). This framework was operationalized through the 2008 primary and lower-secondary science curricula.

However, having the curricular documents is just a first step. To bring about real change in the classrooms, there must be deliberate and sustained efforts to help teachers not just to understand, but more importantly, adopt the new pedagogical approach. Towards this end, MOE mounted a series of professional development activities, in association with both pre-service and in-service training providers, specifically targeted at helping science teachers develop the necessary skills to implement the inquiry-based approach in their classrooms in the initial years of implementation. This challenge of ensuring the alignment of every science teacher’s training to the intent of inquiry science was surmountable because of the close partnership between MOE and the National Institute of Education, Singapore’s sole teacher training institute.

Beyond professional development, regular school visits by MOE curriculum staff provided further support to the teachers during the actual implementation process, working hand-in-hand with teachers on designing lessons, observing their lessons, and providing feedback to further refine the lesson plans and practice. Such hands-on support, while labor-intensive, is critical; research has shown that understanding teachers’ actual experiences when trying to adopt a new practice in their classrooms (especially their in situ challenges) and then providing appropriate support help teachers to successfully adopt an inquiry-based practice in science (Poon and Lim 2014).

Given the amount of effort required to effect a system-wide pedagogical change, one of the policy and practice questions of interest to MOE after the initial years of implementation was how much progress had been made in this systemic shift towards an inquiry-based approach to learning science. In particular, to what extent (if at all) had the espoused shift been translated into actual practice in science classrooms? In addition, to what extent (if at all) was teacher inquiry-based practice associated with different teacher characteristics?

Besides serving as an important support structure for teachers in the implementation process, the regular school visits by MOE curriculum staff also provided the curriculum designers in MOE with first-hand information about the enactment of the new approach through classroom observations and focus group discussions with heads of science departments and their teachers. Some local research studies also reported pockets of teacher use of hands-on and more open-ended scientific investigations aimed at encouraging students to move away from merely following instructions to more self-directed learning and creative thinking (e.g., Chin 2013; Poon et al. 2012). All these provided some answers to the MOE’s questions, which were used for further fine-tuning of the curriculum and implementation process.

TIMSS supplemented these local sources of information, most of which were qualitative and small-scale rather than system-wide in nature, by serving as a large-scale, quantitative data source that enabled trend analyses because of its cyclical nature. Specifically, by analyzing data from TIMSS straddling the implementation of the 2008 science curricula (i.e., data from TIMSS 2007 and TIMSS 2011), MOE examined the extent to which the system-wide espoused shift towards inquiry-based approach had translated into teachers’ actual practice in grade 4 and grade 8 science classrooms.

To do so, specialists from the TIMSS national research center at MOE first created an “inquiry approach” scale to measure the extent of inquiry-based practice in classrooms, as reported by teachers. For direct use in answering questions about the new science-inquiry approach implemented in local schools, this inquiry approach scale had to be aligned to MOE’s definition of science as inquiry “as the activities and processes which scientists and students engage in to study the natural and physical world around us…consisting of two critical aspects: the what (content) and the how (process) of understanding the world we live in” (MOE 2007, p. 11, emphasis added). The availability of item-level data in TIMSS made it possible for Singapore to create a fit-for-purpose scale, using selected items in the TIMSS teacher questionnaire assessed by an expert science curriculum team to be good proxies of inquiry-based practice based on MOE’s conception.

Using the inquiry approach scale, the research team compared the average inquiry approach scores in TIMSS 2007 and 2011 to detect any shift in practice. Using other variables and scales that were available in the TIMSS datasets, the team also examined the association between teachers’ inquiry-based practice with different teacher characteristics (e.g., years of experience, levels of preparation, and confidence in teaching science). The team found that, on average, proportionately more grade 4 and grade 8 students had science teachers who reported the use of inquiry-based pedagogies in TIMSS 2011 than in TIMSS 2007. In terms of relationships with teacher characteristics, the team found positive relationships between teacher inquiry-based practice and their use of instructional strategies that engaged students in learning. Similarly, it found a positive relationship between inquiry-based practice and teachers’ levels of confidence in teaching science, potentially mediated by their levels of preparedness. These findings are useful because they suggest some potential levers that can be used to influence teachers’ adoption of inquiry-based practice.

The analysis was subsequently replicated using TIMSS 2015 data when it was available, as part of MOE’s continual efforts to monitor the situation in classrooms. More broadly, findings from such analyses using TIMSS also form part of the knowledge base that MOE has built over time about inquiry-based learning, not just in science but more broadly across different subjects, tapped by various parties involved in regular curricular reviews.

14.3.3 Bold Curricular and Pedagogical Shifts: “We Made Some Trade-Offs, What Did We Sacrifice?”

Policymaking is frequently about weighing trade-offs and choosing the best policy option available to maximize the positive impact and minimize the negative, based on the information that is available at the point when a decision has to be made. Subsequently, it is necessary to closely monitor the actual implementation of the policy option chosen to ensure that the predicted positive impact comes to fruition while the expected negative ones are minimized and mitigated where possible. Trustworthy data from various sources, both quantitative and qualitative in nature, are important to this ongoing post-implementation monitoring process. In our last example, we illustrate how MOE has used PIRLS and TIMSS data for this purpose.

Since the late 1990s, MOE has been progressively and systemically reducing the content of individual subjects across the grade levels to create space for greater emphasis on other learning outcomes important to students’ holistic development, including higher-order skills such as creativity, application, and reasoning skills (Gopinathan 2002; Poon et al. 2017). MOE embarked on this system-wide shift, which continues today, in part because of the recognition that, while a strong understanding of concepts in each subject domain remains important, being able to apply such understanding and knowledge to real-life situations, including novel ones, is increasingly important in a world where humans cannot out-compete machines in storing and retrieving voluminous amounts of information. As such, there is a need to ensure that students have enough opportunities to practice and develop these skills during their schooling years.

Effecting the shift requires deliberate and fundamental reviews of the curriculum of each subject across the grade levels, largely through a two-pronged approach. First, judicious cuts to the content materials (up to 30% of the original curriculum in some instances) have to be made to carve out adequate time and space from the precious curriculum hours each week to devote to developing skills beyond content learning. Second, teachers need to make pedagogical shifts in tandem in order to capitalize on the time and space available and design learning experiences that will foster higher-order thinking skills among their students. The move towards adopting an inquiry-based approach in learning science (described in Sect. 14.3.2) is an example of how that broader shift is manifested pedagogically in the specific subject of science.

A key policy question of interest to MOE is whether all the deliberate curricular cuts and pedagogical shifts at the system level have made a positive impact on students’ learning and development. Most importantly, are students short-changed in any way by the bold move of trying to teach them less (by way of content) so that they can learn more (by way of other increasingly important skills)? PIRLS and TIMSS data are useful external, stable benchmarks that provide some answers for the reading, mathematics, and science curricula, in ways that cannot be answered using MOE’s local examination data because these examinations change in tandem with changes to the curricula.

Results from TIMSS 2015 and PIRLS 2016 provide some assurance that the system-wide shifts are progressing in the right direction. The students continue to show a strong mastery of mathematics and science at grade 4 and grade 8, and of reading literacy at grade 4, performing well by international standards. More encouragingly, they have made steady improvements over the years, especially in terms of the higher-order thinking skills. In particular, Singapore students at both grades have demonstrated progress in their ability to apply and reason in both mathematics and science, as measured by two of the three cognitive domain scores in TIMSS, namely, “Applying” and “Reasoning.” This is notwithstanding the decline in “Knowing” scores between 2007 and 2015 observed in science for grade 4 students, showing there are some trade-offs at play. Similarly, results from PIRLS and ePIRLS 2016 show that Singapore’s grade 4 students are able to interpret and integrate ideas and information well, and evaluate textual elements and content to recognize how they exemplify the writer’s point of view. ePIRLS 2016 also provided the opportunity to assess students’ online reading skills for the first time on an internationally comparable scale. The results suggest that Singapore’s grade 4 students do well not only because they are able to transfer their reading comprehension skills in print to online reading but also because they are able to navigate non-linearly between different websites. These are important skills in an increasingly digitalized world, where information and knowledge reside across multiple online platforms.

These findings provide MOE with not only the confidence to continue with the system-wide curricular and pedagogical reforms but also concrete evidence to convince the different stakeholders that these reforms are on the right track, thereby garnering continued support for further reforms in the same direction.

14.4 Some Principles Underpinning MOE’s Use of Large-Scale Assessment Data

Reflecting on Singapore’s purpose for participating in international large-scale assessments (ILSAs) and how MOE has used the data from these studies, there are three key general principles guiding such use.

First, insights from ILSAs serve as only one of the sources, rather than the sole source, of information that feeds into the deliberation of policies and program design and development. This is because, despite the richness and robustness of the data that such international assessments provide, each study has its inherent limitations and can only support inferences within those technical limits. For example, due to the sampling design of PIRLS and TIMSS, the teacher respondents to the teacher questionnaires are not necessarily a representative sample of the teacher population; they are instead involved in the studies only because they taught the sampled students. As such, the data alone cannot support inferences that are directly generalizable to the teacher population. This means any deliberations that require such direct inferences about the teacher population have to involve other data sources, minimally as a form of triangulation for findings using data from the PIRLS and TIMSS teacher questionnaires. More importantly, ILSAs cannot be the sole source of information, because education policies and practices operate within a complex system for which no single data source, even when collected through a mixed methods design, will be adequate in painting a full picture (Jacobson et al. 2019). As such, insights from multiple sources are necessary to provide sufficient information at the point when a decision has to be made.

Second, and related, although the international assessments provide useful insights that can be used to inform curricular reforms (especially PIRLS and TIMSS, which have direct links to the curriculum), MOE does not intentionally align the Singapore curriculum to the objectives in the assessment frameworks of the different international studies. Instead, the curriculum designers and developers are guided by what they deem important for students to learn during their schooling years, and not all of these curricular objectives will necessarily fit into what is agreed upon for the assessments by the international community.

Finally, in order to ensure that data from ILSAs remain useful, MOE takes deliberate steps to ensure that the data are not at risk of being corrupted, which would render such material useless as an objective source of information. An important aspect of this is to ensure that the scores are not used for any accountability purposes with any stakes for individuals in the system, which may evoke undesirable behavioral responses leading to potential distortion of the assessment scores (e.g., Figlio 2006; Hamilton et al. 2007; Koretz 2017; Koretz and Hamilton 2006). MOE deliberately does not use results from the international studies to reward or sanction individual schools, teachers, students, or owners of specific policies/programs. Instead, the results are used only for system-level monitoring. Even when secondary analyses are done, the insights gained are used only for system-level decisions, in line with the inferences that data from the studies can properly support. More fundamentally, having a separate research arm comprising staff with the technical expertise to derive useable insights from the international studies (situated within MOE, but separate from policy and program owners) to oversee all aspects of each international study, including all secondary analyses needed, also helps to ensure independent and responsible use of the datasets from the international studies.

Adhering to these guiding principles, now and in the future, enables MOE to derive the greatest value from the ILSA data. International studies remain an important and valuable information source, complementing other sources of information that MOE regularly collects about different aspects of the education system, and contribute to the deliberations of policies and programs in MOE’s ongoing efforts to improve students’ schooling experiences.