Institutionalizing school failure: From abandoning to reintroducing a failing grade—the rationales behind Swedish grading reforms

The article aims to depict the political framing of three grading reforms in Swedish compulsory school, in terms of the political problem they are supposed to solve and what kind of attention is given to the lowest performing pupils. Discourse analysis is employed, focusing on statement producers. The empirical material consists of policy documents from the late 1930s to 2010. The analyses show that three cases of the same type of policy change, a new grading system, rely on very different problem representations. The changes were launched as an equality tool, an accountability measure and a remedy for declining results, respectively. The discourse about the least successful pupils differs. Reasonable demands and a ranking scale without a failing grade characterize the introduction of a norm-referenced system; a first criterion-referenced system rests on a belief that virtually all pupils will meet the formulated levels for passing, an expectation not met, and a changed focus behind the second criterion-referenced system normalizes that some pupils will fail compulsory school. The article also illustrates the merits of studying educational policy change through the theoretical lens of problem representations and directs attention to how reforms can have discursive effects as well as unintended side effects that matter substantially for some people.


Introduction
Grading systems can be viewed as a part of the politically established institutional framework for education (Salomonsen & Andersen, 2014). Grading is a generally accepted aspect of schooling, but can nevertheless have both short-and long-term adverse consequences for students, not least for low-performing students (Schneider & Hutt, 2014). Further, validity and reliability problems (Brookhart, 2013(Brookhart, , 2015 and experiences of unfair grading are often pointed out (Alm & Colnerud, 2015). Grading is therefore a policy area where changes might significantly affect both "what we do" and "what we are" (Ball, 2015, p. 306). Calls to pay attention to side effects (Zhao, 2017) of educational reforms and to study their "overt and hidden effects" (Apple, 2018, p. 686) apply equally well to grading policy. The context of this article is Swedish compulsory school and major changes that have been made to its grading system. 1 The significance of educational policy is evident in the development of the Nordic social democratic welfare state, a key component of which was the introduction of a comprehensive compulsory school (Oftedal Telhaug et al., 2006). This attracted international interest in the Swedish school system in particular. More recent reforms, marked by neo-radical decentralization and neo-liberal competition and freedom of choice, have, in the light of increased international focus on educational output and declining Swedish results, led to reduced interest in Sweden as a role model (Imsen et al., 2017;Lundahl, 2016;Pettersson et al., 2017). These overall changes are well researched. Less attention has been given to specific reforms. A topic of much policy debate in some Nordic countries is grading (Lysne, 2006). In Sweden, a main reason for this is that teacher-assigned grades are of high-stakes character, since they are decisive in the competition for admission to further education (Lundahl et al., 2017).
The research aim of this article is to depict the political framing of three grading reforms in Swedish compulsory school, in terms of what political problems they are supposed to solve and what kind of attention is given to the lowest performing pupils in the political framing at different times. An understanding of policy change as an interplay between the formulation and solution of a problem informs the analysis, and a further ambition is to illustrate the merits of studying educational policy change through the theoretical lens of problem representations (Bacchi, 2009). Discourse analysis is employed, focusing on statement producers, and the empirical material consists of policy documents from the late 1930s to 2010. The investigated time period covers the introduction and abandonment of a norm-referenced grading system, and the implementation of two criterion-referenced scales. The overall research questions are: What political problems have different grading reforms for Swedish compulsory school aimed to solve and in what ways have the lowest performing pupils been taken into account in the problem representations that justified the reforms?
Apart from providing insights into the Swedish case, 2 the findings can give input to research about grading policies and to policy discussions in other educational contexts (cf. Anderson, 2018). Not only does grading have an impact on teaching and what is done in school, it also influences who pupils become in terms of achievement (Ball, 2015). The language used to define performances-in particular weak ones-is relevant to consider (cf. Wikström, 2006), as are requirements that affect prospects for further education. The analysis will also give attention to side effects (Zhao, 2017) and how institutional choices can shape future development (Peters, 1996;Waldow, 2014).

The case of Sweden
The Swedish centrally governed comprehensive nine-year compulsory school was long portrayed as one of the world's most progressive (Imsen et al., 2017;Oftedal Telhaug et al., 2006). In the 1990s, it was transformed into one of the most marketoriented school systems in the world, allowing different actors (including for-profit enterprises) to run schools in a publicly financed, voucher-like system. One peculiarity is that the ideal of a school system that promotes "equity, integration and social justice" was never abandoned (Bunar, 2010). This is still emphasized in policy documents and curricula. The 1990s also saw a decentralization process that transferred the political responsibility for primary and secondary education to municipalities (Lundahl et al., 2013). Over the last decade, the state has retaken a measure of control through school legislation, curricula changes, school inspections (Rönnberg, 2014), national testing (Lundahl & Waldow, 2009), teacher education, certification of teachers (Lilja, 2014) and appointment of "advanced teachers" (Bergh et al., 2019). The development has been characterized as taking "the logic of management by objectives a step further" (Imsen et al., 2017). An important institutional feature that has consistently been controlled by the national government is the grading system. In terms of governance, the grading reforms have not led to any change. However, the system and scale that schools and teachers must apply has been substantially altered several times. This article scrutinizes the political framing of the same type of political solution-a new grading policy-that has taken place during different periods of reform of the basic educational system. Table 1 shows the main characteristics of the grading policies studied. The comprehensive nine-year compulsory school was launched with a norm-referenced numerical grading system. The mid-1990s saw a shift to a criterion-referenced

Research method and theory
Focusing on political framing, in terms of what problems reforms should solve, makes ideational contents relevant and discursive textual analysis applicable (Bergström & Boréus, 2017). The empirical material consists of policy documents.

Discourse analysis
The analysis follows an approach suggested by sociologist Reiner Keller (2012) who defines discourse analysis as a research programme resting on both theory and methodological strategies. In this approach, discourses are seen as concrete, material and observable in for example texts. The formation of discourses is described as "the fixing of collective symbolic orders through a more or less accurate repetition and stabilization of the same statements in singular utterances" (Keller, 2012, p. 60). Actors can relate to discourses either as statement producers (occupying a speaker position) or as addressees of the statement practice. This article focuses on statement production by mapping the political arguments behind grading reforms. However, certain addressees of the statement practice, namely low-achieving pupils, are also given some attention. Discourse analysis is an interpretative endeavour, and Keller proposes a number of main approaches that are possible to use separately or together in empirical research. This article can be described, in Keller's terminology, as making use of interpretative schemes. Theoretical input is derived from political science, specifically from Carol Lee Bacchi's approach to policy analysis. Bacchi provides a particular understanding of policy framing and offers guidelines for the analysis of policy texts. The interpretation is also informed by some theoretical perspectives on educational institutions.
Discourse analysis is a qualitative interpretative method, where discursive claims must rest on transparency and close readings of texts (Keller, 2012). This article makes use of an extensive body of empirical material consisting of policy documents, a genre that generally contains both conscious and unconscious ideas presented in a more or less persuasive manner (Goodin et al., 2009). The interpretative schemes provide conceptualization and structure for analysis. The policy documents were read, partly reread and interpreted several times. A chronological approach was initially used, and a descriptive report (about 170 pages)-including a substantial number of quotes from the documents and initial analytical comments-was produced parallel to the reading. Previous empirical research about the grading reforms was also surveyed during this process (Andersson, 1991;Hyltegren, 2014;Marklund, 1980Marklund, , 1983Tholin, 2006;Wedman, 1983). The more detailed analytical work then included a return to key documents.
Theoretical interpretative scheme: What's the problem represented to be? Bacchi's (2009) theory "what's the problem represented to be" (WPR) focuses on how political issues are framed, thought about and presented. Policies are viewed as giving shape to political problems, rather than addressing or reacting to them. Solutions can therefore not be separated from problem representations. In line with Keller's view of discourses, problem representations are perceived of as existing "in the real" (Bacchi, 2009, p. 33). WPR analysis directs attention to context and complexity and to the importance of recognizing different and contradictory voices. Although there are competing representations of problems, governments are often fundamental statement producers. Bacchi labels her approach a policy-as-discourse theory, with a focus on meaning-making and conceptual framing in policy debates (Bacchi, 2000, p. 46). Bacchi suggests a set of guiding questions for conducting an empirical WPR analysis; these are summarized in Table 2. Bacchi's guiding questions and directions for analysis are used to scrutinize the problem representations in three Swedish grading reforms. The main focus is on statement production in official policy documents when framing the suggested changes. Also analysed is the attention given to the weakest pupils (addressees of some statements). In addition to distinguishing the characteristics, origins and emergence of problem representations, key dimensions also include assumptions and unproblematized aspects, for example, explicit expectations, issues not given attention, and complex matters presented in a shallow and simplified way. The effects of problem representations and policy changes are also of interest, in particular with regard to limits to what can be said and how changes might impact subjectification and responsibility. Power relations in the political process are given more limited consideration. Marton (2006) defines educational institutions as organizational structures and bearers of normative orientations. In line with this understanding, the grading system can be thought of as organizing school performances on the basis of normative ideas (cf. Salomonsen & Andersen, 2014). These ideas can affect pupils' prospects for further education and who they "become" (Ball, 2015). Furthermore, institutional choices are important for future institutional development, since a path dependency is often created (Peters, 1996;Waldow, 2014).

Analysed material
To get a nuanced picture of problem representations, a broad range of texts is needed (Bacchi, 2009, p. 20). The empirical material consists of all relevant policy documents from government and parliament, ranging from the late 1930s to 2010. In total, the 28 documents amount to more than 4000 pages. However, in many of them, particularly the ones paving the way for the norm-referenced system, grading Journal of Educational Change (2022) 23:221-252 Table 2 The what's-the-problem approach Source: Bacchi (2009, pp. 2, 48) Power 5. How/where has the PR been produced, disseminated, defended? Could it be questioned, disrupted, replaced? Consider past and current challenges, discursive resources for re-problematization is discussed among other educational issues, and thus, only limited parts of the texts are devoted to grading.
In Sweden, government bills often result from government commissions of inquiry (GCI). These are temporary working groups directed by terms of reference. They can consist solely of experts or of political (and possibly interest group) representatives, or they can have a combination of both. Reports are presented in the series Official Reports of the Swedish Government (Statens Offentliga Utredningar, SOU). A number of GCIs and their SOU reports prepared two of the reforms. The most recent reform deviated from this standard procedure in that the most important groundwork was done by working groups within the Ministry of Education (Reports of the Ministries, Departementsserien, Ds). Government bills are presented to the parliament, where committees submit their opinions before debate and voting take place. Parliamentary documents are thus particularly important for identifying conflicting views. The problem representations presented in the article therefore rest on a close reading of several SOU and DS reports, government bills, parliamentary committee reports and minutes from parliamentary debates. The lines of argument are depicted and illustrated by representative and carefully selected key statements from policy documents, both in the body text and in summarizing tables. 4 Previous research is also used for supplementary information and to underpin interpretations.

Findings-problem representations in three Swedish grading reforms
The grading reforms are all embedded in periods of educational change. The same choice of political solution-a new grading policy-is framed very differently in each case. The reforms are depicted here, one at a time, using the "what's the problem" framework. A section comparing the discourse about the weakest pupils concludes the section.

Expanding basic education-fair ranking with norm-referenced grading
The journey towards norm-referenced grading starts in the 1930-1940s with two GCI 5 reports based on analyses of grade statistics, entrance examination results, teacher surveys, pilots of standardized tests and thoughtful reasoning. A main problem to solve in these is unequal access to secondary education. Grades are viewed as part of the solution. The inquiries examine whether primary school grades can be used to determine admission to secondary education instead of entrance examinations, which admit some pupils who later fail secondary education, and exclude others whom primary school teachers consider capable. Admission on the basis of primary school grades instead could particularly benefit gifted children from less advantageous homes (SOU 1938:29). It is therefore deemed desirable that primary school grades be "accepted as measurement of pupils" level of knowledge and aptitude for further studies (SOU 1942:11, p. 8). However, statistics reveal that there are inconsistencies in how teachers grade pupils (SOU 1938:29, pp. 115-120). 6 The idea for a new grading system rests on an expected normal distribution of knowledge within the national population of each school year and on an emphasis on average performances. Standardized tests in some subjects are suggested to support grading. The grade scale used at the time includes two failing grades (BC and C). If the current scale is to be kept, it is recommended that only a small proportion of pupils (6% + 1%) be assigned these grades: Regarding the appropriate frequency of failing grades in primary school, it seems natural that the education requirements in a school system for all children must be adapted to the average performance in such a way that the threshold for failure is not too close to "normal performance", and that one grade level below "normal performance" is still considered "approved" (SOU 1938:29, p. 127).
The later report goes a step further and recommends changing the labels for the lowest grades from "not (fully) approved/failed" to "weak/very weak" and proposes a numerical scale starting at 1-not at 0, which implies no knowledge (SOU 1942:11, p. 61). The weakest pupils are thus given particular consideration. However, when norm-referenced grading is introduced in primary school in the 1940s, the existing letter scale-including failing labels-is kept (Andersson, 1991, p. 12ff.).
Education remains an important issue on the political agenda. An expert-led GCI on basic education, appointed by the Second World War national unity government with a Social Democratic Party (SAP) prime minister, follows the already entered path regarding grading. Equity, fairness and ranking are highlighted, and in terms of passing, it is emphasized that most children should succeed in compulsory education and that "anxiety for failing grades" should "hardly play any role in school life" (SOU 1945:45, p. 27f.).
After the war, the SAP dominates Swedish politics for decades. Primary school is prolonged to 7-9 years, but the school system still has parallel tracks, with a more advanced secondary level available to some. The progress towards realizing the late nineteenth-century left-wing/liberal ideal of a comprehensive compulsory school for all children is slow, but some major features of it are presented in 1948 (Marklund, 1980;Richardson, 2010). In terms of grading, equity arguments again favour a norm-referenced system. In 1957, a political GCI (reflecting the parliamentary composition) with expert back-up is tasked with proposing a final model for a comprehensive compulsory school. Consensus is reached about the desirability of 1 3 a unified school system, but the level of uniformity requires negotiation and compromise (Marklund, 1983). Grading is not given much attention in the comprehensive reports, but previous arguments for preferring a norm-referenced system are repeated. It is also argued that increased equity in grading has been achieved in primary school. The weakest performances are again given special attention. Upholding a sharp minimum passing level, where failing grades can force pupils to repeat a year, is deemed inappropriate: In a school system that accepts comprehensive holding back and flunking of pupils who do not meet the knowledge requirements, the system appears more understandable. In a compulsory school, where you cannot place such high and uniform demands on all pupils, such a rigorous system is unsuitable and difficult to maintain (SOU 1961: 30, p. 582).
Pupils' overall performances, abilities and needs should instead be considered individually. A numerical nine-grade scale is suggested to avoid association with the previous seven-grade system and in particular with the pass-fail distinction. The "principle of a 'minimum passing level' is abandoned" in favour of grades that are merely relative measures (SOU 1961:30, p. 589). Standardized tests in some subjects are to support the norm-referencing. It is suggested that selection for admission to upper-secondary school should rely on grade average, and caution is prescribed in attaching "too much importance to grades in individual subjects" (SOU 1961:30, p. 274).
The question of grading also takes up limited space in the voluminous government bill (Prop. 1962:54). It follows the GCI on this point. The grading system is part of solving the problem of unequal access to education. Equality goals frame the comprehensive school reform as a whole. The new school system should provide all children an equally good general education and in the long run make the path to higher education open to all. The minister of education states that the existing school system sorts pupils in a supposedly rational way, which still seems to "have the effect, that it one-sidedly disadvantages, contextually or otherwise, worse-off pupils, without having achieved demonstrable advantages for other pupils" (Prop. 1962:54, p. 264).
The comprehensive school is implemented gradually, with some adjustments made in 1969. Grading is handled by the National Board of Education (Skolöverstyrelsen). It makes one substantial change and implements a five-grade scale (Andersson, 1991;Marklund, 1983). The proportion of pupils expected to receive each grade is initially specified. 7 New guidelines in 1980 stipulate 3 as the average and most frequent grade, and that grades 2 and 4 should be more common than 1 and 5. Grading is also restricted to the last two years of compulsory school (Andersson, 1991, p. 39).
Throughout this long political process, grading is embedded in a problem representation that considers unequal access to education unacceptable. Table 3 Journal of Educational Change (2022) 23:221-252 The specified percentage of 7% should be understood as a "normal value", which should not be exceeded […] [in] the subjects in which the largest number of low grades are expected, while for most subjects a lower percentage can be assumed (SOU 1942: 11, p. 57) • Competition for the highest grades might occur

Effects
• Discursive: Equalizing ambitions not questioned Norm-referenced grading, which means that grades are relative measures, not absolute, and the abolition of the pass/fail distinction are almost unanimously endorsed by the consultative bodies (Prop. 1962: 54, p. 251) • Subjectification: (Gifted) pupils from low-income homes are disadvantaged by the system, changing the system can help • Lived: Some pupils always "worst" Power SAP dominance, long-term political agreement, little political controversy about grading summarizes different dimensions of this problem representation, including additional illustrative statements from the policy documents.

Emphasising accountability-measuring results with a criterion-referenced grading system
Norm-referenced grading is soon criticized for allegedly giving no information about knowledge, 8 preventing cooperation due to competition for the highest grades, and some teachers wrongfully using classes-not the national population of pupils-as points of reference (Andersson, 1991;SOU 1977:9). The critique leads in two directions: one in favour of abolishing grades and one advocating for another system. Grading thus becomes the object of renewed inquiry in the 1970s (Marklund, 1983;Wedman, 1983). Criterion-referenced grading had been dismissed in the 1940s with reference to the difficulty of specifying quality standards (SOU 1942:11). Yet, despite the objections that were previously put forward regarding criterion-referenced grading, the idea attracts renewed interest. In 1972, however, a programme aiming to establish learning objectives for all subjects is cancelled due to conflicts and criticism that it constrains teachers (Andersson, 1991;Marklund, 1983). Some years later, a kind of criterion-referenced system is nevertheless proposed by a GCI. However, challenges are highlighted, like the time-consuming task of determining assessment criteria, potential problems with equivalence, and restricted teacher autonomy (SOU 1977:9). For upper-secondary school, criterion-referenced grading with three passing levels (with the mid/average grade specified) is suggested. Only pupils who cannot be graded due to absence are to be considered failed. For compulsory school, abolition of grades in favour of personal written evaluations is proposed (SOU 1977:9). There is political disagreement within the commission, and political power shifts characterize the period. The left is drawn to limited/no grading, and some right-wing parties argue for more grading and a new system. Hence, the proposal of a criterion-referenced system fails to gain enough political support. The normreferenced system is kept in both upper-secondary and compulsory school. The main reasons cited for the scepticism are that criterion-referenced grading will not solve any general problems and might limit teacher autonomy (Prop. 1978/79:180).
In the late 1980s, grading reappears on the political agenda. The critique of norm-referenced grading joins with new ideals of public governance, shifting the problem representation and argumentation somewhat, in favour of a grading reform. At the time, the "strong state" is criticized from both left and right, and Swedish politics undergoes substantial change (Lindvall & Rothstein, 2006). This is very visible in the educational sector. With broad political unity, the school system is decentralized and placed within a framework of management by objectives (MBO). In 1989, an SAP minority government appoints an expert group to consider grading. Their report repeats previous concerns about criterion-referenced grading, but still recommends it. A minimum passing ("approved") 9 level, possible for all pupils to achieve, is intended to serve as certification of having successfully completed compulsory school. The expert group is not unanimous, however, and the report includes different proposals and dissenting opinions (Ds 1990:60).
The task of further developing these ideas is given to a political GCI later the same year. This makes the problem representation more consistent and brings results measures to the table. The terms of reference underline that norm-referenced grading shall be replaced by a criterion-referenced system with a minimum passing level (SOU 1992:86). After the 1991 elections, a centre-right minority government comes to power and lets the commission proceed with slight modifications of the terms of reference. The time pressure is tangible, and the deadline is extended twice. To provide an empirical basis, a study called The Teacher Assignment (Läraruppdraget) is initiated. 10 Five groups with 6-7 persons each, representing different subjects and educational levels, gather to specify the core of their subjects and outline some "clearly defined qualitative knowledge levels", including a minimum passing level (SOU 1992:59, p. 7). These are then tested in participating teachers' schools. The report blends empirical descriptions with argumentation based on personal experience and conviction. 11 It is concluded that the tests show that "knowledge-referenced grading" (the term used) works; it is possible to "specify and openly communicate what is required for a certain grade" and to "determine whether a pupil has mastered the knowledge and/or skills required" (SOU 1992:59, p. 152f.). Criterion-referenced grading is also argued to have pedagogical merits; when pupils know what is expected of them, they perform better. Moreover, knowledge-referenced grading is viewed as a suitable measure of result accountability. The system communicates a "knowledge content" that teachers and pupils can strive for, and that can be used for "evaluation of schools" (SOU 1992:86, p. 45).
Resting heavily on The Teacher Assignment, the GCI also concludes that it is possible to implement a criterion-referenced grading system. Six grades are suggested, one of which is a failing grade. Suggested grade labels are: "not yet approved, approved, highly approved, very highly approved, outstanding and excellent". 12 The norm-referenced system is dismissed with reference to pupils', teachers' and parents' antipathy, and because it gives no information about knowledge. It is also argued that the idea of fair ranking does not work (SOU 1992:86).
Regarding the central distinction between pass and fail, a general boundary is proposed, relating to what is required for everyday/professional life and further studies. Pass is "the level needed to understand, function and act in our society", and 9 In this paper, the terms "pass"/"passing" and "approved" are used interchangeably as synonyms. The Swedish term godkänd literally means "approved", and in some contexts, the literal translation better conveys the reasoning behind proposals. 10 The work is led by Bo Sundblad, Stockholm College of Education, and Per Måhl, teacher and debater, who has published a book supporting criterion-referenced grading (Måhl, 1991). 11 A more extensive supplementary interview study gives a more nuanced picture. However, this seems to have been disregarded by the GCI. 12 Grade labels were suggested to be the initial letters of descriptive words; G, as in Godkändapproved, F, Framstående-excellent, etc. These letters would not indicate which grade is better. an explicit assumption is that "all pupils at the end of grade 9 should be able to achieve the grade 'approved' in all subjects" (SOU 1992:86, p. 67). Potential failure is foremost seen as a signal to schools and teachers to increase their efforts. The new system is thereby anticipated to be particularly advantageous for the weakest pupils: In a school system with clear and transparent criteria for what is required and with a strong emphasis on accountability, it should be possible to gradually reduce the number of pupils who do not reach the minimum passing level. This should in particular benefit pupils with weak support at home for their schoolwork. In today's school system, these pupils have often never understood what was required of them. (SOU 1992: 86, p. 93) No empirical support is presented for this conclusion. Both The Teacher Assignment and the GCI report include this kind of shallow argumentation lacking nuance, reference to research, empirical evidence, or other foundations. 13 The GCI presents possible assessment criteria for a few subjects, but also emphasizes that teachers must be involved in elaborating final criteria during a long try-out period. 14 The ranking function of grades is toned down. Compulsory school will mainly guarantee a basic level of knowledge: The requirement of passing, which in principle all pupils should be able to achieve, as well as the emphasis on the school's responsibility to bring all students to an acceptable level of knowledge, are the basis for admission to the national upper-secondary study programmes. (SOU 1992:86, p. 90) Three subjects, English, mathematics and Swedish, are ascribed special importance, and to avoid "blind alleys", admission can be allowed without passing subjects less relevant to a study programme (SOU 1992:86, p. 91).
The GCI displays almost total political agreement, with only one clearly dissenting view expressed by the representative from the marginalized right-wing populist party. 15 The subsequent consultation process reveals that most stakeholders support abandoning norm-referenced grading in favour of a criterion-referenced system. At the same time, many are critical of the suggested model, which is perceived as unclear and far from ready (Andersson, 1991, p. 51ff.;Prop. 1992/93:220, Appendix 4).
The centre-right coalition government nevertheless prepares a bill for a grading reform but makes some changes to the proposed model. A six-grade letter scale is suggested (A-F, with F as a failing grade), where assessment criteria are to be determined for three levels. Furthermore, the minimum passing level (grade E) is no longer expressed in terms of ability to function in society, but is simply described as "the level that virtually all grade 9 pupils should reach", C is equivalent to a "good performance" usually reached by many pupils, and A is "excellent performance of highest quality", which only a few pupils are expected to achieve (Prop. 1992/93, Appendix B, p. 117). The grades D/B are to be given when pupils are clearly above E/C, respectively, but have not reached the next level. Sketchy examples of criteria for some subjects are included to illustrate how they can be formulated without risking teacher autonomy. Concern is indicated about putting the bar for grade E too low or too high, with most attention given to the former. The government bill underlines that compulsory school must ensure that every pupil leaves "with an acceptable level of knowledge and competence" (Prop. 1992/93, p. 85), and that pupils in need of extra help are supported. A step-wise implementation is suggested. Some details of the bill are criticized in parliament, but it is passed in December 1993. The Left Party votes against it, proposing a compulsory school without grades. SAP supports the shift to a criterion-referenced grading system, but prefers a three-grade scale with descriptive labels (G, as in Godkänd-"approved" etc.) and an "X" for pupils who do not achieve the minimum passing level (Committee of Education 1993/94 UbU:01; Parliamentary minutes 1993/94:43).
After the elections in September 1994, a new SAP minority government comes to power. Already a month later, a new bill on grading is presented, also suggesting a change to a criterion-referenced system but with a different scale than in the centreright government's proposition a few months earlier. The basic arguments for the introduction of a criterion-referenced system are still the same: The most important responsibility of the school system is to ensure that all pupils can leave compulsory school with knowledge that reaches the minimum levels specified in the curricula for the different subjects (Prop. 1994/95:85, p. 5).
According to SAP's bill, no failing grades should be assigned in compulsory school. The bill thereby underlines accountability and that failure is ascribed to schools, not pupils. A firm belief in the potential benefits of support for weaker pupils is evident, and it is expected to be insufficient in only "a few cases". MPs from the former government prefer their previously accepted proposal, but SAP's bill is passed in parliament, thanks to support from the Left and Green parties (Committee of Education 1994/95:UbU06; Parliamentary minutes 1994/95:45).
Although political differences are obviously at hand, agreement prevails on fundamental aspects throughout the process. The problem representation, in other words, is basically the same among all dominating political parties; it is deemed desirable to abandon norm-referenced grading in favour of a criterion-referenced system with a sharp minimum passing level able to be used as a measure of school results. The GCI behind the main features of the bill has also been portrayed as more unanimous than many previous educational commissions ( Tholin, 2006, p. 81), and a strong faith in criterion-referenced grading was generally evident among politicians (Hyltegren, 2014). The similarities between the two government bills are also greater than the differences, which basically concern the grading scale and whether an explicit failing grade is appropriate in compulsory school. It is worth noting that three different scales are proposed during the process (four, if we include the 1990 • Accountability measure needed to safeguard equivalence • Abandoning norm-referenced grading more important than thoroughly preparing the alternative The point of departure is that at the end of year 9, all pupils will be able to reach a passing grade in all subjects. The commission has therefore considered the level of competence needed for everyone in today's society The most important responsibility of school is to ensure that by the end of year 9 every pupil can leave compulsory school with a level of knowledge that meets the requirements specified in curricula (Prop. 1994/95: 85, p. 5-SAP government) • Subjectification: The pass-fail distinction will make some pupils "unapproved" • Lived: Fail grades close door to upper-secondary school + schools/ teachers held responsible

Power
Political power shifts, agreement on key aspects, political struggle over details expert group). All the scales are also developed more or less independently of the new curricula/learning objectives that are simultaneously being worked on by a separate GCI and the Swedish National Agency of Education (SNAE). Table 4 provides an analytical overview of the path towards a criterion-referenced grading system, including additional illustrative statements from the policy documents. The problem representation dismisses the norm-referenced system as flawed and launches a criterion-referenced system viewed as pedagogically more advantageous and-most importantly-able to be used to measure school results and hold schools accountable for them.

Combating a school crisis-more grading with a new criterion-referenced scale
In 1998, the first pupils leave compulsory school with criterion-referenced grades. From the start, more than 20 per cent fail to pass one or more subjects. 16 This is instantly framed by the school authority as due to shortcomings in schools and teaching (Skolverket, 2001). As intended, grades become a results measure. The large non-passing group, and signs of equivalence problems between schools, contribute to undermining the previously strong confidence in teachers' professionalism, and divert attention from systemic aspects of the 1990s reforms, like decentralization, management by objectives and marketization (Mickwitz, 2015).
In 2006, a centre-right majority coalition government comes to power. Education is a profile issue for one of the governing parties in particular, the Liberal Party. Several reforms are prepared and launched at a rapid pace. The problem representation is informed by bad school results in terms of (failing) grades and declining results in large-scale international assessments like PISA (see e.g. Allians för Sverige, 2006). That the grading scale will be subject to reform has already been made clear in the pre-election debate. Grades play a dual role in the discourse: they indicate (bad) school results, but are also part of the solution to the problems. At this time, a GCI appointed by the previous SAP government is already reviewing learning objectives and assessment criteria, with a focus on increasing clarity and improving educational quality. The new government lets the inquiry complete its work. 17 Although the grading system as such is not the centre of attention, the report contributes to the problem representation by referring to how the implementation of a criterion-referenced system has made "knowledge deficits" visible (SOU 2007:28, p. 12f.). Otherwise, the main message is that curricular changes are needed and that the national grading criteria must and can be made clearer. School/teacher accountability is further underlined, though in a suggestion to replace the expression "goals to achieve/ strive for" with "requirements for acceptable knowledge", which more clearly places demands on pupils (SOU 2007:28, p. 220).
In March 2007, the government appoints a special "working group" within the Ministry of Education to propose a new criterion-referenced grading scale with more grade levels (Ds 2008:13). After managing to extend the deadline, the group presents a report after 11 months (Ds 2008:13). The working group underlines that it is merely suggesting a new scale, unrelated to any particular vision of knowledge or explicit grading criteria. However, the argumentation clearly supports a criterionreferenced system. The 1990s grading reform is described as having been welcomed by teachers, though inadequately prepared. The main advantage with the system is said to be the introduction of a minimum passing level, which contributes to "identifying pupils in need of help" (Ds 2008:13, p. 29). After a cursory international overview, a six-grade scale is presented: A-F. Five are passing grades, while F indicates "unapproved results". Grading criteria are to be formulated for A, C and E, and, as in the current system, all grades will be given numerical values (used to calculate "qualification scores", that decide pupils' rankings in competition for admission to secondary schools-see Table 1). 18 Results enhancement is central to the problem representation. The working group argues that more levels will encourage pupils to work harder, since the next grade will be within closer reach. However, this argument is not applicable to the lowest passing grade, which will remain unchanged. Both a figure and the text in the report clearly underline that the minimum passing level as such will not be within closer reach for pupils. When a grade cannot be assigned due to absence-expected to occur rarely-this will be marked by a dash (-). The proposed differentiation between F and dash is viewed as necessary because "for pupils striving to achieve the goals, it is important to receive confirmation of this" (Ds 2008:13, p. 62).
Receiving an F instead of (as was the case at the time) no grade at all, is represented as an advantage for pupils who are struggling to pass. This shift can be interpreted as framing the explicit failing grade (F) as an expected normal outcome for some pupils. In the previous reform, the anticipated "rare cases" were not pupils impossible to grade due to low attendance, but the whole group of pupils who did not reach the minimum passing level (for whatever reason).
In November 2008, the government presents a bill repeating the working group's main arguments and problem representation on how a new scale can contribute to enhancing educational results. With regard to the lower end of the scale, the bill confirms that the minimum passing level will remain unchanged, and that the exceptional case will be when grades cannot be assigned due to non-attendance. Again, the failing grade F emerges as a more or less normal grade. Furthermore, revised curricula and clearer assessment criteria are presented as a remedy for equivalence problems, and SNAE will be given the task of developing new criteria for the grades A, C and E in all subjects (Prop. 2008/09:66).
The parliament passes the bill in February 2009. The government's parliamentary majority makes it possible to disregard some objections. SAP also supports key elements of the bill (Committee of Education 2008/09:UbU5; Parliamentary minutes 2008/09:63). The minimum passing level is given some attention in the parliamentary debate. However, it is done in a way that never questions or problematizes the level as such. Instead the emphasis is on schools' responsibility to make sure that all pupils pass. A clear example is a speech from a Centre Party member, who represents the government position: The situation now is that 25 per cent of the country's pupils leave compulsory school without sufficient knowledge. How can you be so irresponsible as not to address the problem earlier? […] It is being done now, through both early interventions and good evaluation.
[…] Grades, together with pupil-parentteacher meetings, written assessments and national testing are tools for evaluating, helping and supporting pupils in their ongoing schoolwork, to be able to give the right support in time-something that apparently has not been done before. (Parliamentary minutes 2008/09:63) A Left Party member stresses the need for more funding, and asks how an increased number of grading levels can improve pupils' knowledge, but nothing is said regarding the minimum passing level. The absence of discussion about the minimum passing level as such is not very surprising, given the current PISA crisis (Ringarp, 2016). It is not politically viable to suggest that the minimum passing levels be (re)considered. The fact that the suggested A-F scale is identical to the centre-right government bill originally passed and then replaced in 1994, is not mentioned in the documents. 19 Brief reference is made to only two existing models: the ECTS scale and the Danish grading system. The recycled grading scale is, however, associated with more specific assessment criteria than were suggested in 1994. Another difference is that teacher autonomy is given little attention in the discourse.
In late 2009, a new working group is appointed at the Ministry of Education to prepare guidelines for earlier grading (year 6 20 ). Their report is presented after five months. The high rate of non-passing in compulsory school, declining results in large-scale international assessments, along with surveys showing "a relatively clear opinion in favour of earlier grading", make up the problem representation that justifies a (re-)introduction of grades from year 6. This is launched as part of more systematic assessment of pupils' knowledge development, directed towards early follow-up and support. It is argued that earlier grading makes struggling pupils more "visible". The report reveals that SNAE is already preparing "knowledge requirements" for grade 6. The grade-6 passing level as such is not given any attention, which is particularly interesting because-in contrast to grade 9-it needs to be formulated without a predecessor. Some, rather superficial, and partly unclear, references to research are made in the report, e.g. concerning positive effects of assessment and feedback from teachers, but these are not linked to the proposal itself (Ds 2010:15).
A government bill is presented in September 2010. The problem representation explicitly refers to declining results in international assessments and insufficient goal 1 3 Journal of Educational Change (2022) 23:221-252 achievement in Swedish schools. Early efforts are presented as a remedy, with grading from year 6 being one of the measures taken. Reference is also made to a newly published research report said to show that the abolishment of grades in primary and secondary school in the 1970s had negative long-term effects, especially for children from homes that do not encourage studying. The report gives a more nuanced picture of limited group differences (Sjögren, 2010). The government argues that with proper training and preparation of teachers, grading from year 6 can be introduced in 2012 (Prop. 2009/10:219).
Shortly after the bill is presented, Sweden holds elections. The centre-right government retains power, but now in a minority position. With support from the Sweden Democrats, 21 the parliament still passes the bill. An additional statement is made to please SAP: teachers must be granted support and guidance for the grades B and D, and an evaluation must be conducted after two years (Committee of Education 2010/11:UbU3). The parliamentary debate is characterized by political positioning over rather minor issues, such as whether grading from year 6 or 7 is preferable. Calls for broad political settlements are repeated by several MPs, and SAP's willingness to compromise on this particular issue is praised by the governing parties. Accountability is otherwise their main line of argument; grades serve as a results measure, useful both for identifying problems and ensuring support for children who need it. No attention is given to the minimum passing level (Parliamentary minutes 2010/11:38). The bill is accepted just before Christmas 2010, meaning that both the new scale and earlier grading will be implemented from autumn 2011. Table 5 presents the problem representation surrounding the switch to a new criterion-referenced scale, including additional illustrative statements from policy documents. Tackling the school crisis and improving results are central to the framing.
Ahead of the 2014 election, the government also proposes grades from year 4. 22 The left-green minority coalition government that comes to power is critical of the proposal, but faces a parliament majority in favour. It therefore proposes pilot testing, which is currently being conducted (Prop. 2016/17:46).

Different problem representations
The analyses show that the same type of policy change-a new grading system or scale-relies on very different problem representations at different times. The normreferenced grading system is launched as an equality tool. If the grading system can be made reliable, grades can be used to rank applicants for further education, which • Reformed criterion-referenced system can enhance results: − Better identification of pupils in need of support The reforms and efforts that the government has undertaken or intends to undertake aim at raising the quality of education and increasing students' knowledge (Prop. 2009 (Prop. 2008/09: 66, p. 14f.).
• Relationship between increased grading and improved results superficially considered • Teacher autonomy of minor interest Power Liberal Party dominates school debate, intense reform tempo, limited political negotiation will particularly benefit gifted children from less advantageous socio-economic conditions. Confidence in social engineering is high, and education is a central building block of a social democratic welfare state that is taking form over the course of decades.
In the transition to a criterion-referenced system with a sharp pass-fail distinction in the 1990s, the new grading system provides an accountability measure. This is called for at a time when criticism of the welfare state from both left and right has led to a decentralization and marketization of the Swedish school system, as well as to the introduction of management by objectives in the educational sector. A limited and superficial empirical study, claiming that it is possible to decide on achievement levels and assessment criteria that teachers will find reasonable and will interpret fairly equally-and which most pupils will reach-along with political consensus about key elements, paves the way for the reform.
During the most recent reform, around 2010, a new criterion-referenced scale is established as a remedy for declining school results. The main line of argument is that earlier grading and more grading steps, including an explicit failing grade, will make problems visible earlier. This will help schools/teachers (and students), as well as motivate them to work harder. Several educational reforms are implemented in a short period of time, initiated by a high-profile minister of education in a centreright government. Declining results in international comparisons such as PISA set limits to what is politically viable to discuss.

Different considerations of the weakest pupils
The discourse with regard to the least successful pupils also differs between the reforms. Placing reasonable demands on children and fair ranking are emphasized when introducing norm-referenced grading in compulsory school. To distance the system from the sharp minimum pass level of earlier systems, both the scale and number of grade steps are changed. 23 The fact that some pupils will always score relatively lowest is not given attention. Focusing on average grades does, however, allow higher grades in some subjects to compensate for lower grades in others. The numerical scale and the ranking function cause weak achievement to be referred to as low or bad grades.
Reasonable demands are also emphasized in the introduction of the first criterionreferenced system. It is firmly believed, however, that virtually all pupils will achieve the minimum pass levels formulated. To emphasize the responsibility of schools and teachers, the few who are expected not to pass do not receive grades. Still, the new grading terminology, with "approved" as key concept, makes the no-grade position equivalent to non-approved or failed. Contrary to expectations, this group becomes quite large (Arensmeier, 2019;2020), which has decisive consequences for admission to regular upper-secondary school. Even though schools/teachers are assigned the responsibility, the introduction of a criterion-referenced grading system immediately singles out a substantial proportion of compulsory school pupils as "failed", which was not the case in the previous norm-referenced system. Given Sweden's declining results in international large-scale assessments, and the attention given to the large proportion of pupils who finish compulsory school without passing one or more subjects, it is not surprising that the existing minimum passing level is reaffirmed rather than reconsidered when the second criterion-referenced system model is proposed in 2008. It is not politically viable to reconsider the demands or to propose a different grading logic or terminology. 24 Instead, it is implied in the discourse that schools and teachers are not doing enough for the weakest pupils. Another shift can be noted, namely that "the few" now refers to those who cannot be graded due to absence, not to the group of pupils who will fail to pass. The explicit failing grade (F) thereby emerges as a normal grade. It is anticipated, however, that more pupils than before will pass, thanks to the changes in the grading scale, among other things. Statistics show that this is not the case; the proportion of pupils without a passing grade remains stable in all subjects (Arensmeier, 2019;2020). 25 Even though schools and teachers are held responsible, an F also effectively defines pupils as having failed. Not passing the right subjects can also still hinder admission to upper-secondary school. Unfavourable assessments of this kind are thereby much more than just bad grades; they become "a basis for exclusion from valued education" (O'Neill, 2013, p. 5).
Taken together, the changes to the grading system have had an immense impact on the lowest performing pupils, not least through the transformed grading logic and change of grading language, from "low/bad grades" to "non-approved or fail(ed)". Furthermore, strong performances in some subjects can no longer compensate for weaker performances in others. These features were deliberately abandoned when norm-referenced grading was introduced in the comprehensive compulsory school in the mid-twentieth century. The benefits of minimum quality grades, thought to be possible for virtually all pupils to achieve, dominated the problem representation of the 1990s. Potential failures on the part of professional teachers received little or no attention, because of high public confidence in the teacher corps (Mickwitz, 2015). The effect the new system would have on the weakest pupils was thus not anticipated. On the contrary, criterion-referenced grading was argued to be particularly favourable for pupils who had previously not understood what was expected of them in school. School and teacher accountability was thought to work in the same direction. The large proportion of "non-approved" pupils can therefore be described as a surprising unintended outcome. 26 Still, the problem representation of the results crisis some 15 years later did not allow for any questioning of criterion-referenced grading or of existing minimum passing levels. Instead, the system as such was praised for making shortcomings visible, and the existing minimum passing levels 26 It is also evident that the reform rested on insufficient empirical testing. 24 Attention to the rapid policy process in previous grading reform, power shifts, visionary beliefs in teacher professionalism, or the hasty implementation, could have opened up for re-considering the levels from the 1990s. That this was not done is a further sign of the strength of dominating problem representation. 25 In 2013, 23.0% of grade 9 pupils get an F or dash in one or more subjects. A slight increase can thereafter be noted.

Table 6
Problem representations, consideration of, and effects for, weakest pupils Reform What's the problem Attention to and effects for weakest pupils Norm-referenced (1-5) Inequality; unequal access to education • No one should fail compulsory school, lowest grade a relative measure →New grading system as an equality tool • Grade average for admission (compensating performances possible)

Criterion-referenced (G-MVG)
Accountability; result measure required • Virtually every pupil expected to pass the new ´approved´ level →New grading system as an accountability measure • No grade if pass not reached-can close door to regular secondary education Criterion-referenced (A-F) Result crisis; result enhancing tools needed • At least "making an effort" acknowledged by an explicit fail grade (F) →New grading scale as a remedy for declining school results • Fail grades can close door to regular secondary education were reaffirmed. It was also argued to be important to distinguish between pupils who were absent and those who at least made an effort. This normalized the idea that some pupils will fail compulsory school. More pressure was, however, put on schools and teachers to improve their efforts at least to reduce the number. Table 6 shows the main features of the problem representations, the attention paid to the weakest pupils, and the effects the reforms have had on them.
The findings show that the same type of political solution, a new grading system, rested on different problem representations at different times. The kind of attention given to the weakest students also differed substantially between the framings, as did the effects. The changed grading logic and language introduced in the 1990s organized and described compulsory school performances in a new way, which especially affected weaker pupils who did not meet the minimum requirements for passing. The problem representation paving the way for the reform did not foresee the magnitude of this, and this consequence was indeed unintended. However, path dependency (Peters, 1996) and a modified problem representation during the most recent reform ruled out addressing this aspect. The grading logic with a specified minimum pass level was instead confirmed, and school failure in Swedish compulsory school was institutionalized in the form of a new explicit failing grade.

Implications
This close analysis of Swedish grading policy over time illustrates the importance of studying the policy level in order to understand educational change. The theory of problem representation (Bacchi, 2009) has proven to be a helpful tool, by bringing attention to the power of problem-framing, underlying assumptions and unproblematized aspects in the policy discourse.
Further, the studied case can serve as an example of the need to consider side effects in educational research (Zhao, 2017). This may be particularly important for issues like grading, which on the one hand is an accepted (Schneider & Hutt, 2014) and institutionalized (Salomonsen & Andersen, 2014) aspect of schooling, but on the other hand is an activity that is constantly criticized for shortcomings in practical execution (Anderson, 2018;Brookhart, 2013Brookhart, , 2015. As intended, criterion-referenced grading in Swedish compulsory school has turned into a results measure. The expectation that this would also make school a better place for low-performing children has, after more than 20 years, not been met. On the contrary, the pattern of a substantial proportion of failing pupils remains, and is strikingly stable (Arensmeier, 2019;2020). Both schools and pupils have, due to the changes in grading logic and language, become failures (cf. Ball, 2015;Tiana, 2008;Wikström, 2006).
The case also clearly illustrates the power of discourse. There is a big difference between leaving compulsory school with low or bad grades and having your school performances labelled "non-approved". In addition, in the most recent reform in particular, it is evident how a dominating problem representation can set the limits for policy discussions. No discussion of the passing levels as such, or about the appropriateness of a sharp pass-fail logic, was possible.
The Swedish policy debate has so far been almost incapable of approaching these side effects and discursive consequences. The proportion of pupils who fail compulsory school, equivalence problems (Skolverket, 2019) and signs of grade inflation, especially in private schools (Vlachos, 2018), have instead dominated and contributed to a questioning of teacher professionalism (Mickwitz, 2015). Requests for clearer assessment criteria are also heard (see e.g. Måhl, 2014). The limited political debate about the logic as such is in line with the notion of accountability as an idea "relatively immune to political ideology", which might at times shape "what is politically 'possible'" (Biesta, 2004, p. 234). Some recent attention is, however, given to the minimum passing levels. For example, a study of the grade 6 criteria finds that the minimum passing levels require cognitive abilities that some pupils lack, and for which extra support-often given for years-cannot compensate (Lindblad et al., 2018). Some voices supporting a re-introduction of the norm-referenced systems are also heard (see e.g. Marteus, 2017), but so far not within the political sphere.
The unforeseen negative consequences that the introduction of criterion-referenced grading has had in Sweden do not of course refute justified criticism of normreferenced grading. All grading systems have advantages and problems, and this is certainly also true for norm-referenced practices. The analysis does, however, highlight how reforms of educational institutions can matter substantially for people, sometimes in ways that completely deviate from what was intended. In this case, the criterion-referenced systems introduced cut-off points and labels which turned the weakest children into failing pupils in a way very different from the previous grading logic in the Swedish compulsory school. To be able to discuss effects such as these, changes are needed in the framing and representation of the problem.