Introduction

Language Assessment Literacy (LAL) involves understanding and applying sound assessment practices for language learning evaluation in various contexts (Stiggins, 1991). Research has extensively examined the factors that improve teachers’ LAL, revealing the interplay between contextual and experiential factors in the evolution of LAL (Yan et al., 2018). The China’s Standards of English Language Ability (CSE) serve as a case study of these influences within China’s unique language assessment landscape.

In 2018, China’s Ministry of Education and National Language Commission introduced the CSE, the nation’s inaugural national English proficiency framework. The release of the CSE aimed to standardize English proficiency measures, guide teaching and assessment, and ensure international compatibility of test scores (Liu, 2015). The initiative formed part of broader reforms to optimize examination and enrollment processes (Lin, 2016). This major policy shift motivated many EFL educators, especially those at Chinese universities, to adopt and research CSE-oriented assessment practices. A search conducted on “CNKI,” China’s comprehensive academic journal full-text database, found 554 articles dedicated to CSE were published from 2014 to April 2024. However, research on CSE peaked in 2019 and has waned gradually since then (see Fig. 1), indicating a potential decline in its appeal among Chinese scholars.

Fig. 1
figure 1

Trend of articles counted by years (CNKI, 2024). A search on the subject of “China’s Standards of English Language Ability” resulted in the retrieval of 554 articles from the CNKI database, as of April 28, 2024

This period of declining interest has presented an important opportunity for educators to reflect on prior practices and adapt their assessment methods to the CSE standards. This has led to proficiency in understanding and applying these standards to language assessment, known as CSE assessment literacy (Pan, 2020). However, the impact of this literacy on overall LAL is yet to be investigated. It is crucial to explore whether CSE assessment literacy can enhance general LAL because such research, conducted at this critical juncture, could uncover the CSE’s ability to further language teachers’ professional growth, inform CSE-based professional training, and potentially reignite interest in CSE-related studies.

Addressing the research gap in existing studies, this study firstly profiles Chinese university teachers’ LAL at various developmental stages and then quantitatively examines the correlation between CSE assessment literacy and these LAL profiles.

Literature Review

Language Teachers’ LAL Development

LAL has been conceptualized as a construct that encompasses various dimensions. Grounded in Davies’ (2008) foundational components of skills, knowledge, and principles, research on LAL has incorporated sociocultural and sociopolitical dimensions to address its inherently contextual nature (e.g., Fulcher, 2012; Inbar-Lourie, 2008). In Fulcher’s (2012) three-tier hierarchical LAL model, the language assessment contexts of history, society, politics, and philosophy are located in the top layer, with principles (processes, principles, and concepts) in the intermediate layer and practice (practical skills and knowledge) at its foundation. Taylor (2013) hypothesized the LAL profile of teachers, covering theoretical, technical, sociocultural, and decision-making domains. Complementary to this, Giraldo (2018) provided a detailed descriptor-based definition of language teachers’ LAL.

The American Federation of Teachers, the National Council on Measurement in Education, and the National Education Association (1990) formulated seven standards for teacher assessment development, focusing on skills and principles in creating, administering, scoring, and utilizing assessments. While these standards have been used globally in researching teachers’ LAL development, there is a trend to distinctively distance LAL from assessment literacy (AL) and associate it with teacher professional growth and practice (e.g., Lan & Fan, 2019). Hence, it is essential to study teachers’ LAL development anchored in the defined concept of LAL.

Contextual and Experiential Factors Mediate Teachers’ LAL Development

Recent research has underscored that teachers’ LAL development stems from a multifaceted interplay of factors. Contextual factors such as national and local assessment cultures, educational policies, institutional mandates, and infrastructures, along with experiential factors such as prior experiences, educational background, and teaching practices significantly influence LAL (Crusan et al., 2016). Yan et al. (2018) examined the integration of contextual and experiential factors in shaping teachers’ LAL development, highlighting that these factors, through continuous self-reflection, guide the evolution of intuitive and principled assessment competencies in teachers. However, considering teachers’ LAL as a developmental continuum (Pill & Harding, 2013), the interaction of these factors across various stages of LAL development warrants deeper exploration.

The Case of China: CSE Assessment Literacy Impacts Teachers’ LAL Development

The Definition of CSE Assessment Literacy

CSE assessment literacy encapsulates essential LAL elements tailored to the CSE standards, focusing on grasping and effectively applying these standards to language assessment (Jin, 2018; Pan, 2020). Originating from Jin’s (2018) framework, it involves understanding the CSE’s social context, theoretical foundations, and practical application. Pan (2020) extended this concept by introducing a model delineating CSE assessment literacy profiles for five key groups, incorporating dimensions such as social context, theoretical underpinnings, development methods, and application suggestions (see Fig. 2). Similar to Taylor’s (2013) LAL profiles, this model visually represents varying CSE assessment literacy levels among different stakeholders. It is a conceptual tool that requires further empirical validation.

Fig. 2
figure 2

CSE assessment literacy dimensions for different stakeholder groups (Pan, 2020)

The Practices of CSE Assessment Literacy

Following its launch in 2018, the CSE has had a profound impact on research in English language teaching and assessment. This research has not only been prominent within China (see Fig. 1) but has also attracted attention in the international academic community (e.g., Peng et al., 2021) (see Fig. 3).

Fig. 3
figure 3

Studies on CSE applications. See Appendix I for the references

In summary, previous literature has suggested that the interaction of contextual and experiential factors can mediate teachers’ LAL development, with continuous self-reflection during assessment practices acting as a catalyst for development. Within China’s CSE context, although practitioners tailor their approaches to align with the CSE standards, thereby enhancing their CSE assessment literacy to varying extents, there is a noticeable gap in empirical research exploring the potential of CSE assessment literacy for fostering LAL advancement. A thorough comprehension of this dynamic could empower professional training entities to devise CSE-oriented programs addressing practitioners’ specific needs in utilizing CSE for LAL enhancement.

In terms of methodology, Yan et al. (2018) utilized retrospective interviews with three teachers to explore the interaction between contextual and experiential factors in teachers’ LAL, advocating for broader, more diverse sampling to better understand this influence across various LAL profiles. Our study was conducted using a self-assessment survey questionnaire, recognized for its ability to efficiently engage large participant groups and mitigate anxiety through self-evaluation (e.g., Sun & Zhang, 2022). While prior studies often point to suboptimal LAL levels among Chinese EFL teachers (e.g., Fan & Jin, 2020; Sun & Zhang, 2022), there are concerns regarding existing scales. These include reliance on generic AL frameworks rather than LAL-specific models (e.g., Lan & Fan, 2019), and a focus on limited aspects of LAL, such as knowledge and skills, without incorporating fundamental principles (e.g., Sun & Zhang, 2022).

Therefore, to address these issues, we administered a large-scale survey to reassess the current profiles of LAL among Chinese university EFL teachers in the CSE context and investigate the association between CSE assessment literacy and LAL profiles. Specifically, the study aimed to address two key research questions (RQ):

  • RQ1: What are the current profiles of LAL among Chinese university EFL teachers?

  • RQ2: Is there a relationship between their CSE assessment literacy and LAL profiles? If so, how does CSE assessment literacy impact their LAL profiles?

Methodology

This research used a quantitative method in developing the LAL & CSE questionnaire to investigate the LAL profiles and CSE assessment literacy of university EFL teachers in China. The detailed steps of the method are provided below.

Instrument Development

We initiated the development of a LAL & CSE scale that incorporated elements based on both LAL and CSE assessment literacy models. This involved a multi-stage development process, as shown in Fig. 4.

Fig. 4
figure 4

Overview of the instrument development process

In developing the LAL section of our questionnaire, we initially adopted Davies’ (2008) widely recognized LAL model and integrated Giraldo’s (2018) descriptor-based definition into our framework. Giraldo’s adaptation of Davies’ model offered a detailed conceptual structure, highlighting essential knowledge, skills, and principles for language teachers (Puspawati, 2019). This structure featured eight dimensions with 66 unique descriptors, forming the basis of the LAL part of our questionnaire. The CSE section was based on Pan’s (2020) model, encompassing four dimensions of CSE assessment literacy, and providing a comprehensive overview of the expected CSE assessment literacy. Thus, our initial questionnaire (Version 1.0) was created.

The initial questionnaire was adapted for China’s higher education context, particularly considering the EFL teaching context during three internal reviews (Versions 2.0–2.2). After expert reviews, modifications were made to enhance the content and linguistic suitability, including considering the influence of Mandarin and regional dialects on English language assessment. This process led to Version 2.3 of the questionnaire, which was further refined to Version 2.4 based on feedback from five EFL university teachers.

A pilot study with 139 Chinese university EFL teachers assessed Version 2.4 of a questionnaire, with an attention check used to validate responses. Post-cleanup, 78 responses were analyzed, yielding high-reliability Cronbach alpha values: 0.975 for the LAL section, 0.836 for the CSE Assessment Literacy section, and 0.971 overall. Based on qualitative feedback, Chinese translations were added to the Likert scale options in Version 2.5, the final questionnaire.

Instrument Format

The questionnaire consisted of 70 items in three sections: Section 1. Demographic Features, Section 2 LAL, and Section 3 CSE Assessment Literacy. The LAL section contained 54 items measured on a 5-point Likert scale: 5 (extremely knowledgeable), 4 (knowledgeable), 3 (moderately knowledgeable), 2 (slightly knowledgeable), and 1 (not at all knowledgeable). The CSE Literacy section included eight items measured on a 3-point Likert scale: 3 (extremely knowledgeable), 2 (generally knowledgeable), and 1 (not knowledgeable) (see Fig. 5).

Fig. 5
figure 5

LAL & CSE Assessment Literacy questionnaire

Main Trial Sample

We used convenience and snowball sampling methods, utilizing Wenjuanxing, a leading online survey tool in China, to collect data from Chinese university EFL teachers. The survey, detailed on the introductory page with funding information, was broadly distributed via professional teacher groups to mitigate network biases and ensure varied participant representation. Initially disseminated through QQ and WeChat, targeting EFL teachers and educational competition participants, we encouraged sharing within their networks and promoted the survey via social media, influential individuals, and EFL teacher events at regional and national levels, thus significantly broadening our outreach.

Participants voluntarily completed an anonymous survey, assured their data was for research only. Over a month, we collected 440 responses, but after applying strict cleaning criteria (removing responses failing attention checks or incomplete for later data mining–based analysis), we obtained 233 valid ones: 107 from East China, 106 from Central China, and 20 from West China, detailed in Table 1.

Table 1 Participants’ demographic information (N = 233)

Reliability and Validity of the Questionnaire

Analyzing data from 233 participants with SPSS 26.0 showed high reliability (Cronbach’s alpha: 0.979 for LAL, 0.857 for CSE Assessment Literacy, 0.973 overall). The Kaiser–Meyer–Olkin (KMO) measure and Bartlett’s test indicated sampling adequacy for Exploratory Factor Analysis (EFA), with KMO values of 0.958 (LAL) and 0.835 (CSE), and both sections showing significant Bartlett’s test results (p < 0.00).

The EFA for the LAL section initially suggested a 7-factor solution, later refined to 3 factors via screen plot analysis, accounting for 64.2% of variance. “Skills in Educational Measurement” and “Technological Skills” merged into “Knowledge,” forming Factor 1: “Knowledge and Skills in Educational Measurement and Technology” (KSEMT), with Factor 2 as “Instructional and Language Assessment Design Skills” (ILADS), and Factor 3 as “Principles” (Appendix II). In the CSE section, EFA identified two factors from eight items, explaining 71.1% of variance, named “CSE Background with Theoretical Underpinnings and International Test Alignment” and “Yardstick for Language Education and CSE Practices” (Appendix III).

Data Analysis

To tackle RQ1, we conducted K-means cluster analysis with factor scores (Factor 1, 2, 3, and composite F) to categorize LAL profiles in university EFL teachers, supplemented by descriptive statistics and univariate ANOVAs for a general overview of LAL levels and a Chi-square test on Section 1 data for demographic differences. For RQ2, a Chi-square test assessed associations between LAL profiles and CSE assessment literacy, followed by a multinomial regression to examine the influence of CSE assessment literacy on LAL profiles.

Clustering organizes items into distinct groups based on high intra-cluster similarity and inter-cluster dissimilarity. K-means clustering, noted for its efficiency and effectiveness, often outperforms other methods for document data (Liang et al., 2012). In our study, K-means analysis involved calculating composite scores for Factors 1, 2, and 3, plus an overall LAL level score (F), using SPSS 26. Factor loadings were 28.389% (F1), 19.73% (F2), and 16.068% (F3), leading to a composite score formula (F = 0.28389*F1 + 0.1973*F2 + 0.16068*F3). The K-means clustering was executed in R 4.2.1, with detailed data in Appendix IV.

Results

RQ1: What Are the Current Profiles of LAL Among Chinese University EFL Teachers?

Cluster Results

A three-cluster model best fit the data (Fig. 6). Cluster 1 categorized a subset of 116 university EFL teachers, while Cluster 2 comprised 86 EFL teachers and Cluster 3 included 31 teachers. The clustering results showed imbalanced data distributions. To investigate the CSE assessment literacy in different LAL clusters, we applied the hard K-means clustering algorithm to categorize each teacher into each cluster, which is more likely to create imbalanced data distribution than the fuzzy K-means clustering, allowing multiple memberships across all clusters. This asymmetrical distribution of data may reflect the intrinsic nature of the dataset in the real world (e.g., risk management) (e.g., Liang et al., 2012).

Fig. 6
figure 6

Cluster categories

Descriptive Statistics and Univariate ANOVAs of LAL Profiles

As shown in Table 2, the participants’ LAL levels were not very satisfactory (M = 3.39, SD = 0.35). Additionally, Chinese university EFL teachers rated KSEMT the lowest (M = 3.17, SD = 022) and “Principles” the highest (M = 4, SD = 0.08).

Table 2 Descriptive statistics and univariate ANOVAs of LAL profiles

Based on the “Overall LAL Levels,” we categorized Cluster 1 as the high LAL profile (LAL(H)), indicating advanced LAL development. Cluster 2 was termed the moderate LAL profile (LAL(M)), and Cluster 3 as the low LAL profile (LAL(L)). The mean scores of each LAL component within these profiles generally aligned with their respective developmental stages, aligning with their designated labels. However, the only exception was “Principles” as teachers in LAL(M) were rated higher (M = 4.35; CD = 0.11) than those in LAL(H) (M = 4.28; SD = 0.06). Additionally, teachers in LAL(L) displayed exceptionally low levels in “Principles” (M = 1.98; SD = 0.05).

Comparison of Background Characteristics in the Three LAL Profiles

The Chi-square test results indicated statistical significance for “Teaching Years” and “Test Research Participation Experience,” while other variables showed no significant differences (see Table 3). These two variables served as covariables in the subsequent regression analysis.

Table 3 Comparison of background characteristics in the three LAL profiles

Is There a Relationship Between Their CSE Assessment Literacy and LAL Profiles? If so, How Does CSE Assessment Literacy Impact Their LAL Profiles?

A Chi-square test was applied to find out whether LAL profiles were associated with CSE assessment literacy. Subsequently, multinomial logistic regression was employed to identify factors influencing LAL profiles.

Association of LAL Profiles with CSE Assessment Literacy

We conducted a Chi-square test with the data of the previous cluster results as independent variables and the data of Section 3 CSE Assessment Literacy (Items 63–70) as dependent variables. The results are shown in Table 4.

Table 4 Comparison of CSE assessment literacy in the three different LAL profiles

The results showed that CSE assessment literacy was associated with LAL profiles. Four CSE assessment literacy items were found to have statistically significant differences between the three different LAL profiles. They were Item 63 of the “CSE Background,” Item 66 of “Competence Classification,” Item 69 of “Table Scales,” and Item 70 of “Teaching & Research Values.” Other items were not statistically significant (P > 0.05).

Differences in CSE Assessment Literacy Influence LAL Profiles

A multinomial logistic regression analysis, with the LAL clusters as the dependent variable, the CSE assessment literacy of Item 63, Item 66, Item 69, and Item 70 as independent variables, and Item 5 of “Teaching Years” and Item 7 of “Test Research Participation Experience” as covariates, was applied to identify how CSE assessment literacy impacted LAL profiles (see Table 5).

Table 5 Logistic regression analysis of the impact of CSE assessment literacy on LAL

When Item 63 was scored as 1 (Not knowledgeable), indicating teachers’ lack of awareness of the “CSE Background,” for each unit increase in this lack of awareness, the odds of being in Cluster 3 (LAL(L)) increased (a positive coefficient of 1.877) 6.535 times (Exp(B) = 6.535) that of the teachers who were extremely knowledgeable about this background (Item 63 = 3), compared to Cluster 1 (LAL(H)). Conversely, when Item 63 was scored as 2 (Generally knowledgeable), reflecting teachers’ general awareness of “CSE Background,” for each unit increase, the odds of being in Cluster 2 (LAL(M)) decreased (a negative Coefficient of −1.117) significantly by 67.30% (0.327–1.0 =  −0.673) compared to LAL(H). This means that the odds of being in Cluster 1 (LAL(H) increased by 30.58% (1/0.327 = 3.058). Namely, when teachers did not know the “CSE Background,” they were less likely to be in the LAL(H) profile. In contrast, when teachers had general knowledge of this aspect, they were more likely to be in the LAL(H) profile. Thus, knowing the “CSE Background” can promote a higher LAL profile.

In this sense, teachers with a deep understanding of the “Competence Classification” and “Table Scales” were found to be more likely to be in the LAL(H) profile, while those with only a general or no knowledge were less likely to be in LAL(H). However, “Teaching & Research Values” had a borderline P of 0.051 and was still considered nonsignificant due to the typical significance threshold of 0.05. To summarize, three factors of “CSE Background,” “Competence Classification,” and “Table Scales” were identified as factors to promote LAL profiles. Additionally, “Teaching Years” and “Test Research Participation Experience” were not significant in differentiating low and high LAL levels but were significant between moderate and high levels.

Discussion

LAL Profiles Among Chinese University EFL Teachers

Overall LAL Profiles

In this study, the overall LAL levels were deemed unsatisfactory, with Chinese university EFL teachers averaging a 67.8% LAL score, despite 49.7% of participants being categorized in the high LAL profile. This finding corroborates recent studies (Fan & Jin, 2020; Sun & Zhang, 2022) that similarly underscored the insufficient LAL levels among Chinese EFL teachers.

Overall, participants exhibited the highest literacy in “Principles” and the lowest in “Knowledge and Skills in Educational Measurement and Technology,” aligning with Puspawati’s (2019) observation of university teachers’ superior understanding of assessment principles over their knowledge and skills. In contrast, Sultana’s (2019) findings indicated a lack of awareness among teachers about the fairness and impact of standardized tests. The variation in findings could stem from the degree of assessment autonomy, as university teachers in Puspawati’s study had greater freedom compared to their secondary-level peers in Sultana’s study, who were subject to stricter, government-regulated assessment guidelines.

The Varied “Principles” in Different LAL Profiles

Within different LAL profiles, the literacy regarding “Principles” was complicated. The teachers in the low LAL profile exhibited an exceptionally low level of “Principles,” despite the general high levels in the other two profiles. This might be explained by teachers’ LAL developmental stages. The development of deeper, more theoretical, and principle-based insights necessitates a robust foundation of practical knowledge as a basis for LAL development (Fulcher, 2012). Moreover, the “Principles” scored the highest in the moderate LAL profile rather than in the high LAL profile, suggesting that teachers with a high LAL profile might not uniformly excel in all dimensions. Alternatively, teachers with a high LAL profile might underestimate their confidence in adhering to strict ethical standards or fairness as they gain deeper insights into assessment knowledge, skills, and principles.

Additionally, the statistically significant association of “Teaching Years” and “Test Research Participation Experience” with the three LAL profiles in the Chi-square test (see Table 3) does not imply an influencing impact on LAL levels. The role of these variables as potential factors influencing LAL profiles requires further exploration in regression analysis, which is the focus of the next section.

Impact of CSE Assessment Literacy on LAL Profiles

Relationship between CSE Assessment Literacy and LAL Profiles

The Chi-square test revealed significant associations between the three LAL profiles and CSE assessment literacy, focusing on “CSE Background,” “Competence Classification,” “Scale Tables,” and “Teaching & Research Values.” Subsequent multinomial logistic regression analysis was applied to further identify the first three items of “CSE Background,” “Competence Classification,” and “Scale Tables” as factors positively influencing the LAL profiles, while “Teaching & Research Values” was found to be nonsignificant.

The significance of “Teaching & Research Values” varied between the Chi-square and multinomial regression analyses. A variable significant in a Chi-square test, showing an overall association, may not retain its significance in multinomial regression, where the emphasis is on distinguishing between specific groups defined by the dependent variable (e.g., Bayaga, 2010; Sun & Zhang, 2022). Its borderline P value of 0.051 may suggest that while “Teaching & Research Values” is important, the “CSE Background,” “Competence Classification,” and “Table Scales” may hold greater significance in affecting the development of the LAL profiles in the more detailed, group-focused contexts.

The Promoting Role of CSE Assessment Literacy on LAL Profiles

Teachers with extensive CSE background knowledge from both low and moderate LAL profiles could progress to the high LAL profile. However, only those in the moderate LAL profile with proficiency in CSE competence classification and table scales were able to advance to the high LAL profile. Thus, CSE background seems to more broadly facilitate LAL advancement than the other two factors.

CSE Background

Understanding CSE background could promote teachers’ LAL development. This can be explained by the significance of contextual factors in assessment practices (e.g., Crusan et al., 2016; Fulcher, 2012). Comprehending the CSE background enables EFL teachers to familiarize themselves with China’s reform in the foreign language landscape, aiming to bridge gaps in goals, standards, and assessments, and to synchronize with global benchmarks (Liu, 2015). Prompted by the broader educational context, EFL teachers could gain a better understanding of the significance and urgency of the CSE framework, thus motivating themselves to align their daily assessment practices with the CSE standards and improve their LAL through practicing and engaging in self-reflection.

This impact of contextual factors on teachers’ LAL development through practices has been confirmed in empirical studies (e.g., Yan et al., 2018). Within contexts that promoted an alignment between teaching and testing (e.g., in the CSE context), the simple task of writing test items could make teachers attentive to the validity of assessment content, ensuring that student assessment is aligned with the standardized curriculum. This assessment practice of writing standardized test items also generates opportunities to encourage teachers to improve their test item writing skills, thereby deepening their insight into the core assessment knowledge, skills, and principles.

Competence Classification and Table Scales

Contrary to the influence of CSE background, a profound mastery of CSE competence classification and table scales would effectively aid teachers in progressing from the moderate to high LAL profile. This advancement requires a foundational moderate LAL level. This could be explained by their theoretical association with LAL knowledge. The development of CSE competence classification and table scales drew upon three major theories: Bloom’s Revised Taxonomy, the Communicative Language Ability Model, and the functional linguistic model (Pan & Xiao, 2022). A deep understanding of the two CSE factors demands a familiarization of these applied linguistic theories, integral to assessment knowledge in LAL models (Davies, 2008). Thus, in this process, teachers’ LAL could be advanced.

Furthermore, when EFL teachers apply the CSE competence classification and table scales in classroom assessment practices or standardized testing, their LAL can be developed inductively by hands-on experience and reflections on those experiences, reflecting a constructivist view of LAL development (e.g., Inbar-Lourie, 2008). Sang’s (2023) research on English teachers’ perceptions of the CSE in Chinese universities corroborates this impact. Participants noted that the CSE table scales could clearly define student behaviors, furnishing a direct, quantifiable method for appraising learning outcomes and teaching efficacy, thereby offering a viable framework to gauge both teaching success and student progression.

“Teaching Years” and “Test Research Participation Experience” as Covariables to Affect LAL Profiles

The covariates “Teaching Years” and “Test Research Participation Experience” were not significant in differentiating low from high LAL profiles but were significant between moderate and high profiles, suggesting that the impact of these covariates on LAL development is contingent on reaching a certain threshold of LAL proficiency. This discovery provides fresh perspectives on resolving debates about the effectiveness of certain LAL-influencing factors, such as years of teaching (e.g., Sun & Zhang, 2022).

Conclusion

The study found that Chinese university EFL teachers’ LAL was mostly unsatisfactory, distributed across high, moderate, and low profiles. Awareness of the principles was highest in the moderate LAL profile and lowest in the low profile, outperforming the other LAL components. Mastery of “CSE Background” enhanced LAL across low and moderate profiles, while “Competence Classification” and “Table Scales” benefited only the moderate profile. Therefore, proficiency in CSE assessment literacy in terms of background, competence classification, and table scales promoted the LAL profiles of Chinese university EFL teachers.

This study has several limitations that suggest directions for future research. First, the reliance on self-report scales may limit the generalizability of findings, recommending the use of interpretative methods like teacher interviews and classroom observations to deepen the data (Giraldo, 2020). Second, while “Teaching Years” and “Test Research Participation Experience” did not distinguish between low and high LAL profiles, they were significant in differentiating moderate from high profiles. This highlights their impact at higher LAL proficiency levels and underscores the need for longitudinal studies to examine LAL development across career stages.

This study carries theoretical and practical implications for researchers, policymakers, and EFL teachers. Theoretically, this study investigates the integration of contextual and experiential factors in LAL development using China’s CSE as a focal point. Employing K-means clustering analysis, it identifies diverse LAL profiles among Chinese EFL university teachers, responding to Yan et al.’s (2018) call to examine LAL developmental trajectories. This study is the first large-scale empirical research on the role of CSE assessment literacy in enhancing LAL profiles, thus enriching the existing literature on CSE and LAL. It also clarifies the impact of factors such as teaching experience on LAL, suggesting the existence of a threshold at which these factors become significant.

Practically, the findings will inform LAL enhancement strategies and international policymaking. Educational leaders should provide diverse LAL development programs tailored to teachers’ LAL profiles. Teachers can thoroughly understand assessment contexts, effectively integrate theory with practice, and actively engage in self-reflection (Yan, et al., 2018). Continuous training will help them stay abreast of educational trends and support their professional development. Additionally, the standardization of China’s CSE aligns local educational practices with global standards, helping policymakers balance global uniformity with local specificities. This is crucial for the diverse and multifaceted educational systems in the Asia–Pacific, offering a roadmap for integrating international standards with local educational cultures and languages.