Introduction

Particularly since the 1990s, researchers have promoted theoretical perspectives in education that account for children’s innate tendency toward holistic development (e.g., Boekaerts, 1993; McCombs, 2001; Ryan & Deci, 2000, 2020). Today, the notion of schools as complex systems in which children and adolescents confront a broad range of developmental tasks (including those pertaining to social and emotional development, identity formation, and academic learning) is widely accepted in education psychology research (Eccles & Roeser, 2003, 2011; Osher et al., 2007; Roeser & Eccles, 2014; Tetzner et al., 2017). Researchers have therefore argued for stronger consideration of well-being as a critical outcome of schooling, instead of a unidimensional focus on achievement outcomes (e.g., Huebner et al., 2014; Ryan & Deci, 2020; Tian et al., 2014). Similarly, the heightened socio-political importance assigned to children’s well-being in schools is reflected in the recent consideration of well-being as a critical indicator of high-performing education systems beyond academic achievement (e.g., OECD, 2017).

Well-being is a multifaceted concept that incorporates the co-existence of heterogeneous and complementary theoretical perspectives (Ben-Arieh et al., 2014). In education psychology research, one prevalent perspective on children’s well-being focuses on the notion of Subjective Well-being (hereafter: SWB; Huebner et al., 2014; Long et al., 2012; Tian et al., 2015, 2016). SWB has been described as a “broad category of phenomena that includes people’s emotional responses, domain satisfactions, and global judgments of life satisfaction” (Diener et al., 1999, p. 277). SWB is considered a specific form of well-being that reflects how someone believes or feels that his or her life, in general or in a specific domain, is going well from that person’s own perspective (Diener et al., 2018), and it follows a hedonic-oriented research tradition (for an overview see Keyes et al., 2002 or Jayawickreme et al., 2012). Prevalent in research on the SWB/achievement relation are two-component models of SWB that differentiate between a cognitive component and an affective one (Bücker et al., 2018; Dalbert, 2003; Steinmayr et al., 2016, 2018; Tian et al., 2015). SWB’s cognitive component is typically represented by a measure of global or domain-specific satisfaction, and the affective component is represented either by a uni-dimensional measure of mood (e.g., Dalbert, 2003; Steinmayr et al., 2016, 2018), or by positive versus negative affect (e.g., Tian et al., 2015). SWB has been considered “an indispensable perspective to study adolescents’ school well-being” (Tian et al., 2014, p. 356), as well as a valuable outcome in and of itself (Huebner & Gilman, 2006; Tian et al., 2014, 2016).

With academic achievement and students’ well-being both constituting desirable schooling outcomes, as well as critical indicators of adaptive functioning (Huebner & Gilman, 2006; e.g., Suldo et al., 2008), the past two decades have seen an upsurge in research on how these two constructs relate to one another (Bücker et al., 2018; Gilman & Huebner, 2006; Kaya & Erdem, 2021; Kleinkorres et al., 2020; Ng et al., 2015; Putwain et al., 2020; Steinmayr et al., 2016, 2018; Suldo et al., 2006). However, with growing evidence of the relation between students’ SWB and academic achievement (hereafter: SWB/achievement relation), the findings have become more divergent and heterogeneous findings are yet to be understood (Bücker et al., 2018).

The present study addresses these uncertainties by examining how far three potential sources contribute to differential SWB/achievement relations: Firstly, we examine differences related to the domain-specificity of measures (i.e., global vs. school-specific vs. math-specific measures of SWB). Secondly, we attend to differential relations by the type of SWB component (i.e., cognitive vs. affective). Thirdly, we investigate whether SWB/achievement relations differ by type of achievement indicator (i.e., grade-based vs. test-based indicator). The present study thereby highlights the need for more scientific debate regarding the domain-specific conceptualization of SWB in school, and it expands upon the existing literature by providing a theory-based argument for the subject-specific conceptualization of students’ SWB.

Theoretical and empircal support for the school- and subject-specific conceptualization of SWB

Over the past decade, more and more researchers interested in children’s well-being in school have opted for school-specific conceptualizations of the construct (Long et al., 2012; Putwain et al., 2020; Tian et al., 2014, 2015, 2016). These researchers emphasized the domain-specific organization of adolescents’ experiences which would in turn substantiate the need for domain-specific measures of SWB (Long et al., 2012; Tian et al., 2015). They noted that relations between constructs may be masked if SWB is measured at a global level (Tian et al., 2015). Others highlighted potential differences in the cognitive component’s domain-specificity (e.g., satisfaction) as opposed to the affective component’s. For example, Dalbert (2003) argued that changing from one domain to another does not necessarily induce a change in one’s mood, therein favoring a global conception of SWB’s affective component. However, the empirical evidence of SWB’s domain-specific organization in adolescents remains limited. Furthermore, as noted by Tian et al. (2015), the few existing studies have frequently focused on school-specific measures of the cognitive but not the affective component (Tian et al., 2015). For example, studies involving a global as well as a school-specific measure of satisfaction detected moderate, positive correlations between these variables (e.g., Long & Huebner, 2014, p. 682: 0.30 ≤ r ≤ 0.42), therein supporting their discriminant validity (also see Suldo et al., 2008). Furthermore, studies that investigated the internal structure of life satisfaction also found support for the construct’s domain-specific organization, with school being one relevant life-domain (Gilman et al., 2000; Huebner et al., 1998). In contrast, there is a paucity of empirical evidence from between-network approaches involving external criteria variables (see Byrne, 1984), and existing data does not consistently support the discriminant validity of a school-specific vis à vis a global measure of satisfaction (e.g., Long & Huebner, 2014). Concerning SWB’s affective component, only few studies have attempted to developing or validating school-specific measures (e.g., Long et al., 2012; Tian et al., 2015), however, these studies did not investigate differential between-constructs relations by their level of domain-specificity (i.e., for a global versus a school- or a subject-specific measure of SWB’s affective component). Last but not least, to the best of our knowledge, no study to date has yet addressed potential subject-specific conceptions of students’ SWB, nor are we aware of any measures in this regard. This is surprising considering the wealth of evidence of the subject-specific organization of other key constructs in educational psychology, including more cognitively-oriented constructs such as academic self-concept (e.g., Marsh, 1990; Marsh et al., 1988), as well as more affective-oriented constructs such as academic emotions (Goetz et al., 2006, 2007). For example, research findings on within- and between-subject relations among different academic emotions indicate stronger relations between various emotional constructs within the same school subject (e.g., enjoyment and boredom in math), than between the same emotional construct across different school subjects (e.g., enjoyment in math and enjoyment in English; Goetz et al., 2007). The ascribed within-subject covariances of different emotional experiences support the assumption of a higher-order factor that reflects subject-specific emotional experience. This in turn would be consistent with the notion of subject-specific evaluations of one’s emotional experiences as presumably captured by SWB’s affective component. However, unlike academic emotions, SWB focuses on evaluating the quality of one’s experiences (e.g., good vs. bad, or pleasant vs. unpleasant), without limiting itself to specific (academic) emotions that might underlie this subjective assessment. We argue that it is precisely in this notion of the construct that subject-specific SWB corresponds to theoretical perspectives concerned with the breadth of students’ classroom experiences (Eccles & Roeser, 2011; Osher et al., 2007; Tetzner et al., 2017), as well as the intra-individual differences in such experiences by subject-domain (Jacobs et al., 2003).

The SWB/achievement relation and what may underly differential relations

Empirical findings converge on a positive relation between students’ SWB and academic achievement (Bücker et al., 2018). Notably, in the first meta-analysis of 47 studies published between 1978 and 2017 with 151 relevant effect sizes, Bücker et al. (2018) identified a mean effect size of r = 0.164 with significant heterogeneity across different studies and samples. The authors examined a wide range of potential moderators of the SWB/achievement relation, including domain-specificity, type of component of SWB, as well as the type of achievement indicator. Whereas none of these variables were identified as a significant moderator of the SWB/achievement relation, we mention some limitations in their analyses. Firstly, in their account of different levels of domain-specificity, Bücker et al. (2018) distinguished between domain-general measures of SWB, those that were specific to academics, and those that were specific to a domain other than the academic one. However, since no such studies yet exist, they considered no subject-specific measures of SWB. Secondly, their means of detecting systematic differences by type of component may have been compromised, since most of the individual studies that their meta-analyses incorporated considered either the cognitive or affective component of students’ SWB, but not both. Thirdly, the authors did not account for the potential effect of (mis-) matching levels of domain-specificity regarding SWB and achievement. In this context, a test-based indicator is typically subject-specific, whereas the grade-based indicator is typically operationalized as an average grade across subjects (e.g., self-reported or objectively-reported Average Grade [GPA]). Consequently, in their findings, the effect of (mis-) matching levels of domain-specificity was likely confounded with the effect of the type of achievement indicator. Furthermore, in addressing potential differences by type of achievement indicators, Bücker et al. (2018) focused primarily on the conceptual distinction between objectively-reported (e.g., school-reported GPA), and subjectively-reported (e.g., self-reported GPA) achievement indicators (also see Long & Huebner, 2014). This differentiation of achievement indicators may however not be the most relevant one, given the evidence supporting the validity of self-reported grades (e.g., Sparfeldt et al., 2008).

Differential SWB/achievement relations associated with domain-specificity of measures

Few studies have so far examined the role of domain-specificity of SWB measures in driving differential SWB/achievement relations. Besides the aforementioned meta-analysis (Bücker et al., 2018), a notable exception is the study by Long and Huebner (2014), who addressed differential SWB/achievement relations in examining the discriminant validity of a school-specific and a global measure of satisfaction. These authors expected stronger SWB/achievement relations for the school-specific compared to a global measure of satisfaction. However, their findings failed to support their hypothesis. On the contrary, for some school-specific variables including GPA, the concurrent between-constructs associations were significant only when global satisfaction, but not school-specific satisfaction was considered. Beyond this initial evidence, there is little research on the role of subject-specificity in understanding differential SWB/achievement relations. Accordingly, we are unaware of any such analyses pertaining to the discriminant validity of measures of SWB’s affective component at different levels of domain-specificity.

Taking a validity perspective, researchers also highlight favorable criterion-validity when variables display matching levels of domain-specificity (hereafter interchangeably referred to as specificity-matching), especially when both variables are very specific (e.g., interest in math/math grade). For example, researchers have emphasized the relevance of frame-of-reference effects pertaining to the variance in self-reports associated with varying contexts that a respondent may (consciously or not) use as a reference-point in describing his or her own behavior (e.g., Schmit et al., 1995; Shaffer & Postlethwaite, 2012). Against this background, domain-specific item-formulations (e.g., ‘at school’) have been considered advantageous in that they prompt the respondent to self-report behaviors that are particularly relevant regarding a criterion of interest (e.g., motivated behaviors at school in predicting performance). In the broader context of educational psychology research, there is evidence of the relevance of specificity-matching in between-constructs relations (e.g., Wirthwein et al., 2013). This emphasis on specificity-matching also resonates with contemporary theoretical perspectives on validation, wherein validity of a measure is assessed in view of the intended interpretations and of how a test score is used (Kane, 2013; also see Wolters & Won, 2017). Lastly, more specific measures have been associated with lower (random) error variance, which may underly higher between-constructs relations (e.g., Robie et al., 2000; for an overview see Baranik et al., 2010).

Differential SWB/achievement relations by type of component (affective vs. cognitive)

When it comes to potential differences in the relation between academic achievement and SWB by type of component (e.g., cognitive or affective component) a recent study by Steinmayr et al. (2016) provides relevant insights. These authors investigated the relationship between students’ global SWB and GPA in a sample of 290 German high school students with two measurement points. In their study, the concurrent correlations at both measurement points were descriptively stronger between global satisfaction and GPA (t1: r = 0.10; t2: r = 0.18), compared to those between global mood and GPA (t1: r = 0.02; t2: r = 0.02). However, since differential relations by type of component were not the focus of their research, Steinmayr et al. (2016) did not test whether differences were significant. Instead, they focussed on predictive relations, and noted that GPA served as a significant predictor of changes in life satisfaction but not in global mood. In discussing the differences in the relation between academic achievement and SWB by type of component, Steinmayr et al. (2016) suggested that the affective component may be more strongly influenced by non-academic factors which could weaken its relationship with academic achievement. Furthermore, a study by Ng et al. (2015), who relied on a sample of 821 middle-school students in the United States, investigated the concurrent associations between school-based positive and negative affect, and academic achievement. The bivariate correlations between GPA and each of the two affect variables were small (school-specific positive affect/GPA: r = 0.08; school-specific negative affect/GPA: r = -0.15) and descriptively smaller than the concurrent association between GPA and life satisfaction (r = 0.21), whereby the authors did not report whether differences were significant (Ng et al., 2015). In sum, initial evidence points towards differential SWB/achievement associations revealing stronger relations with the cognitive than the affective component. However, the evidence remains thin. Furthermore, in light of the insufficient account of domain-specific measures of SWB in published studies, it also remains unclear whether differential SWB/achievement relations by type of component occur at different levels of domain-specificity.

Differential SWB/achievement relations by type of achievement indicator

In the realm of research on the SWB/achievement relation, researchers have not yet studied whether between-constructs are stronger for grade-based or test-based achievement indicators. Moreover, most studies considered either a grade-based indicator (usually GPA; e.g., Steinmayr et al., 2016), or a test-based indicator (e.g., Chang et al., 2003), but not both (for an overview see Huebner et al., 2014). A notable exception is the study by Long and Huebner (2014), who considered three types of achievement indicators: self-reported grades (assessed with a single item asking for grades the student usually earns), school-reported GPA, and test scores in different subject areas (reading and language usage, science, and math). However, systematic differences in between-constructs relations by type of achievement indicator were not the focus of their research, and thus not empirically addressed. In contrast, findings from educational psychology research on motivational constructs suggest that between-constructs relations tend to be stronger when grade-based rather than test-based indicators are considered. For example in their sample of 1,067 8th-grade students in Germany, Lauermann et al. (2020) identified subject-specific self-concept of ability as the strongest unique predictor of student’ grades in math and the language arts, whereas intelligence was particularly relevant in predicting standardized test scores in math. The authors stated that a stronger relation between academic self-concept of ability and grades is plausible since grades also function as a salient feedback mechanism on students’ performance. As such, grades are likely to play a more central role in shaping students’ self-perceptions as reflected in their self-concepts of abilities. We propose that a similar argument may hold in the case of students’ SWB: Grades, through their function as a salient feedback mechanism, are more likely to influence students’ well-being – a factor that could contribute to the positive relation between these variables. In contrast, standardized performance indicators do not serve as a regular performance feedback mechanism in the German school system, thus the relation between test scores and SWB may be relatively weaker.

The present study

In summary, the present study examines the sources that may underlie differences in the strengths of the SWB/achievement relation. Specifically, we identified three potential triggers of differential SWB/achievement relations, notably the level of domain-specificity, the component of SWB, and the type of achievement indicator. Based on our review of the existing literature we propose not only a school-specific, but also a subject-specific measure of adolescents’ SWB. Our analyses were set up to systematically alter only one of the three potential drivers of differential SWB/achievement relations, while keeping the others unchanged. Our study thus addresses not just persisting uncertainties associated with the unexplained heterogeneity in findings on the SWB/achievement relation (Bücker et al., 2018; also see Huebner et al., 2014)—it also contributes to the evolving scientific debate regarding the domain-specificity of SWB in school (Long & Huebner, 2014; Tian et al., 2015, 2016). We pose the following three specific research questions:

  • RQ 1: Is the SWB/achievement relation stronger in conjunction with more specific and specificity-matching SWB measures?

Accounting for potential differences in the domain-specificity of SWB’s cognitive and affective components (Dalbert, 1992, 2003), we formulated separate hypotheses by type of component. Based on our literature review, we expected stronger SWB/achievement relations in conjunction with a more specific rather than less specific measure of SWB (i.e., r [math-specific SWB/math achievement] > r [school-specific SWB/math achievement]) regarding the cognitive component (H1a), and regarding the affective component (H1b). Furthermore, we hypothesized stronger between-constructs relations when both constructs are more specific (i.e., r [math-specific SWB/math achievement] > r [school-specific SWB/school achievement]), concerning the cognitive component (H2a) and concerning the affective component (H2b). Lastly, we expected differential relations between school-specific SWB and GPA on the one hand, and math-specific SWB and GPA on the other hand (i.e., r [school-specific SWB/GPA] ≠ r [math-specific SWB / GPA]), with the cognitive component (H3a) and affective component (H3b). Since the two potential mechanisms, that is, specificity-degree (i.e., stronger relations with more specific constructs) and specificity-matching (i.e., weaker relations with specificity mis-matching constructs) are assumed to have opposing effects on the SWB/achievement relation, we did not formulate directed hypotheses in these latter cases.

  • RQ 2: Is the SWB/achievement relation stronger with SWB’s cognitive than its affective component?

In line with recent findings implying differential SWB/achievement relations by type of component (e.g., Steinmayr et al., 2016), we expected stronger between-constructs relations with SWB’s cognitive than its affective component across types of achievement indicators. To avoid potentially confounding effects, we accounted for matching-specificities between constructs, and tested separate hypotheses for math-specific constructs (H4a), and school-specific constructs (H4b).

  • RQ 3: Is the SWB/achievement relation stronger with a grade-based than a test-based achievement indicator?

In light of the wealth of findings on other constructs in educational psychology research (e.g., Lauermann et al., 2020; Marsh et al., 2005), we expected stronger SWB/achievement relations with a grade-based rather than a test-based indicator of academic achievement (H5). We investigated this question only with respect to math-specific constructs, since we did not dispose a test-based achievement indicator at the school level.

Method

Participants and procedures

Our final sample comprised N = 767 students (n = 361 female, n = 403 male) in public secondary schools located in Central Western Germany (for a detailed description of the data-set, also see Steinmayr et al., 2018). Participants were recruited from 8th and 9th grade classes (n = 33 classes) at four schools, two of which were comprehensive schools (‘Gesamtschule’, n = 390 students) and two were highest academic track schools (‘Gymnasium’, n = 377 students). The average full-year age of students was 14.07 (SD = 0.82, range = 12–17). Nearly all participants (96%) reported Germany as their country of birth. The vast majority reported speaking mainly German with their parents at home (87%), whereas a minority indicated speaking mainly Turkish (7%), or a language other than German or Turkish (6%) with their parents at home. More than a third of participants indicated that their mother (36%) and/or father (38%) had a school certificate required for access to tertiary-level education. As noted in a prior description of this sample by Steinmayr et al. (2018), the parental education level was on average higher, and the share of students with an immigration background lower than in a representative sample of German 8th and 9th grade students, a factor attributable to the high percentage of students attending the highest academic track schools in the present sample. We checked our data set for outliers and excluded a total of eight pupils from our initial sample (N = 775) due to systematic response patterns or large amounts of missing data.

Research assistants reached out to schools and informed them about the aims and content of our data collection. Participants were recruited from all 8th and 9th grade classes at each participating school. Participation was voluntary and based on a written informed-consent procedure. We obtained parental consent from 95% of students. 10% of students did not participate for reasons such as illness-induced absence, resulting in an overall participation rate of 85% of the student population. Data was collected in schools between end of 2015 and early 2016. Trained research assistants conducted the 90-min testing sessions during regular class time.

Measures

Subjective well-being

Cognitive and affective SWB components were measured using a short-version (Steinmayr et al., 2018) of the German Habitual SWB Scale (HSWBS; Dalbert, 1992, 2003). Responses to items in both scales were provided on a six-point scale ranging from ‘I do not agree at all’ (= 1), to ‘I fully agree’ (= 6). Response scales for each item-stem were presented in a matrix format with three horizontally arranged response-blocks, with headline labels indicating the respective context (“in general”, “in school”, “in math class”). Students were thus asked to select one response option per block so that one generic item formulation corresponded to three parallel-worded items, each referring to a different level of domain-specificity.

The cognitive component (hereafter interchangeably referred to as Satisfaction) was measured with five-items, whereby two items were past-oriented, two items were present-oriented, and one item was future-oriented (e.g., “When looking back at my past, I have achieved much of what I hoped to”; “I’m satisfied with my situation”, “I think that with time, I’ll experience interesting and uplifting things” [own translation from German]). As noted by Dalbert (1992), the measure of the cognitive component resembles the Satisfaction With Life Scale proposed by Diener et al. (1985). The Satisfaction scales’ internal consistencies at different levels of domain specificity (i.e., global [SatisfactionGlobal]/school-specific [SatisfactionSchool]/math-specific [SatisfactionMath]) ranged between 0.83 ≤ α ≤ 0.92 (see Table 1).

Table 1 Descriptive statistics and observed bivariate correlations

The affective component was assessed using a short version of the German Mood Level Scale of the HSWBS (Dalbert, 1992), which in turn builds upon the German version (Bohner et al., 1991) of the original Mood Level Scale (Underwood & Froming, 1980). The Mood Level Scale was originally conceived to reflect the presence of positive and absence of negative affect (see Dalbert, 1992). However, although Dalbert (1992) did not discuss potential conceptual implications (instead retaining the notion of a ‘unidimensional mood scale’ (also see Dalbert, 2003)), the shortened six-item scale eventually proposed by the author primarily reflects the presence of positive affect, but not the absence of negative effect. The short version employed in the present study consisted of five items (“I consider myself a happy person” [original item formulation by Underwood & Froming, 1980]), whereby two items were reverse-coded (e.g., ‘I’m not as cheerful as most people’ [original item formulation]), and one item (“I generally look at the sunny side of life” [original item formulation]) was dropped (hereafter interchangeably referred to as Mood). The Mood scales’ internal consistencies at different specificity levels (i.e., global [MoodGlobal]/school-specific [MoodSchool]/math-specific Mood [MoodMath]) ranged between 0.70 ≤ α ≤ 0.72 (see Table 1).

Academic achievement

We considered a total of three academic achievement indicators in the present study, two of which were grade-based and one test-based. The grade-based indicators relied on students’ self-reported grade in math (hereafter: Math Grade), and their self-reported overall average grade according to their most recent end-of-term certificate (hereafter: GPA). Self-reported grades are considered a valid indicator of actual grades in light of strong correlations between self-reported and actual grades, as shown in prior research ( r ≥ 0.90; see Sparfeldt et al., 2008). Both grade-based indicators were assessed with a single item placed at the end of the questionnaire. As Math Grade and GPA were modelled as manifest, single-item variables, no error terms were specified. According to the German grading system, school grades range between one and six, with one being the highest grade and six the lowest grade. To facilitate interpretation, we reverse-coded the grade-based indicators so that higher scores would reflect better grades.

A test-based indicator of academic achievement in math (hereafter: Math Test) was computed based on the performance in selected test items developed for the standardized assessment of secondary students’ math competences (Trends in International Mathematics and Science Study; Baumert et al., 1998). The performance test thereby comprised a total of 30 items from six content domains, referred to as Algebra, Data Representation and Analysis, Number Sense, Geometry, Measurement, and Proportionality. Each of these content domains was assessed by five items. Sum scores were computed for each content domain with higher scores corresponding to better performance. A latent factor was modelled with the sum-scores per sub-domain serving as manifest parcel indicators (Little et al., 2013). The internal consistency was α = 0.91 for the overall test. For the latent modelling of Math Test, we allowed residual covariances between parcel indicators that represented related content domains (also see Lauermann et al., 2020), notably between ‘Proportionality’ and ‘Measurement’, between ‘Geometry’ and ‘Measurement’, as well as between ‘Data Representation and Analysis’ and ‘Algebra’. The fit of the final measurement model of Math Test was good (χ2 = 6, df = 6, CFI = 1.00, RMSEA = 0.010, SRMR = 0.008).

Statistical analyses and procedures

Preliminary analyses were conducted with IBM SPSS version 23. MPlus version 8.6 was used for latent modelling (Muthén & Muthén, 2017). Using the “type = complex” option in Mplus, standard errors were adjusted for the non-independence of observations, generating a total number of 33 clusters, representative of the total number of school classes. Maximum likelihood parameter estimates were computed which are considered robust to non-normality and non-independence of observations (“estimator = MLR”; Muthén & Muthén, 2017). To deal with missing data, Full Information Maximum Likelihood (FIML) was used (Brown, 2015). Global model fit was assessed using RMSEA, SRMR, and CFI, as recommended by Goodboy and Kline (2017), whereby CFI close to 0.95, RMSEA close to 0.06, and SRMR close to 0.08 indicated good model fit (Hu & Bentler, 1999; Marsh et al., 2004). For scaling purposes, factor variances were fixed at 1, and all factor loadings were estimated freely. Correlated error terms were allowed for parallel items at different levels of specificities to account for methodological artefacts induced by analogous wording (e.g., Marsh et al., 2010). Ex–post modifications of latent measurement models were carried out considering global fit indices, of local areas of strains as indicated by modification indices, as well as of substantial arguments. Confirmatory factor analyses (hereafter: CFA) were employed to first examine each measurement model separately and to assess whether the presumed factorial structure can be replicated in the case of SWB’s adapted, domain-specific measures. After verifying the appropriateness of the presumed measurement models, each model was integrated in a comprehensive model which thus included all variables representing SWB and achievement. All bivariate relations were estimated freely. To test our hypotheses, a series of invariance constraints were specified and tested one by one. Nested models were compared based on the Satorra-Bentler scaled chi-square difference test statistic (∆χ2SB), which is appropriate for use with the MLR estimator (Brown, 2015; Muthén & Muthén, 2017). A five percent level of significance was presumed for the difference tests (p < 0.05).

Results

Descriptive statistics, inter-correlations between all manifest variables, as well as indicators of reliabilities are displayed in Table 1. Information on all subsequently described latent models including model fit indices are summarized in Table 2.

Table 2 Measurement Models of SWB at different Levels of Domain-Specificity (Global, School, Math)

Preliminary analyses

Using CFA, we first assessed the appropriateness of SWB’s latent models, particularly of school-specific and math-specific SWB, since these models relied on adaptations of the original measure of global SWB. To this end, we specified separate measurement models for SWB at each level of domain-specificity (i.e., global [abbreviated: Gl]; school-specific [abbreviated: Sc], and math-specific [abbreviated: Ma]). Each baseline model (Models Gl_0, Sc_0, Ma_0) comprised one Satisfaction factor and one Mood factor, and the inter-factor covariance was estimated freely in each model. Each of the five items per factor was specified to have one target loading; no other factor loadings or residual covariances were allowed. In each of these three baseline models, we detected local strains between the two reverse-coded items of the Mood factor. To account for methodological artefacts associated with negative wording in these two items (see Marsh, 1996), we allowed residual covariances between these two reverse-coded items (Models Gl_1, Sc_1, Ma_1), which significantly improved the fit of all models (18.24 ≤ ∆χ2SB ≤ 107.55, df = 1, p < 0.01). The resulting three measurement models of SWB at each level of domain-specificity showed good fit to the data (Models Gl_1, Sc_1, Ma_1: 45 ≤ χ2 ≤ 129; df = 33; 0.963 ≤ CFI ≤ 0.993; 0.021 ≤ RMSEA ≤ 0.062; 0.022 ≤ SRMR ≤ 0.038).

To gather further insights on the appropriateness of the presumed two-dimensional structure of domain-specific SWB, we computed alternative one-dimensional models of school- and math-specific SWB (Models Sc_2 and Ma_2). Global fit indices for these alternative, one-dimensional models did not fulfill the good model fit criteria, and difference tests also indicated a significantly better fit for the two-dimensional models (77.92 ≤ ∆χ2SB ≤ 146.18, df = 1, p < 0.01). We therefore continued with SWB’s identified two-dimensional models (i.e., Gl_1, Sc_1, Ma_1) and integrated them in one comprehensive SWB measurement model across levels of domain-specificity (Model SWB_1). Consistent with conventional practice (e.g., Arens et al., 2011), residual covariances between items with parallel wording were allowed in this model. The fit of this comprehensive SWB model across domain-specificity levels showed a good fit to the data (χ2 = 657, df = 357, CFI = 0.969, RMSEA = 0.033, SRMR = 0.046). Completely standardized factor loadings were all positive and significant (p < 0.01). Specifically, completely standardized factor loadings on the Satisfaction factors ranged between 0.60 ≤ λ ≤ 0.90, and they were similar across specificity levels (math-specific Satisfaction: 0.68 ≤ λ ≤ 0.90; school-specific Satisfaction: 0.61 ≤ λ ≤ 0.84; global Satisfaction: 0.60 ≤ λ ≤ 0.81). Regarding the Mood factors, factor loadings ranged between 0.26 ≤ λ ≤ 0.91 (by level of domain-specificity: math-specific Mood: 0.26 ≤ λ ≤ 0.91; school-specific Mood: 0.33 ≤ λ ≤ 0.83; global Mood: 0.36 ≤ λ ≤ 0.84). Note that the Mood factors’ factor loadings were comparatively low for the two reverse-coded items at all levels of domain-specificity. Inter-factor correlations between Satisfaction and Mood at the same level of domain-specificity were high and similar across levels of domain-specificity (math-specific SWB: r = 0.73; school-specific SWB: r = 0.66; global SWB: r = 0.77; for inter-factor correlations also see Fig. 1). Inter-factor correlations between the analogous factors at different levels of domain-specificity ranged between 0.47 ≤ r ≤ 0.68 in the case of Satisfaction, and between 0.36 ≤ r ≤ 0.60 in the case of Mood. Thereby, correlations were descriptively stronger between factors at adjoint levels of domain-specificity (i.e., math and school, or school and global) than those between constructs at disjunct levels of domain-specificity (i.e., math and global).

Fig. 1
figure 1

Model-implied estimates of bivariate correlations in the full model of SWB and academic achievement. *** p < 0.001. ** p < 0.01. * p < 0.05. † p < 0.10 (two-sided). All significant correlations (p < 0.10) are printed in bold. GPA = student-reported average grade on the last end-of-term certificate. Grade-based variables (GPA and Math Grade) were reverse-coded so that higher grades correspond to better performance ratings. Global fit indices for this model: χ.2 = 1,057, df = 595, CFI = 0.967, RMSEA = 0.032, SRMR = 0.048

Full model of SWB and academic achievement

All variables were integrated in a full model comprising the six factors representing SWB at three levels of domain-specificity, as well as three indicators of academic achievement, namely GPA, Math Grade, and Math Test (see Fig. 1). The fit of this comprehensive model was good (χ2 = 1,057, df = 595, CFI = 0.967, RMSEA = 0.032, SRMR = 0.048). All bivariate associations in this full model were freely estimated. Model-implied estimates of correlations and levels of significance are displayed Fig. 1. In response to our overarching research questions, we subsequently tested a series of parameter constraints (see summary of findings in Table 3).

Table 3 Hypothesis testing of differential SWB/achievement relations arranged by research questions

Differential SWB/achievement relations by SWB’s level of domain-specificity and (mis-)matching specificities between variables

For SWB’s cognitive component, we consistently detected stronger SWB/achievement relations when the SWB measure matched the achievement indicator’s specificity-level, and was also more specific (r1) than when it mismatched the achievement indicator’s specificity level, and was also less specific (r2; H1a: e.g., r1 [SatisfactionMath, Math Test] > r2 [SatisfactionSchool, Math Test]; 0.34 ≤ r1 ≤ 0.56; 0.08 ≤ r2 ≤ 0.36; 17.42 ≤ ∆χ2SB≤ 72.40, p (1) < 0.001). Our results were less consistent when focussing on SWB’s affective component (H1b). Specifically, we found stronger SWB/achievement relations with a specificity-matching and more specific (r1), versus a specificity-mismatched and less specific measure (r2) only for the case of math-specific Mood (e.g., r1 [MoodMath, Math Test] vs. r2 [MoodSchool, Math Test]; 0.26 ≤ r1 ≤ 0.39; -0.02 ≤ r2 ≤ 0.16; 8.02 ≤ ∆χ2SB≤ 213.22, p (1) < 0.001). In contrast, for school-specific Mood, the SWB/achievement relation did not differ significantly from that concerning a less specific and specificity-mismatched SWB measure (i.e., r1 [MoodSchool, GPA] = 0.22 vs. r2 [MoodGlobal, GPA] = 0.25, ∆χ2SB = 1.01, p = 0.314). Consistent with our hypotheses, for both SWB components, we identified stronger SWB/achievement relations between specificity-matching, and also math-specific variables than between specificity-matching, and also school-specific variables, given the same type of achievement indicator (i.e., grade-based; H2a: r1 [SatisfactionMath, Math Grade] = 0.56 > r2 [SatisfactionSchool, GPA] = 0.43, ∆χ2SB = 213.22, p < 0.001; H2b: r1 [MoodMath, Math Grade] = 0.39 > r2 [MoodSchool, GPA] = 0.22, ∆χ2SB = 61.25, p < 0.001). Lastly, we tested whether the SWB/achievement relation between specificity-matching variables at a mid-level of specificity differed significantly from that regarding specificity-mismatched variables that also involved a more specific measure of SWB. For these configurations, invariance testing indicated that the SWB/achievement relation was not significantly different for any of SWB’s two components (H3a: r1 [SatisfactionSchool, GPA] = 0.43 vs. r2 [SatisfactionMath, GPA] = 0.36, ∆χ2SB = 2.23, p = 0.135; H3b: r1 [MoodSchool, GPA] = 0.22 vs. r2 [MoodMath, GPA] = 0.26, ∆χ2SB = 1.11, p = 0.291).

Differential SWB/achievement relations by type of SWB component (cognitive vs. affective)

To respond to our second research question, we assessed whether the SWB/achievement relation is significantly stronger with the cognitive component (r1) than the affective component (r2), given matching-specificities between variables. Our invariances tests supported this hypothesis in the case of math-specific variables (H4a: e.g., r1 [SatisfactionMath, Math Grade] > r2 [MoodMath, Math Grade]; 0.34 ≤ r1 ≤ 0.56; 0.26 ≤ r2 ≤ 0.39; 4.53 ≤ ∆χ2SB≤ 30.41, p (1) < 0.050). Analogously, invariance testing also supported the corresponding hypothesis for school-specific variables (i.e., H4b: r1 [SatisfactionSchool, GPA] = 0.43 > r2 [MoodSchool, GPA] = 0.22, ∆χ2SB = 25.09, p < 0.001).

Differential SWB/achievement relations by type of achievement indicator

Lastly, we tested if the SWB/achievement relation is significantly stronger with a grade-based achievement indicator (r1) than with a test-based indicator (r2) across both SWB components, given matching-specificities between variables. Since both types of achievement indicators were available only at the math-specific but not school-specific level, we only tested differential relations with math-specific variables. As our invariance tests supported this hypothesis, SWB/achievement relations were significantly stronger in conjunction with a grade-based rather than test-based math achievement indicator (H5: e.g., r1 [SatisfactionMath, Math Grade] > r2 [SatisfactionMath, Math Test]; 0.39 ≤ r1 ≤ 0.56; 0.26 ≤ r2 ≤ 0.34; 10.82 ≤ ∆χ2SB≤ 68.45, p (1) < 0.001).

Discussion

In the present study we focused on heterogeneous associations between SWB and academic achievement. Contrary to prior studies aimed at understanding differential SWB/achievement relations (e.gBücker et al., 2018; Long & Huebner, 2014), our analytical approach was based on systematically varying only one of three identified potential sources of differential relations, notably the level of domain-specificities of the implied measures (RQ1), the type of SWB component under consideration (RQ2), and the type of achievement indicator (RQ3). By individually varying only one of these potential sources of differential SWB/achievement relations, we accounted for potentially confounding factors that could have masked significant differences in preceding studies (e.g., Bücker et al., 2018; Long & Huebner, 2014). Our account of the role of domain specificities in understanding differential SWB/achievement relations exceeded a purely methodological perspective, as we embedded our investigation in a theory-grounded proposition of a subject-specific measure of SWB in schools.

With respect to our first research question (stronger SWB/achievement relation with more specific and specificity-matching variables) our findings differed for the two components of SWB. Regarding SWB’s cognitive component our findings were consistent with our hypotheses (H1a, H2a). That is, we detected a pattern of stronger SWB/achievement relations with the cognitive component’s more specific and specificity-matching measures. The strength of the SWB/achievement relation thus varied substantially as the SWB measure’s specificity level was altered, with small to non-significant correlations with a global measure of SWB (0.08 ≤ r ≤ 0.20), and at least medium-sized correlations with a math-specific measure of SWB in combination with a specificity-matching indicator of academic achievement (0.34 ≤ r ≤ 0.56). Importantly, the SWB/achievement relation did not differ significantly whether a school-specific or math-specific SWB measure was employed together with a school-specific achievement indicator (here: GPA; H3a). This latter finding corresponds to the evidence on achievement goals (e.g., Baranik et al., 2010), and is consistent with the traditional emphasis on the correspondence of specificity levels between constructs (Ajzen, 2012; Ajzen & Fishbein, 1977). Overall, our findings also resonate with contemporary perspectives on validation that conceive of validity an attribute to be assessed against the intended uses and interpretations of a measurement instrument, rather than an inherent attribute (Kane, 2013; also see Wolters & Won, 2017). Under this perspective, our findings support the empirical relevance of the proposed measures of SWB’s domain-specific cognitive component (school-specific and math-specific), given that achievement outcomes at corresponding levels of domain-specificity (i.e., school-level and subject-level) constitute criteria of interest in educational psychology research (see Steinmayr et al., 2014).

Note that our findings stand in contrast with those by Long and Huebner (2014), who observed stronger SWB/achievement relations with a global rather than a school-specific measure of the cognitive component, whereas we noticed the opposite pattern. In light of such diverging evidence, we draw attention to a central difference between these two studies regarding the operationalization of SWB’s domain-specific measures. In the present study, the domain-specificity of the employed SWB measures was approached by using contextualized item-stems (i.e., using ‘in general/in school/in math class….’), while keeping the item content per se unchanged (see Shaffer & Postlethwaite, 2012). In contrast, Long and Huebner employed entirely different measures whereby the items implied by each measure differed not only in their degree of context-specificity, but also in their content. Correspondingly, these authors’ findings may have reflected both - the effects of differential domain-specificities in the cognitive component as well as more pronounced, substantive differences in how they conceived the underlying constructs at different specificity-levels.

Concerning SWB’s affective component, our findings supported our hypotheses of stronger SWB/achievement relations with more specific and specificity matching measures (H1b, H2b), but only for the math-specific, not the school-specific measure. Specifically, the strength of the SWB/achievement strength varied substantially, exhibiting small- to medium-sized associations with a math-specific measure of the affective component in combination with a math-specific achievement indicator (0.26 ≤ r ≤ 0.39), and small to near-zero associations when the same achievement indicator was combined instead with a less specific SWB measure (-0.02 ≤ r ≤ 0.16). In contrast, we detected no significant difference in the SWB/achievement relation when combining the affective component with an achievement indicator at the school-specific level (i.e., GPA), regardless of whether the affective component’s measure was school-specific (r = 0.22), or global (r = 0.25). Lastly, similar to our findings regarding the cognitive component, the SWB/achievement relation did not differ significantly whether a school-specific or math-specific SWB measure was used together with a school-specific measure of academic achievement (i.e., GPA; H3b). In sum, our results support the empirical relevance of a math-specific measure of SWB’s cognitive and affective component provided the achievement indicator of interest is also subject-specific. Our findings in this regard were consistent for both types of achievement indicators that is also in the case of a non-self-reported measure (i.e., performance in a standardized math test). Note that this underlies a specific strength of our study, as potential concerns of stronger SWB/achievement relations with more specific measures reflecting no more than methodological artefacts (i.e., higher SWB/achievement relations since measures for both variables involve ‘in math’) are herein pre-empted. In contrast, our findings fail to support an analogous argument for a school-specific measure of the affective component. When interpreting these findings, we surmise that considering students’ evaluations of their habitual affective state, ‘school’ versus ‘life in general’ may not constitute such distinct contexts. Instead, how students feel in school may reflect a broad spectrum of influences from other life domains such as families and friends (also see Steinmayr et al., 2016). Or, inversely speaking, since ‘school’ is such an encompassing domain in the life of adolescents’ (Eccles & Roeser, 2011), the overlap between ‘life in school’ and ‘life in general’ may be too large to be associated with substantively meaningful differences in the associated affective experiences. However, if that were indeed true, one might expect an even higher bivariate correlation between the global and school-specific measure of SWB’s affective component than the size of the bi-factor correlation observed in this study (r = 0.60; see Fig. 1). Another potential explanation is that it is harder to discern differential relations with a school-specific than a global measure of the affective component, since the specificity-matching indicator of academic achievement (here: GPA) displays less variance relative to the corresponding subject-specific indicator (here: Math Grade; SDGPA = 0.65 vs. SDMath Grade = 1.14, see Table 1). However, rather than suggesting a methodological artefact, we argue that this interpretation underscores the substantial relevance of subject-specific measures, since substantively meaningful subject-level variance may be obscured at the more aggregated level. Whereas researchers have referred to this argument in adopting SWB’s school-specific measures (e.g., Putwain et al., 2020), our findings suggest that they may have overlooked the right degree of domain-specificity.

With respect to our second research question (stronger SWB/achievement relations with the cognitive compared to the affective component), our results were consistent with our hypothesis of stronger SWB/achievement relations with the cognitive rather than affective component across levels-of specificities and types of achievement indicators. Our findings are therefore consistent with those from a longitudinal study by Steinmayr et al. (2016), but they contradict the meta-analytic findings by Bücker et al. (2018), who did not find the component type to be a significant moderator of the SWB/achievement relation. However, to the best of our knowledge, the present study is the first to have tested differential SWB/achievement relations by the component type while also accounting for matching levels of domain-specificities and maintaining the type of achievement indicator invariant. Note that when ignoring matching specificity levels between variables, and instead employing global measures of SWB in combination with a school-specific achievement indicator (i.e., GPA), the relation did not differ significantly by type of SWB component also in the present study (r [SatisfactionGlobal, GPA] = 0.20; r [MoodGlobal, GPA] = 0.25; ∆χ2SB = 1.60, p = 0.205). This in turn resembles a configuration of variables that has typically been employed in published studies (e.g., Steinmayr et al., 2016, 2018), and it is consistent with the meta-analytic findings by Bücker and colleagues, who did not detect component type as a significant moderator of the SWB/achievement relation (2018). In sum, this underscores a specific strength of the present study in systematically altering only one of the factors underlying differential SWB/achievement relations therein accounting for potentially confounding factors.

With respect to our third research question (grade-based versus test-based indicator of academic achievement in math), our findings were consistent with our hypothesis of stronger SWB/achievement relations in the case of a (high-stakes) grade-based, as opposed to a (low-stakes) test-based indicator. Our findings resonate with existing evidence of differential relations between achievement indicators and other constructs including students’ self-concept of ability (Lauermann et al., 2020; e.g., Marsh et al., 2005). Since grades also give students performance feedback, it seems plausible that stronger associations with SWB would emerge than with standardized achievement indicators, which in turn play a subordinate role in the German education system and are rarely employed for student-feedback purposes. Furthermore, our findings complement research on differential SWB/achievement relations that focused instead on differentiating between subjectively- and objectively-reported grades, which revealed no significant differences (Bücker et al., 2018).

Limitations and directions for future research

Some limitations should also be mentioned. Firstly, in light of the cross-sectional design of our study, no causal conclusions regarding the association between SWB and academic achievement should be inferred. Hence, future studies should focus more on longitudinal research, also examining the reciprocal associations between domain-specific SWB components and different indicators of academic achievement (e.g., Steinmayr et al., 2016). Furthermore, our study sample was recruited from four schools, and their populations were not representative of German 8th and 9th grade students. Specifically, our students mainly came from families with a more advanced educational background, and the proportion of students with a migration background was comparatively small. Although these sociodemographic variables might have a small impact on SWB (e.g., Crede et al., 2015), they could be more relevant with respect to academic achievement indicators. Since children’s self-perceptions become more differentiated as they age (see McCombs, 2001), future studies should attend to potential developmental differences in the pertinence of subject-specific SWB in younger students. Furthermore, our findings may be specific to schooling contexts in which factors that may contribute to students’ subjective experiences systematically differ by subject-domain (e.g., teaching personnel), and where much schooling time is scheduled by subject. This, in turn, would contribute to school subjects being a differential context in students’ perceptions (also see Jacobs et al., 2003). Moreover, in the present study, we focused on math as the school subject. Future studies should simultaneously address well-being constructs and achievement indicators in several different school subjects to enable within- and between-school subject variations to be disentangled (see for example Gogol et al., 2017). It might be relevant in this regard that published studies have demonstrated significant differences in the strength of correlations between different emotional constructs by school subject, with academic emotions in math showing the strongest within-subject correlations (Goetz et al., 2007). Furthermore, in the present study we relied on a uni-dimensional mood scale focusing on the presence of positive affect (Dalbert, 1992, 2003). Future studies should also address experiences of negative affect by considering the representation of positive and negative as separate dimensions (see Busseri & Sadava, 2011; Long et al., 2012), or by investigating the relative frequency of experiences of positive versus negative affect in school (e.g., Tian et al., 2015).

Conclusion

Over the past two decades, concerns with students’ well-being have become increasingly prominent in both education policy and research. As a result, more and more researchers have investigated students’ SWB and its relations with other critical outcomes of schooling, particularly academic achievement. However, with the increasing availability of empirical evidence, findings on the SWB/achievement relation have become more diverse, and this heterogeneity is yet to be understood. The present study contributes to a better understanding of heterogenous findings by providing evidence of stronger SWB/achievement relations with domain-specific vs. global measures of SWB, and with the cognitive vs. affective component of SWB, and with grade-based vs. test-based indicators of academic achievement. Our findings show that, contrary to prior investigations, divergent findings regarding the strength of the SWB/achievement relation can be understood when analyses enable the systematic variation of only one of the potential sources of differential relations. Furthermore, our findings highlight the empirical relevance of a subject-specific measure of SWB – at least in math. Last but not least, and resonating with the current socio-political emphasis on both well-being and academic achievement as simultaneous, desirable outcomes of schooling, the present study aimed to synergistically integrate theoretical perspectives and empirical evidence from research with focus on children’s SWB and educational psychology research.