Test anxiety (TA) is common amongst university students (Knappe et al. 2011). Meta-analyses consistently find TA is associated with poorer academic performance (Hembree 1988; Seipp 1991; von der Embse et al. 2018). Moreover, high-test-anxious students are more likely to repeat a year of study or dropout of university compared to their low-test-anxious peers (Mourshed et al. 2015; Schaefer et al. 2007 cf. Neuderth et al. 2009).

TA is a situation specific form of anxiety, measurable in both trait and state forms. Trait TA is the relatively enduring disposition to perceive test situations as threatening; trait TA interacts with situational variables (e.g., examination ‘stakes’) to produce the degree of state TA (Spielberger and Vagg 1995). TA consists of ‘Worry’ and ‘Emotionality’ (Liebert and Morris 1967; Spielberger 1980). Worry is repetitive negative thinking focused on failure, while Emotionality refers to the somatic symptoms experienced prior to and during tests. Worry has much stronger associations with decreased academic performance than Emotionality (Cassady and Johnson 2002; Seipp 1991; von der Embse et al. 2018).

Existing treatments for test-anxious university students achieve medium effect sizes compared to control conditions for reducing TA (g = −0.64) and only weak effects for improving academic performance (g = 0.28), when outliers are removed (Huntley et al. 2019). However, confidence in these findings was moderated by the lack of high quality trials and evidence of publication bias. Moreover, most interventions for TA focused on modifying somatic symptoms rather than worry. It may be the case that interventions that target worry, the cardinal feature of TA, may achieve greater efficacy. One candidate intervention that does target worry for therapeutic modification is metacognitive therapy (MCT; Wells 2000, 2009).

MCT is based upon the transdiagnostic Self-Regulatory Executive Function (S-REF) or metacognitive model of psychopathology (Wells and Matthews 1994; Wells and Matthews 1996). The S-REF model proposes that emotional distress is caused by how people respond to negative thoughts and feelings termed the Cognitive Attentional Syndrome (CAS). The CAS includes perseverative thinking (e.g., worry, overanalyzing), monitoring for threat (e.g., scanning the environment), and coping strategies (e.g., distraction, avoidance). Applied to TA, the CAS might manifest itself as worry about failing to meet one’s test goals and the personal and social consequences of such failure, monitoring the body for signs of somatic symptoms, and coping strategies such as trying to distract oneself from thoughts about failure.

The CAS is initiated and prolonged by metacognitive beliefs, which are cognitions about cognitions. Two types of metacognitive beliefs are considered particularly important to the S-REF model; positive metacognitive beliefs concerning the usefulness of engaging in the CAS (e.g., “Worrying helps me to cope”), and negative metacognitive beliefs concerning perceived uncontrollability and danger of worry (e.g., “Once I start worrying I cannot stop”) (Wells 2009; Wells and Matthews 1994; Wells and Matthews 1996).

The main self-report measure of metacognitive beliefs is the Metacognitions Questionnaire-30 (MCQ-30; Wells and Cartwright-Hatton 2004). Initial examination of its psychometric properties found a stable five-factor structure, while concurrent validity was demonstrated between its subscales and measures of pathological worry and anxiety (Spada et al. 2008; Wells and Cartwright-Hatton 2004). The five-factor structure of MCQ-30 has subsequently been replicated in epilepsy (Fisher et al. 2016), Obsessive-Compulsive Disorder (Grøtte et al. 2016), and breast cancer (Cook et al. 2014) populations. To our knowledge, no studies have used the MCQ-30 in TA among university students. Prior to systematic investigation of the utility of the S-REF model in TA among university students, an important first step is to establish the psychometric properties of the MCQ-30 in this population.

This study has two primary aims: (i) to examine if the five-factor structure of the MCQ-30 is valid in TA, and (ii) to examine the concurrent validity between the MCQ-30 and TA. We examine factor structure and concurrent validity with both trait and state TA, to ensure the generalizability of findings and give confidence to researchers using the measure in different contexts. The MCQ-30 five-factor structure has been replicated across mental and physical health samples thus far (e.g., Cook et al. 2014; Fisher et al. 2016; Grøtte et al. 2016), so we hypothesize the published five-factor structure will hold here. With regard to concurrent associations between MCQ-30 subscales and TA, we hypothesize ‘Negative beliefs about uncontrollability and danger of worry’ will have the largest associations with Worry and Emotionality dimensions of TA, as these beliefs are consistently associated with emotional disorder (e.g., Roussis and Wells 2006; Wells 2005).

Method

Participants

Participants consisted of two cohorts of university students from a large UK university. The first cohort (‘trait TA’) consisted of participants drawn from the entire student population. An accurate indication of state TA can only be achieved if it is assessed immediately prior to an examination (Zeidner 1998). However, in the present study, it was not possible to assess state TA in the general student population (i.e., the population from which ‘trait TA’ is sampled) due to different examination schedules and formats. However, Objective Structured Clinical Examinations (OSCEs) for medical students follow a highly structured format and involve a pre-examination waiting period of approximately 20 min before OSCEs begin, thereby allowing assessment of state TA immediately before the examination started.

Measures

Metacognitions Questionnaire–30 (MCQ-30; Wells and Cartwright-Hatton 2004)

The MCQ-30 consists of 30 items assessing maladaptive metacognitive beliefs. It has five subscales: (i) ‘Positive beliefs about worry’, (ii) ‘Negative beliefs about uncontrollability and danger of worry’, (iii) ‘Cognitive confidence’, (iv) ‘Need to control thoughts’, and (v) ‘Cognitive self-consciousness’. Items are scored on a 4-point scale from 1 (“Do not agree”) to 4 (“Agree very much”). Subscale scores range from 6 to 24, with higher scores indicating greater conviction in metacognitive beliefs. The MCQ-30 has acceptable-to-excellent internal consistency (subscale αs from .72–.93), consistent factor structure, and convergent validity with other measures of maladaptive metacognition (Spada et al. 2008; Wells and Cartwright-Hatton 2004).

Test Anxiety Inventory (TAI; Spielberger 1980)

The TAI consists of 20 items assessing trait test anxiety. It has two subscales: (i) Worry, and (ii) Emotionality. Items are scored on a 4-point scale from 1 (“Almost never”) to 4 (“Almost always”). Subscale scores range from 8 to 32, with higher scores indicating greater test anxiety. The TAI has excellent internal consistency (αs from .90–.91), consistent factor structure, and convergent validity with other measures of test anxiety (e.g., Everson et al. 1991; Spielberger 1980).

State-Trait Inventory for Cognitive and Somatic Anxiety – State Subscale (STICSA-S; Ree et al. 2008)

The STICSA-S consists of 21 items assessing state anxiety. It has two subscales: (i) S-Cognitive Anxiety (α = .88), and (ii) S-Somatic Anxiety (α = .88). Items are scored on a 4-point scale from 1 (“Not at all”) to 4 (“Very much so”). Subscale scores range from 10 to 40 for Cognitive Anxiety and 11–44 for Somatic Anxiety, with higher scores indicating greater state anxiety. The STICSA-S has good internal consistency (αs from .87–.88), consistent factor structure, and convergent validity with other measures of state anxiety, and the scale has previously been used to measure state anxiety in examination contexts (Gros et al. 2007; Ree et al. 2008). For the purpose of this study, we refer to S-Cognitive as S-Worry and S-Somatic as S-Emotionality.

Procedure

Data was collected using convenience sampling. Participants were recruited via advertisements highlighting the aims, methods, and voluntary nature of the study. The participant information sheet highlighted to students that they could withdraw at any time with no impact upon their studies or grades. The first cohort (‘trait TA’ dataset) completed the MCQ-30 and TAI, online, during term time. The second cohort (‘state TA’ dataset) completed paper copies of the MCQ-30 and STICSA, approximately 30 min before their Objective Structured Clinical Examinations (OSCEs). OSCEs are used in medical training, where students perform a series of clinical tasks while being observed and evaluated by examiners (Harden 1988). A copy of the participant information sheet was provided to all potential participants at least two weeks before their OSCEs. Participants could be entered into a prize draw for Amazon vouchers (first prize of £100, four second prizes of £25) if they completed the study. University ethical approval was granted for this research. Informed consent was obtained from all participants.

Data Analysis Strategy

Confirmatory Factor Analysis (CFA) was used to examine if the published five-factor structure of the MCQ-30 fit the data. Fit of alternative models was explored using Exploratory Factor Analysis (EFA), where models up to and including five factors were tested, and where items were free to load. CFA and EFA were conducted on the same ‘trait TA’ and ‘state TA’ datasets. Analyses used the weighted least squares estimator (WLSMV) recommended for ordinal data. As MCQ-30 factors are inter-correlated an oblique rotation (i.e., Geomin) determined optimal factor loadings. Adequacy of fit for both models (i.e., CFA and EFA) was assessed using four indices of fit, comprising two incremental and two absolute ‘misfit’ indices. The two incremental indices were the Comparative Fit Index (CFI) and the Tucker-Lewis Fit Index (TLI), where values ≥ .95 indicate adequate fit (Hu and Bentler 1999). The two absolute ‘misfit’ indices were the Root Mean Square Error of Approximation (RMSEA), where values < .05 indicate good fit and values between 0.5–0.8 indicate adequate fit (Browne and Cudeck 1992), and the Weighed Root Mean Square Residual (WRMR), where values <1.00 indicate good fit (DiStefano et al. 2018). For the EFA, the Standardized Root Mean Square Residual (SRMR) was used instead of the WRMR, with values < .08 indicating adequate fit (Hu and Bentler 1999). Inter-correlations among factors were examined and the internal consistency of each factor was measured using Cronbach’s alpha.

Exploratory data analyses were then conducted, using Pearson’s correlations to examine relationships between age and the study variables, and independent t-tests to examine if there were significant differences in study variable scores due to (i) gender, and (ii) year of study.

Finally, concurrent validity was assessed in trait and state TA datasets by fitting a structural model in which latent variables for the TA Worry and Emotionality were regressed onto MCQ-30 factors. Model fit was assessed using CLI, TLI, RMSEA, and SRMR fit indices.

All analyses were conducted in Mplus version 8 (Muthén and Muthén 2008-2017).

Results

The demographics of the ‘trait TA’ and ‘state TA’ samples are presented in Table 1.The ‘State TA’ sample was significantly older (t(735) = 4.87, p < .001, Mdifference = 1.07 years) and in later years of study (χ(3) = 230.01, p < .001) than the ‘trait TA’ sample. There were also significantly more females (χ(1) = 21.29, p < .001) in the ‘trait TA’ sample, and ethnic composition of the two samples was significantly different (χ(6) = 47.77, p < .001), with the ‘trait TA’ sample having a larger proportion of Caucasian respondents compared to the ‘state TA’ sample.

Table 1 Demographic characteristics of trait and state test anxiety samples

Factor Structure of the MCQ-30 in TA

CFA of the published MCQ-30 five-factor solution found adequate fit of the model to trait TA data: χ2 (395) = 874.05 p < .0001, CFI = .92, TLI = .92, RMSEA = .066 (90% CIs .060–.072), WRMR = 1.27. EFA confirmed that a five-factor model provided the best fit to the data: χ2 (295) = 428.22 p < .0001, CFI = .98, TLI = .97, RMSEA = .040 (90% CIs .032–.048), SRMR = .033. Table 2 shows item loadings. Minor discrepancies were observed between the EFA-derived model and the published five-factor model. Item 3 loaded higher on ‘Negative beliefs about uncontrollability and danger of worry’ (factor 1) than its own factor of ‘Cognitive self-consciousness’ (factor 4), and item 27 did not load highly on any factor.

Table 2 MCQ-30 published scale structure and Geomin rotated factor loadings from exploratory factor analyses of trait and state datasets

For the state TA dataset, CFA of the MCQ-30 five-factor model found adequate model fit: χ2 (395) = 992.79 p < .0001, CFI = .95, TLI = .95, RMSEA = .057 (90% CIs .053–.062), WRMR = 1.29. EFA confirmed that a five-factor model again provided the best fit to the data: χ2 (295) = 625.27 p < .0001, CFI = .97, TLI = .95, RMSEA = .049 (90% CIs .044–.055), SRMR = .034. Table 2 shows item loadings. All items loaded on their expected factors except items 3, 6, 13, and 27. Item 3 loaded highly on its own factor of ‘Cognitive self-consciousness’ (factor 4) but also loaded on ‘Negative beliefs about uncontrollability and danger of worry’ (factor 1). Items 6, 13, and 27 did not load highly on any factor.

MCQ-30 Subscales: Descriptive Statistics and Intercorrelations

Mean and standard deviations of the MCQ-30 subscales and intercorrelations between them are presented in Table 3 (derived from CFA analyses). The majority of intercorrelations were significant, and mostly in the small-to-medium effect size range based upon Cohen’s (1992) taxonomy. Internal consistency was good across subscales with Cronbach’s alphas ranging between .82–.88, except for the ‘Need to control thoughts’ subscale which had acceptable internal consistency of .70–.74.

Table 3 Descriptive data and correlations between Metacognitions Questionnaire-30 and test anxiety measures (Trait = TAI; State = STICSA) subscales

Exploratory data analyses found age was not significantly correlated with MCQ-30, TAI, or STISCA-S subscales. In the ‘trait TA’ dataset, females reported significantly higher TAI-Worry (t(275) = 3.33, p = .001), TAI-Emotionality (t(275) = 5.51, p < .001), and MCQ-NEG (t(275) = 2.50, p = .013) than males. In the ‘state TA’ dataset, females reported significantly higher STICSA-Worry (t(461) = 4.00, p < .001), STISCA-Emotionality (t(461) = 5.23, p < .001), MCQ-NEG (t(461) = 4.10, p < .001), MCQ-CC (t(461) = 3.33, p = .001) than males but lower MCQ-NC (t(461) = −2.50, p = .013). In the ‘state TA’ dataset, independent ANOVAs found a significant difference in STICSA-Worry (F(2, 460) = 3.71, p = .25) based upon year of study, with Year 4 students reporting greater worry than Year 2 students (but no significant differences with Year 3). No significant differences between study variable scores based upon year of study were found in ‘trait TA’ dataset.

Concurrent Validity of MCQ-30 Subscales with TA

Relationships between metacognitive beliefs and concurrent TA Worry and Emotionality dimensions are shown in Fig. 1 for the ‘trait TA’ dataset and Fig. 2 for the ‘state TA’ dataset. We included gender as a covariate of TA Worry and Emotionality in both models and year of study as a covariate of TA Worry in the state model only (not shown in Figs. 1 and 2).

Fig. 1
figure 1

Structural equation modeling of the relationships between latent factors of the MCQ and dimensions of (trait) TA. Notes. Ellipses indicate latent factors, rectangles indicate observed variables. POS = ‘Positive beliefs about worry’; NEG = ‘Negative beliefs about uncontrollability and danger of worry’; CC = ‘Cognitive confidence’; NC = ‘Need to control thoughts’; CSC = ‘Cognitive self-consciousness’; TA-W = Test Anxiety Inventory – Worry; TA-E = Test Anxiety Inventory - Emotionality. Figures show standardized path coefficients. Dotted lines indicate non-significant relationships. * p < .05, ** p < .01, *** p < .001

Fig. 2
figure 2

Structural equation modeling of the relationships between latent factors of the MCQ and dimensions of (state) TA. Notes. Ellipses indicate latent factors, rectangles indicate observed variables. POS = ‘Positive beliefs about worry’; NEG = ‘Negative beliefs about uncontrollability and danger of worry’; CC = ‘Cognitive confidence’; NC = ‘Need to control thoughts’; CSC = ‘Cognitive self-consciousness’; S-Worry = State (Test Anxiety) – Worry (as measured by STICSA-S Cognitive subscale); S-Emotionality = State (Test Anxiety) – Emotionality (as measured by STICSA-S Somatic subscale); STISCA-S = State-Trait Inventory for Cognitive and Somatic Anxiety – State. Figures show standardized path coefficients. Dotted lines indicate non-significant relationships. * p < .05, ** p < .01, *** p < .001

Fit indices indicate acceptable model fit for both trait: χ2 (1012) = 1643.41 p < .0001, CFI = .94, TLI = .93, RMSEA = .047 (90% CIs .043–.052), SRMR = 0.07, and state TA datasets: χ2 (1302) = 2337.31 p < .0001, CFI = .95, TLI = .95, RMSEA = .041 (90% CIs .039–.044), SRMR = 0.06. As hypothesized, ‘Negative beliefs about uncontrollability and danger of worry’ had the strongest association with Worry and Emotionality dimensions of TA in both trait and state TA. However, a slightly different pattern of relationships were observed between models. For the trait model, only ‘Negative beliefs about uncontrollability and danger of worry’ and ‘Cognitive confidence’ were significantly associated with Worry, while ‘Negative beliefs about uncontrollability and danger of worry’, ‘Positive beliefs about worry’, and ‘Cognitive confidence’ were all significantly associated with Emotionality. Neither ‘Need to control thoughts’ nor ‘Cognitive self-consciousness’ were associated with Worry or Emotionality. In the state model, ‘Negative beliefs about uncontrollability and danger of worry’, ‘Positive beliefs about worry’, and ‘Cognitive confidence’ were all significantly associated with both Worry and Emotionality. ‘Need for control over thoughts’ was also significantly associated with Emotionality but a negative relationship was observed. No significant relationships between ‘Cognitive self-consciousness’ and Worry and Emotionality were found. There was a stronger relationship between Worry and Emotionality dimensions in the trait model than in the state model.

Discussion

This study presents the first findings to support the MCQ-30 as a valid and reliable measure of maladaptive metacognitive beliefs in TA among university students. The main aims of this study were to explore the factor structure of the MCQ-30 and the concurrent associations between metacognitive beliefs and trait and state TA.

Factor analyses supported the published five-factor solution. With the exception of three items related to the ‘Need to control thoughts’ subscale, all items loaded on their respective subscales, which was reflected in good internal consistencies of subscales (alphas between .82–.88). ‘Need to control thoughts’ only had acceptable internal consistency (alphas between .70–.74), a pattern consistent with findings reported in the original examination of the MCQ-30 psychometric properties (Wells and Cartwright-Hatton 2004) and later studies (Cook et al. 2014; Fisher et al. 2016; Grøtte et al. 2016; Spada et al. 2008). Only one item (Item 3), on the ‘Cognitive self-consciousness’ subscale, cross-loaded onto another subscale (‘Negative beliefs about worry’), which has been witnessed previously (Cook et al. 2014). The item wording (“I think a lot about my thoughts”) suggests it may tap ‘meta-worry’ or worry about worry, which is associated with ‘Negative beliefs about uncontrollability and danger of worry’ (Wells 2010). Overall, the five-factor structure of the MCQ-30 is supported in TA among university students.

In tests of concurrent validity, structural equation modeling was used to examine relationships between MCQ-30 subscales and TA Worry and Emotionality dimensions. ‘Negative beliefs about uncontrollability and danger of worry’ had the strongest association with Worry and Emotionality dimensions of TA, across both datasets. ‘Cognitive confidence’ was also significantly associated with Worry and Emotionality across both datasets, while ‘Positive beliefs about worry’ was associated with both Worry and Emotionality in the state dataset but only associated with Emotionality in the trait dataset. This latter result was surprising as a stronger relationship between ‘Positive beliefs about worry’ and the Worry dimension of test anxiety, as opposed to Emotionality, would be expected, given that ‘Positive beliefs about worry’ leads to selection of worry as a coping strategy (Wells and Matthews 1994; Wells and Matthews 1996). One possible explanation is that students may be more aware of using worry as a coping strategy in an examination environment than when they reflect on their TA during term time. Finally, a negative relationship was found between ‘Need to control thoughts’ and Emotionality in the state dataset. This finding is curious and suggests greater beliefs in the need to control thoughts results in lower somatic symptoms. However, this may be a result of the lower internal consistency of the ‘Need to control thoughts’ factor and lack of significant loadings for some items. It may also be the case that beliefs about controlling thoughts are not viewed negatively by students and that their perception is that being able to distract themselves from unwanted thoughts has a useful short term effect. Overall, ‘Negative beliefs about uncontrollability and danger of worry’ had the strongest associations with TA. ‘Cognitive confidence’ beliefs may also play an important role in TA, where lack of belief in one’s memory may result in increased anxiety about one’s ability to perform well in tests.

There were several limitations to this study. The sample size for the ‘trait TA’ cohort was relatively modest and survey response bias was evident with females overrepresented in the sample, which may reduce reliability of estimates. With regard to the ‘state TA’ cohort, students find OSCEs more anxiety provoking than other forms of tests (Brand and Schoonheim-Klein 2009; Guraya et al. 2018; Marshall and Jones 2003; Nicholson and Forrest 2009), so different patterns of relationships may be found in other test contexts. Differences in the demographic compositions of the two samples also may have contributed to the differing patterns of relationships in the structural equation modelling. For example, the test experience in university setting of those in the ‘trait TA’ sample may not be similar, reflecting the different degree programs undertaken, whereas the medical students in the ‘state TA’ sample will have virtually identical test experiences. Moreover, data were collected in different ways, with online surveys used to collect data for the ‘trait TA’ cohort, while traditional pen-and-paper methods were used to collect data for the ‘state TA’ cohort, which may have affected the response rates and compositions of the samples. A cross-sectional design was used whereas prospective designs can elucidate predictive associations between metacognitive beliefs and TA. Finally, reliability of the MCQ-30 in TA was only assessed via internal consistency of subscales, whereas test-retest would provide evidence of score stability.

Conclusion

The MCQ-30 is a valid self-report measure of metacognitive beliefs in both trait and state TA. Therefore, researchers can be confident using the MCQ-30, and interpreting subscale scores, when testing predictions derived from the S-REF model among university students. ‘Negative beliefs about uncontrollability and danger of worry’ had the strongest associations with both Worry and Emotionality dimensions, which suggests these beliefs play a key role in the maintenance of TA. The adaptation of MCT for TA should follow the UK Medical Research Council (MRC) program regarding development and appraisal of complex interventions (Craig et al. 2008). The next steps are to examine the cross-sectional and prospective relationships between maladaptive metacognitive beliefs and TA, and compare the relative explanatory utility of the metacognitive beliefs against constructs (e.g., intolerance of uncertainty) identified in other competing models of worry and anxiety.