Introduction

The Response Style Theory of depression (Nolen-Hoeksema, 1991) proposes rumination as one of the main factors associated with the duration and exacerbation of depression. Rumination is considered as a way of responding to depressive symptoms that involves repetitively and passively self-focusing on one’s depressed mood and on the possible causes and consequences of this negative mood (Butler & Nolen-Hoeksema, 1994). However, advances in research have yielded some relevant changes in the conceptualization of rumination. Accordingly, there is evidence that rumination is not only involved in the duration of depression, but also in its onset (Nolen-Hoeksema et al., 2008). In addition, rumination can lead to several detrimental health outcomes beyond depression, such as major depression, social and generalized anxiety, substance abuse, or eating disorders, thereby acting as a transdiagnostic psychological factor (Aldao et al., 2010; Ehring & Watkins, 2008; Nolen-Hoeksema & Watkins, 2011). In parallel to this conceptual evolution, the assessment of rumination has also evolved from the use of more specific instruments of depressed rumination to incorporating more general questionnaires of a broader ruminative thinking style.

One of the most employed rumination scales is the Ruminative Response Scale (RRS), which included 22 items that assessed repetitive thinking around causes, consequences, and symptoms of current negative affect (i.e., feeling down, sad, or depressed, Nolen-Hoeksema, 1991). An important criticism to this scale was the presence of a great number of items that overlap with depression symptomatology, which led to the refinement of the questionnaire in a shorter version of 10 items without items of depressive content (Treynor et al., 2003). However, and despite the improvements of this short scale, some authors expressed concerns over the RSS because its content still focused on negative mood. Its instructional set was also considered problematic (i.e., instructions asked participants to rate themselves in terms of “…when you feel down, sad, or depressed”), which restricts the assessment of rumination to the current depressed mood, and thus complicates the research of rumination in other situations where negative mood is not necessarily present, or in other psychopathological conditions, such as anxiety (Brinker & Dozois, 2009).

To overcome these issues, Brinker and Dozois (2009) created a new questionnaire to assess rumination, less tied to negative affect (particularly depression), named the Ruminative Thought Style Questionnaire (RTSQ). With 20 items, the authors assessed four central characteristics of rumination: repetitive, recurrent, uncontrollable, and intrusive thoughts within a unidimensional measure. They also included (1) past, present and future temporal orientation, and three types of valence of the thoughts (neutral, negative, and positive). In order to identify more specific subcomponents of rumination, Tanner et al. (2013) selected 15-items of the RTSQ that assessed ruminative thinking across four distinct facets: 1) problem-focused thoughts (thoughts focused on symptoms, causes, and consequences of problems), 2) counterfactual thinking (thoughts focused on imagining alternative outcomes or realities), 3) repetitive thoughts (intrusiveness, persistence, and automaticity of thoughts) and 4) anticipatory thoughts (future-oriented ruminative thoughts). Overall, these four-factors appear to reflect some ideas of the traditional conceptualizations of rumination: problem-focused thoughts and repetitive thoughts subfacets would be congruent with initial conceptualizations of rumination (e.g., Nolen-Hoeksema, 1991; Conway et al., 2000), whereas anticipatory thoughts would be more related to the protective effects of rumination (Tanner et al., 2013).

Despite the general agreement in identifying these four components at the core of the RTSQ, there are some discrepancies in describing the structure of the questionnaire, with some authors opting for a four-factor correlated model (Bravo et al., 2018; Dzhambov et al., 2019; Tanner et al., 2013), and others showing that a second-order factor structure, in which a higher-order general factor of rumination overarches the four factors, has better fit to the data (Helmig et al., 2016; Tanner et al., 2013). Thus, discrepancies across studies suggest that additional research is needed to better describe the structure of the 15-item version of the RTSQ.

The RTSQ has been employed to assess rumination in different populations [clinical vs non-clinical (Helmig et al., 2016); undergraduates (Bravo et al, 2018; Brinker & Dozois, 2009; Dzhambov et al., 2019; Mihić et al., 2019), general population (Karatepe et al., 2013), and adolescents (Tanner et al., 2013)]. Furthermore, the RTSQ has been adapted to different languages such as Spanish (Bravo et al., 2018), Serbian (Mihić et al., 2019), Bulgarian (Dzhambov et al., 2019), German (Helmig et al., 2016), and Turkish (Karatepe et al., 2013). Despite its use in different populations and languages, only a few studies have explored the measurement invariance of the RTSQ across countries and gender groups. In this regard, Bravo et al. (2018) found that the 4-factor correlated model, using the 15-item version of the RTSQ, was invariant across males and females, but also among undergraduates from the U.S., Argentina, and Spain. However, to our knowledge, no previous study has explored the measurement invariance of a hierarchical model of the RSTQ 15-item form across countries and gender groups. This is especially relevant, considering that most studies use a global factor of rumination (e.g., McCarrick et al., 2021; Olatunji et al., 2013), and some studies that compare rumination across men and women and across countries use the total score of the RSTQ 15-item form (e.g., Mezquita et al., 2019).

The Present Study

Overall, although the second-order model presents advantages compared with the four-factor correlated structure (i.e., a general factor of rumination is considered), there is no evidence regarding the invariance of the higher-order model of the 15-item RTSQ across different populations and gender groups. Thus, we tested the structure of the 15-item RSTQ (i.e., four-factor correlated model vs a second-order factor model) and the measurement invariance of the final model across four countries (U.S., Spain, Argentina, and the Netherlands) and gender (male and female). This has relevant implications. Namely, provided the measurement invariance across countries and gender groups of the hierarchical structure is demonstrated, comparison of the total scale and subscale mean scores would be allowed between groups. Providing evidence of the measurement invariance across time is also a necessary step before comparing scores (total scale and subscales) of the 15-item RSTQ in follow-ups or longitudinal studies. Thus, we examined the longitudinal measurement invariance of the resulting model across three assessment waves in a subsample of Spanish youths. Based on previous studies, we expected to find evidence to support the use of a global factor of rumination using the RTSQ in addition to the four distinct factors (i.e., repetitive thoughts, problem-focused thoughts, counterfactual thoughts, and anticipatory thoughts) across countries and gender groups (i.e., multi-group invariance). We also expected that the RTSQ would show longitudinal measurement invariance in emerging adulthood in Spain.

Method

Participants and Procedure

Participants were college students (total n = 3,482) from the U.S., Spain, Argentina, and the Netherlands, who participated in an online cross-national survey study regarding personal mental health, personality traits, and substance use behaviors (see Bravo et al., 2019, for a detailed description of the samples and procedures). In addition, the participants of the Spanish sample also participated in two additional follow-ups, after six (Wave 2) and 12 months with respect to the first assessment (Wave 3). Only data from students that completed the Ruminative Thought Style Questionnaire (RTSQ) were included in the analyses (see Table 1). Overall, an over representation of females was observed (U.S. sites, 67.1%; Spain, Time 1 = 63.9%, Time 2 = 71.6%, Time 3 = 60.6%; Argentina 65.6%; the Netherlands 74.8%), with a mean age of 20.87 (SD = 4.47). Participants reported a mean age which ranged from 20.05 years (U.S. sites) to 24.26 years (Argentina) across countries (see Table 1).

Table 1 Descriptive statistics across study groups

Measures

Rumination

Rumination was assessed using the 15-item version of the Ruminative Thought Style Questionnaire (RTSQ; Tanner et al., 2013), measured on a 7-point scale from 1 (Not at all) to 7 (Very Well). The RTSQ has shown evidence of its validity across gender and among undergraduates from Spain (Bravo et al., 2018).

Data Analysis

Confirmatory Factor Analysis (CFAs) of the hierarchical model and the four-factor correlated model were performed in the whole sample that comprised participants from the four countries (Time 1). We examined the model’s goodness-of-fit using the comparative fit index (CFI), the Tucker–Lewis Index (TLI) and the root mean square error of approximation (RMSEA). According to commonly employed cut-off values, CFI and TLI > 0.90 and > 0.95 indicate an acceptable and optimal fit, respectively (Marsh et al., 2009). RMSEA values of ≤ 0.10 (Weston & Gore, 2006) and ≤ 0.06 (Hu & Bentler, 1999) indicate an acceptable and optimal fit, respectively. Once the final model for the whole sample was selected, Multigroup Measurement Invariance (MMI) analysis of the model that showed better fit than the previous CFAs was performed across countries and gender groups. Previously, separate CFAs for the four countries, men, and women were performed. The MMI of the hierarchical model across groups was tested following the steps suggested by Rudnev et al (2018): (1) configural (test whether all items load on the proposed factor), (2) metric first-order factors (test whether item-factor loadings are similar across groups), (3) metric first and second-order factors, (4) scalar first-order factors (test whether the unstandardized item intercepts are similar across groups), and (5) scalar first and second-order factors. A similar procedure was followed to test the Longitudinal Measurement Invariance (LMI) of the measures across 3 waves in the Spanish sample (Times 1, 2, and 3). Before running the LMI analysis of the second-order factor structure, we examined the structures at each wave using CFAs. To test the LMI of the second-order model we examined four distinct levels: (1) configural, (2) metric of the first-order factors and (3) metric of the second-order factor, and (4) scalar of the first-order factors. Note that only scalar invariance was tested for the first-order factors because the second-order latent means of the factors were set to 0 to identify the model (Chen et al., 2005; Dimitrov, 2010; Meredith, 1993). Thus, to indicate significant decrement in fit when testing for measurement invariance (i.e., MMI, and LMI), we used model comparison criteria of ΔCFI/ΔTLI ≥ 0.010 (i.e., decrease indicates worse fit; Cheung & Rensvold, 2002) and ΔRMSEA ≥ 0.015 (i.e., increase indicates worse fit; Chen, 2007). For each model we used a Maximum Likelihood estimator.

Mean comparisons across groups (i.e., countries and gender) and across time were also examined. Specifically, one-way ANCOVA (for rumination global scores) and MANCOVA (for each subfactor score) analyses were performed for country groups (controlling for age and gender effects), and also for gender groups (controlling for the effect of age). To test mean differences across the three waves in the Spanish sample, a repeated measures ANCOVA (for rumination global scores), and MANCOVA (for each subfactor score) were performed, controlling for age and gender effects.

All the structural equation models were performed using Mplus 8.4, while descriptive analyses, Cronbach’s alpha (Cronbach, 1951) and mean comparisons were performed using SPSS v.25. Effect sizes were calculated employing Cohen’s d (Cohen, 1992) using the following online calculator: https://www.easycalculation.com/es/statistics/effect-size.php.

Results

Confirmatory Factor Analysis

Optimal fit indices for the baseline model of the four-factor correlated model (CFI = 0.962; TLI = 0.952; RMSEA = 0.061) and the second-order factor model (CFI = 0.960; TLI = 0.951; RMSEA = 0.062) were observed. Factor loadings were all significant (p < 0.001) and salient (i.e., equal, or higher than 0.673; see Fig. 1). Considering the equivalence of both models in terms of fit indices, and also the practical and theoretical advantages of the second-order factor model over the four-factor correlated model, the subsequent invariance analyses were performed with the second-order factor model as the baseline model.

Fig. 1
figure 1

Factor structure of the two competing models in the total sample. Note: Single-arrow lines indicate factor loadings, while double-arrow lines indicate correlations

Measurement Invariance Across Countries and Gender Groups

Results for multi-group measurement invariance across countries and gender groups analysis are summarized in Table 2. Prior to carrying out the multi-group analysis, we confirmed the adequacy of the hierarchical structure in each country and gender group separately. For all countries, acceptable to optimal fit indices were observed, except for the Netherlands. In this subsample, although the CFI was acceptable, the TLI and RMSEA were lower/higher than the standard cut-offs of 0.90 and 0.10 respectively. For gender groups, optimal fit indices were observed in both groups (Table 2).

Table 2 Goodness-of-fit for the hierarchical structure of the Ruminative Thought Style Questionnaire across countries

When we tested the configural invariance (MG.1) of the hierarchical model across countries, we found acceptable to optimal fit indices (MG.1, Table 2). Metric (i.e., of the first-order factors, MG.2; and second order factor, MG.3) and scalar invariance (i.e., of the first-order factors, MG.4; and the second order factor, MG.5) across countries were also found as changes in CFI and TLI, and RMSEA were lower than 0.010 and 0.015, respectively (Table 2). Similar results were found when the invariance was tested across gender groups (see Table 2, models MG.1b to MG.5b).

Measurement Invariance Across Time

Results for longitudinal measurement invariance of the hierarchical model in the Spanish sample are summarized in Table 3. The CFA of the hierarchical model in each wave separately, and also when they were specified in the same model (i.e., configural invariance; ML.1) showed acceptable to optimal fit indices. When the item factor loadings (ML.2), the loadings of the first-order factors in the second-order factor (ML.3), and the intercepts of the first-order factors (ML.4) were constrained between waves, changes in the CFI and TLI (i.e., < 0.01), and RMSEA (i.e., < 0.06) suggested longitudinal metric and scalar invariance.

Table 3 Longitudinal Measurement Invariance of the Ruminative Thought Style Questionnaire in Spanish youths

Reliability Coefficients

The Cronbach’s alphas in the whole sample and differentiating by country and by gender groups were adequate (see Table 1), less so in the case of the Anticipatory Thoughts subscale in the Netherlands (α = 0.67) which nevertheless could be considered acceptable, as the subscale is composed of only two items (Loewenthal, 1996). When the internal consistency of the scales was explored in the Spanish subsample across time, we found acceptable to adequate internal consistency indices, less so in the case of the Anticipatory Thoughts subscale in wave 2 and 3.

Mean Comparisons

MANCOVA analysis showed statistically significant differences between countries [F (12, 8416) = 16.268, p < 0.001, Wilks' Λ = 0.941, partial η2 = 0.020], and gender groups [F (4, 3184) = 10.182, p < 0.001, Wilks' Λ = 0.987, partial η2 = 0.013] on Repetitive Thoughts, Counterfactual Thoughts, Problem-focused Thoughts, and Anticipatory Thoughts. ANCOVA analyses also showed statistically significant differences between countries [F (3, 3184) = 22.289, p < 0.001, partial η2 = 0.021] and gender groups [F (1, 3187) = 21.882, p < 0.001, partial η2 = 0.007] on Global Rumination scores. However, the differences were small, as Cohens’ d were all lower than 0.29 (see Supplemental Table 1). Moreover, repeated measures analyses showed non-significant differences across time on Global Rumination scores [F (2, 544) = 0.306, p = 0.737, partial η2 = 0.001], Repetitive Thoughts [F (2, 544) = 0.279, p = 0.757, partial η2 = 0.001], Counterfactual Thoughts [F (2, 544) = 0.484, p = 0.617, partial η2 = 0.002], Problem-focused Thoughts [F (2, 544) = 0.124, p = 0.883, partial η2 = 0.000], and Anticipatory Thoughts [F (2, 544) = 1.009, p = 0.365, partial η2 = 0.004] in the Spanish sample.

Discussion

The present study aimed to examine and extend the evidence concerning the structural validity of the 15-item Ruminative Thought Style Questionnaire (RTSQ), and provide evidence of the measurement invariance of the resulting model across countries, gender groups, and time.

The results of the CFA in the whole sample showed acceptable to optimal fit indices for the 4-factor correlated model (Bravo et al., 2018; Dzhambov et al., 2019; Tanner et al., 2013) and the hierarchical model (Helmig et al., 2016; Tanner et al., 2013) as in previous studies. Due to fit indices of both models being similar, and also considering the practical and theoretical implications of incorporating a general factor of rumination in addition to the four subfacets (i.e., Repetitive Thoughts, Counterfactual Thoughts, Problem Focus Thoughts, and Anticipatory Thoughts), the hierarchal model was selected as the baseline model for the subsequent invariance measurement testing. This is an important issue, as rumination is usually operationalized with a global score in the literature (e.g., McCarrick et al., 2021; Olatunji et al., 2013). However, recent studies have also pointed out the differential associations between subfactors and distinct psychological problems (for a review see Bravo et al., 2018), therefore highlighting an important target for interventions. Thereby, using a second-order factor structure for the RTSQ can incorporate advantages from both models regarding the manner in which they conceptualize rumination (i.e., global factor and four-correlated factors), from a broader perspective to a more specific-content assessment of rumination.

Multi-group measurement invariance (MMI) analysis showed that the hierarchical structure was invariant across the four countries (i.e., the U.S., Spain, Argentina, and the Netherlands) and gender groups, thereby conferring validity to the comparison of the scores obtained through the RTSQ in different countries and between men and women. Likewise, we evaluated the temporal invariance of the RTSQ in a Spanish subsample. The results of the Longitudinal Measurement Invariance (LMI) indicated configural, metric, and scalar invariance of the hierarchical structure of the RTSQ across the three assessment waves, suggesting that the RTSQ is a sound measure to assess and follow-up the rumination levels across time, at least among Spanish undergraduates.

The results also provide reliability evidence of the total score and the scores of each RTSQ subscale, as the alpha indices rank from adequate to excellent in each country and gender group. The only low alpha coefficients (i.e., < 0.60) were found in the second and third assessment of the Anticipatory subscale in the Spanish subsample. Considering that alpha at Time 1 was 0.78, the decrement may be associated with sample attrition.

Moreover, the confirmation of the measurement invariances of the hierarchical structure of the RSTQ allowed us to compare the mean scores across groups and time. Although some significant differences were observed between countries (Bravo et al., 2018) and gender groups (women scoring higher than men; see Johnson & Whisman, 2013) as in previous studies, the differences were low in magnitude (as was suggested by the η2 and Cohen’s d indices). Moreover, when we tested the mean differences across time, non-significant differences were found, supporting the conceptualization of rumination as stable individual trait (Nolen-Hoeksema, 1991).

Thus, the results of the present study suggest that the RSTQ 15-item form may be a useful assessment tool to assess rumination and its subfacets in youths from different populations, and across time. This is especially important in prevention and clinical settings as rumination has been related to depression (e.g., Olatunji et al., 2013), and other psychological problems (Nolen-Hoeksema & Watkins, 2011). Nonetheless, this research is not exempt of limitations. First, there was an over-representation of women in all four countries. Second, the sample used was composed exclusively of university students from the U.S., Argentina, Spain and the Netherlands, so the findings cannot be extrapolated to other populations (e.g., clinical, elderly, children, or adolescents, among others) or countries. Therefore, future studies are necessary to replicate our findings in other types of populations. Third, the attrition across waves was notable in the Spanish subsample. Therefore, the results obtained by the LMI analyses must be replicated with a larger sample size.

Overall, the present study contributes to the growing literature examining the structural validity of the 15-item version of Ruminative Thought Style Questionnaire (RTSQ). The results have relevant implications in the understanding of the concept of rumination, as they support the existence of four different subcomponents of rumination (i.e., Repetitive Thoughts, Counterfactual Thoughts, Problem-focused Thoughts, and Anticipatory Thoughts) in addition to a general tendency of ruminative thinking. Finally, the measurement invariance results suggest that the RTSQ could be a useful tool to compare the global and specific scores in cross-national and gender-focused research and also in longitudinal and follow-up studies.