The following section will collectively present the methods employed and the results found to best illustrate the sequential steps of scale development, refinement, and validation.
Procedure
Ethics approval were obtained from the Human Ethics Committee at James Cook University and the University of Southern Queensland (Number H7414; Number H20REA042). The current studies were conducted online as an anonymous survey. The online link was placed on sites, such as the primary and fellow researchers’ website; Facebook and Twitter; and the universities research participation system. Snowball recruitment (i.e., participants sharing the information sheet or web link with other potential participants) was also encouraged. It is estimated that participants took around 15–30 min to complete the survey. Data for the current studies were collected between June 2018 and December 2020 in three separate campaigns. Data were analyzed using SPSS and AMOS (IBM Statistics), version 25.
Item generation
The initial items pool were generated based on the 12 main themes extracted from the thematic analysis of interviews conducted with psychologists specializing in relationship therapy, reported in the 2019 study [14]. Although items were created based on these broad themes, it was not expected all themes would be represented as separated constructs. Instead, it was expected that constructs would be an agglomeration of the specified themes as per the 2021 study [15]. Additionally, as per Worthington and Whittaker’s [27] recommendation, the newly formulated items were submitted to expert reviewers (KMB; BB) in the field of relationships research. Both reviewers are practicing psychologists with experience in relationship counselling. Feedback from the reviewers resulted in additional items being added (three items were added to the initial pool of 57 items, resulting in a total of 60 items, with an approximately equal number of items per theme) and changing the wording of some items for better comprehension. Finally, reverse questions were included to combat response automatism and a seven-point Likert scale, ranging from 1 (“strongly disagree”) to 7 (“strongly agree”) was used, where high scores indicated high levels of the measured dimensions. The items were randomly presented as a survey to prevent question order from affecting scores. In the survey, participants were instructed with the following message: “The following statements concern how you feel and behave in romantic relationships. We are interested in how you generally experience relationships, not just in what is happening in a current relationship. If you are not in a relationship, think back to your last relationship. Please respond to each statement by indicating how much you agree or disagree with it”. See Table 1 for a complete list of the items included in the survey.
Table 1 List of themes and proposed items for the Relationship Sabotage Scale Study participation criteria
Participants for the three studies were English speaking individuals of diverse gender orientation, sexual orientation, and cultural background, with lived experience of relationship sabotage.
Study 1
Sample
A sample of 321 participants was recruited for this study. A sample size above 300 is considered acceptable for EFA [27,28,29,30], especially given that the sample item communality values were within the recommended range (0.40–0.90), with few exceptions. Participants’ ages ranged between 15 and 80 years (M = 29.60, SD = 13.42), where five participants did not disclose their age. The distribution included 98 male participants (30.5%), 222 female participants (69%) and one reported as ‘other’ (0.5%). Regarding sexual orientation, most participants reported being heterosexual (243, 76%), while 53 (17%) self-identified as bisexual, 11 (3%) self-identified as homosexual, 11 (3%) reported as ‘other’, and three (1%) elected not to answer. For those who reported as ‘other’, 11 provided descriptions for their sexuality, which included androphilic (one), asexual (three), asexual and homoromantic (one), asexual and romantic (one), bisexual (one), heteroflexible (one), pansexual (one), polysexual (one) and queer (one). Most participants (193, 60%) reported being in a relationship (i.e., committed, de facto, married), with a reported mean of 7.1 years (SD = 10.39, range 0–59) for their longest relationship duration, and a total of 99 (31%) participants reported having had an affair. In addition, a total of 78 (24%) participants reported previously seeing a psychologist or counsellor for issues regarding a romantic relationship. Participants were all English speakers, from the United States (96, 30%), Australia (53, 16.5%), and Other (172, 53.5%).
Item analysis
The data for this first study showed mild deviations from normality with skewedness values ranging from − 1.09 to − 1.69 and kurtosis values ranging from − 1.37 to 2.62. This complies with the parameters recommended by Fabrigar et al. [28] to treat the data as normally distributed (i.e., skewness < 2, kurtosis < 7). Lastly, the sample did not include missing data.
Analysis of initial item pool and underlying factor structure
The aim of this initial analysis was to assess the original item pool, the underlying factor structure for the proposed inventory, reduce the number of items, and determine the highest loading items. As per Costello and Osborne’s [29] and Carpenter’s [30] recommendation, a maximum likelihood (ML) extraction method was applied when conducting EFA. This extraction method is arguably the most robust choice for normally distributed data, as it provides more generalizable results and allows for the computation of goodness-of-fit measures and the testing of the significance of loadings and correlations between factors [28,29,30,31]. These are important considerations [32] for future analysis of the scale using structural equation modelling (SEM). The data factorability was examined with the Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy [33] and Bartlett’s test of sphericity [34]. The KMO statistic measures whether the correlations between pairs of variables can be explained by other variables [33]. The Bartlett’s test measures whether the correlation matrix differs significantly from an identity matrix [34]. These are necessary conditions to support the existence of underlying factor structures.
Factorability was established with a KMO above the recommended (i.e., 0.6) at 0.84 and the Bartlett’s test was significant (χ2(1,770) = 8004.04, p < 0.001). Eigenvalues above 1, as per Kaiser’s [35] recommendation, indicated 15 factors, accounting for 62.49% of the total variance in the test. Factor 1, the strongest factor, accounted for 17.65% of the variance. All the remaining factors explained less than 10% of the total variance. Overall, the factor correlation matrix showed that factors were not highly correlated (i.e., < 0.3), which indicated the existence of unique factors. An inspection of the screeplot revealed a break after the sixth component. Next, a parallel analysis was conducted, and results showed eight components with eigenvalues exceeding the corresponding criterion values for a randomly generated data matrix of the same size (60 variables × 321 respondents). To ensure a conservative approach at this stage, eight components were retained for further investigation.
The eight-component solution explained a total of 48.68% of the total variance, with eigenvalues of 10.6, 4.5, 3.5, 2.9, 2.4, 2.1, 1.8, and 1.7, respectively. To aid in the interpretation of results, a direct oblimin rotation with Kaiser normalization was performed, which allowed for factors to correlate. It was assumed that factors within the construct of relationship sabotage should all correlate [30], as this is often the case when measuring psychological constructs [28, 29]. The pattern and structure matrices were reviewed, and the rotated solution showed all components included moderate to strong loadings (i.e., between 0.32 and 0.89), with the majority of items loading substantially on only one component. Further investigation to ensure the quality of items was also applied. Items loading with coefficient values below 0.3, or loading on more than one factor with coefficient values above 0.3, were removed [27, 29, 30, 36]. This resulted in 19 items dropped, with a total of 41 items remaining.
Study 2
Sample
A sample of 608 participants was recruited for this study. This sample size was deemed appropriate based on specific recommendations. Bentler and Chou [37], Worthington and Whittaker [27], and Kline [32] recommended a sample of a minimum of 200 participants and a minimum of 5:1 participants per parameter. In the current study, the most complex model estimated 16 parameters (a ratio of 38:1). Therefore, the current sample was adequately powered to detect significant misspecifications in the models examined. Further, Browne [38] developed the Asymptotic Distribution Free (ADF) estimator for sample sizes based on a weight matrix in the function for fitting covariance structures. This method is considered too stringent [39] and other methods, such as the aforementioned, are most often used. Nevertheless, it is noted that the current study met the sample size suggested by the ADF estimator, with 608 participants for 8 observable variables and 1 latent variable in the most complex model.
Participants’ ages ranged between 17 and 80 years (M = 32.30, SD = 13.76) and five participants did not disclose their age. The distribution included 156 male participants (26%) and 452 female participants (74%). Regarding sexual orientation, the majority of participants reported being heterosexual (486, 80%), while 77 (12.5%) self-identified as bisexual, 28 (4.5%) self-identified as homosexual, 12 (2%) reported as ‘other’, and five (1%) elected not to answer. Most participants (394, 65%) reported being in a relationship (i.e., committed, de facto, married), with a reported mean of 8.6 years (SD = 10.36, range 0–61) for their longest relationship duration, and a total of 183 (30%) participants reported having had an affair. In addition, a total of 210 (34.5%) participants reported previously seeing a psychologist or counsellor for issues regarding a romantic relationship. Participants were all English speakers, from the United States (86, 14%), Australia (346, 57%), and Other (176, 29%).
Item analysis
The data set for this study showed mild deviations (skewness < 2, kurtosis < 7) and was treated as normal. The sample did not include missing data.
Final scale refinement
A two-part EFA was conducted. The first part was the scale refinement process (including factor and scale-length optimization). The second part, recommended by Henson and Roberts [40] and Worthington and Whittaker [27], was to ensure that factor and item elimination does not result in significant changes to the instrument.
The 41 items derived from the previous study were tested for the first part of Study 2. Factorability was established with a KMO at 0.87 and the Bartlett’s test [34] was significant (χ2(820) = 7,465.817, p < 0.001). Eigenvalues indicated eleven factors over 1. These factors explained 58.36% of the variance. An inspection of the screeplot revealed a break after the second component and the results of a parallel analysis showed seven components with eigenvalues exceeding the corresponding criterion values for a randomly generated data matrix of the same size (41 variables × 608 respondents). Using the results from the parallel analysis, seven components were retained for further investigation.
To ensure a stringent approach to retaining factors and items the following five criteria were applied: (1) item coefficient values ≥ 0.32 (this is to ensure the item total variance equals the minimum recommended 10%), (2) inter-item correlation within factors ≥ 0.3, (3) factor reliability ≥ 0.6, (4) inter-factor correlation ≤ 0.3, and (5) number of items on each factor ≥ 4 [29, 30, 32, 36, 41, 42]. Overall, this approach is to ensure constructs can be represented, ensure good model identification [43], and avoid an inadmissible solution [32] prior to conducting one-congeneric model analyses (the next step). This resulted in six items dropped due to low coefficient values, three items dropped due to low inter-item correlation values, and four factors dropped due to insufficient number of items and low factor reliability, with a total of three factors and 20 items remaining.
As per Holmes-Smith and Rowe’s [42] recommendation, one-congeneric model analyses were fitted for each individual factor to clean each construct and ensure model fit prior to establishing the final list of items. All latent variables were scaled from 1 to 7 (from “strongly disagree” to “strongly agree”) by randomly fixing the factor loading from one of the observable variables (also called the reference variable) from each set of constructs to the value of 1. This process was used to identify and scale the model [44]. Also, alternative marker variables were examined as a means of checking for the robustness of the final models. No items were allowed to covary within constructs. The error terms (associated with observable and latent variables) were also set to the value of 1 and measurement error was assumed to be uncorrelated between items [44].
The t-rule method [43] was used to assess model identification. Model identification is assumed if the number of parameters to be estimated in a model does not exceed the number of unique variances and covariances in the sample variance–covariance matrix (calculated using k). The most complex model analyzed in this study (Factor 1) had 16 free parameters and 8 observable variables; therefore, it met the t-rule requirement (i.e., 16 ≤ 36). Free parameters in the model were also estimated using the ML procedure. In SEM, this practice is recommended by several researchers—e.g., Kline [32]—following the original seminal work of Jöreskog [45]. The ML approach is robust for normal, or near normal data, as it provides close estimates of measurement error and a chi-square distribution closely related to the population of estimation.
In this step, factor score regression weights, variance explained, and measurement error were used to assess the quality of items. Modifications were only applied to improve the model when existing literature, previous research findings, and the results from the current set of studies supported the proposed alterations. Six measures were used to assess model fit: (1) chi-square, (2) root mean square error of approximation (RMSEA), (3) goodness-of-fit index (GFI), (4) comparative fit index (CFI), (5) Tucker-Lewis index (TLI), and (6) standardized root mean square residual (SRMR). Overall, the one-congeneric model approach allows for factors of different weights within the same construct to contribute uniquely and does not assume that items are parallel (i.e., all variables carry the same weight).
Factor 1 The initial analysis for this factor, containing eight items (16, 18, 19, 22, 23, 24, 27, 28), showed a poor fit (χ2(20) = 98.824, p < 0.001; RMSEA = 0.081 [0.065, 0.097], p = 0.001; GFI = 0.959; CFI = 0.969; TLI = 0.957; SRMR = 0.031). Model specifications analysis showed high covariance associated with four items (16, 22, 24, 27). Therefore, these items were removed. The final one-congeneric model with four items (18, 19, 23, 28) showed an excellent fit (χ2(2) = 4.632, p = 0.099; RMSEA = 0.047 [0.000, 0.104], p = 0.445; GFI = 0.996; CFI = 0.998; TLI = 0.994; SRMR = 0.010). Altogether, this factor contains three items from the original defensiveness theme (items 18, 19, and 23) and one item from the original contempt theme (item 28).
Factor 2. The initial analysis for this factor, containing seven items (6, 8, 9, 37, 38, 44, 45), showed a poor fit (χ2(14) = 47.721, p < 0.001; RMSEA = 0.063 [0.044, 0.083], p = 0.124; GFI = 0.978; CFI = 0.955; TLI = 0.933; SRMR = 0.037). Model specifications analysis showed high covariance associated with three items (6, 9, 38). Therefore, these items were removed. The final one-congeneric model with four items (8, 37, 44, 45) showed an excellent fit (χ2(2) = 3.724, p = 0.155; RMSEA = 0.038 [0.000, 0.097], p = 0.540; GFI = 0.997; CFI = 0.996; TLI = 0.988; SRMR = 0.016). Altogether, this factor contains two items from the original trust difficulty theme (items 44 and 45), one item from the original partner pursue theme (item 8), and one item from the original controlling tendency theme (item 37).
Factor 3. The initial analysis for this factor, containing five items (26, 40, 41, 42, 60), showed an excellent fit (χ2(5) = 7.638, p = 0.177; RMSEA = 0.029 [0.000, 0.069], p = 0.767; GFI = 0.995; CFI = 0.993; TLI = 0.986; SRMR = 0.021). However, item 60 showed a weak regression weight (i.e., < 0.32) and therefore was dropped. The final one-congeneric model with four items (26, 40, 41, 42) also showed an excellent fit (χ2(2) = 3.873, p = 0.144; RMSEA = 0.039 [0.000, 0.098], p = 0.524; GFI = 0.997; CFI = 0.995; TLI = 0.984; SRMR = 0.017). Altogether, this factor contains three items from the original lack of relationship skills theme (items 40, 41, and 42) and one item from the original contempt theme (item 26).
These analyses resulted in eight items dropped. The final EFA was performed on 12 items. Factorability was established with a KMO at 0.84 and the Bartlett’s test [34] was significant (χ2(66) = 2,315.468, p < 0.001). The three-component solution explained a total of 60.3% of the total variance, with eigenvalues of 4, 1.7, and 1.5, respectively. No other factor showed eigenvalues above 1. The rotated solution showed all components included moderate to strong loadings (i.e., between 0.54 and 0.88) and the majority of items loaded substantially on only one component. Factor 1 (33.3%) was termed Defensiveness, Factor 2 (14.3%) was termed Trust Difficulty, and Factor 3 (12.7%) was termed Lack of Relationship Skills. Overall, this result demonstrated the three-factor model is superior to the eight and seven factor solution previously identified. The final inventory of 12 items and their respective loadings can be viewed in Table 2.
Table 2 Scale pattern and structure matrix with maximum likelihood extraction and oblimin rotation Study 3
Sample
A sample of 436 participants were recruited for this study. The same specifications to access the appropriateness of sample size as Study 2 were used. Participants’ ages ranged between 14 and 75 years (M = 27.41, SD = 12.37). The distribution included 128 male participants (29.5%) and 302 female participants (69.5%), and six reported as ‘other’ (1%). For those who reported as ‘other’, six provided descriptions for their gender, which included gender fluid (one), gender neutral (one), non-binary (one), queer (two), and transgender male (one). Regarding sexual orientation, most participants reported being heterosexual (336, 77%), while 74 (17%) self-identified as bisexual, 11 (2.5%) self-identified as homosexual, eight (2%) reported as ‘other’, and seven (1.5%) elected not to answer. For those who reported as ‘other’, eight provided descriptions for their sexuality, which included asexual (two), bi-curious (one), confused (one), panromantic and demisexual (one), pansexual (one), and questioning (two). Most participants (250, 57%) reported being in a relationship (i.e., committed, de facto, married), with a reported mean of 5.68 years (SD = 8.13, range 0–50) for their longest relationship duration, and a total of 93 (21%) participants reported having had an affair. In addition, a total of 101 (23%) participants reported previously seeing a psychologist or counsellor for issues regarding a romantic relationship. Participants were all English speakers from the United States (70, 16%), Australia (215, 49%), and Other (151, 35%).
Item analysis
The data set for this study showed mild deviations (skewness < 2, kurtosis < 7) and was treated as normal. Also, the sample did not include missing data.
Confirmatory factor analysis
A full multi-factor CFA was conducted with the final set of items and the same sample and specifications as the one-congeneric model analyses. The aim of conducting this CFA was to evaluate the EFA-informed factor structure and psychometric properties and to test the fit of the global model. The three factors were represented in the full model by latent variables (fitted as a second-order g model), with each item loading on its respective latent factor, as predicted by the EFA. Factor loadings from one of the observable variables from each set of constructs was randomly set to the value of 1. Also, alternative marker variables were examined as a means of checking for the robustness of the final model. Items were not allowed to load on multiple factors. The three factors were allowed to covary and measurement error was assumed to be uncorrelated between items.
All factors and items significantly loaded in their respective latent factor. Items loaded with t values between 6 and 17.2 and regression weights between 0.4 and 0.85. Also, items squared multiple correlations ranged between 0.16 and 0.72. Overall, this indicates items were strong and reliable indicators of the latent variables [44]. The goodness-of-fit statistics demonstrated that the three-factor model had a RMSEA of 0.048 ([0.034, 0.062], p = 0.565), which is considered an excellent fit [44]. Although the chi-square value was significant (χ2(50) = 100.577, p < 0.001), this fit statistic is less important than the RMSEA, when fitting a full and more complex model [44, 46]. The RMSEA takes into account the error of approximation in the population and reduces the stringent requirement on the chi-square that the model should hold exactly in the population [44, 46]. An issue with the chi-square statistic is that the more complex the model, the bigger the value and the more likely it is that the model will be rejected. Therefore, the normed chi-squared (χ2/df) was calculated with a value of 2, which is acceptable. The normed chi-square takes model complexity into account and can also be referred to as an index of model parsimony [47].
Regarding incremental or comparative fit indices, the GFI and CFI values were 0.96, which is above the acceptable level. This indicates the hypothesised model accounts for variance in the data well in comparison with the null model. The TLI was 0.95, which is acceptable. This indicates the model is parsimonious. Finally, the SRMR, which is a residual statistic that assesses the residual variance unexplained by the model, showed a level of 0.052, which is also acceptable [48, 49]. Overall, the final 12-item inventory was supported by the CFA.
Final scale reliability analysis
Reliability was calculated with the measure of Cronbach’s alpha [50] and the SEM-recommended practice of coefficient H [51]. According to Hancock and Mueller [51], coefficient H provides a more robust way to assess latent measures created from observable construct indicators, such as regression coefficients, especially if items are not parallel. The Cronbach’s alpha calculation assumes that all items are parallel, which is often not the case, and is affected by the sign of the indicators’ loading. Alternatively, coefficient H is not limited by the strength and sign of items and draws information from all indicators (even from weaker variables) to reflect the construct. Further, Lord and Novick [52] proposed that if measures associated with a latent trait are congeneric, which is the case with the current measure, Cronbach’s alpha will be a lower-bound estimate of the true reliability.
The standard cut-off indicators recommended by the most stringent researchers [50, 53, 54] were followed for both analyses (i.e., α ≥ 0.9 = excellent; 0.9 > α ≥ 0.8 = good; 0.8 > α ≥ 0.7 = acceptable; 0.7 > α ≥ 0.6 = questionable; 0.6 > α ≥ 0.5 = poor; 0.5 > α = not acceptable). The results showed acceptable/good reliability for the total scale (α = 0.77; H = 0.82), good reliability for Factor 1 (α = 0.85; H = 0.87), questionable reliability for Factor 2 (α = 0.60; H = 0.62), and acceptable reliability for Factor 3 (α = 0.75; H = 0.77). As all sub-scales contain less than ten items, which can affect the reliability value, the mean inter-item correlation value was also inspected. The mean inter-item correlation value for all sub-factors showed a strong relationship between items (i.e., ≥ 0.3).
Scale construct validity
Traditional approaches to assess construct validity (i.e., the multi-trait–multi-method [MTMM] matrix approach) rely on the assumption that the construct’s variables are parallel. Therefore, assessing validity with a correlation matrix alone is limited and does not account for the effect of variables with different regression weights and measurement errors. To remedy this limitation, SEM-based approaches to construct validity were also performed. SEM-based approaches highlight how constructs are affected differently and allows them to correlate freely among themselves. Further, these approaches assess how well each construct fits within the model with regards to variance explained and measurement error [55].
Convergent and Discriminant Validity (MTMM Matrix Approach). Convergent and discriminant validity were assessed using the MTMM matrix, which assesses construct validity by comparing the correlation matrix between the proposed constructs and constructs measured by different scales, which are either conceptually similar or dissimilar [56]. The three factors were compared with three measures—the Experiences in Close Relationships Scale Short-Form (ECR-SF) [57], used to assess adult insecure attachment styles (i.e., anxious and avoidant attachment); the Perceived Relationship Quality Components Inventory Short-Form (PRQCI-SF) [58], used to assess perceived relationship quality with six components: (1) satisfaction, (2) commitment, (3) intimacy, (4) trust, (5) passion, and (6) love; and the Self-Handicapping Scale Short Form (SHS-SF) [59], used to assess self-handicapping in the educational and sport contexts with mainly physical barriers employed to explicitly hinder performance driven activities. The ECR-SF and PRQCI-SF were used to assess convergent validity and the SHS-SF was used to assess divergent validity.
The sub-factors for the ECR-SF, PRQCI-SF, and SHS-SF were created as per the scales’ manuals, by adding raw scores. The three factors for the scale in development were created as composite variables, which involves using the factor score regression weights obtained from the one-factor congeneric measurement models fitted as part of the CFA, as recommended by Jöreskog and Sörbom [60]. This approach is unlike adding raw scores to represent subscales, which assumes that the items are parallel. Weighted composite variables best represent each variable’s unique contribution. Further, weighted composite variables are continuous, as opposed to Likert scale scores, which are ordinal. Therefore, for the purpose of creating weighted composite variables, factor score regression weights were rescaled to add up to a total of 1.
Regarding convergent validity, Factor 1 (Defensiveness) showed significant positive correlations (p < 0.01) with anxious attachment (r = 0.348) and avoidant attachment (r = 0.435), and significant negative correlation with perceived relationship quality (r = ˗0.371). Factor 2 (Trust Difficulty) showed significant positive correlations (p < 0.01) with anxious attachment (r = 0.508) and avoidant attachment (r = 0.197). Factor 3 (Lack of Relationship Skills) showed significant positive correlations (p < 0.01) with avoidant attachment (r = 0.473) and significant negative correlation with perceived relationship quality (r = ˗0.406). Regarding divergent validity, all three factors showed a near zero positive relationship with self-handicapping (ranging between 0.033 and 0.082). See Table 3 below.
Table 3 Correlation matrix to measure construct validity Convergent Validity (SEM–based Approaches). According to Bagozzi et al. [55], if all item loadings are statistically significant, meaning that the relationship between an observed variable and latent construct is different to zero, convergent validity can be assumed. Further, Holmes-Smith and Rowe [42] recommended a threshold value of 0.5 for the standardized loading (with a significant t-statistic) to achieve convergent validity. Standardized item loadings were in between 0.4 and 0.87 (with a significant t-statistic), with all items above 0.5, except for item 37 (0.43), and item 45 (0.4). Additionally, Hair [61] proposed an all-encompassing and more stringent set of criteria for convergent validity, which requires that in addition to standardized factor loading of all items greater than 0.5, an average variance extracted (AVE) between constructs is greater than 0.5, and construct’s composite reliability (CR) is greater than 0.7. This set of criteria is in agreement with Fornell and Larcker’s [62, 63] works. All AVE between factor were above 0.5, with a range of 0.72–1. Further, all factor CR were above 0.7, expect for Factor 2 (0.61), with a range of 0.61–0.84. These results fully supported convergent validity for Factors 1 and Factor 3 and partially support convergent validity for Factor 2. See the Table 4 below for AVE and CR estimates.
Table 4 AVE and CR estimates for the relationship sabotage scale factors Discriminant Validity (SEM–based Approaches). The criterion adopted by Kline [32] was considered for discriminant validity analyses, which stipulates that validity can be assumed if the correlation between two factors is less than 0.85. This was further supported by Cheung and Wang [64], who recommended the correlation not be significantly greater than 0.7. However, this approach is often criticized for its reliance on the correlation matrix approach, which does not consider variance explained and error measurement [55]. Therefore, two additional approaches were considered.
Discriminant validity was first assessed using the Fornell and Larcker’s [62, 63] approach in a multi-trait–mono-method context using the AVE and inter-correlation between factors. This method showed that all pairs of constructs were distinct, thereby supporting discriminant validity (i.e., AVE > squared factors inter-correlation or square-rooted AVE > factors inter-correlation—refer back to Table 4). Further, discriminant validity was assessed using the Bagozzi et al. [55] nested model method. This procedure involves measuring the difference between the constrained and unconstrained models (with correlations between constructs set to 1) between each two pairs of variables. The conclusion is based on the difference between the models’ chi-square test. The difference between models should show that constraining the correlation between the two constructs worsens the model fit (i.e., there is a significant difference between models), which in turn means that the constructs are discriminant. The nested model approach was performed between factors showing divergent constructs. This confirms there are three distinct factors. Additionally, this approach has gained favor as a technique to compare alternative models [27]. The results from this test fully supported discriminant validity—see Table 5.
Table 5 Nested model approach to discriminant validity in the relationship sabotage scale