Factor analyses
Data were analyzed in two complementary ways. First, following Knutson et al., we conducted EFA using the 10-event feature scales assessed in Knutson et al. (Table 1), with ratings averaged across participants. We focused on these scales due to their direct measurement of features previously implicated in MJ (e.g., harm) and to assess the reproducibility of Knutson et al.’s factor structure in this larger sample. This event feature-centered approach identifies whether underlying factors influenced feature ratings, and EFA allowed us to examine the factor structure without being confined to a specific solution (i.e., Knutson et al.’s original structure). Given that our aim was to replicate a factor structure with a smaller subset of items as a step to improve assessment feasibility and not necessarily confirm a theoretical model, EFA was considered the appropriate choice over confirmatory factor analysis (CFA), which is often theory-driven (Church & Burke, 1994; McCrae et al., 1996; Ferrando & Lorenzo, 2000).
EFA analyses were conducted in the same manner as Knutson et al., using SPSS’s (20.0.0) principal component extraction with a Varimax rotation and Kaiser normalization (Kaiser, 1958). Oblique rotations (i.e., Promax, Direct Oblimin) produced structures similar to Knutson’s results and to the Varimax results reported here (i.e., similar factor structures and acceptable Tucker Index values) as well as weak correlations between components (mean r = .105, SD = .088, range = .001–.360; see Supplementary Material – Appendix C). Analyses were conducted separately for each of the three subsets as well as with the subsets combined (i.e., 117 vignettes). For all EFA analyses, Tucker’s Index of Factor Congruence was calculated to determine the similarity of factor loadings across samples (i.e., Knutson et al. results vs. our results), across subsets (i.e., comparing subsets 1, 3, and 4) and across grouping variables (i.e., sex and political affiliation) (Lorenzo-Seva & Ten Berge, 2006).
Factor analyses on the 10 event feature scales in the three subsets separately, as well as together, produced a three-factor solution similar to Knutson et al.’s analyses (Table 2) and evidenced high Tucker Index values (Table 3), indicating similarity in factor loadings of these subsets relative to the Knutson sample as well as similarity across the subsets themselves. Specifically, we identified a norm violation component, a social affect component, and an intentionality component all with eigen values greater than one. Norm violation consistently accounted for the most variance and involved positive loadings from social norms, legality, benefit to others, and moral appropriateness, as well as negative loadings from harm to others. Social affect accounted for the second most variance, with positive loadings from emotional intensity, socialness, and emotional aversion. Lastly, intention accounted for the smallest amount of variance, with positive loadings from premeditation and self-benefit.
Table 3 Tucker’s Index of Factor Congruence was computed between the factors from Knutson et al. (2010), the three vignette subsets individually, and the subsets combined. Tucker’s Index values range from −1 to +1. Values in the range of 0.85 to 0.94 indicate acceptable similarity between factors, whereas values over .95 indicate near equality between factors (Lorenzo-Seva & ten Berge, 2006)
Based on previous research and theory suggesting that MJ may vary according to an individual’s sex (Jaffee & Hyde, 2000) or political affiliation (Graham et al., 2009), we examined whether factor structures differed as a function of sex (male, female) or political affiliation (liberal, moderate, conservative). Religious affiliation (Graham et al., 2009; Galen, 2012) was not directly investigated with the current sample since our sample was predominantly Christian (i.e., N = 284 Christianity, N = 81 Agnostic, N = 56 Atheist, N = 21 Judaism, N = 20 Islam, all remaining affiliations N < 20). As indicated by high Tucker Index values, analogous results were found across sex (male, female; Table 4) and political affiliation (liberal, moderate, conservative; Table 5) (see Supplemental Materials – Appendix D and E, respectively for full factor structures and additional Tucker Index comparisons). Based on these results, the norm violation, social affect, and intention components appear to generalize well across samples, subsets, sex, and political affiliation, providing increased confidence in their ability to assess core and key aspects of MJ.
Table 4 Tucker’s Index of Factor Congruence between males and females for subsets 1, 3, and 4 and for the subsets combined
Table 5 Tucker’s Index of Factor Congruence between political affiliations for subsets 1, 3, and 4 and for the subsets combined
Similar factor structures and Tucker Index values were seen when conducting EFA on the 13 original event feature scales and including our three new scales (i.e., 16-scale analysis; see Table 1 for scale identification), with results again indicating norm violation, social affect, and intention components. In addition, a fourth component, event familiarity and likelihood, was identified in both the 13-scale and 16-scale analyses (Supplementary Material – Appendix F and see Supplementary Material – Appendix G, respectively). Mirroring findings for the 10 event features, these results held across sex and political affiliation, each of the three vignette subsets, and in the full set of 117 vignettes used here (see Supplementary Material – Appendix D for sex results, Appendix E for political affiliation results).
To further inform the utility of these subsets for use across a range of samples (e.g., those where reading ability may be a limitation; Greenberg et al., 2007), we also calculated the readability and comprehensibility of the vignettes via the Flesch-Kincaid reading ease and grade level indices (Flesch, 1948). Across all vignettes, reading ease and grade level ranged from 46.4 to 98.9 (mean = 82.82, SD = 9.97) and from 2.6 to 11.1 (mean = 5.22, SD = 1.53), respectively (specific indices for each vignette are found in the Supplementary Materials – Vignette Information). Despite the wide range in readability indices across vignettes, the subsets did not differ significantly in reading ease (F(5, 111) = 0.391, p = .854) or grade level (F(5, 111) = 0.447, p = .815), with subset 1 showing a minimum reading ease of 62 and grade level of 2.7, subset 3 a minimum reading ease of 62 and grade level of 2.6, and subset 4 a minimum reading ease of 46 and grade level of 2.9.
Latent profile analysis
Our second analytic approach, LPA, used the vignette as the unit of analysis (averaged across participants) and provides an empirically derived characterization of the vignettes’ content. LPA identifies unique patterns (i.e., profiles) of responding across a set of items. These profiles can be used to identify empirically derived categories of vignette content. Here, nine of the original event feature scales (i.e., all those used in the 10-event feature factor analysis, save for moral appropriateness) were used to identify groupings of similar vignettes via profiles of event feature ratings. We excluded moral appropriateness in clustering the vignettes to examine which vignette groups were rated as more or less morally appropriate. LPA was conducted in Mplus 5.0 (Muthén & Muthén, 2008) and used all 117 vignettes tested here. In line with the literature (Clark et al., 2013; Nylund et al., 2007), we fit models starting with two profiles, adding profiles until the Bayesian Information Criterion (BIC; Schwarz, 1978) achieved its first minimum and then increased. Simulation studies have shown that the model with the lowest BIC is most likely to be the correct model, and the BIC outperforms other methods of model selection (Nylund et al., 2007). In addition to the BIC, we considered the number of vignettes per profile grouping and profile interpretability to determine best fit (Clark et al., 2013).
After identifying the best fitting solution, we characterized vignette groups based on profiling features and content and assigned names to each profile. We then compared the groupings on judgments of moral appropriateness. We applied three different indices to describe mean differences in ratings of moral appropriateness across vignette groupings. First, we report the point estimate and 95% confidence interval (CI) for the mean. CIs that do not overlap are generally different at p < .01, while CIs that overlap by less than half (.5) a margin of error are generally different at p < .05 (Cumming, 2013). Second, we report Cohen’s d as a measure of effect size. Based on Cohen’s recommendations (1992), we interpreted effect sizes of 0.2 and below as "small" effects, effects near 0.5 as "medium" effects, and effects larger than 0.8 as "large" effects. Nonetheless, we also report p-values from pairwise comparisons. Lastly, we examined whether ratings of moral appropriateness for groupings differed based on sex (i.e., male, female; Jaffee & Hyde, 2000) or political affiliation (i.e., liberal, moderate, conservative; Graham et al., 2009).
For the LPA, a seven-profile solution evidenced the lowest BIC (BIC = 2,665.3); however, there were convergence errors (e.g., best likelihood score not replicated), suggesting issues in interpreting this model. A six-profile solution had the next lowest BIC (BIC = 2,668.77) and produced clear and meaningful vignette groupings. Therefore, based on BIC and interpretability, we determined the six-profile solution to be the best fitting model (Figs. 1 and 2). The resultant groupings did not differ on Flesch-Kincaid indices of reading ease (F(5, 114) = 1.933, p = .149) or grade level ((F(5, 114) = 2.251, p = .110).
To help interpret the six profiles identified, we plotted the means and confidence intervals for the event feature ratings used to define the profiles in two complementary ways. Figure 1 uses box plots to visualize the values for each profile for each event feature, allowing for comparisons across profiles (e.g., Prosocial profile is rated significantly lower on harm than the Controversial Act profile, t (44) = -25.02, p < .001, d = −7.19). Figure 2 shows the results of the six-factor solution by plotting each event feature on the x-axis and separate lines for each profile. This allows for the characterization of the profile in relation to itself (e.g., Prosocial profile is rated higher on legality than harm).
The first profile, labeled the Deception profile (n = number of vignettes, % = percentage of total vignettes administered [n/117]; n = 22, 19%), involves vignettes with low ratings on all event feature scales in absolute terms and in relation to other profiles. For example, the Deception profile was significantly lower on emotional intensity than all other profiles – except for the Prosocial profile (d’s ranged from −2.09 to −4.88, all p’s < .001), and showed elevations on benefit to self. Actions in this group typically break social norms, especially cheating, lying, or stealing, while benefiting the self. Example vignettes are, “Back in high school I kind of had an agreement with the guy sitting next to me. We would show each other our papers whenever we were taking a test. Both of us were pretty good students, we just would make sure we shared test answers if we needed to” and “I applied for a position at this company. As it turned out the pay was not very good at all for the amount of work that I was doing. So I lied and told the manager that I had another job offer and I would take it unless I got a raise.”
The Controversial Act profile of vignettes (n = 21, 18%) was characterized by high ratings on emotional intensity (e.g., significantly greater than Prosocial, Peccadillo, Illegal & Antisocial, and Deception profiles, d’s range from 1.77 to 3.94, all p’s < .001), emotional aversion, harm to others, and legality and low ratings on benefit to others and social norms. These center on actions that are mostly legal, yet generate negative emotions, likely due to the violation of social norms causing harm to others and potentially controversial behaviors in the vignettes. Example vignettes are, “I left my second marriage and I left my step-kids there too. My youngest stepson has some disabilities, but I left him there. I could not cope with his druggy, drinking father and so I decided to leave everything behind” and “One night I was having sex with my boyfriend. He said that he had a condom on but at the end I found out that he didn’t. I became pregnant and since I just had had a baby recently, I decided to have an abortion.”
The Peccadillo profile of vignettes (n = 24, 21%) was characterized by elevation on legality, slightly lower than neutral ratings on social norms, and neutral ratings on emotional intensity and unpleasantness. Actions in these vignettes are typically legal yet break some, often more minor, social norms (e.g., lies or minor sins). They are similar to vignettes in the Controversial Act profile but are much less emotionally charged (i.e., significantly lower than Controversial Act on emotional intensity, t (43) = −5.96, p < .001, d = −1.77, and emotional aversion, t (43) = −6.88, p < .001, d = −2.08), partly because there is less overt other-harm involved (i.e., significantly lower than Controversial Act profile on harm, t (43) = −8.41, p < .001, d = −2.53). Example vignettes are, “While I was in college, I was in a long distance relationship with a girl. We talked every night on the phone and really tried to make it work. Meanwhile, I was having study sessions with an attractive girl in my class and very tempted to cheat on my girlfriend” and “One night I was going out late and I didn’t want my son to know about it. I was a single mom and at that point in time he always seemed to want to act like the parent. So I snuck out of the house to go out.”
The Illegal and Antisocial profile (n = 20, 18%) contained vignettes low on legality (i.e., significantly lower than all other profiles, d’s range from −1.74 to −9.56, all p’s < .001) and social norms and neutral on other scales. This profile was similar to the Deception profile, but differed in that the acts were rated as more illegal, emotional (i.e., emotional intensity and emotional aversion), and harmful (i.e., significantly higher on harm; d’s range from 1.74 to 2.59, all p’s < .001). Vignettes in this group are the most clearly illegal and involve antisocial behavior. Example vignettes are, “As I was backing out of a parking lot I bumped a parked car and left a minor dent. I didn”t even feel the impact when I hit the car but it left a little bit of damage. I drove away without leaving a message or trying to contact the person” and “I was thirteen years old and I went into the grocery store where I lived. There was a comb that I wanted in the store, so I just took it. I didn’t really need it but I just wanted the thrill of stealing it and nobody catching me.”
The Prosocial profile (n = 25, 21%) evidenced high scores on benefit to others, legality, and social norms and low scores on emotional intensity, emotional aversion, and harm to others. Events in this group represent prosocial actions, such as being charitable and honest. Example vignettes are, “I found a wallet with a fifty-dollar bill in it. I found a phone number to call and contacted the woman whose wallet it was. She was very appreciative and came to my house to pick it up” and “I was at the pharmacy buying something and I noticed a man who was sitting outside selling trinkets. He was homeless and it was freezing out. So I went next door to a store and bought him some food and new clothes.”
Lastly, the Compassion profile (n = 5, 4%) exhibited high scores on emotional intensity, benefit to others, planning, legality, and social norms and low scores on harm to others. Like the Prosocial set, these vignettes represent prosocial acts but involve more emotional events than the former group, as evidenced by significantly higher emotional intensity (t (28) = 6.97, p < .001, d = 3.79) and emotional aversion ratings (t (28) = 6.32, p < .001, d = 2.63). Examples are, “My wife was diagnosed with cancer. I was there with her every step of the way even though it was extremely emotionally and mentally demanding. I helped her through all her appointments and emotional distress” and “I had a bad relationship with my father and had not talked to him for years. He had left my mother. When my mother died I gathered the strength to call him and tell him that she died and that I loved him.”
To help further differentiate vignette groupings, we conducted an ANOVA with grouping profile as a between-subject factor and judgments of moral appropriateness as the dependent measure. There was a significant main effect of profile, F(5, 111) = 142.51, p < .001, which highlighted their distinctness. The Illegal and Antisocial profile was rated the least morally appropriate (M = 2.33, SD = .44, 95% CI [2.18–2.48]) and significantly lower than all other profiles (d’s range from −1.66 to −6.12, all p < .001) aside from Controversial Acts, (d = −.35, p = .260). The Controversial Act profile (M = 2.52, SD = .53, 95% CI [2.27–2.76]) was the next lowest and was also significantly different from all other profiles (d’s range from −2.09 to −6.16, all p’s < .001), aside from the Illegal and Antisocial profile. The Deception (M = 3.22, SD = .60, 95% CI [2.95–3.48]) and Peccadillo profiles (M = 3.64, SD = .66, 95% CI [3.35–3.92]) were both rated as slightly more morally inappropriate and were significantly different from all others [|d’s| range from .71 to 4.90, all p < .001). Finally, the Prosocial and Compassion profiles were rated as the most morally appropriate (Prosocial: M = 5.80, SD = .39, 95% CI [5.64–5.96]; Compassion: M = 5.83, SD = .49, 95% CI [5.21–6.45]). They did not differ from each other (d = .04, p = .927) and were rated significantly higher on appropriateness relative to all other profiles (d’s range from 4.02 to 6.47, all p < .001).
To examine whether sex or political affiliation moderated the relationship between grouping profile and ratings of moral appropriateness, we used multi-level modeling, nesting vignette profile within subjects. Although not all participants saw the same vignettes, each subset of vignettes contained a fair spread of vignette profiles, allowing us to examine whether between-person differences affected morality ratings for different vignette profiles. Vignette profile (i.e., Deception, Controversial Acts, Peccadillo, Illegal and Antisocial, Prosocial, and Compassion) was a within-subject factor and sex (male, female)/political affiliation (liberal, moderate, conservative) were between-subject factors in separate analyses. Although there was no significant interaction with political affiliation, F(10, 21000) = 1.59, p = .101, there was for sex, F(5, 26000) = 33.39, p < .001. To follow-up on this interaction, we compared males and females on their ratings of moral appropriateness within vignette profile. Males and females significantly differed in their ratings for the Deception, Controversial Act, Illegal and Antisocial, and Prosocial profiles (all p’s < .001). Females rated the Deception, Controversial Act, and the Illegal and Antisocial profiles as less morally appropriate than males (Deception: d = .34, Controversial Act d = .56, Illegal and Antisocial d = .87), and the Prosocial profile as more morally appropriate than males (d = −.71 [males –females]), p < .001. The other profiles were similarly rated across the sexes, with the Peccadillo and Compassion profiles demonstrating no significant differences between males and females (Peccadillo: d = .14, p = .051; Compassion: d = −.04, p = .792). Thus, females, relative to males, appeared harsher in their moral judgments of negative actions, and more morally approving of prosocial actions.