Different judgment frameworks for moral compliance and moral violation

Shirai, Risako; Watanabe, Katsumi

doi:10.1038/s41598-024-66862-9

Different judgment frameworks for moral compliance and moral violation

Article
Open access
Published: 16 July 2024

Volume 14, article number 16432, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Different judgment frameworks for moral compliance and moral violation

Download PDF

Risako Shirai^1,2 &
Katsumi Watanabe¹

262 Accesses
Explore all metrics

Abstract

In recent decades, the field of moral psychology has focused on moral judgments based on some moral foundations/categories (e.g., harm/care, fairness/reciprocity, ingroup/loyalty, authority/respect, and purity/sanctity). When discussing the moral categories, however, whether a person judges moral compliance or moral violation has been rarely considered. We examined the extent to which moral judgments are influenced by each other across moral categories and explored whether the framework of judgments for moral violation and compliance would be different. For this purpose, we developed the episodes set for moral and affective behaviors. For each episode, participants evaluated valence, arousal, morality, and the degree of relevance to each of the Haidt's 5 moral foundations. The cluster analysis showed that the moral compliance episodes were divided into three clusters, whereas the moral violation episodes were divided into two clusters. Also, the additional experiment indicated that the clusters might not be stable in time. These findings suggest that people have different framework of judgments for moral compliance and moral violation.

Everyday moral transgressions (EMTs): Investigating the morality of everyday behaviors

Article 07 September 2023

Moral Motivation and the Four Component Model

The Moral Identity Questionnaire predicts prosocial behavior better than the Moral Identity Scale

Article 02 July 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Medical Ethics

Introduction

We routinely see conflicts of opinion between individuals or groups on social networking sites, television, and newspapers (e.g., wearing mask on a train, unfairness dress code on gender, and movements about induced abortion). Such individual differences in moral judgments sometimes escalate to discrimination or more violent conflict.

To understand the mechanisms of moral judgments, researchers in moral psychology have studied the principles of human morality. Recently, the model positing an innate moral principle has gained attention^1,2. For example, the Universal Moral Grammar model is characterized by that humans have innate moral faculties like the Chomsky's idea³ about linguistics^1,2. Further, the Moral Foundations Theory (MFT;^4,5) explains where the specific moral intuitions come from and includes the sociohistorical elements and social interactions that cannot be fully explained within a general cognitive framework². Specifically, Haidt⁶ proposed the MFT^4,5 that the moral judgments are primary determined by the autonomic intuition process rather than the conscious reasoning process. MFT argues that this intuition is supported by the five foundations or bases (follow harm/care, fairness/reciprocity, ingroup/loyalty, authority/respect, purity/sanctity) and the differential sensitivities in each moral foundation lead to the individual differences in moral judgments. Haidt and Joseph⁷ further argued that each foundation has specific psychological process and own evolutionally history. Care/Harm is related to the moral goods of care and kindness for suffering in others. Fairness/reciprocity concerns the cheating or the cooperation in reciprocal interactions. Ingroup/loyalty is related to the moral goods of loyalty and patriotism for the in-group. Authority/respect is related to the respect and owe for the authority. Purity/sanctity is related to the chastity and purity for the bodily and spiritual activities. However, how many foundations exist for morality is still not clear. For example, liberty/oppression is now also included as a basis for the sixth foundation⁸. Moreover, equality and proportionality (e.g.,⁹), honor¹⁰, and ownership¹¹ have been proposed as candidate domains or in the place of the “Fairness/reciprocity” domain (e.g.,⁹). Further, Curry, Mullins, and Whitehouse¹² reported the seven types of cooperative behaviors for universal moral rules.

For the number of the moral domains, the researchers suggest that morality can be divided into three (i.e., autonomy, community, and divine ethics;¹³), five, or six categories⁷, or even primary explained by one component (e.g., Harm component^14,15). Haidt and Joseph¹⁶ noted that it is important to advance the understanding of moral functioning by counting and observing the distinctive moral content. However, in response to the methodology of Haidt and Joseph¹⁶, Carchidi² featured the notion of Chomsky¹⁷ that it is difficult to effectively understand diversity by skipping understanding the mechanisms underlying moral judgments.

For exploring the structure of moral foundations, several useful episode sets have been developed (e.g., moral foundations vignettes: MFVs¹⁸; Moral Foundations Dictionary: MFD¹⁹; Japanese versions of MFD: J-MFD²⁰; Socio-Moral Image Database: SMID²¹; Moral and Affective Film Set: MAAFS²²). However, many previous studies used episodes depicting an act(s) of moral violation, as opposed to moral compliance (or adherence to the moral rules or virtues). Moreover, even when moral compliance behaviors were included, separate analyses for moral compliance and moral violation were not performed. For example, hurting and caring for others are both considered to be associated with a Care/Harm foundation (e.g.,²³). Recently, studies have focused on the compliance side of morality. Curry et al.¹² suggested that morality as cooperation can predict a broader moral phenomenon than the other accounts of morality and reported that seven cooperative behaviors were evaluated as morally good for 60 diverse societies. Further, emotional studies have argued that positive and negative emotions are relatively independent (e.g.,^24,25,26,27). For instance, previous studies suggested that the negative and positive affects facilitated problem-solving (e.g.,²⁶) in different ways and differently affected job performances²⁷. Moreover, Cunningham, Steinberg, and Grev²⁸ reported that positive moods and negative events (guilt-related) increased helping behaviors under different conditions. Given the differences between positive and negative functions, the studies of moral judgments will require the accumulation of data on both compliance and violation.

Here, we focused on how moral judgments are exclusive across moral categories. Examining the extent to which moral categories could be subjectively distinguished, we considered that it would be possible to explore whether the framework of moral judgment is multidimensional, at least in terms of explicit judgments. Moreover, we focused not only on the aspect of moral violation but also on the aspect of moral compliance to explore whether the framework of judgments for moral violation and compliance would be different.

For this purpose, we developed a collection of episodes of moral violation and moral compliance associated with various moral contents. To cover the diverse and complex moral judgments, we referenced the Haidt’s taxonomy of morality. First, a total of 390 episodes of moral violation, moral compliance, and affective but moral-neutral related behaviors were created. Then, participants rated the valence, arousal, and the degree of morality of the main character in the episode. They also rated how relevant the episode was to contents of each of the five moral foundations using a 0–100% scale. Based on the degree of relevance of each episode to the five moral foundations, we performed cluster analyses separately for the moral violation episodes and the moral compliance episodes.

Experiment 1

Methods

The codes of the analysis and episodes set are available and can be accessed at [https://osf.io/jtxyp/?view_only=63607826eed24593a4a7c7f0337b0b62].

Sample

Participants were recruited through the Yahoo crowdsourcing system (https://crowdsourcing.yahoo.co.jp/; January 25th to 26th, 2021). Considering the validity and reliability of the ratings per episode and the crowdsourcing system’s limitation, the participants were recruited so that the number of participants rating each episode would be at least 100. Finally, 1555 participants (599 women, 956 men, mean age = 45.57 years, age range = 18–78 years) participated in the study. All participants gave their informed consent over the Internet before participating in all experiments in our study. Ethical approval for our study was obtained by the Ethics Review Committee on Research with Human Subjects in Waseda University, and the study was conducted in accordance with relevant guidelines and standard including the Ethical Guidelines for Medical and Biological Research Involving Human Subjects and the Declaration of Helsinki. The present study was not preregistered.

Instrument

One of the authors created 390 episodes with reference to the episodes of Anderson, Siegel, Bliss-Moreau, and Barrett²⁹ and Konishi, Oe, Shimizu, Tanaka, and Ohtsubo³⁰, which included the part of episodes from Shirai and Ogawa³¹. There were 13 types of episodes (moral compliance: care, fairness, authority, ingroup, purity; moral violation: harm, unfairness, despise, betrayal, impurity; affective but moral-neutral: positive, negative, neutral) and 30 sentences for each type. Finally, due to technical problems, the following episode categories resulted in changes in the number of episodes (29 Fairness episodes, 31 Unfairness episodes, and 29 Positive episodes were used). All the episodes included the contents describing the actions of a person. The episodes of each type were created by follows. Episodes of Care were created to include keywords related to protecting and helping others. Episodes of Fairness were prepared to relate to cooperation and equality. Episodes of Authority were designed to relate to the respect and defense for the superior and its symbols. Episodes of Ingroup were created to include the acts of respect for the in-group members. Episodes of Purity were designed to relate to the acts such as valuing the chastity and purity and avoiding contamination. Episodes of Harm were created to include the acts associated with the physically or mentally harming others. Episodes of Unfairness were prepared to include the keywords related to cheating others and unequal behavior. Episodes of Despise were created to include the acts related to disobeying a superior and defiling its symbols. Episodes of Betrayal were created to include the acts related to betraying in-group members. Episodes of Impurity were designed to include the acts related to impurity such as staining the sacred objects. The affective but moral-neutral episodes were created without considering the five moral foundations: Episodes of Positive category were designed to include the good things that happen independent of social interactions with others. Episodes of Negative category were designed to include bad events that occur independently of interactions with others. Episodes of Neutral category were created to include events in daily lives that do not seem evoke emotions strongly. Each episode was created by referring to events that occurred in daily lives or were reported in news reports.

Procedure

All participants completed the experiment online through the Yahoo crowdsourcing system. The experiment was created by Qualtrics (qualtrics.com). In the experiment, 26 episodes were shown on the display in sequence with the questionnaire. Participants were asked to read the episode and rate the valence (1 = extremely unpleasant, 9 = extremely pleasant) and arousal (1 = extremely low, 9 = extremely high) for the episode, the morality of the main character in the episode (1 = extremely moral violation acts, 9 = extremely moral compliance acts) and how much the content of the episode related to each moral category in percentages (Supplementary Material). The episodes rated by the participants were designed to include two episodes for each type of episode (i.e., 10 moral compliance episodes: 2 care, 2 fairness, 2 authority, 2 ingroup, 2 purity; 10 moral violation episodes: 2 harm, 2 unfairness, 2 despise, 2 betrayal, 2 impurity; 6 affective but moral-neutral episodes: 2 positive, 2 negative, 2 neutral), for a total of 26 episodes evaluated. The episodes were divided into 15 sets of 26 episodes each, and participants rated the episodes from one of the sets. The 26 episodes in each set were presented in random order. There was no time limit for each question.

Results

Moral and affective episodes set (MAES)

We developed the set of moral compliance, moral violation, and affective but moral-neutral related episodes to understand the structures of moral categories. Table 1 shows the basic demographics in Experiment 1. Tables 2 and 3 show that summary descriptive statistics for the episodes. The episodes and the ratings for each episode are available on the OSF (https://osf.io/jtxyp/?view_only=63607826eed24593a4a7c7f0337b0b62). The descriptive statistics showed that the moral compliance episodes were evaluated as more moral than the moral violation episodes. The episodes of positive category were rated as more positive and the episodes of negative category were rated more negative, and the episodes of neutral category were rated as the median value among the scale. These confirmed the face validity of the episodes in each category (i.e., moral compliance, moral violation, affective but moral-neutral). On the ratings of each moral category, the episodes of Care category were rated as more positive, and the episodes of Harm category were evaluated as more negative than the episodes of other categories. Also, the episodes of Care/Harm categories were rated as the highest arousal among the episodes of other categories. Moreover, the episodes of Care category were rated as more moral and the episodes of Harm category were rated as more immoral than the episodes of other categories. We conducted the Kruskal–Wallis test for mean valence with each factor (5 moral categories: Harm/Care, Fairness/Unfairness, Ingroup/Betrayal, Authority/Despise, Purity/Impurity or 2 moral directions: moral violation and moral compliance) by using JASP³². Results showed that the mean valence was significantly influenced by the moral directions, H (1) = 224.08, p < 0.001, but not affected by the moral categories, H (4) = 0.18, p = 0.996. Dunn’s post hoc comparisons showed that the mean valence for moral compliance episodes was significantly higher than those for moral violation episodes (p < 0.001). Also, the mean valence of the moral violation episodes was affected by the moral categories, H (4) = 34.41, p < 0.001. Pairwise comparisons showed that Harm-related episodes were rated more negative than the other episodes (unfairness: p = 0.004, despise: p < 0.001, betrayal: p < 0.001, impurity: p = 0.002). There were no significant differences between other episode pairs (ps > 0.494). Moreover, the moral categories affected the moral compliance episodes, H (4) = 34.19, p < 0.001. The Care-related episodes were rated more positive than the Authority-related, Ingroup-related, and Purity-related episodes (ps < 0.001). There were no significant differences between other episode pairs (ps > 0.074). Next, we examined the effect of the moral categories or the moral direction on the mean arousal scores. Results showed that the moral categories affected the mean arousal, H (4) = 91.42, p < 0.001, but not the moral direction, H (1) = 1.24, p = 0.266. Care/Harm-related episodes were rated as high arousal compared to the other types of episodes (ps < 0.001). Authority/Despise-related episodes were rated as high arousal than the Fairness/Unfairness-related and Purity/Impurity-related episodes (p < 0.001; p = 0.048). For moral violation episodes, the moral categories significantly affected the arousal scores, H (4) = 53.90, p < 0.001. Pairwise comparisons showed that Harm-related episodes were evaluated as high arousal compared to other moral categories (ps < 0.001). For moral compliance episodes, the moral categories significantly affected the arousal scores, H (4) = 53.90, p < 0.001. Pairwise comparisons showed that Care-related episodes were evaluated as high arousal compared to Authority-related, Ingroup-related, and Purity-related episodes (ps < 0.001). Also, Fairness-related episodes were rated higher arousal than Authority-related episodes (p < 0.001) and Purity-related episodes (p = 0.007). These findings indicated that the individuals were highly sensitive especially to the contents related to the Care/Harm category.

Table 1 Basic demographics in Experiment 1.

Full size table

Table 2 Mean valence, arousal and morality for each moral compliance, moral violation, affective but moral-neutral episode.

Full size table

Table 3 Mean degree of relevance to moral categories for each moral compliance, moral violation, affective but moral-neutral episode.

Full size table

Correlations between valence, arousal, and morality

Figure 1 shows the relationships among valence, arousal, and morality ratings. The relationships between the valence and the arousal ratings (Fig. 1A) and between the morality and the arousal ratings (Fig. 1B) appeared to be boomerang or U-shaped. That is, the arousal ratings were higher when the valence and morality ratings of episodes were at their extreme values. Such boomerang shapes have been generally seen in the relationship between the affective valence and arousal ratings for the images^33,34,35. These findings demonstrate that the relationship between morality and arousal ratings is somewhat consistent with the relationship between affective valence and arousal ratings. Figure 1C shows the relationship between the morality and valence ratings. The mean morality ratings seemed to strongly correlate with the mean valence ratings, which mainly contained the moral or the moral violation episode. There also existed an almost horizontal distribution, which contained the affective but moral-neutral episodes. Since the moral compliance and moral violation episodes were created to include the acts related to morality and immorality, but not the affective but moral-neutral episodes, the range of morality ratings for moral compliance and violation episodes may have been wider, resulting in the different shapes of the distributions of included moral compliance and violation episodes from that of included the affective but moral-neutral episodes.

To investigate the associations of ratings of valence, arousal, morality, and relevance of moral categories to the episodes, we performed Pearson correlation analyses (we used the R for analysis; R code is included in the file placed in the OSF). Table 4 shows the correlation coefficients among the ratings of valence, arousal, morality, and relevance of moral categories to the episode. The ratings of morality were highly positively correlated with the ratings of valence, r = 0.96, t(387) = 63.86, p < 0.001 (see also Table 4). There were the negative correlations between arousal and valence ratings, r = − 0.13, t(387) = − 2.68, p = 0.008, and between arousal and morality ratings, r = − 0.16, t(387) = − 3.27, p = 0.001. Moreover, the positive correlations were found between the relevance ratings of all the moral categories. In particular, it was shown that when the relevance of Fairness to the episodes was assessed to be higher, the relevance of Care, Ingroup, and Purity to the episodes also tended to be assessed to be higher (Care: r = 0.64, t(387) = 16.35, p < 0.001; Ingroup: r = 0.73, t(387) = 21.30, p < 0.001; Purity: r = 0.62, t(387) = 15.63, p < 0.001). It was also shown that Purity category was highly related to Care category, r = 0.64, t(387) = 16.23, p < 0.001. Furthermore, Ingroup category was highly correlated with Authority category, r = 0.82, t(387) = 28.55, p < 0.001. These results indicate that the moral categories of the episodes are not completely independent.

Table 4 Correlations between valence, arousal, morality, and ratio of each moral categories for each episode.

Full size table

Cluster analyses of moral compliance and moral violation episodes by using MAES

By using MAES, we analyzed how many categories the moral compliance and violation episodes could be classified based on their relevance ratings of each category for the episodes.

For determining the best number of clusters, a cluster analysis using NbClust packages in R (method: ward.D2;^36,37) was conducted for the moral compliance and violation episodes, separately. NbClust is a package that provides the best clustering scheme according to 30 indicators for identifying the number of clusters³⁶. This method was selected to ensure that the choice of the number of clusters was not arbitrary. The moral violation episodes were classified into two clusters and the moral compliance episodes were classified into three clusters. Thus, the hierarchical clustering (ward.D2 method) with hclust and partitioning the data by the cutree function according to the optimal number of clusters calculated by the NbClust packages (ward.D2 method) (Table 5).

Table 5 Numbers of clusters for each gender or age.

Full size table

Figure 2 indicates the distributions of the degree of relevance to each moral categories for each cluster of moral compliance or violation episodes. One of the two clusters (i.e., Moral violation/Cluster 1) contained moral violation episodes with low degrees in all moral categories, while the other cluster (i.e., Moral violation/Cluster 2) contained moral violation episodes with high degrees in all moral categories. Our participants might tend to rate equally the relevance of all moral categories for the moral violation episodes. This might indicate that the moral categories for the immoral acts are not exclusive.

For the clusters of moral compliance episodes, the first group (i.e., Moral compliance/Cluster 1) included the episodes with a high degree of Ingroup and Authority categories, the second group (i.e., Moral compliance/Cluster 2) included the episodes with a high percentage of Care category, the third group (i.e., Moral compliance/Cluster 3) included the episodes with a high percentage of Ingroup and Fairness categories. These suggested that the Ingroup, Authority, and Fairness categories have similar evaluation trends, whereas Care category is relatively evaluated independently.

To explore the gender differences in the number of clusters, we divided the data based on reported gender and conducted the cluster analysis for the relevance ratings of each moral category. The results showed that the number of clusters in the moral violation episodes seemed to be relatively stable across genders. On the other hand, the estimated number of clusters in the moral compliance episodes was greater for the female participants than the male participants.

Previous studies suggested that the moral foundations might not be stable across the age of cohorts (e.g.,^38,39). To examine the effect of ages on the number of clusters, the participants were divided into four ages of cohorts (i.e., Emerging adults, Young adults, Middle adults, and Late adults) in accordance with Sağel³⁹. The results of cluster analysis for each age of the cohort showed that the number of clusters was not stable across ages in both moral violation and compliance episodes.

Gender differences

Studies have shown that there may exist gender differences in moral judgments^40,41. To test whether there were the gender differences of the valence, the arousal, the morality ratings for each category of our episodes, we conducted the Welch’s two-sample t test for each these ratings. The p values were not adjusted in the analyses as the episodes were qualitatively different. Figure 3 showed the differences of participant gender on valence, arousal, and morality ratings for each category of the episodes. The results showed that our female participants rated negatively more than our male participants in the episodes associated with Harm (female: Mean = 1.73, SD = 0.37; male: Mean = 1.99, SD = 0.36; t(57.95) = 2.85, p = 0.006, d = − 0.71) and negative contents (female: Mean = 3.22, SD = 0.46; male: Mean = 3.57, SD = 0.42; t(57.43) = 3.00, p = 0.004, d = − 0.79). In addition, the female participants rated positively than the male participants for the positive episodes (female: Mean = 7.26, SD = 0.46; male: Mean = 6.64, SD = 0.47; t(55.92) = 5.08, p < 0.001, d = 1.33). There were no gender differences in other episode categories (t < 1.96, ps > 0.05). Regarding the arousal ratings, the results showed that the male participants tended to perceive higher arousal than the female participants for the episodes associated with Authority (female: Mean = 4.33, SD = 0.62; male: Mean = 4.74, SD = 0.46; t(53.49) = 2.93, p = 0.005, d = − 0.75) and with Purity (female: Mean = 4.41, SD = 0.71; male: Mean = 4.93, SD = 0.50; t(52.09) = 3.26, p = 0.002, d = − 0.85), and the female participants tended to perceive higher arousal for episodes associated with Harm (female: Mean = 6.40, SD = 0.68; male: Mean = 5.73, SD = 0.54; t(54.98) = − 4.21, p < 0.001, d = 1.09) and Impurity (female: Mean = 5.46, SD = 0.75; male: Mean = 4.99, SD = 0.68; t(57.47) = 2.53, p = 0.014, d = 0.66). The differences of gender in other episode categories were not significant (t < 1.92, ps > 0.06). For the morality ratings, the female participants evaluated as more morally than the male participants for Fairness (women: Mean = 7.46, SD = 0.56; men: Mean = 7.15, SD = 0.50; t(55.24) = 2.24, p = 0.029, d = 0.58) and the positive episodes (women: Mean = 6.05, SD = 0.40; men: Mean = 5.69, SD = 0.31; t(52.58) = 3.81, p < 0.001, d = 1.01). We found no significant difference of gender on the other episode categories (t < 1.85, ps > 0.07). Thus, while the modality judgments are generally consistent between men and women, there are slight differences in ratings between men and women on some moral foundations.

Ethics approval

Ethical approval for our study was obtained by the Ethics Review Committee on Research with Human Subjects in Waseda University (No. 2019-357(1)).

Consent to participate

Informed consent was obtained from all participants included in the study.

Consent for publication

All participants have consented to the submission of the data to the journal.

Discussion

We explored the categories of moral violation and moral compliance by developing the Moral and affective episodes set (MAES) and evaluating them. Experiment 1 showed that the moral compliance episodes were divided into three clusters, whereas the moral violation episodes were divided into two clusters, suggesting that the frameworks in judging moral compliance and moral violation might be different.

Moral violation episodes seemed to be grouped by the perceived levels of immorality for the episodes, rather than the moral categories. This suggests that moral violation behaviors are being judged based on a single framework (i.e., degree of immorality) rather than based on multiple moral foundations. On the other hand, the moral compliance episodes seemed to be clustered by more than two categories. It is considered that the plurality of moral judgments may fit the structure of moral compliance rather than that the structure of moral violation.

For the episodes of moral compliance, we observed three clusters as follows (i.e., Cluster 1: highly related to Ingroup and Authority; Cluster 2: highly related to Care; Cluster 3: highly related to Ingroup and Fairness). The relevance ratings especially for Care seemed to change independently of other moral bases. Such independence of Care is also in line with previous studies and intuitions. For example, several studies arguing for the universality of moral rules have considered the harm component to be particularly crucial in moral judgments and have focused on harm as a basic concept (e.g.,^14,15,42). Our results also showed that the moral episodes with higher relevance ratings of Ingroup also had higher relevance ratings of Authority or Fairness, suggesting that the judgments about the moral events related to Ingroup could affect the judgments about Authority or Fairness. Consistent with our findings, the previous studies supposed that the categories of ingroup/loyalty and authority/respect could be grouped into one category as the ethic related to the community and groups^19,23. In addition, the relevance ratings of Fairness change with the relevance ratings of Ingroup but not the relevance ratings of Authority; thus, it is possible that the moral categories of Ingroup and Authority is partially, rather than completely, overlap.

Fairness and Ingroup have been assigned into different ethics types in previous taxonomies of morality (e.g., ethics of autonomy and community^13,23; rights of individuals and values of group unity¹⁹); however, our findings showed that the relevance ratings about Fairness changed along with the relevance ratings of Ingroup. One possible explanation is that the episodes about Fairness sometimes included words associated with the groups (i.e., company or family); thus, the episodes about Fairness might be evaluated as having a higher association with the loyalty for the groups. Also, all participants in the present study were recruited through the Japanese crowdsourcing website and therefore most of them were Japanese. It has been reported that Asian cultures were placed relatively high importance on ethics associated with collectivism⁴³. When the peoples with high collectivist tendencies evaluate the episodes about Fairness, it may be that they focus on the collectivist issues, rather than the individualistic issues in episodes.

Our results indicated that there were different sensitivities to the moral categories depending on gender. Specially, the female participants rated the events related to Harm as more negative than the male participants. On the other hand, the male participants evaluated the events related to Authority as leading to more arousal than the female participants. These findings are in line with to the previous studies that suggested that women judged severely for the acts related to Care, Fairness, and Purity, whereas men judged severely for the acts related to Authority and Ingroup (e.g.,^40,41). To date, the differences in gender identity can be explained by the intertwined effects of nature and nurture⁴⁴. For the effects of nurture, Niazi, Inam, and Akhtar⁴⁵ examined the effects of gender stereotypes on the predictions of moral ratings and suggested that there is a stereotype that women estimate that men are less sensitive to the moral category of Care. Furthermore, since the participants were asked to enter their gender before the survey began in the present study, these procedures may have facilitated attention to their own gender, which might influence their moral judgments for the events. Further research will be needed to determine how moral judgments change depending on what identity of their own they direct their attention.

Experiment 2

In Experiment 2, the survey was conducted to confirm the reproducibility of the results in Experiment 1. We recruited 1600 participants through the Yahoo crowdsourcing system (https://crowdsourcing.yahoo.co.jp/; April 18th to May 6th, 2024). The data were collected to match the gender and age distribution as closely as possible. If the participants answered incorrectly on a basic quiz about the categories of morality, we judged that the participants did not fully understand the survey and the data were excluded from further analysis. Finally, 1265 participants’ data (648 women, 599 men, others 4, unknown 3, no answer 11, mean age = 43.78 years, age range = 18–80 years) were used in Experiment 2.

The procedure is identical to Experiment 1 except one episode’s contents. Because the parts of technical problems in Experiment 1 was corrected, the number of Positive episodes was changed from 29 to 30.

Results

Table 6 shows the demographics of Experiment 2. Tables 7 and 8 summarize the descriptive statistics for the episodes in Experiment 2. For the relationships between the arousal ratings and affective valence or morality, the results also showed the boomerang shapes (see Fig. 4).

Table 6 Basic demographics in Experiment 2.

Full size table

Table 7 Mean valence, arousal and morality for each moral compliance, moral violation, affective but moral-neutral episode.

Full size table

Table 8 Mean degree of relevance to moral categories for each moral compliance, moral violation, affective but moral-neutral episode.

Full size table

Correlations between valence, arousal, and morality

We performed Pearson correlation analyses to examine the relationships among the ratings of valence, arousal, morality, and relevance of moral categories to the episode. The results showed in Table 9. The overall trend of results was identical to that of Experiment 1.

Table 9 Correlations between valence, arousal, morality, and ratio of each moral category for each episode.

Full size table

Cluster analyses of moral compliance and moral violation episodes

Utilizing the episode data from Experiment 2, we analyzed how many categories the moral compliance and violation episodes could be clustered into. The analysis methods were identical to Experiment 1. The analysis for determining the best number of clusters resulted that the moral violation episodes were classified into three clusters and the moral compliance episodes were classified into four clusters. Therefore, the moral violation and compliance episodes were partitioning by the cutree function according to each optimal number of clusters.

Figure 5 shows the distributions of the relevance to each five moral categories for each cluster of moral compliance or violation episodes. The distributions of all moral categories in moral violation in Cluster 1 were relatively located at the average ratios. The moral violation episodes in Cluster 2 contained episodes with low degrees of relevance to all moral categories, compared to Cluster 1. Further, the moral violation episodes in Cluster 3 included the episodes with a high degree of Harm category. These results were partially overlapped with the results in Experiment 1. It seems that there is a bias to respond equally to the relevance of all moral categories for the moral violation episodes. However, as seen in Cluster 3, there were judgments in which especially Harm was rated strongly.

The number of clusters in moral compliance episodes was four; Cluster 1 of moral compliance episodes contained the episodes with a high degree of Ingroup and Authority categories, Cluster 2 of moral compliance episodes had the episodes with a high percentage of Purity category, Cluster 3 of moral compliance episodes included the episodes with a high percentage of Care category, Cluster 4 of moral compliance episodes included the episodes with a high percentage of Fairness and Ingroup categories. These results showed that the judgments about Ingroup, Authority, and Fairness are similar, whereas the Care or Purity category is judged independently for the specific episodes.

Table 10 shows the results of cluster analysis by each gender (only responses from women or men were analyzed for the cluster analysis) and age. Results showed that there were no differences in the cluster numbers across the genders in both moral violation and compliance episodes. For the age-related differences, the number of clusters was two or three across age cohorts in moral violation episodes. For the moral compliance episodes, the number of clusters was not stable across age cohorts.

Table 10 Number of clusters for each gender or age.

Full size table

Discussion

We conducted Experiment 2 to confirm the reproducibility of the results in Experiment 1. Results indicate that the moral compliance episodes have four clusters, whereas the moral violation episodes provide three clusters. The number of clusters in Experiment 2 was different from those in Experiment 1. Specifically, the moral violation episodes that are independently judged for Harm and the moral compliance episodes that are rated independently for Purity were found in Experiment 2.

General discussion

We examined the extent to which moral judgments are exclusive across moral categories and explored whether the framework of judgments for moral violation and compliance would be different. The present study demonstrated that some moral judgments such as care/harm are relatively exclusive across moral categories. Also, regarding moral compliance behaviors, authority-related or fairness-related judgments appeared to be easily mixed with ingroup-related judgments. Furthermore, our findings showed that the number of clusters was different between moral violation and moral compliance episodes. Specifically, the number of clusters in moral compliance episodes was greater than those in moral violation episodes. It is possible that the moral judgments for the moral violation episodes were borderless across moral categories compared to those for the moral compliance episodes (Supplementary Material).

Both experiments suggested that moral violations are somewhat judged based on a single framework (i.e., degree of immorality) compared to moral compliance. One possible explanation concerns the generalization of negative responses. In the context of fear learning, it was known that various objects that resemble fear conditioned objects also elicit fear (i.e., fear generalization⁴⁶). Similarity, it is possible that negative responses for specific moral violated acts spread to responses for other moral violated acts. Further studies should examine how the impact of negative emotions for the moral violation acts would confuse classification of mora acts.

We found that the number of clusters was not fixed across experiments, suggesting that the structures of moral judgments might not be unchanged. The data from Experiment 1 was collected during the period of impact of COVID-19, whereas the data of Experiment 2 was gathered after the declaration of a state of emergency about COVID-19. Aftereffects of the threat for infections might affect moral judgments. In accordance with the findings, the several studies pointed out the possibility that the moral bases were not be stable across time and susceptible to flexible cognitive and emotional states^38,47.

There exist some limitations in this study that need to be considered (Table 11). First, the participants were the people in Japan; therefore, our findings may be specific to the present samples. Future studies should examine whether the differences between moral violation and moral compliance will be generalized in other countries. If there are differences in the categorization for moral violation and/or moral compliance among countries, the moral-related beliefs may be modulated by the rules and customs in the countries or environments where people have interacted. Furthermore, previous studies pointed out that the political ideology (i.e., liberals and conservatives) would affect the moral judgments and the configurations of the five moral foundations (e.g.,¹⁹). Brown, Chua, and Lukaszewski⁴⁸ suggested that the socioeconomic status was related to the endorsing “binding” foundations (loyalty, purity, respect for authority). Further studies should focus on how the political and socioeconomic status influence the clusters of moral judgments. The second concern is that we used the cluster analysis for the relevance ratings for moral categories. Other analyses should be utilized to test for generalizability; for example, further insights might be gained by natural language processing for the episodes set. Third, using episodes contained the contents of rules and place names in Japan and/or Asia. Whether the observed moral categories in this study are found in other countries and environments may depend on the understanding of the episodes. Further studies should explore whether the findings depend on the understanding of the episodes. Moreover, if one tries to verify these episodes in other countries, it may be difficult to understand the meanings of the episodes. For further studies, we placed the English versions of episodes set in OSF that were marked to the episodes that included Japanese and/or Asian-specific names and rules. Several previous studies suggested that the moral judgments are affected by wordings (e.g., “always”) and framings^49,50,51,52. The present study did not confirm whether the episodes had clear thematic and linguistic distinctions between moral categories. Also, it should be noted that the possibility that the episodes were unintentionally biased by the author cannot be rejected. A future study should examine whether the structures of moral violation and compliance would be different even when the wording are controlled. The last point concerns the type of stimuli. The richness of information may differ depending on the type of stimulus, such as words, episodes, images, and videos. Future studies should address whether the differences of judgments in moral violation and moral compliance differ among the types of stimuli.

Table 11 Limitation table in this study.

Full size table

In summary, we developed the moral and affective episodes set (MAES) and found that the frameworks of judgments of morality and immorality would be different. Our findings point to the importance of considering not only on moral violation, but also on moral compliance for further understanding the structure of the moral foundations. The MAES provides the ratings of affective valence, arousal, morality, and relevance of each moral foundation for 389 moral compliance and violation episodes. It would be interesting and important to test the validity of the MAES in different populations. The MAES with all the ratings can be freely available for research purpose through this URL (OSF, https://osf.io/jtxyp/?view_only=63607826eed24593a4a7c7f0337b0b62).

Data availability

The datasets generated during and/or analyzed during the current study are available in the OSF repository, [https://osf.io/jtxyp/?view_only=63607826eed24593a4a7c7f0337b0b62].

References

Mikhail, J. Universal moral grammar: Theory, evidence and the future. Trends Cogn. Sci. 11(4), 143–152. https://doi.org/10.1016/j.tics.2006.12.007 (2007).
Article PubMed Google Scholar
Carchidi, V. J. The nature of morals: How universal moral grammar provides the conceptual basis for the Universal Declaration of Human Rights. Hum. Rights Rev. 21(1), 65–92 (2020).
Article Google Scholar
Chomsky, N. Syntactic Structures (Mouton, 1957).
Book Google Scholar
Haidt, J. & Joseph, C. How moral foundations theory succeeded in building on sand: A response to Suhler and Churchland. J. Cogn. Neurosci. 23(9), 2117–2122 (2011).
Article Google Scholar
Haidt, J. & Joseph, C. The moral mind: How 5 sets of innate moral intuitions guide the development of many culture-specific virtues, and perhaps even modules. In The Innate Mind Vol. 3 (eds Carruthers, P. et al.) 367–391 (Oxford, 2007).
Google Scholar
Haidt, J. The emotional dog and its rational tail: A social intuitionist approach to moral judgment. Psychol. Rev. 108(4), 814–834 (2001).
Article CAS PubMed Google Scholar
Haidt, J. & Joseph, C. Intuitive ethics: How innately prepared intuitions generate culturally variable virtues. Daedalus 133(4), 55–66 (2004).
Article Google Scholar
Iyer, R., Koleva, S., Graham, J., Ditto, P. & Haidt, J. Understanding libertarian morality: The psychological dispositions of self-identified libertarians. PLoS ONE 7(8), e42366 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Atari, M. & Haidt, J. Ownership is (likely to be) a moral foundation. Behav. Brain Sci. 46, e326 (2023).
Article PubMed Google Scholar
Atari, M. et al. Morality beyond the WEIRD: How the nomological network of morality varies across cultures. J. Personal. Soc. Psychol. 125(5), 1157–1188 (2023).
Article Google Scholar
Atari, M. Culture of honor. In Encyclopedia of Personality and Individual Differences (eds Zeigler-Hill, V. & Shackelford, T. K.) 977–980 (Springer, 2020).
Chapter Google Scholar
Curry, O. S., Mullins, D. A. & Whitehouse, H. Is it good to cooperate? Testing the theory of morality-as-cooperation in 60 societies. Curr. Anthropol. 60(1), 47–69 (2019).
Article Google Scholar
Shweder, R. A., Much, N. C., Mahapatra, M. & Park, L. The ‘“big three”’ of morality (autonomy, community, divinity), and the ‘“big three”’ explanations of suffering. In Morality and Health (eds Rozin, P. & Brandt, A.) (Routledge, 1997).
Google Scholar
Royzman, E. B., Leeman, R. F. & Baron, J. Unsentimental ethics: Towards a content-specific account of the moral–conventional distinction. Cognition 112(1), 159–174 (2009).
Article PubMed Google Scholar
Sousa, P., Holbrook, C. & Piazza, J. The morality of harm. Cognition 113(1), 80–92 (2009).
Article PubMed Google Scholar
Haidt, J. & Joseph, C. The moral mind: How five sets of innate intuitions guide the development of many culture-specific virtues, and perhaps even modules. In The Innate Mind. Foundations and the future Vol. 3 (eds Carruthers, P. et al.) 367–391 (Oxford University Press, 2008).
Chapter Google Scholar
Mitchell, P. R. & Schoeffel, J. Understanding Power: The Indispensable Chomsky (The New Press, 2002).
Google Scholar
Clifford, S., Iyengar, V., Cabeza, R. & Sinnott-Armstrong, W. Moral foundations vignettes: A standardized stimulus database of scenarios based on moral foundations theory. Behav. Res. Methods 47(4), 1178–1198 (2015).
Article PubMed PubMed Central Google Scholar
Graham, J., Haidt, J. & Nosek, B. A. Liberals and conservatives rely on different sets of moral foundations. J. Personal. Soc. Psychol. 96(5), 1029–1046 (2009).
Article Google Scholar
Matsuo, A., Sasahara, K., Taguchi, Y. & Karasawa, M. Development and validation of the Japanese moral foundations dictionary. PLoS ONE 14(3), e0213343 (2019).
Article CAS PubMed PubMed Central Google Scholar
Crone, D. L., Bode, S., Murawski, C. & Laham, S. M. The Socio-Moral Image Database (SMID): A novel stimulus set for the study of social, moral and affective processes. PLoS ONE 13(1), e0190954 (2018).
Article PubMed PubMed Central Google Scholar
McCurrie, C. H., Crone, D. L., Bigelow, F. & Laham, S. M. Moral and Affective Film Set (MAAFS): A normed moral video database. PLoS ONE 13(11), e0206604 (2018).
Article PubMed PubMed Central Google Scholar
Haidt, J. & Graham, J. When morality opposes justice: Conservatives have moral intuitions that liberals may not recognize. Soc. Justice Res. 20(1), 98–116 (2007).
Article Google Scholar
Watson, D., Clark, L. A. & Tellegen, A. Development and validation of brief measures of positive and negative affect: The PANAS scales. J. Personal. Soc. Psychol. 54(6), 1063 (1988).
Article CAS Google Scholar
Pettit, J. W., Kline, J. P., Gencoz, T., Gencoz, F. & Joiner, T. E. Jr. Are happy people healthier? The specific role of positive affect in predicting self-reported health symptoms. J. Res. Personal. 35(4), 521–536 (2001).
Article Google Scholar
Orita, R. & Hattori, M. Positive and negative affects facilitate insight problem-solving in different ways: A study with implicit hints. Jpn. Psychol. Res. 61(2), 94–106 (2019).
Article Google Scholar
Van Yperen, N. W. On the link between different combinations of Negative Affectivity (NA) and Positive Affectivity (PA) and job performance. Personal. Individ. Differ. 35(8), 1873–1881 (2003).
Article Google Scholar
Cunningham, M. R., Steinberg, J. & Grev, R. Wanting to and having to help: Separate motivations for positive mood and guilt-induced helping. J. Personal. Soc. Psychol. 38(2), 181–192. https://doi.org/10.1037/0022-3514.38.2.181 (1980).
Article Google Scholar
Anderson, E., Siegel, E. H., Bliss-Moreau, E. & Barrett, L. F. The visual impact of gossip. Science 332(6036), 1446–1448 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Konishi, N., Oe, T., Shimizu, H., Tanaka, K. & Ohtsubo, Y. Perceived shared condemnation intensifies punitive moral emotions. Sci. Rep. 7(1), 7289 (2017).
Article ADS PubMed PubMed Central Google Scholar
Shirai, R. & Ogawa, H. Morality extracted under crowding impairs face identification. i-Perception https://doi.org/10.1177/20416695221104843 (2022).
Article PubMed PubMed Central Google Scholar
JASP Team. JASP (Version 0.18.3) [Computer software] (2024).
Kurdi, B., Lozano, S. & Banaji, M. R. Introducing the open affective standardized image set (OASIS). Behav. Res. Methods 49(2), 457–470 (2017).
Article PubMed Google Scholar
Kuppens, P., Tuerlinckx, F., Russell, J. A. & Barrett, L. F. The relation between valence and arousal in subjective experience. Psychol. Bull. 139(4), 917 (2013).
Article PubMed Google Scholar
Lang, P. J. The emotion probe: Studies of motivation and attention. Am. Psychol. 50(5), 372 (1995).
Article CAS PubMed Google Scholar
Charrad, M., Ghazzali, N., Boiteau, V. & Niknafs, A. NbClust: An R package for determining the relevant number of clusters in a data set. J. Stat. Softw. 61(6), 1–36. https://doi.org/10.18637/jss.v061.i06 (2014).
Article Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing (2023). http://www.R-project.org/.
Friesen, A. Generational change? The effects of family, age, and time on moral foundations. The Forum 17(1), 121–140 (2019).
Article MathSciNet Google Scholar
Sağel, E. Age Differences in Moral Foundations Across Adolescence and Adulthood. Master's thesis, Middle East Technical University (2015).
Graham, J. et al. Mapping the moral domain. J. Personal. Soc. Psychol. 101(2), 366–385 (2011).
Article Google Scholar
Atari, M., Lai, M. H. C. & Dehghani, M. Sex differences in moral judgments across 67 countries. Proc. R. Soc. B 287, 1–10. https://doi.org/10.1098/rspb.2020.1201 (2020).
Article Google Scholar
Turiel, E. The Development of Social Knowledge: Morality and Convention (Cambridge University Press, 1983).
Google Scholar
Kawamura, K. Y. Body image among Asian Americans. In Encyclopedia of Body Image and Human Appearance (ed. Cash, T. F.) 95–102 (Academic Press, 2012).
Chapter Google Scholar
Eagly, A. H. & Wood, W. Janet Taylor Spence: Innovator in the study of gender. Sex Roles 77(11), 725–733 (2017).
Article Google Scholar
Niazi, F., Inam, A. & Akhtar, Z. Accuracy of consensual stereotypes in moral foundations: A gender analysis. PLoS ONE 15(3), e0229926 (2020).
Article CAS PubMed PubMed Central Google Scholar
Dymond, S., Dunsmoor, J. E., Vervliet, B., Roche, B. & Hermans, D. Fear generalization in humans: Systematic review and implications for anxiety disorder research. Behav. Ther. 46(5), 561–582 (2015).
Article PubMed Google Scholar
Smith, K. B., Alford, J. R., Hibbing, J. R., Martin, N. G. & Hatemi, P. K. Intuitive ethics and political orientations: Testing moral foundations as a theory of political ideology. Am. J. Political Sci. 61(2), 424–437 (2017).
Article Google Scholar
Brown, M., Chua, K. J. & Lukaszewski, A. W. Formidability and socioeconomic status uniquely predict militancy and political moral foundations. Personal. Individ. Differ. 168, 110284 (2021).
Article Google Scholar
O’Hara, R. E., Sinnott-Armstrong, W. & Sinnott-Armstrong, N. A. Wording effects in moral judgments. Judgm. Decis. Mak. 5(7), 547–554 (2010).
Article Google Scholar
Barbosa, S. & Jiménez-Leal, W. It’s not right but it’s permitted: Wording effects in moral judgement. Judgm. Decis. Mak. 12(3), 308–313 (2017).
Article Google Scholar
Petrinovich, L. & O’Neill, P. Influence of wording and framing effects on moral intuitions. Ethol. Sociobiol. 17(3), 145–171 (1996).
Article Google Scholar
Blankenship, K. L., Craig, T. Y. & Machacek, M. G. The interplay between absolute language and moral reasoning on endorsement of moral foundations. Front. Psychol. 12, 569380 (2021).
Article PubMed PubMed Central Google Scholar

Download references

Funding

This work was supported by JSPS KAKENHI Grant Number JP20J00838; KAKENHI Grant Number 22H00090.

Author information

Authors and Affiliations

Faculty of Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan
Risako Shirai & Katsumi Watanabe
Japan Society for the Promotion of Science, Tokyo, Japan
Risako Shirai

Authors

Risako Shirai
View author publications
You can also search for this author in PubMed Google Scholar
Katsumi Watanabe
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.S. designed the study, conducted the online experiments and analyses; K.W. coordinated the study. All authors contributed to the writing of the final manuscript.

Corresponding author

Correspondence to Risako Shirai.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Shirai, R., Watanabe, K. Different judgment frameworks for moral compliance and moral violation. Sci Rep 14, 16432 (2024). https://doi.org/10.1038/s41598-024-66862-9

Download citation

Received: 22 January 2024
Accepted: 04 July 2024
Published: 16 July 2024
DOI: https://doi.org/10.1038/s41598-024-66862-9
Springer Nature Limited

Different judgment frameworks for moral compliance and moral violation

Abstract

Similar content being viewed by others

Everyday moral transgressions (EMTs): Investigating the morality of everyday behaviors

Moral Motivation and the Four Component Model

The Moral Identity Questionnaire predicts prosocial behavior better than the Moral Identity Scale

Explore related subjects

Introduction

Experiment 1

Methods

Sample

Instrument

Procedure

Results

Moral and affective episodes set (MAES)

Correlations between valence, arousal, and morality

Cluster analyses of moral compliance and moral violation episodes by using MAES

Gender differences

Ethics approval

Consent to participate

Consent for publication

Discussion

Experiment 2

Results

Correlations between valence, arousal, and morality

Cluster analyses of moral compliance and moral violation episodes

Discussion

General discussion

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation