Introduction

Children experience themselves as moral actors from an early age. In moral conflict situations, the wishes and attitudes of children may conflict with the needs of other children (Vera-Estay et al., 2016). In such conflict situations, children are faced with different choices of actions, being pulled into contrary directions by rival moral values, duties and reasons (e.g., conflicts between personal interests (e.g., going on time to a leisure park) and moral duties (helping a person in need) (Christensen & Gomila, 2012; Weller & Lagattuta, 2014). For acting morally, it can be crucial that children regulate their feelings, thoughts, and behaviors in conflict situations (Martel et al., 2007). The importance of self-regulation skills on social and emotional competencies of children has already been highlighted in previous literature (Eisenberg, 2000; Eisenberg et al., 1996; Rademacher & Koglin, 2019). Further studies highlighted association between impaired self-regulatory skills and negative (clinical) outcomes (Baldessarini et al., 2017; Robson et al., 2020). However, in order to make clear statements about the impact of self-regulatory skills on children’s morality, further discussions and research are needed (Blasi, 2013).

Self-regulation comprises various skills for controlling thoughts, emotions and behavior. These self-regulatory skills develop in early childhood, then increase rapidly from kindergarten and preschool age and develop in non-linear processes through to adulthood (Berger et al., 2007; Denham et al., 2014; Nigg, 2017; Zelazo & Carlson, 2012). The greatest growth in self-regulatory skills occurs in early to middle childhood (Raffaelli et al., 2005). At this age, children are increasingly able to inhibit behavior and initiate goal-directed behavior (McClelland et al., 2007). Since adaptive self-regulation strategies contribute to well-adjusted behavior in preschool or class settings, it seems particularly interesting to examine the association between self-regulation and morality for this age group (Skibbe et al., 2019). The literature highlights that moral action depends not only on moral variables, but also on impulse control, attention, emotional reactions, and the ability to delay gratification (Eisenberg, 2000; Kohlberg, 1981; Oser, 2013). Eisenberg et al. (2000) emphasizes the importance of emotionality and the ability to regulate emotions for theorizing moral development and behavior. Further studies found an association between children’s effortful control (temperamental aspect of self-regulation) and greater internalization of and compliance with rules (Kochanska et al., 1997; Kochanska & Knaack, 2003). It can be hypothesized that children who have high self-regulatory skills are able to put their own interests aside in morally conflicting situations and react morally.

Research has shown that clinical samples (e.g., samples with behavior disorders or callous-unemotional tendencies) differ from community samples in their morality and self-regulation and thus their experiences cannot be considered equivalent for the purposes of the review (Arsenio & Fleiss, 1996; Lotze et al., 2010).

In current studies, different terminologies and aspects of self-regulation are examined depending on the research perspective (Nigg, 2017). Temperament research focuses on effortful control (Kochanska et al., 1994, 1996, 1997); cognitive psychological perspectives focus on executive functions such as attention control and working memory (Cowell et al., 2015, 2017; Hinnant et al., 2013). Rademacher and Koglin (2019) highlighted that different forms and research perspectives of self-regulation should be analyzed separately so that the complex construct of self-regulation can be better understood. Similarly, research on morality also uses various constructs like moral emotions or cognition to answer the question of why a person behaves morally (Oser, 2013). In order to make clear statements about the relationship between self-regulation and morality, the different operationalizations and definitions of the constructs should be considered.

This systematic review and meta-analysis examine, the state of research on the empirical associations between self-regulation and morality. In the narrative synthesis, different definitions and operationalizations of the two constructs are considered; identifying differentiated relationships between the individual aspects of self-regulation in the context of morality. In addition, to further analyze the relationship between self-regulation and morality, a meta-analysis is conducted. Accordingly, the following research questions are processed: (1) How are self-regulation and morality defined and operationalized in this context? (2) Which empirical results are reported in current research regarding the question of associations between self-regulation and morality in preschool and elementary school age?

Due to the different research perspectives, a high degree of heterogeneity in the definitions and operationalizations of self-regulation and morality is expected (Nigg, 2017; Oser, 2013). To address these different research perspectives, a list of definitions for each study was synthesized to clarify which constructs and operationalizations were used (see Table 1). Despite the expected heterogeneity of the definitions, the association between self-regulation and morality should be examined in a meta-analysis. In the meta-analysis, the constructs are summarized in superordinate constructs depending on the research perspective and operationalization mentioned in the respective studies. For example, the constructs impulsivity and behavioral inhibition are summarized under the aspect of temperament-related self-regulation and constructs such as not cheating or sharing are summarized under the aspect of moral behavior. Concepts from the same study and research perspectives were amalgamated. Studies that focused on several research perspectives or components of self-regulation or morality were therefore included several times. A detailed assignment can also be found in Table 1 and Fig. 1.

Table 1 Definitions and operationalizations of self-regulation and morality and their associations
Fig. 1
figure 1

Mind map for different terminologies synthesized regarding self-regulation and morality

Methods

The systematic literature search in the bibliographic databases PsycINFO, Scopus and Web of Science was carried out in October 2020. Additionally, an update to the search up to and including March 2022 at the end of the process was done. Guidelines for preferred reporting items for systematic reviews and meta-analyses were followed (PRISMA; Moher et al., 2009; Page et al., 2021). Data organization and extraction was carried out with the software EPPI-Reviewer (Thomas et al., 2020). The statistical program R was used for meta-analytical calculations (R Core Team, 2020). In particular the statistical packages “meta” (Balduzzi et al., 2019), “metafor” (Viechtbauer, 2010), “dmetar” (Harrer et al., 2021), and “tidyverse” (Wickham et al., 2019) were used. Following search terms were used to identify studies that examined the associations between self-regulation and morality in preschool and elementary school children:

Self-regulation [self-regulat* OR self-control OR "emotion* regulation" OR "executive function*" OR "effortful control" OR Inhibit* OR impulsiv*] AND Morality [moral* OR guilt OR shame OR empathy OR sympathy OR jealous* OR pride OR embarrass*] AND

Preschool and schoolchildren [preschool* OR kindergart* OR nursery OR child* OR "Primary school*" OR "elementary school*" OR "basic education"]

Inclusion and exclusion criteria.

The following inclusion and exclusion criteria were defined: Included studies (1) examined the empirical association between self-regulation and morality. Studies that focused on these and other associations (e.g., moderation or mediation) were also included. (2) Included studies described relationships among children of preschool and elementary school age, i.e., around the age of three to eleven. If children’s ages were slightly out of range (± two years; applies to k = 18 studies), studies were also included. Included studies (3) reported primary empirical data, (4) had been published in English and were (5) published in peer-reviewed journals.

Studies with secondary or summarized data as well as reviews and theoretical papers were excluded. Additionally, experimental studies and studies with clinical samples or samples with children with a mental or physical impairment were excluded because they differ in their methodological and theoretical basis. Contributions such as dissertations, conference contributions, books and book chapters were excluded as well. When databases offered the possibility of setting search restrictions, the results were filtered by language and document type (articles only).

Additionally, necessary statistical parameters for calculating effect sizes must be reported regarding the meta-analytical calculations. If the necessary information were missing, the authors of the respective studies were contacted. If the necessary parameters could not be obtained, the studies were excluded. Two authors were contacted to inquire missing statistical parameters, one of whom did not respond and was therefore excluded from the meta-analysis. Since the analysis by Gummerum and López-Pérez (2020) only records frequency (percentages) analyses and does not pursue analysis of associations, this study was also excluded from the meta-analysis.

Study Identification

After the keyword-search, all 2538 hits were exported to the reference management program EPPI-Reviewer (Thomas et al., 2020). 754 duplicates were removed. The remaining 1784 articles were subjected to a title and abstract screening, in which the previously defined inclusion and exclusion criteria were checked. 124 articles were rechecked by full-text screening. As a result, 34 studies met all inclusion criteria and were considered in the further analyses. The references of the included studies were checked using manual backwards procedure and three additional studies were identified. A total of 37 studies were included in the narrative synthesis of this review. Two of these studies were excluded from the meta-analyses because the necessary statistical parameters could not be determined, since one author did not respond after contacting and another did not analyze associations between the constructs. Figure 2 presents the process of study identification.

Fig. 2
figure 2

Flow diagram. Note. PRISMA 2020 flowchart for study identification (based on Page et al., 2021)

Data Analysis and Evaluation

The following data were extracted from the studies: (1) study identification, (2) methods - participants (number, age, gender), (3) methods - sample (country, determination of sample size, selection process, remuneration), (4) methods - design (dependent and independent variables, instruments, cross-sectional or longitudinal section), (5) operationalization and definition of self-regulation and morality, (6) study objectives (research questions, hypotheses, theories), (7) results and discussion (analysis, control variables, results, statistical parameters for meta-analysis, discussion). The data extraction list was piloted with five articles. Most of the included studies had multiple and different research questions, which were not always related to self-regulation or morality. In such cases only results that are relevant to the current review were extracted.

Definitions and operationalizations relating to self-regulation and morality were reported, classified, and quantified in the narrative synthesis and are summarized in Table 1 and Fig. 1. In the narrative synthesis of the associations of the constructs, results of regression analyses were reported. Correlations results were reported, if no regression analyses were performed. Additionally, results of relationships between the constructs were examined quantitatively by using meta-analytical calculations.

For the meta-analysis, it was hypothesized that high levels of self-regulation are associated with high levels of morality. Mainly, correlation coefficients were extracted for better comparability. If necessary, effects were first transformed into r-values. Since different methods were used in the included studies, delta (Δ) was used as a uniform effect size. The effects were recoded in the same direction. Therefore, a positive effect suggested that a high level of self-regulation is associated with a high level of morality. If only insignificant results were reported, the effect size was set to zero (Cohen, 1988; Ellis, 2010; Lipsey & Wilson, 2001; Peterson & Brown, 2005; Rosenthal, 1994). Additionally, separate meta-analyses at the level of the different constructs were carried out. For constructs that could be combined (e.g., moral emotions such as guilt and shame) and when several effect sizes were reported, a mean study effect size was calculated (Beelmann & Bliesener, 1994).

Nine subgroups were examined to determine whether these had a moderating effect onto the association. Regarding the sample, subgroup analyses for the continents (North America, Europe, Asia, and South America) from which these were recruited and the age (Preschool age, School age and both age groups) were included. As methodologically relevant moderators, the design of the studies (cross-sectional or longitudinal), the source of the morality report (assessment completed by children versus others), and the source of the self-regulation report (assessment completed by children versus others) were included. Regarding statistically relevant moderator, the sample power was assessed by a post hoc calculation with G*Power (Faul et al., 2007, 2009) and categorizing these into underpowered (< .80) and enough power (> .80; Faul et al., 2007, 2009). Furthermore, the quality of the studies (fair versus good) were utilized as a subgroup (NHLBI, 2021).

Due to the different methods used in the included studies, the differences in population effects should be considered. Consequently, models with random effect sizes (Random effect models) were selected for the analyses (Ellis, 2010; Lipsey & Wilson, 2001; Rosenthal, 1994). To determine the heterogeneity Cochran’s Q, Higgin’s and Thompson’s I2, τ2 and p were calculated (Higgins & Thompson, 2002). If I2 = 25% the heterogeneity is low, if I2 = 50% it is moderate and if I2 = 75% the heterogeneity is high (Higgins et al., 2003). To identify possible sources of heterogeneity, influencer-analyses (Baujat et al., 2002; Harrer et al., 2021) were conducted and prior selected subgroups were analyzed. To determine publication bias, funnel plots were created and fail-safe N results were reported (Egger et al., 1997; Orwin, 1983; Rosenthal, 1979).

Additionally, the Study Quality Assessment Tool of the National Heart, Lung, and Blood Institute (NHLBI, 2021) was applied for all studies included (see Appendix A). A guide with 14 criteria was used to assess the quality of cross-sectional and longitudinal studies. Studies that (1) measured self-regulation prior to morality, (2) showed sufficient time between the measurement times, (3) recorded important control variables, and (4) adequately defined self-regulation and morality, were rated as “good”. Studies which did not meet the first two points, (e.g. cross-sectional studies) were rated as “fair”. If a study neither included control variables nor described the variables accordingly, it was classified as “poor” (NHLBI, 2021).

Results

Study Characteristics

A total of 37 studies were included in the narrative synthesis. The studies were published between 1974 and 2021. 6062 children between the ages of 22 months and 13 years took part in the studies. Sample sizes varied depending on the study (Minn = 36, Maxn = 999). The studies include samples from North America (n = 23), Europe (n = 6), Asia (n = 5) and South America (n = 1). Two studies included samples from different continents (Cowell et al., 2017; Narvaez et al., 2021). 12 studies have a longitudinal and 25 have a cross-sectional design. All cross-sectional studies were rated as “fair” in terms of their quality. All longitudinal studies were rated as “good”, with the exception of Feldman (2007) and Garner (2012), which were “fair” (NHLBI, 2021). The included studies focused on different aspects of self-regulation and utilized different conceptualizations. Temperament-related aspects of self-regulation were examined 21 times, executive functions eleven times, emotion regulation four times, and for five times there was no specific conceptualization in this direction (e.g., general cognitive aspects of self-regulation).

Different conceptualizations were also used in regard to morality: moral behavior was examined 15 times, emotions 13 times, cognition 12 times, conscience four times, moral abilities three times, and moral motivation and the moral self once each. In some cases, however, individual aspects with the example of moral emotions such as empathy or guilt were also conceptualized as an aspect of conscience. All results of the narrative synthesis are summarized in Table 1. Additionally, a mind map for the different terminologies that were synthesized regarding self-regulation and morality was created (see Fig. 1).

Narrative Synthesis of Definitions and Operationalizations of Self-Regulation

Temperamental Aspects of Self-Regulation

In the studies, temperament-related aspects of self-regulation were examined most frequently (see Table 1). Rothbart et al. (1994a, b) define temperament as biologically based individual differences in reactivity and self-regulation. Reactivity includes excitability and reactions associated with it. Additionally, it is part of the bottom-up regulatory processes, which mainly run automatically. Self-regulation is purposeful and deliberate and includes top-down processes of attentiveness. It also processes of approach, withdrawal, as well as self-calming. Self-regulation can help modulate reactivity (Nigg, 2017; Rothbart, 1989; Rothbart & Derryberry, 1981). Temperament-related aspects of self-regulation can therefore also cover different aspects of self-regulation. For example, Rothbart and Bates (2007) emphasizes that there are two temperament-related control systems: “One is part of an emotional reaction (fear and behavioral inhibition), the other is more completely self-regulatory (attentional control), with the first system developing earlier than the second” (Rothbart & Bates, 2007, p. 131). Children characterized as fearful, with little rapprochement or greater avoidance of novelty are classified as behaviorally inhibited (Asendorpf & Nunner-Winkler, 1992; Cornell & Frick, 2007; Kagan, 1989; Stifter et al., 2009). Low levels of behavioral inhibition therefore reflect low levels of self-regulation (Rothbart & Bates, 2007). Behavioral inhibition is defined as the “bottom-up interruption of a behavior sequence in response to novel, ambiguous, or threatening stimulus; mediated by internal state of anxiety. A component of bottom-up and reactive aspects of SR” (Nigg, 2017, p. 38). Most of the included studies measured behavioral inhibition using questionnaires (e.g., Behavioral Inhibition Questionnaire; Bishop et al., 2003) with reports from parents or teachers as well as self-reports from children. Augustine and Stifter (2015) and Stifter et al. (2009) used observations.

Inhibitory control is defined as the ability to suppress dominant stimuli or poorly adapted reactions (e.g., Kochanska et al., 1996; Smith et al., 2013). In contrast, impulsivity is defined as less intentional, conscious, or flexible and includes regulatory cognitive components (dos Santos et al., 2020; Eisenberg et al., 2009; Nigg, 2017). Inhibitory control is also defined as the main component of effortful control (Kochanska & Knaack, 2003; Rothbart & Bates, 2007). Effortful control comprises the ability to suppress dominant and prepotent reactions consciously to carry out subdominant reactions and includes top-down regulatory processes (Dong et al., 2021; dos Santos et al., 2020; Nigg, 2017; Stifter et al., 2009). Besides effortful control, the temperamental aspects of surgency (dimensions such as impulsiveness and shyness) and negative affect (malaise, fear, anger, frustration, sadness, and decreased reactivity and reassurance) were also measured in the included studies (Rothbart et al., 1994a, b; Smetana et al., 2012; Yoo & Smetana, 2021). Depending on age groups, the respective scales of the Children’s Behavior Questionnaire (CBQ; Rothbart et al., 2001), the Temperament in Middle Childhood Questionnaire (TMCQ; Simonds & Rothbart, 2004) or the Early Adolescent Temperament Questionnaire (EATQ; Capaldi & Rothbart, 1992) were used for measuring temperament-related aspects of self-regulation. Some included studies also used behavioral batteries with various tasks to measure effortful control and inhibitory control (e.g., Dong et al., 2021; Kochanska et al., 1997; Kochanska & Knaack, 2003; Stifter et al., 2009; Yoo & Smetana, 2021).

Executive Functions

Executive functions comprise cross-domain, social and cognitive abilities and processes that encompass behavioral, emotional and cognitive functions and control, including future-oriented, planned, and regulated behavior (Baker et al., 2021; Cowell et al., 2015, 2017; Wang et al., 2021; Zelazo & Müller, 2011). Effortful control and inhibitory control are also considered to be components of executive functions (Diamond, 2013; Reis & Sampaio, 2019; Wang et al., 2021; Smetana et al., 2012). Furthermore, shift in attention or flexibility, planning skills and working memory are defined as aspects of executive functions (Cowell et al., 2015; Hinnant et al., 2013). Tan et al. (2020) additionally conceptualize the ability to delay gratification as an aspect of executive functions (Mischel et al., 1972; Rodriguez et al., 1989). With regard to the respective definitions and aspects, overlaps with the temperament-related aspects of self-regulation were found. Both, effortful control or inhibitory control and executive functions include inhibition of prepotent or dominant reactions (dos Santos et al., 2020; Hinnant et al., 2013; Kochanska et al., 2009; Kochanska & Knaack, 2003; Stifter et al., 2009). Temperament-related aspects of self-regulation were often measured by parent reports in the included studies. Executive functions were measured by various tasks completed by children, for example, stroop-like tasks (Gerstadt et al., 1994) like day-night (Reis & Sampaio, 2019; Stifter et al., 2009). Measurement methods in the included studies also overlapped regarding different constructs. For example, the day-night task was also used to measure temperament-related aspects of self-regulation (e.g., dos Santos et al., 2020; Kochanska et al., 2009).

Emotion Regulation

Emotion regulation describes the ability to understand and respond to emotions (Garner, 2012). It is defined as a goal-oriented process that modulates, initiates, inhibits or maintains a sum of emotion-related, motivational, attention and behavioral processes (Eisenberg & Spinrad, 2004; Hinnant et al., 2013; Panfile & Laible, 2012). Negative emotionality is defined as the frequency, intensity, and duration of experiences with negative affective states (e.g., sadness or anger; Denham, 1998; Panfile & Laible, 2012; Rothbart & Putnam, 2002). According to Gummerum and López-Pérez (2020), interpersonal or extrinsic emotion regulation includes the regulation of the emotions of others to improve or worsen another’s active emotional state. Garner (2012) and Hinnant et al. (2013) used the parent report of the Emotion Regulation Checklist (ERC; Shields & Cicchetti, 1997) to measure emotion regulation. Panfile and Laible (2012) aggregated the scales of the ERC (Shields & Cicchetti, 1997) and the CBQ (Rothbart et al., 2001) and formed a scale for emotion regulation and a scale for negativity. Gummerum and López-Pérez (2020) used three hypothetical moral scenarios to measure interpersonal emotion regulation.

Other Aspect of Self-Regulation

Some studies used conceptualizations, like Ego Control (Asendorpf & Nunner-Winkler, 1992), general cognitive aspects of self-regulation (Tabibi et al., 2016), resistance to deviation (LaVoie, 1974), misbehavior (Narvaez et al., 2021) or self-regulated compliance (Feldman, 2007) that cannot be classified into the aforementioned classifications.

Narrative Synthesis of Definitions and Operationalizations of Morality

Moral Behavior

Moral behavior is defined as the ability to inhibit behaviors such as cheating, lying or rule violation (Asendorpf & Nunner-Winkler, 1992; Augustine & Stifter, 2015; Dong et al., 2021; Stifter et al., 2009; Wang et al., 2021). Further included studies suggest that prosocial behavior can mirror moral behavior and antisocial behavior can mirror immoral behavior (Asendorpf & Nunner-Winkler, 1992; Augustine & Stifter, 2015; Colasante et al., 2014; Stifter et al., 2009). For example, sharing or donating has been conceptualized as prosocial as well as moral behavior (Cowell et al., 2015, 2017; Reis & Sampaio, 2019; Smith et al., 2013; Tan et al., 2020; Wildeboer et al., 2017). Reis and Sampaio (2019) conceptualize sharing as a behavioral component associated with moral reasoning, arguing that sharing shows ways in which children apply and judge norms of justice. Colasante et al. (2014) and Dong et al. (2021) used the parental report of the My Child Conscience Instrument to measure moral behavior (Kochanska et al., 1994). All other included studies used behavioral observations in the context of tasks or play situations to assess the children’s moral behavior.

Moral Emotions

Research identified self-conscious emotions, such as guilt, shame, empathy, or pride as relevant variables in moral development (Eisenberg, 2000; Lewis, 2000; Muris et al., 2015; Tangney et al., 2007). Guilt is a feeling that is triggered by violations of internalized moral standards and is associated with worry, restlessness, tension, as well as the desire to make amends (Colasante et al., 2014, 2015; dos Santos et al., 2020; Hoffman, 2000). Shame in turn, is defined as an emotion including despondency, helplessness, and the desire to escape (dos Santos et al., 2020; Ferguson et al., 1999; Muris et al., 2015). Panfile and Laible (2012) conceptualize empathy as a precursor to prosocial and moral behavior (Eisenberg & Miller, 1987; Hoffman, 1990). Empathy is also associated with affective concern for a person in need (Young et al., 1999). Sympathy is conceptualized as the concern that arises from the perception of the emotional state of another (Colasante et al., 2014, 2015). Jambon et al. (2021) conceptualized happy victimizer tendencies as positive emotion expectations while harming others to achieve a goal. Parent reports were primarily used to measure children’s moral emotions in the included studies. Narvaez et al. (2021), Panfile and Laible (2012) and Colasante et al. (2014) used the My Child Conscience Instrument by Kochanska et al. (1994) to measure empathy or guilt. However, they did not discuss moral emotions with regard to the theory of conscience development. In addition, interviews with children including hypothetical moral conflicts were used to measure guilt or happy victimizer tendencies (Colasante et al., 2014; Colasante et al., 2015; Jambon et al., 2021). According to Gummerum and López-Pérez (2020) the moral emotion attribution can be distinguished from moral emotions. Moral emotion attribution encompasses the expected emotions in moral scenarios. To measure these, hypothetical moral scenarios of social exclusion were used.

Conscience

Five of the selected studies conceptualize moral emotions as part of the conscience (Cornell & Frick, 2007; Kochanska et al., 2009; Nicolais et al., 2017; Rothbart et al., 1994a, b; Stifter et al., 2009). Kochanska et al. (1994) described conscience development as a part of socialization. According to the conceptual model of early conscience development, there are two components: (1) the affective discomfort including arousal, fear of deviations, guilt and remorse related to actual or hypothetical misconduct, and (2) the behavioral control including the ability to refrain from misconduct, to exercise restraint from prohibited impulses and to implement behavioral standards (Kochanska, 1993; Kochanska et al., 1994). Furthermore, other aspects such as moral behavior (Kochanska et al., 1996, 1997; Nicolais et al., 2017; Stifter et al., 2009), moral cognition (Kochanska et al., 1997; Nicolais et al., 2017), concern after wrongdoing, internalized conduct (Narvaez et al., 2021) and the moral self (Kochanska et al., 1997) were conceptualized as part of conscience development. To measure conscience parental reports (e.g., My Child conscience instrument; Kochanska et al., 1994), behavioral observations or interviews with hypothetical moral situations were used.

Moral Cognition

Moral cognition contains mental processes, such as judgments or reasoning about moral issues (Bandura, 2002; Guerra et al., 1994). Gummerum and López-Pérez (2020), Smetana et al. (2012) and Yoo and Smetana (2021) consider moral judgment from the perspective of the social domain theory. Accordingly, they distinguish moral judgments from conventional or personal judgments (Smetana et al., 2014; Turiel, 1983, 2006; Yoo & Smetana, 2021). Five of the studies examine moral reasoning and conceptualize it as part of moral cognition (Baker et al., 2021; Feldman, 2007; Harari & Weinstock, 2021; Hinnant et al., 2013; Vera-Estay et al., 2016). Moral reasoning involves responses to situations where the needs or desires of others conflict with the own. The justification whether a person in need should be helped is recorded (Eisenberg, 1982). Moral cognition was measured using hypothetical moral scenarios presented in interview situations (e.g., social rules interview; Nucci & Turiel, 1978).

Other Aspect of Morality

Further studies used other conceptualizations such as moral motivation (Asendorpf & Nunner-Winkler, 1992) and moral abilities or competencies in a broader sense (Feldman, 2007; Garner, 2012; Tan et al., 2020).

Narrative Synthesis of the Associations between Self-Regulation and Morality

This configurative synthesis describes the association between different aspects of self-regulation and different aspects of morality. Only results that align with the review question are reported. First and foremost, results of regression analyses are reported. Correlation results were reported if no regression analyses were performed. Various confounding variables were included in the analyses of the selected studies. Detailed information on individual associations with the respective confounding variables are presented in Table 1.

Temperamental Aspects of Self-Regulation and Morality

Behavioral Inhibition

Asendorpf and Nunner-Winkler (1992) identified behavioral inhibition as a negative predictor of cheating and egoistic behavior mirroring immoral behavior. Augustine and Stifter (2015) also associated it with moral behavior. In contrast Stifter et al. (2009) reported that behavioral inhibition did not correlate with moral behavior and emotionality. Similarly, Nicolais et al. (2017) reported no correlations with any of the moral variables in the study (moral choice, emotions, cognition or behavior). Cornell and Frick (2007) on the other hand stated that behavioral inhibition was correlated with the moral emotion guilt, but not with empathy. Muris et al. (2015) identified positive correlations with the self-conscious moral emotions shame and guilt. When the shared variance of guilt and shame were controlled, correlations were weakened so that behavioral inhibition and guilt (shame-free guilt) were no longer significant. Moreover, Asendorpf and Nunner-Winkler (1992) reported no significant correlations between behavioral inhibition and moral motivation. The association between behavioral inhibition and morality has been examined in six studies. Behavioral inhibition was not linked to moral cognition nor to moral motivation. Results concerning moral behavior and moral emotions varied.

Inhibitory Control

Colasante et al. (2014) conducted a mediation analysis revealing the moral emotions guilt and sympathy as mediators of the association between inhibitory control and reparation as a part of moral behavior. Jambon et al. (2021) reported that greater inhibitory control predicted a faster decrease in happy victimizing tendencies over time. In contrast, Colasante et al. (2015) reported inhibitory control did not correlate with these moral emotions. Smith et al. (2013) identified different results depending on the measurement methods for self-regulation. Inhibitory control measured with the day-night task correlated positively with sharing as a part of moral behavior. There was no significant correlation with the bear-dragon task. Stifter et al. (2009) reported that inhibitory control did not correlate with moral behavior and emotionality. Further studies stated that inhibitory control was a positive predictor of conscience (Kochanska et al., 1997; Narvaez et al., 2021) and empathy (Narvaez et al., 2021). Similarly, Kochanska et al. (1996) reported that inhibitory control and impulsivity were significant predictors of conscience. The association between inhibitory control and morality has been examined in eight studies. Inhibitory control was linked to conscience. Additionally, results concerning moral behavior and moral emotions seem to differ.

Effortful Control, Negative Affect and Surgency

Dos Santos et al. (2020) reported that effortful control was a positive predictor for guilt, but not for shame. No significant associations between impulsivity (reflecting unregulated behavior) and guilt or shame were identified. Similarly, Rothbart et al. (1994a, b) identified that effortful control and negative affect (reflecting emotional components of regulation) positively predicted empathy and guilt/shame. Surgency (reflecting impulsive behavior) was not a predictor. Further studies stated that effortful control alone was not a predictor for moral behavior (Dong et al., 2021) or cognition (Yoo & Smetana, 2021). Kochanska et al. (2009) on the other hand, stated a positive correlation between effortful control and guilt and Smetana et al. (2012) reported that surgency and effortful control were predictors for children’s understanding of moral generalizability as a part of moral judgement. Kochanska et al. (1994) found that high effortful control predicted high affective moral discomfort for girls and higher active moral regulation or vigilance for both girls and boys. Kochanska and Knaack (2003) reported that effortful control was a positive predictor of conscience. The association between effortful control and morality has been examined in eight studies. Effortful control was linked to moral cognition and conscience. Results concerning moral emotions seemed to differ, depending on the specific emotion. There was no association between effortful control and moral behavior.

Executive Functions and Morality

Cowell et al. (2015), Reis and Sampaio (2019), Tan et al. (2020), Wang et al. (2021) and Wildeboer et al. (2017) reported that executive functions were not a significant predictor for moral behavior like, sharing, helping, donating or telling the truth. However, Cowell et al. (2017) reported that executive functions were a significant predictor for sharing. Stifter et al. (2009) identified a significant interaction effect between inhibition and moral emotionality with the executive functions acting as a moderator. Inhibited children who demonstrated higher levels of executive functions in preschool age showed less intense emotional responses in moral contexts. Hinnant et al. (2013) and Baker et al. (2021) stated that executive functions alone were not a significant predictor for moral reasoning. However, there was an interaction with emotion regulation or false belief understanding. Children who had low scores in both emotion regulation and executive functions also had lower scores in moral reasoning (Hinnant et al., 2013). Vera-Estay et al. (2016) reported that executive functions and moral reasoning were positively correlated. In contrast, Harari and Weinstock (2021) found no associations. Tan et al. (2020) reported that the ability to delay gratification, as part of executive functions, predicted moral functioning. The association between executive functions and morality has been examined in eleven studies. Executive functions were not linked to moral behavior. Results concerning moral emotions and cognition varied.

Emotion Regulation and Morality

Garner (2012) states that emotion regulation in school age was not correlated with empathic responses or moral transgressions in preschool age. However, emotion regulation in school age was positively correlated with empathy and the willingness of preschoolers to intervene in moral situations. Gummerum and López-Pérez (2020) used frequency analyses to examine the regulation of interpersonal emotions in moral scenarios. Overall, children strive to improve the emotions of the victims and to worsen the emotions of the perpetrators in situations of social exclusion. Hinnant et al. (2013) reported that emotion regulation was not a significant predictor for moral reasoning. Panfile and Laible (2012) identified emotion regulation but not negative emotionality as a positive predictor for empathy. The association between emotion regulation and morality has been examined in four studies. Emotion regulation was not linked to moral cognition and concerning moral emotions the results varied.

Other Aspect of Self-Regulation and Morality

Five studies analyzed aspects of self-regulation that cannot be classified in the scheme used above. Asendorpf and Nunner-Winkler (1992) reported no significant association between ego control and moral motivation and behavior. Tabibi et al. (2016) also reported no significant associations between cognitive self-regulation and moral judgment. LaVoie (1974) conducted an ANOVA and found that children who were more mature in moral judgment tended to show more resistance to deviation. This study was the only one to consider components of morality as independent variables and components of self-regulation as dependent variables. According to Feldman (2007) self-regulated compliance predicted the moral emotion empathy at the age of 13, but not moral cognition. Narvaez et al. (2021) reported that misbehavior only correlated negatively with internalized conduct in a Chinese sample. Additionally, it correlated negatively with empathy in an US sample.

Meta-Analyses of the Associations between Self-Regulation and Morality

While the narrative synthesis focuses primarily on regression analyses, the meta-analyses uses correlation coefficients to ensure a uniform interpretation. A total of k = 53 correlation results (N = 9443) which analyzed the association between self-regulation and the morality of preschool and elementary school children were included in the meta-analysis. Concepts from the same study and research perspectives were amalgamated. Nine studies focus on several research perspectives and components of self-regulation or morality and were therefore included several times (Asendorpf & Nunner-Winkler, 1992; Colasante et al., 2014; Feldman, 2007; Hinnant et al., 2013; Kochanska et al., 1997; Narvaez et al., 2021; Nicolais et al., 2017; Stifter et al., 2009; Tan et al., 2020). Two studies used in the narrative synthesis, neither name any relevant statistical information (Gummerum & López-Pérez, 2020) nor the parameters required to calculate an effect size, (LaVoie, 1974) lead to an exclusion from the meta-analytical calculations.

Baujat plots and influence diagnostics were used to identify studies that strongly contribute to the heterogeneity and to identify studies that do not fit well into the meta-analytical model (Baujat et al., 2002; Harrer et al., 2021). Based on these analyses, the studies by Kochanska et al. (2009), Kochanska and Knaack (2003), Cowell et al. (2017) and Narvaez et al. (2021) were excluded from the meta-analyses.

Fig. 3 shows the forest plot of the associations between self-regulation and morality (k = 46, N = 4990), revealing a small significant effect (Δ = .15, p < .001, 95% CI [.11, .19]). The heterogeneity (Q = 88.44, p < .001, I2 = 49.10%) was assessed as moderate (Higgins & Thompson, 2002; Higgins et al., 2003).

Fig. 3
figure 3

Forest plot for self-regulation and morality

Subgroup Analysis

The results of the subgroup analyses are shown in Table 2. The analyses regarding the study characteristics: age (QContrast = 3.87, pContrast = .144), quality of the study (QContrast = 0.10, pContrast = .749), measurement of self-regulation (QContrast = 0.11, pContrast = .737) and morality (QContrast = 1.18, pContrast = .278) did not reveal any significant differences (see Table 2). There were also no differences (QContrast = 0.15, pContrast = .696) with regard to longitudinal (Δ = .16, p < .001, 95% CI [.10, .23], k = 20, n = 1706) and cross-sectional studies (Δ = .15, p < .001, 95% CI [.09, .20], k = 26, n = 3284). However, there were significant differences in the subgroup analyses focusing on the continents (QContrast = 9.74, pContrast = .021), the sample power (QContrast = 60.66, pContrast < .001) as well as the different constructs of self-regulation (QContrast = 8.05, pContrast = .045) and morality (QContrast = 9.50, pContrast = .050; see Table 2).

Table 2 Subgroup analyses with study characteristics

With regard to the continents, a small effect in the subgroup North America (Δ = .17, p < .001, 95% CI [.12, .22], k = 29, n = 2917) with small to moderate heterogeneity (Q = 47.02, p = .014, I2 = 40.40%) was found. There was also a small effect (Δ = .11, p = .039, 95% CI [.01, .20], k = 10, n = 1347) with moderate heterogeneity (Q = 21.70, p = .01, I2 = 58.50%) in the subgroup Europe. For Asia, no significant effect (Δ = .08, p = .156) was detected. Since there was only one study in the subgroup South America, the analysis could not be carried out.

Analyzing the subgroups in terms of sample power, results showed that analyses with an underpowered sample had a small effect (Δ = .10, p < .001, 95% CI [.06, .14], k = 36, n = 3883) and analyses with sufficient power had a medium effect (Δ = .31, p < .001, 95% CI [.26, .35], k = 10, n = 1107). Both subgroups showed homogeneity (Underpowered: Q = 43.54, p = .152, I2 = 19.60%; Enough Power: Q = 4.31, p = .890, I2 = 0.00%).

The results according to the various aspects of self-regulation suggest that the subgroup that associated emotion regulation with morality showed the greatest effect (Δ = .30, p = .03, 95% CI [.07, .49], k = 3, n = 197). The subgroup is assessed as homogeneous (Q = 4.31, p = .562, I2 = 0.0%). The second largest effect was shown by the subgroup that linked temperament-related self-regulation with morality (Δ = .15, p < .001, 95% CI [.09, .21], k = 26, n = 2787), with moderate heterogeneity (Q = 52.20, p = .001, I2 = 52.10%). Self-regulation with regard to executive functions showed the lowest significant effect (Δ = .14, p = .005, 95% CI [.05, .23], k = 12, n = 1558) and medium heterogeneity (Q = 25.21, p = .009, I2 = 56.40%). The aspects of self-regulation that could not be assigned to any of these conceptions showed no significant effect (Δ = .11, p = .092).

The results for the subgroup analyses of variables on morality suggest that moral behavior (Δ = .14, p = .001, 95% CI [.07, .21], k = 16, n = 1787), moral emotions (Δ = .16, p = .005, 95% CI [.06, .26], k = 11, n = 1229) moral cognition (Δ = .13, p = .021, 95% CI [.02, .23], k = 11, n = 1151) and aspects of morality that could not be assigned to any of these conceptions (Δ = .13, p = .020, 95% CI [.03, .23], k = 6, n = 643) were significantly associated with self-regulation. Conscience (Δ = .33, p = .114) showed no significant effect. In the subgroup of moral behavior (Q = 28.71, p = .018, I2 = 47.70%), moral cognition (Q = 23.16, p = .010, I2 = 56.80%) as well as in moral emotions (Q = 23.12, p = .010, I2 = 56.70%), moderate heterogeneity was found. The subgroup of the aspects of morality that could not be assigned to any conception is assessed as homogeneous (Q = 4.92, p = .426, I2 = 0.00%).

Publication bias

Egger’s regression test (z = 1.32, p = .187) and the rank correlation test (τ = 0.11, p = .280) for funnel plot symmetry were carried out (see Fig. 4). Since neither of the two results were significant nor visual analysis revealed asymmetry, no evidence of publication bias could be found (Begg & Mazumdar, 1994; Egger et al., 1997). The fail-safe N (observed level of significance: p < .0001, target level of significance: p = .05, fail-safe N = 1811) does not indicate any publication bias either (Rosenthal, 1979).

Fig. 4
figure 4

Funnel plot

Differentiated meta-Analyses

Table 3 shows several smaller meta-analyses with independent samples. These differentiated meta-analyses were carried out if at least two studies (k = 2) that could be meaningfully combined depending on the research perspective and forms of the constructs were available (Ryan, 2016; Valentine et al., 2010). In line with the narrative synthesis, there was only a small significant combined effect regarding the relationship between temperament-related aspects of self-regulation and moral behavior (Δ = .17, p = .007, 95% CI [.06, .28], k = 9, n = 874), as well as moral emotions (Δ = .15, p = .017, 95% CI [.03, .25], k = 9, n = 1099). Moderate heterogeneity can be reported for both meta-analyses. The correlations between executive functions and moral behavior (Δ = .11, p = .119) or moral cognition (Δ = .21, p = .095) were not significant. Additionally, there were no significant correlations between temperament-related aspects of self-regulation and moral cognition (Δ = .05, p = .520).

Table 3 Separate meta-analyses with the various constructs

Discussion

The narrative synthesis has shown that both self-regulation and morality can be viewed from different scientific perspectives. Despite various research orientations, definitions and operationalizations, overlaps were also identified. Most included studies focused on temperament-related aspects of self-regulation and associated them with moral behavior and moral emotions. The executive functions were also analyzed with moral behavior. There were less research results regarding the connections between emotion regulation and morality, indicating a research gap. Some overlaps have been identified concerning the temperament-related aspects of self-regulation and the executive functions (see Table 1). Both consider inhibitory control which involve the inhibition of a prepotent or dominant reactions (dos Santos et al., 2020; Hinnant et al., 2013; Jambon et al., 2021; Kim-Spoon et al., 2019; Kochanska et al., 2009; Kochanska & Knaack, 2003; Stifter et al., 2009). From the perspective of temperament research, inhibitory control is also an aspect of effortful control. Effortful control, in turn, also represents cognitive components of executive functions such as executive attention (Kim-Spoon et al., 2019; Nigg, 2017). Nigg (2017) highlights that effortful control can be understood as a representation at the trait level of the use of cognitive control for self-regulation. Rademacher and Koglin (2019) emphasize that further research could focus on the interrelation of executive function and effortful control to contribute to a unified labelling and the resolution of conceptual overlaps (Rademacher & Koglin, 2019). This could also help to better understand the construct of self-regulation in the context of morality and build up a theoretical framework.

The meta-analyses took a closer look at the quantitative associations. In line with our hypothesis, there is a small and positive significant effect for the association between self-regulation and morality. Results focusing on the aspects of self-regulation revealed that the subgroup that associated emotion regulation with morality showed the greatest effects. The regulation of emotions is involved in the upregulation of sympathy and empathy. These two emotions are positively related to moral reasoning, because they make it easier to take on the perspective and to understand the needs of others. Furthermore, emotion regulation is involved in the downregulation of personal distress or jealousy. These emotions, on the other hand, can disrupt moral cognition if they are not properly regulated (Hinnant et al., 2013). Therefore, well-developed emotion regulation skills in the context of moral conflicts can contribute to acting morally. Otherwise, high empathic arousal could lead to personal distress, which leads to all cognitive resources consequently being used to reduce the distress rather than acting moral (Eisenberg & Fabes, 2006; Garner, 2012; Panfile & Laible, 2012). This leads to the practical implication that emotion regulation strategies should be promoted, as they could have a positive effect on morality, especially on the perception of moral emotions.

Furthermore, analyses revealed small and positive significant combined effects regarding the association between temperament-related aspects of self-regulation and moral behavior as well as moral emotions in the differentiated meta-analysis. According to dos Santos et al. (2020) the temperamental aspect of effortful control can contribute to inhibiting actions which are inappropriate from a moral point of view, e.g., deliberately harming someone. Conversely, prosocial and moral behavior can easier be carried out (Evans & Rothbart, 2007; Rothbart et al., 2001). Contrarily, Stifter et al. (2009) stated that temperament-related aspects of self-regulation were negatively associated with moral emotions and moral behavior. They assumed that inhibited children are motivated by arousal and discomfort. This, in turn, can impair the ability to feel and understand other people’s emotions. In terms of temperamental inhibition, there seems to be a fine line between too little and too much inhibition. On the one hand, inhibition can help to inhibit undesirable behavior, on the other hand, too much inhibition can inhibit the perception of emotions like empathy. The predictive value of affective temperament is also discussed as a possible contributor to negative clinical outcomes (Baldessarini et al., 2017). Furthermore, Stifter et al. (2009) stated that executive functions can help to reduce low empathy of inhibited children. Studies should look more closely at the role of temperamental inhibition and the interaction with executive functions in the context of moral behavior and emotions.

Subgroup analyses detecting sources of heterogeneity revealed differences with regard to the continents. These results are in line with the findings of other studies that have already found cultural differences regarding morality (Cowell et al., 2017; Myyry et al., 2021; Sachdeva et al., 2011). These differences should be considered in future analyses.

This systematic review and meta-analysis intents to contribute to a uniform labelling of the terminology relating to constructs of self-regulation and to resolute conceptual overlaps. Furthermore, the review illustrated that different constructs of self-regulation have different impacts on the constructs of morality, highlighting the importance of a differentiated view.

Limitations

The results of these investigations are to be interpreted considering some limitations. Missing information on effect sizes that were reported as being insignificant were set to zero. Even if this approach is conservative, it can underestimate the mean effect size of the population and overestimate the effect size variance (Peterson & Brown, 2005; Pigott, 1994). The combination of effect sizes within the included studies could have the disadvantage of a possible loss of information due to a smaller number of effect sizes. Due to the sometimes-small sample sizes and possible sample data overlaps between the subgroups, the results should be interpreted with caution (Beelmann & Bliesener, 1994).

The central search term used to identify the most important studies was “moral*”. Moreover, the self-conscious moral emotions (guilt, shame, empathy, sympathy, jealousy, pride and embarrassment) were additionally used. Further analyses should also consider (precursor) skills related to morality such as the ability to be empathic, perspective taking, understanding emotions, theory of mind or false belief understanding (Baker et al., 2021). For this purpose, broader search terms are necessary.

Appendix A shows the evaluation of the quality of the included studies. This reveals some methodological restrictions that should also be considered when interpreting the results. Many of the included studies did not specify or define the study population. Only a few studies justified their sample size. Samples that were too small and with insufficient power may fail to detect effects and lead to a reporting bias. In line with this, subgroup analysis of sample power revealed that the studies with enough power (> .80) reported stronger effects between self-regulation and morality than studies with underpowered samples (< .80). Hence it is highly recommended to check and report the a priori sample size justifications (Faul et al., 2007, 2009; Nayak, 2010). Additionally, it should be noted that only 12 of the included studies used a longitudinal study design and only ten of them were classified as good in the quality assessment (NHLBI, 2021). Accordingly, from a methodological point of view, little is known about the longitudinal relationship between self-regulation and moral development. In order to clarify the importance of self-regulation for the moral development of preschool and elementary school children, more longitudinal research and research with greater samples are required. The strengths of many studies were the consideration of different levels of exposure and confounder variables (NHLBI, 2021).

Conclusion and Further Research

A small positive association between self-regulation and morality could be identified, especially between temperament-related self-regulation and moral behavior and emotions. Considering the heterogeneous initial situation described, further studies that examine the relationship between self-regulation and morality are essential. For this purpose, coherent definitions of self-regulation and morality are requisite. On the one hand, there are research gaps regarding the associations between emotion regulation and morality and, on the other hand, the reported results are not homogeneous. Further research should also focus on behavioral inhibition and emotion regulation. Several studies detected behavioral inhibition as a predictor of social anxiety (Clauss & Blackford, 2012; Sandstrom et al., 2020). Behavioral inhibition is characterized by fear and shyness (Kagan, 1989). Inhibited and anxious children may experience moral situations with peers as extremely distressing which in turn may lead to excessive demand and functional impairment (Clauss & Blackford, 2012). In this context, emotion regulation could also become relevant. When children are unable to regulate emotions or the capacity to regulate emotions is reduced, active emotions may inhibit cognitive processes. This could, for example, affect moral cognition and emotions (Garner, 2012; Hinnant et al., 2013; Panfile & Laible, 2012). Poor emotional regulation skills could be associated with personal distress, as the adequate coping strategies to alleviate one’s own emotions are missing (Eisenberg et al., 1996; Panfile & Laible, 2012).

It is also noteworthy that some authors and working groups are focusing on the topic repetitively. In this systematic review, for example, studies by Kochanska and colleagues were included five times (Kochanska et al., 1994, 1996, 1997, 2009; Kochanska & Knaack, 2003), Colasante and colleagues (Colasante et al., 2014, 2015; Jambon et al., 2021) three times, Stifter and colleagues (Augustine & Stifter, 2015; Stifter et al., 2009), Smetana and colleagues (Smetana et al., 2012; Yoo & Smetana, 2021) and Cowell et al. (2015, 2017) twice each. The results can therefore be dependent on the research perspective and the theoretical conceptions of the respective authors and working groups. In order to expand the research landscape of self-regulation and morality in preschool and elementary age school children, it is necessary that more researchers with different perspectives focus on the topic.

It becomes clear that different research perspectives should be considered. A combination of several constructs of self-regulation as discussed by Stifter et al. (2009) and Hinnant et al. (2013) could be a meaningful approach to detected which regulatory processes are involved in context of morality. Rademacher and Koglin (2019) also propose an integrative model of self-regulation, in which executive functions and effortful control could be analyzed at the same time and in which overlaps, and unique aspects of self-regulation should be considered. Global and integrative models are also proposed with regard to moral development (Malti & Keller, 2010; Oser, 2013). Oser’s (2013) global moral motivation model attempts to explain why individuals act morally by considering, for example, moral visions, beliefs, motives, and the sense of duty. Even if moral values and norms have been internalized by individuals, it is still possible that they act contrary to them. Self-regulation can play a decisive role in this context and can be used as an explanatory approach. Jambon et al. (2021) for example discuss the importance of inhibitory control for preschoolers regarding “control” hypothesis. Children with happy victimizer tendencies may show difficulties suppressing immediate satisfaction of needs. They focus on the desired reward rather than the moral concern. Facilitating self-control may be key to unlocking the moral responses children internalized and are capable of (Jambon et al., 2021). In an integrative model, the various aspects of self-regulation could be considered in order to analyze the relevance for the various constructs of morality. This review and meta-analysis offers a first reference point for future research; to consider the importance of emotion regulation and temperamental aspects of self-regulation for moral emotions and action in an integrative approach.