1 Introduction

Externalizing problems (EPs) in young children, usually understood as impulsive, disruptive, aggressive, anti-social, and overactive behavior (Hinshaw 1992), can have a long-standing detrimental effect on success in life, in particular on children’s later academic achievement and school career (e.g., Palmu et al. 2018). There is consistent evidence that genetic as well as environmental influences, such as the family environment, contribute to EPs in young children (e.g., Tucker-Drob and Harden 2013). Genetic influences in particular have been shown to contribute to individual initial and continuing differences in EPs, whereas changes in initial differences in EP with age have been attributed mainly to age-specific environmental influences (e.g., Lewis and Plomin 2015; Hatoum et al. 2018). However, genes and environment do not work completely independently of one another. Environmental factors can compensate for or trigger genetic influences. Although compensation refers to an environmental setting that prevents expression of a genetic vulnerability, triggering refers to a setting with the opposite effects (Shanahan and Hofer 2005). Thus, at this point we can speak of a “genetic risk” for EPs, which can potentially be compensated for or even buffered by environments that work against this genetic risk (Leve et al. 2010).

With the expansion of early childhood education and care (ECEC) services, in almost all industrialized countries, ECEC services have increasingly become relevant for child development. Therefore, the question arises how far ECEC services are able to moderate the contributions of genetic influences as well as the contributions of other environments experienced by children, primarily household and family conditions, on EPs. Based on the observation that social experiences in ECEC centers can evoke differentiation in children’s externalizing behavior (McCartney et al. 2010), previous research (e.g., in the USA and the Netherlands) has investigated whether being enrolled in ECEC services moderates the contribution of genetic influences on problem behaviors. These studies indicated a greater contribution of genetic influences among children who attended ECEC centers than among those who did not. For children who did not attend ECEC services, EPs were predominantly due to other environmental influences, such as the family environment (Middeldorp et al. 2014; Tucker-Drob and Harden 2013).

How ECEC centers influence problem behaviors is, however, not simply dependent on whether a child is enrolled but also on ECEC quality (Broekhuizen et al. 2018). ECEC quality can differ significantly across ECEC centers and generate unequal social experiences and unequal chances in child development (Tietze et al. 2013; Stahl et al. 2018). They may indeed compensate for vulnerabilities to developing EPs, but it is also possible that social experiences trigger EPs, for example through processes of peer rejection (McCartney et al. 2010; Sturaro et al. 2011). To the best of the authors’ knowledge, this paper is the first to study the role of specific characteristics describing ECEC quality for explaining EPs in young children and preschoolers based on a genetically sensitive model. Studying the extent to which differences in ECEC quality moderate genetic and other environmental influences helps us to understand better whether improving particular characteristics of ECEC quality could also help to avoid behavioral problems in the longer run (Tucker-Drob and Harden 2013), and why some children continue to show high levels of EPs when they grow older, whereas others do not (Tucker-Drob and Harden 2013). Therefore, we do not restrict our focus to the short-term consequences of ECEC quality on EPs but look at the effects 2 years later when children attend primary schools. Thereby, our approach provides unique valuable information for the ongoing debate about improving ECEC quality to facilitate child development (e.g., Vandell et al. 2010).

We apply behavioral genetic methods based on a twin design. Comparing EPs across pairs of monozygotic (MZ) and dizygotic (DZ) twins, we are able to decompose the observed variance in EPs into three variance components utilizing latent random variables. Thus, we do not directly measure genetic and environmental influences but derive them from the variance–covariance matrix in EPs for MZ and DZ twin pairs. Based on the fact that MZ twins share 100% and DZ twins share on average 50% of their genetic makeup, we are able to estimate the extent of variation in EPs that relates to genetic variation (the genetic component). By looking at the variation in EPs between twin families and within twin pairs, we are additionally able to distinguish between variance in EPs that relates to environments that are shared between twins (the so-called shared environmental component), and the share of variance in EPs that is due to unique experiences of the twins (the so-called nonshared environmental component). By comparing the three components und the underlying variances across different ECEC environments, we are able to assess the extent to which ECEC quality moderates genetic and environmental influences on EPs.

Compared with previous nongenetically informed studies, the genetically sensitive approaches, such as the twin design, provide more comprehensive control for bias owing to omitted variables and unobserved heterogeneity (e.g., Diewald et al. 2016). The genetic component (A) captures all unobserved child characteristics that mediate any genetic effects—as any “genetic causes must work through the body” (Freese 2008, p. 6). The shared environmental component (C) captures all unobserved influences of environments that increase the twin’s trait similarity, whereas the nonshared environmental component (E) captures all unobserved influences of environments unique to each twin that contribute to the twins becoming less similar in their EPs. Given that the sample is highly homogeneous with respect to the age of the twins, country, historical time, and ethnicity, the shared environment component can be seen as a proxy for the role of homogeneous effects of the family environment (Freese and Jao 2017). This allows the question to be addressed whether ECEC quality is able to moderate the formative influence of the family environment on EPs. We suggest that twin-based studies, such as ours, have the potential to provide relevant and frequently generalizable evidence as previous research found no meaningful differences in the personalities of twins and nontwins (Johnson et al. 2002), the parenting they receive (Mönkediek et al. 2020), or their anti-social behavior (Barnes and Boutwell 2013).

2 Theoretical Framework

2.1 The German ECEC System

In Germany, ECEC attendance rates of children are high and almost universal from the age of 2. In 2020, 93% of children aged 3–6 in Germany attended ECEC centers (Destatis 2020). At the same time, there is substantive variation in the quality of ECEC centers owing to their organizational and legal framework. The legal framework of the ECEC is organized on three different levels: national, state, and municipal. This leads to substantive variation in the characteristics of daycare centers across German municipalities (Spiess 2008; Tietze et al. 2013). Minimum child–staff ratios are regulated across all German states, but not with the same standards (Stahl et al. 2018; Stahl 2017). According to legal regulations, ratios vary between 10 children per educator in regions with strict regulations to 20 children in regions with loose regulations (Stahl 2017). Minimum requirements for most other indicators of structural quality, such as group size, teacher qualifications, and further training, range from precise to very general or none at all. Minimum quality standards and actual conditions often fall short of evidence-based recommendations (Stahl et al. 2018). In terms of educator qualifications, only about 5% of staff in ECEC hold an academic degree, whereas the great majority have completed vocational training. Owing to decentralization, German states and municipalities vary greatly with respect to governance and funding issues. Parents’ fees are mostly income-dependent and relatively low compared with those in most other OECD (Organisation for Economic Co-operation and Development) countries (Huebener et al. 2020). This substantive variation in ECEC quality provides good opportunities to study the effects of ECEC quality on EPs in young children.

2.2 ECEC Quality and (Facets of) Children’s Externalizing Problem Behavior

There is consistent evidence that environmental factors can compensate or exacerbate genetic and environmental contributions to EPs (e.g., Leve et al. 2010), and that experiences in ECEC centers are such relevant social environments (Middeldorp et al. 2014; Tucker-Drob and Harden 2013). However, there are no research results on the extent to which specific ECEC quality characteristics influence genetic and environmental contributions to EPs. As an overabundance of different quality characteristics may be relevant, it is practically almost impossible to consider them all within the scope of a single study. Consequently, this study concentrates on identifying and investigating, on the basis of previous studies, those quality characteristics that might be particularly influential for EPs in general or sub-facets of EPs.

ECEC quality research often differentiates among structural quality, process quality, and orientation quality (e.g., Kluczniok and Roßbach 2014; Tietze et al. 2013). Structural quality predominantly comprises easily observable, quantifiable, and regulable features of the ECEC context, such as educators’ qualification, group size, and child–staff ratio. Process quality in ECEC institutions includes the entirety of pedagogical interactions and children’s experiences with the social and material environment (Anders et al. 2012). Orientation quality comprises the education- and care-related expectations, attitudes, norms, and values of all educators in ECEC settings—educational goals play an important role in this context. How ECEC centers organize their work and assure quality (e.g., pedagogical concept) also fall into this category (Tietze et al. 2013).

Overall, there is no consistent evidence of structural quality directly influencing children’s externalizing behavior or socio-emotional development (e.g., Bowne et al. 2017; Gialamas et al. 2014; for Germany: Viernickel and Fuchs-Rechlin 2016). Group size, child–staff ratio, or characteristics related to educators, such as qualification levels, are nevertheless often assumed to affect children’s behavior by influencing process quality (Kluczniok and Roßbach 2014); and, thus, to facilitate the pedagogical work of the educators (Viernickel and Fuchs-Rechlin 2016). For example, group size and the child–staff ratio are expected to impact educators’ educational strategies and their interactions with young children (e.g., Finn et al. 2003). Lower child–staff ratios provide better opportunities for monitoring and promoting children’s skills and learning processes in a more individualized and targeted way (e.g., Bowne et al. 2017). Higher levels of educators’ qualification are supposed to help educators to identify children’s needs (Viernickel and Fuchs-Rechlin 2016), also resulting in higher-quality pedagogical interactions. As a consequence, it can be assumed that in smaller groups, with a lower child–staff ratio or with higher qualifications, the educators have more scope to react to and prevent externalizing behavior of children. Previous research showed a positive association between educators’ training and, for example, children’s social play (Kontos et al. 1994).

Concerning process quality, there is consistent evidence that the quality of the educator–child interactions is associated with social competence and problem behavior in children (e.g., Broekhuizen et al. 2016). In particular, a relationship was observed between better educator–child interactions, such as relationships characterized by higher levels of affection and emotional support, and lower levels of problem behaviors, together with higher levels of a child’s social competence (Gialamas et al. 2014). In addition, educator–child interactions at the group level have been observed to facilitate a certain group climate (Mashburn et al. 2008). Group climate has been described as an important contextual factor for child development (e.g., Broekhuizen et al. 2016), where a positive, supportive climate has been shown to cause lower levels of EPs, particularly in children showing greater vulnerability (Roubinov et al. 2020).

Orientation quality, and particularly perceived responsibility, teacher enthusiasm, and joy and interest in teaching specific activities, have been found to correlate with better teaching (Anders and Rossbach 2015; Kluczniok et al. 2011) and may thus influence child development directly. Despite increasing attention for pedagogical conceptualization of ECEC quality and child development (e.g., Kluczniok et al. 2011), studies that address this quality dimension explicitly in the context of child outcomes are still rare.Footnote 1 Therefore, it is worthwhile seeing if the assumption of direct effects of orientation quality on EPs, in addition to indirect ones via process quality, holds.

2.3 Genetic and Environmental Contributions to Externalizing Problem Behaviors in Different ECEC Environments

To theorize how various ECEC characteristics, experienced at the age of 4–6 years, may modify contributions of genes and of environments other than ECEC centers to EP measured 2 years later, we draw on different mechanisms that have been proposed to describe how social contexts can moderate genetic expression. In part, these mechanisms have been applied to ECEC environments before (Middeldorp et al. 2014), and it can be assumed that they affect the relative contributions of genes and environments on EPs. The proposed mechanisms are: contextual triggering, social context as compensation, social context as control, and social context as enhancement (Shanahan and Hofer 2005). Since in the case of EPs we are talking about genetic vulnerabilities and risks, and not about positive reinforcement of genetic predispositions (enhancement), in the present case we focus only on the first three mechanisms.

Contextual triggering is based on the diathesis–stress model, which suggests that the environmental context can act as a stressor that activates a genetic predisposition (Shanahan and Hofer 2005). For example, ECEC attendance could trigger a genetic predisposition to EPs in the case of children experiencing social rejection (e.g., Sturaro et al. 2011). This might be more easily the case in ECEC environments, for example, with an unpleasant group climate. Compensation refers to favorable or enriched environmental contexts that prevent or neutralize genetic expression for certain problem behaviors. In the ECEC context, compensation might occur when conditions allow for better educator–child interactions. Here, educators may be better able to provide children with emotional support and convey strategies to children to counteract negative behaviors. Thereby, the ECEC context may also mitigate the negative contributions of detrimental environments shared between twins, such as household and family conditions. Control refers to a context where social norms or structural constraints hinder genetic expression. For example, in ECEC centers with a low child–staff ratio the better opportunities to monitor children’s behavior may reduce the genetic contribution to EPs, as negative behaviors could be countered earlier and more effectively.

Taken together, we expect better ECEC quality to hinder genetic expression of EPs and to compensate for negative influences of shared environments of the twins, first of all, household and family conditions. This translates into the following hypothesis:

H 1

Characteristics related to better structural, process and orientation quality decrease genetic and shared environmental influences on externalizing problems,

whereas lower structural, process, and orientation quality is expected to have the opposite effects. As structural quality and orientation quality can be expected to affect a child’s EPs mainly through changes in process quality, the effects of structural quality and orientation on EPs can be expected to be less proximal, and may therefore be less influential.

H 2

We expect stronger moderation effects for ECEC characteristics related to process quality than for ECEC characteristics related to structural or orientation quality.

3 Data, Measures, and Analytic Strategy

3.1 Data

The analysis is based on the first two waves of the German Twin Family Panel (TwinLife) (data-set version 5.0.0) (Diewald et al. 2021). The sample is restricted to the twins of the youngest cohort aged 4–6 years at the time of the first interview (2010 twins born in 2009 or 2010), who, in addition, took part in an additional ECEC study, the K2ID-Twins Study (see www.k2id.de/data/samples-k2id-twins). TwinLife is based on a sample of twin families, with same-sex twins, randomly derived from administrative data from communal registration offices. The sample covers the full range of regions and social strata in Germany (Lang and Kottwitz 2020). The first wave was conducted between 2014 and 2016 and the second between 2016 and 2018. Informed consent was obtained from all parents of participating children in the youngest cohort during the first-wave household interviews. Informed consent was also obtained from all educators in the ECEC centers of the K2ID-Twins Study. The surveys of the K2ID-Twins Study were undertaken according to the data privacy protection rules applicable to institutional surveys in Germany.

In 2015 or 2016, detailed information on the ECEC centers the twins attended was collected as part of a satellite project conducted in cooperation with the K2ID project team (www.k2id.de; see also Spiess et al. 2020 for the Socio-Economic Panel (SOEP)-related K2ID study). Unfortunately, it was not possible in all cases to establish contact with the ECEC centers through the twins’ parents; i.e., about 24% of the parents refused to provide the address of the daycare facility. Questionnaires were sent to 769 ECEC centers to collect information on measures on the level of the centers, provided by the directors. In addition, measures on the level of groups the twins were enrolled in were provided by their group educators. The questionnaires were designed to capture various quality indicators of various dimensions of ECEC quality (Schober et al. 2017). They are based on other surveys, which have been confirmed as valid instruments to measure ECEC quality (e.g., McCabe and Ackerman 2007). The response rate of the ECEC centers of about 62% was higher than for comparable German surveys of ECEC centers, such as for the K2ID-SOEP study at 56% (Spiess et al. 2020), for the National Educational Panel Study (NEPS) at 33% (Hellrung et al. 2011), and for the Nationale Untersuchung zur Bildung, Betreuung und Erziehung in der frühen Kindheit study (NUBBEK) at 13% (Döge et al. 2013). This resulted in valid information on 946 twins in 480 ECEC centers.

Restricting the sample to only those twins who took part in the second wave of TwinLife further reduces the sample to a maximum of 713 twinsFootnote 2 (40% monozygotic, 55% female, 35% with one parent born outside Germany; see Table 1) in 364 ECEC centers. Ninety-nine percent of these twins attended the same ECEC center, and more than two-thirds of the twins (70%) attended the same groups. However, owing to item non-response, the number of twins included in the analytical samples is lower and varies between 414 and 630 twins. Twins for whom the information on the dependent variable was missing (47 cases) were kept in the analysis, because they could be used for certain parts of the analysis using full information maximum likelihood estimation (Enders 2010).

Table 1 Sample statistics

The structural quality indicators of the ECEC centers in our analytical sample do not differ systematically from the same measures collected as part of the K2ID-SOEP survey of ECEC centers of children, a representative panel study (Stahl et al. 2018). A comparison of characteristics for the twins in the analytical sample with the original full sample (Table 1) shows that there are only negligible differences for the outcome variable “externalizing problems” and sample characteristics, such as the twin’s zygosity and sexFootnote 3. However, we observe slightly fewer children with one parent born outside Germany in the analytical sample and a slightly higher proportion of parents with tertiary education, as indicated by the highest parental ISCED (International Standard Classification of Education; UNESCO 2003) in the household. Closer inspection shows that this is rather due to increased participation of these families in the second wave of TwinLife than selective participation in the K2ID-Twins study. In addition, there is no correlation between parental education and levels of child’s EPs (compare Table 2). Thus, we do not expect these minor differences to affect our results.

Table 2 Means, Standard Deviations, and Correlations among Study Variables

3.2 Measures

The dependent variable in our analysis is child’s EP behavior in wave 2, when children were 6–8 years old. As educators may react to the children’s EPs with special measures, by using lagged predictors, we aim to reduce the risk of reverse causality with EPs affecting ECEC quality rather than vice versa. Children’s EPs were measured based on a self-report version of the Strength and Difficulties Questionnaire (SDQ). The SDQ has been widely used, both in terms of self-reports and reports by parents and teachersFootnote 4 (Stanger and Lewis 1993) and shown to provide meaningful data for children younger than 10 years old (Muris et al. 2004; Di Riso et al. 2010) and for the age range considered here (Curvis et al. 2014). This version consisted of four items that measured on a scale from 1 (never) to 3 (very often) how often a child got angry, listened to her/his parents, argued with other children, and lied or cheated (see Table S1 in the Online Appendix). Children under the age of 10 were interviewed personally by the interviewer during a household visit. We reversed the values of the second item and estimated the degree to which children showed EPs utilizing a confirmatory factor analysis (CFA) for all twins included in the second wave. For that purpose, we used the lavaan package (version 0.6-9) in R (Rosseel 2014). The results are reported in Table S2 (in the Online Appendix). Assuming a latent variable underlying the observed items, based on this approach, we are not only able to account for possible ambiguities in the measurement of EPs but estimating the dependent variable in a generalized SEM allows us to combine the ordinally scaled items into one metrically scaled variable. Although the CFA can be regarded as over-identified based on four items (Brown and Moore 2012), the resulting model fit statistics showed that the estimated model fits the observed data well (CF: 0.985; TLI: 0.955; RMSEA: 0.045; SRMR: 0.026) (Hu and Bentler 1999). Saving the predicted values, the resulting new variable “externalizing problems” (mean: −0.02, SD: 0.43) is nearly normally distributed (Fig. S1 in the Online Appendix).

The data we use does not allow operationalizing all three dimensions of ECEC quality with the same accuracy. Structural quality is operationalized by four measures. The first measure is the size of the ECEC center, which serves as a proxy for differences in the capabilities of organizational structures, such as the availability of relevant equipment, and organizational resources for cross-group activities. The size of the ECEC centers in our sample varies from 15 to 301 children (Table 2). The second measure is group size, which serves as a proxy for the size of a child’s direct ECEC environment and the possibility of educator-child interactions and peer interactions. Twenty-eight ECEC centers in the analytical sample have no group concepts, which means that the ECEC center size equals the group size. Like the size of the center, the group size also varies significantly (Table 2). The two characteristics do not really correlate, like most of the characteristics. This, however, is not surprising given the marked differences in the degree of official regulation of centers on the state, community and even provider level, which not only lead to high autonomy in implementing childcare services but also to substantial variations in the focus of ECEC centers (e.g., Stahl 2017). The third measure, the child–staff ratio, is derived on the basis of group sizes divided by the information on the number of educators normally co-present in the groups. Based on the data, the child–staff ratio varies between 2.8 and 31.5 children per educator. The values at the top of the distribution thus deviate from the legal regulations of the minimum standards (20 children per educator; Stahl 2017). These deviations most likely represent measurement errors and are therefore set to the value of 20 (affecting 14 cases). The formal education of the educators is the same for almost all educators. They are trained as ErzieherIn, which is the standard qualification in German ECEC centers. Although the vocational training as an “ErzieherIn” is not university based as in many other countries, it is a relatively solid education covering a broad variety of pedagogical aspects. However, as it trains day care teachers for younger and older children, even school children, some ECEC educators take additional courses with a specific focus on early childhood issues. In the current case, 40% of the educators have completed additional training with a focus on early childhood pedagogy, which we take as the fourth measure of structural quality.

Apart from this we use additional indicators to measure ECEC quality, which might affect EPs. The next indicator, namely the stress experience of the educator, serves as proxy for the social “climate” in the group context and may also reflect the quality of the educator–child interaction (as part of process quality), which may decrease the more stressed the educator is. Stress experience was measured based on responses to a question asking how often in the last 4 weeks group educators felt rushed or under time pressure. Group educators were able to answer this question on a scale ranging from 1 (always) to 5 (never). We recoded the values for this item so that higher numbers reflect greater stress experience in educators and treat the variable as approximately metric scaled. Half of the educators (50%) reported often having had or even always having experienced stress within the last 4 weeks, which is reflected in a relatively high mean (mean: 3.4, SD: 0.8). As another quality indicator we used the information on talking circles in the ECEC centers, which are usually aimed at encouraging children to reflect and share experiences or discuss group dynamics. Information on the frequency of talking circles was derived based on a question asking group educators how often these circles take place within the group. Educators were able to rate the frequency of talking circles on a scale ranging from 1 (never) to 7 (daily). In general, educators reported that talking circles take place several times a week (mean: 1.9, SD: 1.2). The existence of talking circles gives some quantitative hints on the staff–child interaction as well, although we have no measure of how well they are implemented.

An indicator for the orientation quality of the ECEC center is a measure of educators’ education goals. Given that we are interested in EPs, we focus on the education goal self-regulation,” which was measured by five items. These items measured the extent to which group educators considered it important, each ranging from 1 (not important at all) to 5 (very important), that children possess self-control, act and behave in a responsible way, show respect for others, can fit in well in groups, and are liked by others. The variable “goals” is the mean score of these items (alpha: 0.66) and is nearly normally distributed and shifted to the upper end of the scale. On average, group educators considered it quite important (mean: 4.1, SD: 0.5) to increase children’s self-regulation abilities.

3.3 Analytic Strategy

We examined the contributions of genes and environments to EPs in wave 2, and the extent to which these contributions are moderated by ECEC quality indicators, based on a genetically sensitive linear probability model with ACE variance decomposition (ACE Model). Compared with purely phenotypic analyses such a “black box approach” avoids problems of omitted variable bias. It is virtually impossible to capture all relevant environmental characteristics, even with regard to the parental home. Standard socioeconomic variables contribute only modestly to the overall variance assigned to the shared environment (Mönkediek and Diewald 2022). Compared with molecular genetic methods, advanced ACE modeling, like the bivariate Purcell model applied in this paper, provides similar flexibility to address gene-by-environment interaction and gene–environment covariation. However, ACE models have been criticized for overestimating whole-genome contribution and to underestimate shared environment effects (Burt and Simons 2014). Nevertheless, polygenic scores often comprise only a small contribution of the whole genome effect with unclear confounding with environmental effects (Burt 2022). Moreover, it is unknown to which degree the part captured by the polygenic score is highly selective with respect to the overall relevance for EPs. Finally, no molecular genetic data exist for studies that also have detailed information on ECEC quality.

The model is based on assumptions described in Neale and Cardon (1992). In the ACE-Model, the observed EPs of twin 1 (T1) and twin 2 (T2) are postulated to depend on six latent random variables (A1, C1, A2, C2, E1 and E2) and the means μ1 and μ2 (Jöreskog 2021; Mönkediek 2022):

$$T_{1}=\mu _{1}+a_{1}A_{1}+c_{1}C_{1}+E_{1}$$
(1)
$$T_{2}=\mu _{2}+a_{2}A_{2}+c_{2}C_{2}+E_{2}$$

“A” reflects “narrow sense heritability” (h2), which is indicative of the average genetic effect on EPs (Neale and Cardon 1992). “C” represents the homogeneous effects on EPs of the environments that the twins share, such as the family, the shared neighborhood, or the ECEC center attended together. These effects contribute to twins becoming more similar in their EP. “E” is indicative of accidental experiences and individually different perceptions of same environments, both of them making twins more dissimilar in their (externalizing problem) behavior (Freese and Jao 2017). In addition, in the analytical model “E” contains the error term. Although E1 and E2 are assumed not to correlate with each other or with all other latent variables, the latent random variables A1, C1, C2, and A2 are assumed to have the means zero and the correlation matrix (Jöreskog 2021; Mönkediek 2022):

$$\Upphi =\left(\begin{array}{cccc} 1 & & & \\ 0 & 1 & & \\ 0 & 1 & 1 & \\ x & 0 & 0 & 1 \end{array}\right).$$

For MZ and DZ twins the correlation between C1 and C2 is expected to be 1, as shared environmental influences are postulated to be shared by MZ and DZ twins to the same extent (called the equal environments assumption).Footnote 5 The genetic relatedness of the twins (“x”) is 1 for MZ and 0.5 for DZ twins. Assuming that all latent random variables exert the same effects on both twins of a pair, the predicted variance–covariance matrix for T1 and T2 for MZ and DZ twin pairs is (Jöreskog 2021; Mönkediek 2022):

$$\Sigma \left(A{,}C{,}E\right)_{MZ}=\left(\begin{array}{cc} a^{2}+c^{2}+e^{2} & a^{2}+c^{2}\\ a^{2}+c^{2} & a^{2}+c^{2}+e^{2} \end{array}\right)$$
(2)
$$\Sigma \left(A{,}C{,}E\right)_{DZ}=\left(\begin{array}{cc} a^{2}+c^{2}+e^{2} & 0.5a^{2}+c^{2}\\ 0.5a^{2}+c^{2} & a^{2}+c^{2}+e^{2} \end{array}\right).$$

Based on the variance–covariance matrix the three standardized variance (A, C, E) components can be calculated (e.g., Jöreskog 2021):

$$\mathrm{A}=2*\left(\left(\frac{a^{2}+c^{2}}{a^{2}+c^{2}+e^{2}}\right)-\left(\frac{0.5a^{2}+c^{2}}{a^{2}+c^{2}+e^{2}}\right)\right){,}$$
(3)
$$\mathrm{C}=2*\left(\frac{0.5a^{2}+c^{2}}{a^{2}+c^{2}+e^{2}}\right)-\left(\frac{a^{2}+c^{2}}{a^{2}+c^{2}+e^{2}}\right){,}$$
(4)
$$\mathrm{E}=1-\left(\frac{a^{2}+c^{2}}{a^{2}+c^{2}+e^{2}}\right).$$
(5)

We studied the extent to which characteristics of ECEC quality moderate genetic and environmental influences on EPs based on the full bivariate moderation model proposed by Purcell (2002). The bivariate Purcell Model, typically estimated in a path-based parametrization, extends the baseline model by incorporating a linear regression on the path coefficients (Fig. 1). Through this linear regression the model partitions the variance of EPs into a part that is unrelated to the moderator (M) (here: ECEC quality) and a part that is associated with ECEC quality (Purcell 2002; van der Sluis et al. 2012). The part that is unrelated to the moderator (a21, c21, e21, a22, c22, e22) can be interpreted as the regression constant. The part that is associated with the moderator (βa1 * M, βc1 * M, βe1 * M) corresponds to the regression slope.

Fig. 1
figure 1

Bivariate moderation model based on Purcell (2002)

There are two ways through which ECEC quality can moderate genetic and environmental influences on EPs. First, ECEC quality characteristics can act as a contextual factor and compensate for or exacerbate genetic and environmental influences, for example, family conditions, on EPs (e.g., Zavala et al. 2018 provide an example on cognitive performance). In this case, we would observe ECEC characteristics to moderate one or more of the three unique variance paths on EPs (βa2 * M, βc2 * M, βe2 * M; Fig. 1).

Second, the level of ECEC quality might affect the source of covariance between the ECEC characteristics and EPs, which is reflected in the path estimates βa1, βc1, and βe1 (van der Sluis et al. 2012). The bivariate Purcell Model allows decomposing the covariance between the ECEC quality indicators and EPs into common genetic influences (a21) and common environmental influences (c21, e21). Common environmental influences most likely relate to environmental confounding. Environmental confounding could arise when children from certain social groups are more likely to attend certain daycare centers, and at the same time are more likely to develop certain problem behaviors. The common genetic influences (a21) most likely relate to processes of gene–environment correlation (rGE) (Zavala et al. 2018). rGE relates to patterns where individuals at a greater genetic risk for EPs are more often found in certain ECEC environments (for a short description of rGE processes see Diewald et al. 2016). As described, for example, by Rutter and Silberg (2002), it is difficult to assess the relevance of moderation effects (GxE) in the presence of rGE, because E is not completely exogenous to G, which can confound the analysis of GxE. Therefore, we test for possible patterns relating to rGE in the current analysis by looking at the source of covariance between ECEC characteristics and EPs.

We tested the relevance of the moderation effects by comparing model fit statistics (Akaike information criterion, AIC) and by performing a likelihood ratio test (lrtest) for nested models with and without moderation. We gradually excluded the nonsignificant paths from our models to further improve the model fit. Based on the lrtest we additionally tested whether a model assuming no moderation had a significantly worse fit to the data than the reduced model (the results are reported in Table S4 in the Online Appendix). Even though we tested moderation effects for multiple indicators, given that in multivariate twin studies using the standard path specification the numerical type I error rates are lower than expected (Verhulst et al. 2019), we still tested against a p value of 5%. In such a context, a p value of 5% has been discussed to rather reflect a significance level of 1% and thus to be conservative (Verhulst et al. 2019). All models were estimated in R using the umx package (version 4.9.0) developed by Bates et al. (2019). The lrtest was performed using the mxCompare command, which is part of OpenMx (version 2.19.6) (Boker et al. 2011). We selected the most parsimonious model based on the lrtest in combination with the smallest number of estimated parameters and the smallest values for AIC. Table S2 (Online Appendix) provides an overview of the model fit statistics. In the moderation analysis we z‑standardized all variables except for the variable whether or not educators have training with a focus on early childhood pedagogy.

4 Results

In line with previous research (e.g., Krapohl et al. 2014) we observe only genetic and nonshared environmental contributions to child-reported EPs (see Fig. 2 baseline model; Table S3 in the Online Appendix). Our results suggest that about 35% of the variation in EPs relates to genetic variation, whereas there is no evidence for shared environmental influences. This does not imply that environments objectively shared by twins do not affect them, but only that they do not affect the twins uniformly.

Fig. 2
figure 2

Standardized variance components for the baseline models and models showing moderation effects

Table 3 presents the results of the moderation models that best fit the data according to the AIC and the lrtest criteria after all nonsignificant paths have been excluded by fixing them to zero (see Table S4 in the Online Appendix). To facilitate the interpretation of the results, Fig. 3 summarizes the moderation effects for the models that showed moderation by plotting the unstandardized variance components (a2, c2, e2) and their confidence intervals (95% CIs). Given that the reported standardized variance components (A, C, E; Fig. 2) can vary as a function of each other, it is generally recommended to report the unstandardized variance components (a2, c2, e2) when studying moderation effects (Purcell 2002).

Fig. 3
figure 3

Unstandardized variance components and confidence intervals (CI) for the baseline models and models showing moderation effects. Baseline AE refers to the baseline model with the two variance components A and E

Table 3 Estimates of the path loadings and standard errors (SE) for the bivariate Purcell (2002) Model for early childhood education and care (ECEC) characteristics and externalizing problems (EPs)

According to the results, only genetic and nonshared environmental influences contribute to EPs in children. Interestingly, there is also a common non-shared environmental component for ECEC center size and the child–staff ratio with EPs. Thus, unsystematic influences that affected twins’ selection into specific ECEC settings also contributed to differences in their externalizing behavior at ages 6–8 (for ECEC center sizes: e21 = 0.11, SE = 0.03; for child–staff ratio: e21 = 0.06, SE = 0.03). Furthermore, we observe common genetic influences (a21) with EPs in wave 2 for the indicators “child–staff ratio” and “training with focus on early childhood pedagogy.” For “child–staff ratio” the common genetic path is moderated (βa1 = 0.10, SE = 0.04), suggesting that children with a higher genetic predisposition to EPs at ages 6–8 more often visited ECEC centers with a higher child–staff ratio when they were 4–6 years old than children with a lower genetic predisposition. For educators’ training we observe the opposite pattern: Children with a higher genetic predisposition for EPs at ages 6–8 more often visited ECEC centers where educators had a training with focus on early childhood pedagogy (a21 = 0.18, SE = 0.06). Nevertheless, educators’ training reduces the contribution of the genetic component (βa2 = −0.24, SE = 0.04), which is mainly pertinent for stability in EPs (e.g., Lewis and Plomin 2015), and increases the relevance of unsystematic and individual experiences (βe2 = 0.07, SE = 0.02). This results in the variance in EPs in ECEC facilities with trained educators being almost entirely due to unshared environmental effects (Figs. 2 and 3).

Taken together, the results make it evident that the quality of ECEC centers can indeed be relevant for moderating a genetic risk of EPs. However, this evidence is restricted to only two of the quality characteristics. There could be more relevant quality characteristics, but the comparably small sample for complex modeling like the bivariate Purcell Models we applied made coefficients often insignificant, which in larger samples would possibly allow to the hypothesis to be confirmed for more quality indicators. For example, we did not find any evidence for stress experience to moderate the contributions of the variance components, although one could have expected that greater experiences of stress should indicate a more stressful social “climate” in the group and a lower quality of educator–child interactions.

5 Discussion and Conclusion

In children, EPs have been shown to negatively affect different outcomes in later life (e.g., Palmu et al. 2018). With the expansion of ECEC centers, their quality has been increasingly discussed as a promising way of compensating for risks for developing such problem behaviors. Our research links to important ongoing debates about the necessity to improve ECEC quality to facilitate child development (e.g., Stahl et al. 2018). To the best of our knowledge, this paper is the first to analyze the moderating role of specific indicators of ECEC quality on the heritability of EPs. Aside from adverse social environments, genetic propensities for developing EPs are a second risk that is especially relevant for the perpetuation of EPs beyond preschool age. Moreover, the genetically informative design and methods we applied enabled us to tackle issues related to unobserved heterogeneity and omitted variable bias in capturing relevant environments inside and outside the family that many previous studies suffered from.

Previous research has focused on being enrolled in ECEC centers but without paying attention to possible differences in ECEC quality (Middeldorp et al. 2014; Tucker-Drob and Harden 2013). The variations in several quality characteristics we found in our sample allowed us to address this research gap. More precisely, this paper studied the extent to which specific ECEC quality characteristics experienced at the age of 4–6 years moderate the effect of genes as well as conditions outside ECEC centers on EPs 2 years later at the age of 6–8 years. In other words, the genetically informative design enabled us to analyze to which degree a genetic risk and environmental conditions that promote EPs are buffered by specific ECEC quality indicators. This distinction is relevant, particularly as it has been claimed that genes contribute to stability in EPs, whereas environmental influences have been said to lead to changes in EPs (e.g., Lewis and Plomin 2015).

Our expectation that ECEC quality moderates a possible contribution of the shared environment as proxy for a uniform influence of the family environment on EPs was not confirmed. This is mainly because we did not find any such shared environment contribution to EPs at the age of 6–8 years at all. This does not necessarily mean that the family environment does not exert any influence on EPs. Instead, the family environment may not have a uniform but rather an individual effect on children’s behavior, i.e., that the family environment contributes to EPs in one twin and not in the other. Another possibility is that the family environment changed during the 2 years until the second measurement of EPs.

Our results show, however, that a higher ECEC quality with respect to educators’ training is able to moderate genetic influences that contribute to EPs, whereas it also moderates the relevance of unsystematic, individual experiences and thus provokes differentiation in children’s EPs. Accordingly, ECEC quality is rather effective in buffering a genetic risk of EPs as opposed to buffering against a detrimental home environment. In the light of existing research that genetic variation is most relevant for enduring EPs (e.g., Lewis and Plomin 2015), this result is not surprising, as we addressed this aspect by measuring EPs 2 years later.

That for such a specific outcome like EPs not all ECEC quality characteristics play a durable role for child development should not be surprising. That we could at least identify one, namely better training of educators, should therefore not give reason for disappointment. Rather, it gives valuable information for clearly targeted policy interventions. That we did not find more significant moderation effects is not only due to the complex modeling approach we applied. Most bivariate correlations between ECEC quality and EPs were small and nonsignificant (see Table 2). We also tested the association between ECEC quality and EPs in a multilevel regression analysis without finding any effects (not presented, results are available upon request).

Moreover, and also relevant for educational policy considerations, our moderation analyses point to processes of gene–environment correlation (rGE), i.e., associations between the genetic risk of EPs and the child–staff ratio. It is difficult to determine the exact mechanisms underlying this correlation, and further research is needed to understand why children at a higher genetic risk of EPs are more often found in daycare centers with a higher child–staff ratio. Differences in the selection process of children with higher and lower genetic risks into different ECEC environments may relate to differences in the availability of and accessibility to high-quality ECEC for children from different social backgrounds. For example, children in disadvantaged families are overall more likely to develop problem behaviors (e.g., Lansford et al. 2019). If ECEC centers in residential areas where disadvantaged families are over-represented are more often characterized by lower ECEC quality (compare Burchinal et al. 2014 for similar results for the USA), this would result in more children at a genetic risk for EPs being found in lower-quality ECEC centers. Stahl et al. (2018) did indeed find more children from disadvantaged backgrounds, such as children of parents with lower levels of education or from families with a migration background, attending lower-quality ECEC centers in Germany than children from other groups. Given that children from disadvantaged families are also said to react more sensitively to the quality of ECEC (e.g., Phillips and Lowenstein 2011), such a pattern suggests a double disadvantage for children from disadvantaged families with genetic predispositions for EPs. Where ECEC quality characteristics that are helpful to buffer risk for EPs, such as better opportunities for educator–child interactions, are most needed, these characteristics are often less readily available.

Taken together, improving educators’ training and ensuring the presence of a sufficient number of educators who can carry out beneficial activities with children, appears to be a promising way of counteracting EPs in young children. Larger centers may have an advantage in this regard, as they increase the possibility, for instance, of funding a larger number of educators and cross-group educational support programs. This could explain why we did not observe any dedicated influences of ECEC center size or group size.

This study is not without its limitations. First, the identification of a child’s EPs through personal interviews (self-reports) appears rather suboptimal compared with observational data or data derived from multiple informants (e.g., Stanger and Lewis 1993). Although previous research showed that self-reports from children aged 6 to 10 can provide meaningful SDQ data (e.g., Curvis et al. 2014), in some studies the reliability of the SDQ self-report subscales was found to be modest in younger children (e.g., Di Riso et al. 2010). In the current study, the variable “externalizing problems” has approximately a normal distribution and model fit statistics showed that the estimated model (CFA) to predict a child’s EPs fitted the observed data well. Nevertheless, future research should corroborate our findings using an alternative scale, such as the Child Behavior Checklist (e.g., Achenbach 2011), and multiple informants when possible. The second limitation is that, despite a higher response rate than in other comparable German surveys of ECEC centers, our analyses are based on a small sample and therefore suffer from low power to detect moderation effects, which require more power than is needed to identify main effects (Rutter and Silberg 2002). Therefore, these analyses are only a first step and indicate the need for further research in this area. Third, owing to a lack of sufficient power, it was not possible to look at the interaction of several ECEC characteristics and their joint effect on EPs. This is a significant limitation because the environmental conditions associated with the measured ECEC characteristics typically occur together and in various combinations, and their effects might thus depend on each other. Future research needs to go beyond considering ECEC quality characteristics separately in their analysis to see the importance of ECEC quality for children’s development. Again, much larger samples are needed in this case. Fourth, with respect to quality measures, the orientation quality of staff at the ECEC centers and more process quality-related measures should be considered in more detail. This should include gathering information from all educators, if possible, and conducting observational studies as the most appropriate approach to study process quality. Fifth, future research should focus on the possible differences in the outcomes presented for boys and girls and possible differences in the relevance of the variance components between socioeconomic status groups. Such desirable differentiations required larger case numbers, which were not available here. Finally, future research may look more closely at the influence of combinations of ECEC quality and home environments on EPs to further study the mechanisms underlying the estimated variance components.