1 Introduction

Based on evolutionary reasoning, Trivers and Willard (1973) predicted that maternal condition should affect both offspring sex composition at birth and sex-selective parental investment post partum in a number of species, including humans. Specifically, son-favoritism in parental resource allocation should result in higher reproductive fitness for parents in good condition and daughter-favoritism for parents in bad condition. Whereas in animal studies on the Trivers–Willard (TW) hypothesis, “condition” is usually operationalized on a physiological level, Trivers and Willard suggested that socioeconomic status was a good proxy for maternal, or parental, condition in humans (Schnettler 2017). For humans, the TW hypothesis therefore predicts an interaction between parental status and offspring sex composition as well as parental investment (for a schematic visualization, see Fig. 1). Even though they did not suggest any specific mechanisms, Trivers and Willard (1973) argued that over evolutionary time, mechanisms may have evolved that could drive the predicted association if three conditions had been met in our species’ ancestral past: (a) a correlation of offspring condition with maternal condition during the phase of parental investment, (b) persistence of differences in offspring condition until adulthood, and (c) a stronger beneficiary effect of improvements in maternal condition for the reproductive success of male rather than female offspring (Trivers and Willard 1973). Human males, compared to males of other species, display a relatively high degree of paternal investment. That is, the difference in the cost of reproduction between the sexes and thus the expected difference in variance of reproductive success between male and female offspring are lower than in other species (Trivers and Willard 1973; Trivers 2002). Nevertheless, sociobiological research provides support for all three conditions in human societies (Low and Clarke 1992; Røskaft et al. 1992; Klindworth and Voland 1995; Voland 1995; Scott and Duncan 1999).

Fig. 1
figure 1

Schematic representation of the Trivers–Willard hypothesis

Even for contemporary societies, sociological research provides plenty of empirical support for conditions (a) and (b). In fact, the persistent social inequalities over generations, even if fluid to a certain degree, have rendered social inequality a foundational topic of sociology today (Erikson and Goldthorpe 1992; Grusky 1994; Breen 2004). And in the course of the past few decades, life-course sociology has provided a plethora of empirical studies evidencing linkages between early conditions and adult-life outcomes (Mayer 2000, 2009). More critical is the third condition: Empirical evidence supports that across cultures, females more so than males value high socioeconomic status as an important mate characteristic (Shackelford et al. 2005). Yet in many contemporary, developed societies, fertility has reached historical lows (Morgan 2003; Billari and Kohler 2004), thus limiting the potential variance to be explained by social status. In countries with structural constraints against combining work and family, many high-status women postpone childbearing, resulting in particularly low levels of fertility (see Kaufmann et al. 1998, 2002; Sobotka 2008). This appears to be at odds with condition (c). But evolutionary psychologists argue that not all conditions need to be met in contemporary societies. Instead, for major physiological and psychological mechanisms orchestrating the TW effect to have evolved and thus to still be active in contemporary societies, it would be sufficient for the conditions to have held long enough during our species’ evolutionary, ancestral environment (see Hopcroft 2005, pp. 1114 f.).

Empirical studies on both aspects of the TW hypothesis, sex composition and parental investment, have yielded mixed results (Smith et al. 1987; Gaulin and Robbins 1991; Betzig and Weber 1995; Chacon-Puignau and Jaffe 1996; Freese and Powell 1999, 2001; Keller et al. 2001; Kolk and Schnettler 2016; Koziel and Ulijaszek 2001; Gibson and Mace 2003; Hopcroft 2005; Almond and Edlund 2007; Mackey and Immerman 2008; Cameron and Dalerum 2009; Schnettler 2010, 2013). James (2012) therefore concludes, “The Trivers–Willard hypothesis has had only limited success. […] At present there seems too much evidence in its favour for this hypothesis to be rejected, and too much against it, for it to be accepted” (James 2012, p. 183). Not only is the empirical status of the hypothesis yet undetermined, but the potential relevance of the TW effect to both evolutionary biology and the social sciences is too high for the inquiry to be given up prematurely: If boys and girls were born with different probabilities into high- and low-resource contexts, and if they experienced different levels of parental support, this could have far-reaching effects on gender and social inequality at later ages (cf. Edlund 1999). Therefore, more research on the potential mechanisms mediating the relationship between maternal resource conditions and outcome patterns in sex composition and parental investment is needed. Trivers and Willard (1973) left the question of specific mechanisms open. But for an explanation to be complete, the proximate mechanisms responsible for driving the TW effect need to be understood, and to this purpose a more analytical approach seems necessary (cf. Mathews 2011; Salmon and Hehman 2020).

With regard to the first part of the TW hypothesis, physiological mechanisms driving sex composition could be conceived to operate at various stages of development: Spermatozoa could be equipped with different survival chances depending on whether they carry an X or Y chromosome, or they could have different chances of fertilizing the ovum; male and female fetuses could have different degrees of vulnerability during pregnancy; and/or sex-biased parental investment could lead to differential infant or child survival (Stinson 1985; Wells 2000; see Lazarus 2002). Recent research has provided promising cues for detecting potential mechanisms, and Cameron (2004) emphasized the importance of correctly specifying the timing of status measurement when measuring the underlying associations: Studies in which maternal condition is measured closer to conception are more likely to be consistent with the TW hypothesis, whereas studies in which condition is measured after conception or even after birth are less likely to be consistent with the TW hypothesis. For human males, tentative evidence for a similar timing effect has been reported for a sample of billionaires (Schnettler 2013). Furthermore, researchers have narrowed down the pool of potential hormonal mechanisms playing a role in sex determination (Grant and Irwin, 2005; Cameron and Linklater 2007; Grant 2007; Grant et al. 2008; James 2008; Grant and Chamley 2010). Noteworthy is the theoretical model by Grant and Irvin (2009), in part supported by experimental research by the same author team, that combines two mechanisms: Prior to conception, the maternal level of follicular testosterone affects the shape of the outer rim of the ovum, thus making it more or less likely that a spermatozoon carrying either an X or Y chromosome fertilizes the egg. During the gestational period, adverse conditions in the physical and social environment can alter the sex ratio at birth, given that male fetuses are more fragile with regard to these outside influences than female ones.

With regard to the second part of the TW hypothesis that pertains to parental investment, the issue is more difficult (Schnettler 2017). If, as Trivers and Willard suggested, parental investment is responsive to maternal condition in the way suggested above, then we would expect some psychological mechanism that mediates the influence of (perceived) parental status and offspring sex on parental investment (Mathews 2011; Kolk and Schnettler 2013). Most empirical studies on the psychological TW effect look at single investment indicators. Overall, Keller et al. (2001) identified only eight studies that specifically tested the TW hypothesis on data on parental investment. More recent studies take into account a wider variety of parental investment indicators, status measures, and/or more appropriate research designs to facilitate within-family comparisons (Freese and Powell 1999, 2001; Schnettler 2010). Although these studies find no support for the TW hypothesis, their evidence is not conclusive. The focus on single investment indicators, even sets of investment indicators, is problematic because for evaluating the TW hypothesis it would be necessary to look at the total amount of parental investment across all investment domains. This is because parents may give preferential treatment to one child in one investment domain but to another child in another domain, yielding equal treatment overall. Or, vice versa, it could be that actual favoritism remains hidden across a multitude, but selective set, of single investment indicators (cf. Schnettler 2010). An additional problem is that correlational studies may be fraught with the problem of unobserved heterogeneity, limiting the potential for causal arguments regarding the TW effect (Mathews 2011). Therefore, research should direct attention to the (unconscious) parental sex preferences for their offspring that underlie and potentially steer parental investment (cf., Cronk 2007).

To my knowledge, very few studies have measured a TW effect in parental preferences. In one such study, the author examined offspring sex preferences of college students in the United States. He used mortality priming to trigger thoughts related to a limited-resource situation but found no TW effect in students’ sex preferences (Mathews 2011). Given that the study involves a sample that is rather homogeneous with regard to respondents’ socioeconomic status (Mathews 2011, p. 15), it remains unclear whether the reported effect is a true null result or due to a failed manipulation. Although the author cites research that shows mortality priming to affect indicators of reproductive strategy (Mathews 2011, p. 16), it could be that mortality-related thoughts are not sufficient to change a respondent’s own status perceptions and thus do not affect sex preferences in a direction predicted by the TW hypothesis. Kolk and Schnettler (2013) instead drew on Swedish population register data with an approach common in demography to study population-level sex preferences for offspring (Hank and Kohler 2000; e.g., Andersson et al. 2006; Hank 2007) and differences in these preferences across status groups (Yamaguchi and Ferguson 1995; Andersson et al. 2007). This approach involves looking at fertility stopping to study whether parents of higher socioeconomic status are more likely than parents of lower socioeconomic status to have an additional child if they have no or few sons, as well as whether the reverse pattern can be observed if parents have no or few daughters. Using a broad set of different status indicators, including wealth, income, education, and socioeconomic class, they did not find differences across status groups (Kolk and Schnettler 2013). Although this study is incisive in showing that no TW effect can be found in fertility preferences, the indicator may lack the necessary sensitivity to capture the TW effect, given that existing sex preferences may be hidden behind strong preferences for a fixed family size. For instance, although a rich family with two daughters may have a preference for sons, a stronger preference for a small family may have led the couple to decide not to have more children despite the fact that they do not yet have a son. Completed fertility therefore is only a very rough indicator of actual parental preferences regarding offspring sex. Another reason why this study may have led to a null result is that Sweden is a country with relatively low social inequality. Although status inheritance in Sweden is still considerable, differences are lower than in many other countries (Breen and Jonsson 2005). If differences in socioeconomic status are relatively small, so may be differences in sex preferences.

In addition to the limitations discussed in the previous paragraph, strictly speaking, fertility preferences need to be distinguished from parental investment preferences. Here, I add to this existing line of research on a possible psychological TW mechanism and provide an alternative way to capture parents’ attitudes about differential treatment of sons and daughters. This approach is thus more directly focused on parental investment rather than fertility preferences. In a survey experiment, administered to just over 2000 respondents and conducted as part of Time-sharing Experiments for the Social Sciences (TESS),Footnote 1 respondents were presented with two different fictitious scenarios of parental investment: In these scenarios, two siblings, varied randomly with regard to their sex, birth order, and birth weight, either did or did not receive a particular form of parental investment. After the respective vignette had been presented to respondents, they were asked to judge how fair they evaluated the investment decision of the fictitious parents described in the vignette to be. The study serves to test whether parents judge differential parental treatment as more or less fair, depending on a combination of their own socioeconomic status and the sex of the children described in the vignettes. If the results were consistent with the TW hypothesis, we would expect that high-status parents would judge parental favoritism to be relatively fairer as compared to low-status parents if the favored child is a son. And the reverse should be true if the favored child is a daughter. This study combines the strength of a representative social survey with that of an experimental study (cf. Mutz 2011): The survey experiment is administered to the general U.S. adult population, thereby providing coverage across a broad and representative set of social status groups; also, it allows the researcher to randomly vary the treatment variable, in this case the characteristics of the fictitious children in the parental investment vignettes. Survey experiments have the added advantage that they reduce, if not eliminate, social desirability biases (Auspurg et al. 2009a; Nisic and Auspurg 2009). In this case, reducing this bias is necessary because parents, when asked directly whether they treat their children differently, would likely bias their answers according to a perceived social desirability for treating children equally.

2 Material and Methods

Two parental investment scenarios as part of a survey experiment were presented to a random subsample of a larger online panel consisting of about 50,000 U.S. households. This and other studies were conducted by Knowledge Networks for TESS (see footnote 1), a National Science Foundation–funded research initiative. The sample is representative for the U.S. resident population and avoids typical sample biases due to limited computer and/or Internet coverage by providing laptops and free Internet access to those respondents lacking the respective technology in their homes (Knowledge Networks, n.d.). The maximum number of respondents researchers can get access to is limited by TESS. The exact limit depends on the length of the vignettes and the number of survey questions in the applied research design. For the current study, the available maximum number of respondents yielded a net sample of 2016 adult respondents who completed the survey experiment. This number of respondents corresponds to a participation rate of about 66.4%. Data collection took place between October 15 and November 4, 2010.

The experimental design consists of two vignettes with a combination of eight conditions of child characteristics each. Child characteristics that are randomly assigned are birth order of siblings in a fictitious sibling dyad, sex of the focal child, and sex of the second child. The latter two variables that indicate the sex of both children were included to allow for a test of the TW hypothesis. Because generating the vignettes required mentioning the birth order of the two children, this variable was included in the experimental manipulation to control for potential birth order effects (Brody et al. 1992; Sulloway 2001; Rohde et al. 2003; Sakata et al. 2022). In the first vignette, the parental investment indicator is breastfeeding. That is, a fictitious sibling pair is described in which one sibling is breastfed and the other one is formula fed:

“When Sarah had her [first | second] child—her [son Andy | daughter Jennifer]—she nursed [him | her] for the first 12 months. It wasn’t always easy to manage the demands of nursing given the strains of her busy work schedule—but she was more than happy to go through these difficulties given the assumed benefits of nursing for her [son’s | daughter’s] development. But at the time when Sarah had her [second | first] child two years [later | earlier]—her [daughter Caroline | son Bob]—she felt that she didn’t have the energy to go through the efforts and exhaustion that go along with breastfeeding. Thinking that it couldn’t be so bad if she didn’t breastfeed [Caroline | Bob], [this time | back then] she decided that it would be equally fine to formula feed [her | him] instead.” (Text of Vignette 1, 2×2×2 design, three experimental conditions highlighted using different font styles)

In the second vignette, parental investment is operationalized as the amount of time parents spend with their children. Here, a fictitious situation is described in which one child was born with low birth weight (LBW) and one with normal birth weight (NBW). An effect comparison between Vignette 1 and Vignette 2 would therefore allow for detecting endowment effects (Schnettler 2010; cf., Hsin 2012). Both vignettes were presented to all respondents. But the sequence in which they were presented was varied randomly to account for order effects (see Auspurg et al. 2009b). To further reduce a potential social desirability bias with regard to the question of differential parental treatment of a disadvantaged child, respondents were introduced to three time-investment strategies that were described as equally legitimate: Compensation by spending more time with the LBW child, reinforcement by spending more time with the NBW child, and equal time investment. The fictitious parents in the vignette were then presented as having decided for the latter strategy of equal time investment.

“Susan and John have two children. Their [first-born | second-born] [daughter Allison | son David] was born with low birth weight, whereas their [second-born | first-born] [daughter Lisa | son Paul] was born with normal birth weight. Because Susan and John both work full-time, time is a scarce resource for them. Therefore, they want to make sure to use the little time they have with their children in an optimal way. But when they asked friends, relatives, and acquaintances for advice, everyone seemed to have a different opinion of what is best: Some told the two parents to spend the same amount of time with each of their children so that everything could turn out in a natural way. Others told them to spend more time with their [first-born | second-born] [daughter Allison | son David], the child born with lower birth weight, in order to reduce the different developmental potential between the two children. And yet others told them to spend more time with their [second-born | first-born] [daughter Lisa | son Paul], the child born with normal birth weight, because this is the child with the highest developmental potential and therefore Susan and John’s time investment would have a larger effect. To the two parents, all arguments sounded reasonable, but in the end they decided for the second recommendation, that is, to spend the same amount of time with [Allison | David] and [Lisa | Paul].” (Text of Vignette 2, 2×2×2 design, three experimental conditions highlighted using different font styles)

After the presentation of each vignette, respondents were asked to rate the fairness of the parental investment decision. That is, after the presentation of Vignette 1 they were asked to evaluate Susan’s decision to breastfeed only one child, and after the presentation of Vignette 2 they were asked to evaluate Susan’s and John’s decision for equal time investment. Each of the two fairness ratings was based on two measures to evaluate the propriety of fairness norms and one measure to evaluate the validity of the fairness norm.Footnote 2 All of these separate ratings were assessed on an 11-point scale, ranging from −5 to +5. Cronbach’s alpha for combining the three ratings to one index is high for both vignettes (0.83 for Vignette 1; 0.79 for Vignette 2). Therefore, a combined index was built for each vignette from the arithmetic mean of the three respective fairness ratings. The resulting two fairness indices represent the key dependent variables for the subsequent regression analysis (see Fig. 2 for their distribution). Following the fairness ratings after each vignette, a manipulation check was conducted to estimate whether respondents correctly remembered the sex and birth order of the focal child in the respective vignette. In Vignette 1, this is the child who received breastfeeding. In Vignette 2, it is the one with LBW. Respondents were also asked how important they think breastfeeding and parental time investment are for the development of children. These efficacy beliefs about parental investment are included because they should matter with regard to the strength of the TW effect: If the TW effect actually existed, it should be stronger for more consequential forms of parental investment, or forms of investment that are believed to be consequential for child development (see Scott and Duncan, 1999, for a historical example that illustrates the importance of distinguishing between parental investment beliefs and behaviors with regard to the TW effect).

Fig. 2
figure 2

Distribution of dependent variables for both vignettes. Note: The figure shows that the distribution of fairness ratings is quite distinct across the two vignettes. Whereas fairness ratings for Vignette 1 are roughly balanced across lower and higher ratings, the respective ratings for Vignette 2 are highly skewed toward higher fairness ratings. One possible reason is that the wording of Vignette 2 may more strongly make respondents aware of the widespread equal-treatment norm when it comes to parental investment. It may thus be more strongly affected by social desirability bias than expected, an issue that could also be examined by studying a larger number of different parental investment scenarios (see discussion for other arguments speaking for including a larger number of investment scenarios). Another reason could be that efficacy beliefs regarding breastfeeding are more mixed than those for time investment (see Results and Discussion for further details)

The TW hypothesis was tested in a series of linear regression models. As some of the model assumptions were breached, a quantile (median) regression was conducted as a robustness check.Footnote 3 Dependent variables were the fairness ratings obtained for the two vignettes, respectively. The main explanatory variables in the model were sex of the fictitious child receiving parental investment, respondents’ socioeconomic status, and the interaction between the two. Because a theory about the most appropriate status measure for evaluating the TW effect is lacking (Kolk and Schnettler 2013; Schnettler 2017), here I included different status indicators that were available in the dataset: education and income. The education variable has four outcomes: less than a high school degree (“<high school,” reference category in the regression models), high school degree, more than a high school degree (e.g., high school plus vocational training or some college), or a bachelor’s degree or higher (“bachelor’s degree +”). Income is divided into 19 classes. The intervals start with incomes below US$5000, followed by US$2500 increments up to just below US$15,000. Subsequent intervals increase in US$5000 increments up to just below US$40,000, then in US$10,000 increments up to just below US$60,000. After that, the increments vary between US$10,000 and US$15,000 up to just below US$100,000. The final intervals are in US$25,000 steps up to an income just below US$175,000, with the last interval covering incomes of US$175,000 and higher. This income categorization was designed by the survey providers to approximate a normal distribution and is thus treated as a metric variable in the regression models.

Further controls were for the additional experimental conditions (birth order, second child in vignette of same or different sex). Inclusion of the sex of the second child also allowed to test whether respondents rate unequal treatment as more or less fair in same- vs. other-sex sibling dyads, as one could argue that the respective TW mechanism would only be activated in the latter (Schnettler 2010). In addition, a dummy variable captured the order in which the two randomly ordered vignettes were presented to a particular respondent. I also controlled for a number of respondent characteristics. First, this includes age, which helps capture possible life-course effects (Mayer, 2000, cf. 2009): Depending on career and family status, for instance, respondents may have more or less time and financial resources available, thus making them more or less understanding of certain forms of differential treatment. Second, this includes the status of being a parent, as one’s own parenting experience may give parents a sense of how realistic equal treatment is on a daily basis. In addition, it gives them a sense of how sensible children react to differences in parental investment. Third, this includes the sex of respondents, as previous research has provided evidence for a benefit of same-sex parent-child dyads in terms of parental investment (see Lye 1996). Fourth, this includes respondents’ efficacy beliefs with regard to the respective type of investment. These were assessed on an 11-point scale. In the case of Vignette 1, respondents were asked, “Which infant feeding method do you think is better for a child?” Responses were rated from −5 (“breastfeeding is better”) to +5 (formula feeding is better). In the case of Vignette 2, respondents were asked, “How strongly can differences in the amount of time parents spend with their children affect a child’s development, do you think?” The scale ranged from 0, “not at all,” to 11, “very strongly.”

3 Results

Presentation of the vignettes and the respective fairness ratings was followed by manipulation checks to see whether respondents correctly attributed the sex and birth order of the focal child, that is, in Vignette 1, the breastfed child, and in Vignette 2, the child with LBW. Roughly one-third of the respondents made incorrect attributions of either sex or birth order (Table 1). The percentage of misattributions is a bit lower for birth order in Vignette 2 (25%). For the TW effect, it should matter whether respondents actually register the occurrence of sex-selective parental investment in the fictitious scenarios. This underscores the importance of including respondents’ performance in the manipulation check as control, as I do in the multivariate versions of the regression models presented below.

Table 1 Descriptive results of manipulation check (number and percentage of correct, incorrect, and missing attributions of child characteristics)

Tables 2 and 3 present the results of bivariate regression models on the respective fairness ratings for Vignettes 1 and 2. In addition to coefficient estimates, these tables display the amount of explained variance for each bivariate regression model.

Table 2 Results of bivariate regression models for Vignette 1 (intercepts omitted)
Table 3 Results of bivariate linear regression models for Vignette 2 (intercepts omitted)

The bivariate regression coefficients indicate that in Vignette 1, the perceived fairness of the discriminatory breastfeeding scenario decreases with increasing education or income levels, and these effects are statistically significant. However, the comparison between the second-lowest and the lowest educational levels is not significant. In Vignette 2, the effects are reversed, which is consistent since this scenario involves equal treatment. Overall, the main effects show that individuals of higher status, on average, perceive unequal treatment as more unfair than individuals of lower status do.

In both vignettes, the main effects of child sex, birth order, and whether the two children presented in the respective vignettes are the same sex are not statistically significant. In Vignette 1, this is also true for the manipulation check indicator, and in Vignette 2, for respondent age. Higher values in the variable measuring beliefs about breastfeeding indicate that respondents believe formula is better than breastmilk. Therefore, the main effect in Table 2 indicates that as respondents’ preference for formula increases by one unit, the perceived fairness of one child in the vignette not being breastfed increases by a value of 0.306.

For beliefs about time investment, higher values indicate respondents’ belief that differences in time spent with children more strongly affect child development. Therefore, the main effect of the respective variable in Table 3 indicates that as parents’ belief in the efficacy of time investment increases by one unit, their perceived fairness of equal treatment increases by a value of 0.057. In both tables, the main effects for the variable distinguishing parents from nonparents show that parents perceive unequal treatment as more unfair (Vignette 1) and equal treatment as fairer (Vignette 2).

The effect of parental sex is inconsistent, as female respondents seem to be more accepting of unequal breastfeeding but also prefer equal time investment. In both vignettes, the order of the vignettes seems to play a role, leading to higher fairness ratings when Vignette 2 is presented first.

Notably, as indicated in the last column of Tables 2 and 3, parental investment–related beliefs account for a significant portion of the explained variance, with 13.8% for breastfeeding and approximately 9.7% for time investment. This substantial contribution suggests that the increase in shared variance observed in the more comprehensive linear regression models (Tables S1–S4), which include multiple variables, is largely driven by these parental investment beliefs.

Figure 3 shows predicted fairness ratings from linear regression models representing the relationship between fairness ratings and the interaction between respondent status and sex of the fictitious child in the respective vignettes (including the main effects of respondent status and child sex). The graph panel shows four diagrams. The two top diagrams refer to Vignette 1, and “sex of the child” refers to the child in the vignette not receiving breastfeeding, that is, the formula-fed child. The two bottom diagrams refer to Vignette 2, and “sex of the child” refers to the child born with low birth weight. Results in the left column of Fig. 3 relate to education, and the results in the right column to household income as respondents’ status measure.

Fig. 3
figure 3

Predicted fairness ratings for selected regression models (by vignette and respondent sex). Fairness ratings by offspring sex, parental status, and vignette. a Parental education, Vignette 1; b parental income, Vignette 1; c parental education, Vignette 2; d parental income, Vignette 2. Note: Education is distinguished into four outcome categories: less than a high school degree (“1<HS”), “high school degree” (“2HS”), more than a high school degree (e.g., with additional vocational training or some college, “3HS+”), and bachelor’s degree or higher (“4BA+”). The variable “household income” is based on 19 income classes. For a more detailed description of this variable, see the main text. lbw stands for low birth weight, m for male, and w for female

Temporarily setting aside variations related to sex of the child, a distinct main effect of respondent status becomes evident. Across all graphs, it is observable that respondents of higher status, regardless of the specific measure of status, consistently perceive unequal treatment as less fair. Vignette 1 illustrates a case of unequal treatment. Consequently, the observed decrease in fairness ratings with rising status implies that individuals of higher status are more inclined to oppose unequal breastfeeding decisions among siblings. Conversely, Vignette 2 presents a situation of equal treatment. In this context, the trend of increasing approval lines suggests that individuals with higher status are more likely to favor equal time investment in caregiving. Thus, the multivariate results replicate the general pattern already found in the bivariate associations.

Now, with a focus on testing the TW hypothesis, we examine the variations between lines within each graph. Figure 3a illustrates that those individuals with less than a high school education perceive unequal treatment as less fair when the “disadvantaged” child (who did not receive breastfeeding) is female, compared to when the child is male. This trend reverses among individuals with higher educational qualifications. Those with a high school diploma tend to view the unequal treatment of boys and girls as similarly unfair, while individuals with education beyond high school perceive unequal treatment as less fair when the “disadvantaged” child is male. This aligns with the TW hypothesis. A similar pattern emerges when considering household income as a measure of status (Fig. 3b). Respondents from lower income brackets find the unequal treatment of girls to be less fair than that of boys, whereas respondents from higher income brackets exhibit the opposite trend.

In Fig. 3d, a similar pattern emerges, though the lines show a positive slope. Respondents from lower income brackets perceive equal treatment as less fair when the child born with LBW is female rather than male. As household income increases, the disparity in fairness ratings based on the child’s sex diminishes. At even higher income levels, the trend reverses: Respondents rate equal treatment as less fair when the child with LBW is male rather than female. This observation also aligns with the TW hypothesis. In terms of education, the pattern is less distinct, as the lines representing male and female children intersect twice (Fig. 3c).

The primary versions of the plots in Fig. 3 offer a detailed view of a narrow segment of the dependent variables, while each plot also includes a smaller inset depicting the full range of these variables on the y‑axis, from −5 to 5. This expanded perspective reveals that the observed differences, though generally consistent with the TW hypothesis, are minor. The models in Fig. 3 utilize basic linear regression, incorporating only respondent status, child sex, and their interaction as independent variables (referred to as “Linear (1)”in Tables S1–S4 in the Online Appendix). The regression tables in the Online Appendix indicate that the interaction effects, which could substantiate the TW hypothesis, lack statistical significance across all vignette–respondent–status combinations. However, the primary effect of respondent status does achieve conventional statistical significance in most scenarios. Specifically, Table S1 (see Online Appendix) demonstrates that the interaction effect for the two highest educational categories, compared to the lowest educational category, approaches marginal significance (p < 0.1). This observation persists even after adjusting for a range of relevant covariates (“Linear (2)” in Table S1 in the Online Appendix) but does not hold up under a robustness test. In the quantile regression, these interaction effects appear even less significant (p > 0.1). Regarding the combination of income as respondent status measure and Vignette 1, the situation is reversed: The interaction effect between income and child sex is not statistically significant in either linear regression model but approaches significance in the quantile regression (p < 0.1). For Vignette 2, however, the status–sex interaction coefficients are neither statistically significant nor do they approach significance.

In addition to status, we also see main effects of respondent sex and parental status, with positive associations in most combinations of vignette and status operationalizations (see Tables S1–S4). That is, as compared to the bivariate associations, for Vignette 1, the direction of the association is reversed. But the remaining association is very small. Also different from the bivariate associations, the variable indicating whether the two children presented in the vignette are of the same sex plays a statistically significant role, but only for Vignette 1.

4 Discussion

This research was grounded in the TW hypothesis, which postulates that parental condition influences both offspring sex composition at birth and sex-selective parental investment post partum in various species, including humans. Specifically, it suggests a pattern of son-favoritism in resource allocation by parents in good condition and daughter-favoritism by those in less favorable conditions. Recognizing the mixed empirical support for the TW hypothesis in prior studies, this research responds to calls for a more focused exploration of the psychological mechanisms that might drive a TW effect, as suggested by Mathews (2011) and Schnettler (2013). Rather than quantifying actual parental investment, which can be balanced across various measures and influenced by social and cultural norms, this study aims to unearth the latent parental preferences. These preferences are inferred from fairness assessments provided by respondents in response to hypothetical parental investment scenarios presented in vignettes within a survey experiment. This approach shifts the emphasis from observable behaviors to underlying attitudes, offering a novel perspective on the TW hypothesis.

The results reveal nuanced patterns that partially align with the TW hypothesis. In vignettes representing (un)equal parental treatment scenarios, individuals with lower educational levels or income perceived unequal treatment as less fair when the disadvantaged child was female, while those with higher status levels showed the reverse pattern. This trend was evident in scenarios involving both breastfeeding and time investment. However, these observed differences, albeit in line with the TW hypothesis, were relatively minor. Our regression analyses further indicated that while the main effect of respondent status often reached conventional levels of statistical significance, the critical interaction effects between respondent status and child sex, which would directly support the TW hypothesis, generally lacked statistical significance. This was true across various combinations of respondent status measures and vignette scenarios, even after accounting for a range of covariates and conducting robustness checks through quantile regression.

Given this overall picture, James’s 10-year-old assessment about the TW hypothesis, cited in the introduction, still seems right: “At present there seems too much evidence in its favour for this hypothesis to be rejected, and too much against it, for it to be accepted” (James 2012, p. 183). It seems that a satisfying answer to the empirical puzzle posed by Trivers and Willard (1973) requires us to continue the search for an underlying mechanism. Therefore, in addition to empirical studies showing where parental investment is patterned in alignment with the TW hypothesis and where it is not, we need theoretical work that sorts findings alongside integrating theoretical ideas about potential mechanisms. I propose that there are three pivotal research directions that could significantly advance our understanding of whether the TW hypothesis should be refuted or upheld in contemporary advanced welfare states, and under which specific conditions.

First, many of the existing empirical studies on the TW hypothesis have previously not addressed the question of mechanisms directly, with some implying or explicitly suggesting that any evolved behavioral tendency must be mediated through a psychological mechanism. In response, my work, including that of Kolk and Schnettler (2013), has focused on examining parental preferences, which could reflect such psychological mechanisms. With the approach chosen here, I replicated well-established main effects of status on parental investment, independent of the sex of the children receiving parental investment, a pattern that, in sociology, has been attributed to different parenting styles (“natural growth” vs. “concerted cultivation”) between social classes (cf., e.g., Lareau 2003; Lareau and Weininger 2008). This finding seems to validate the approach taken in the present study. Consequently, future research should focus more on underlying preferences as proxies for a potential psychological mechanism. Factorial survey experiments, like the one conducted here, are particularly effective for this purpose (cf. Auspurg and Hinz 2015). They help minimize social desirability bias, which is a common issue in research exploring sensitive topics like discriminatory parental solicitude.

Yet it should be noted that status-dependent beliefs about parental investment and cost–benefit calculations are two contenders for an explanation of a possible status and child-sex interaction in parental investment. In fact, beliefs about parental investment contributed by far the largest share of variance explained in this study, as the coefficients of determination in the bivariate regression tables showed. And not only that: Beliefs about the efficacy of breastfeeding and time investment were also associated with respondent status. This illustrates how easily, even if a psychological TW effect existed, its effects could be superseded by culturally or socially shaped beliefs. Closely related to this point, besides differences in the association between social status and beliefs about parental investment, differences in cost–benefit calculations across cultures and historical times could also lead to an alignment with or deviation from TW-predicted investment patterns.

For scholars in behavioral ecology (Brown et al. 2011), a limitation of the present article may thus be its focus on a narrow version of the TW hypothesis. In this I followed the lead of other sociological treatments of TW-related parental investment patterns, assuming, if often only implicitly and in accordance with a variant of evolutionary behavioral science, that some form of “physiological processes before conception or during gestation and psychological processes after birth” drive the TW effect in sex composition or parental investment (Cronk 2007, p. 25). With this tradition in mind, the present study aimed to empirically explore a possible psychological mechanism underlying the TW effect. In contrast to this narrow interpretation, a broader view considers that parents invest more heavily in offspring who most significantly influence their reproductive success. Behavioral ecology posits that behavior is highly flexible, enabling individuals to adapt to varying environmental conditions and cost–benefit ratios regarding evolutionary fitness for different parental investment behaviors (Schnettler 2016). Parental investment patterns may therefore sometimes resemble those implied by the TW hypothesis and sometimes not, dependent on contextual conditions. For example, local-factor models predict that sex-biased parental investment depends on the degree of resource or mating competition, implying different costs of rearing children (Cronk 2007). This approach is not dissimilar to sociological models such as the value-of-children approach (e.g., Friedman et al. 1994; Nauck 2005), which also require a general ability to calculate costs and benefits (cf. Cronk 2007).

Theoretically, an ancestral, psychological mechanism driving the TW effect and a general, rational, thought-based mechanism taking into account culturally formed and socially influenced beliefs and preferences as well as cost–benefit considerations, both driving parental investment patterns, could even coexist, making it difficult to discern the two types of mechanisms in observational data on parental investment behavior. The methodological approach presented here is promising with regard to disentangling these different influences, as beliefs and cost–benefit scenarios can be made explicit and be statistically controlled for. The factorial survey design, although reliant on indirect cues rather than direct observation of parental investment behaviors, allows for isolating relevant factors in the assumed TW association while also reducing social desirability bias in the normatively charged domain of parental investment. Particularly promising would be to extend this type of research to cross-cultural comparisons, as this would allow varying of contextual factors that may affect respondents’ assessments of parental investment scenarios presented to them in the form of vignettes.

The second of the three pivotal research directions mentioned above is related to the type of parental investment. So far, there has been limited consideration of which indicators of parental investment might be most relevant. Schnettler (2010) proposed that if a genetically evolved mechanism for discriminatory parental investment existed, it would likely be most evident in evolutionarily ancient forms of investment, such as breastfeeding (cf. Cronk 2007). However, his study examining breastfeeding and parental time investment did not find robust support for the TW hypothesis. Conversely, U.S. evolutionary sociologist Rosemary Hopcroft (2005) suggests that in the modern United States and many other developed countries, investment in children’s education is the most critical form of parental investment. Hopcroft and Martin (2014) state, “[S]o by investing in their children’s education parents can help ensure their children’s economic, social, and reproductive success” (Hopcroft and Martin 2014, p. 237). Hopcroft and others have found evidence supporting the alignment of parental investment in children’s education with the TW hypothesis (Cox 2003; Hopcroft 2005; Hopcroft and Martin 2014, 2016; Song 2018). Since this form of investment, unlike breastfeeding, is evolutionarily novel, it again raises questions about the underlying mechanisms. For parents to make such informed decisions, reflexive decision-making is required (cf. Cronk 2007). Other evolutionary approaches to explaining human behavior, such as behavioral ecology and cultural evolution (for an overview see Schnettler 2016 and the introduction to this special issue), explicitly accommodate the role of cultural influences and the interplay between biological and cultural mechanisms in shaping fitness-enhancing behavior. Cultural change occurs much more rapidly than genetic evolution, enabling humans to adapt more swiftly to changes in the physical and social environments. Therefore, future research on the TW effect should more explicitly incorporate insights from cultural evolution studies.

A limitation of the present study, tied to the question of which indicators of parental investment are most relevant with regard to the TW hypothesis, is its reliance on just two parental investment scenarios to deduce respondents’ preferences. The methodological approach chosen in this study, if extended to a larger set of parental investment scenarios (e.g., financial support and education) and subjected to cross-country and cross-cultural comparisons, could illuminate how TW-like patterns depend on the type of parental investment and context conditions. Compared to studies examining actual parental investment behaviors, this indirect approach might better isolate specific factors, potentially contributing to a clearer distinction between the hypothesized mechanisms. For future research I recommend using a larger variety of investment scenarios, both to study the dependence of parental investment patterns on the type of investment (e.g., evolutionary ancient vs. novel forms of investment; cf. Cronk 2007) and to increase the statistical power. The latter seems crucial, considering the potentially subtle nature of the assumed TW effect. By including multiple types of parental investment scenarios, future research could overcome this study’s limitations. This extension would also make it relevant to the evolutionary and classical family sociology of parental investment more broadly, as it could help frame the nexus between social status, child sex, and parental investment more broadly than in the narrow version of the TW hypothesis often considered by sociologists.

Extending the variety of parental investment types could be useful not only for subjecting the TW hypothesis to a more rigorous test but also for exploring how stable other main effects on parental investment are across different types of parental investment. For instance, the gender of respondents and their “status of being a parent” were, in this study, largely positively associated with perceived fairness of parental investment, independent of whether the respective vignette described an unequal or equal investment scenario. The current study does not answer why this is the case. One possibility could be that parents, especially women, due to their oftentimes stronger involvement in childcare, might be more accepting of different parental investment patterns because they are better aware of the day-to-day challenges of parenting and the difficulty of trying to conform to norms of parental investment.

As a third pivotal research direction, I recommend a comprehensive meta-scientific initiative to compile and evaluate all existing evidence, assessing the quality of each study to ascertain the validity of the TW hypothesis in light of established quality criteria for empirical social research. This should include meta-analyses that identify the conditions under which the TW hypothesis is more likely to be supported. Notably, with the exception of some studies cited above, evidence suggests that studies using larger, representative samples are more likely to yield null results as compared to smaller representative or convenience samples (see, e.g., Kolk and Schnettler 2013, 2016; cf. also Brown and Silk 2002). Additionally, the investigation should address the possibility of publication bias within TW research, which might skew the balance between null findings and significant results (cf. Smith 1983; Festa-Bianchet 1996). Such an analysis is crucial for understanding the robustness of the TW hypothesis and for guiding future research directions.

In each of the three proposed research directions, sociology has a significant role to play. It has been long advocated that sociology should more actively engage with evolutionary and biosocial research in understanding human behavior (see, e.g., Ellis 1996; Barkow 2006; Schnettler 2016). When it comes to examining the TW hypothesis in contemporary societies, sociologists are equipped with comprehensive, well-structured datasets, often representative of entire populations. Furthermore, the field boasts a rich tradition in researching social inequality, providing sophisticated concepts and methodologies for measuring socioeconomic status. Overall, sociologists can play an important role in solving the puzzle that the TW effect poses.