1 Introduction

The tenet underlying most of microeconomics until not very long ago was that rational individuals with given preferences and endowments optimize subject only to the law of the land. More recently, economists have started to talk of norms or rules, and to examine their implications in different contexts. Cigno (1993) shows that individuals may be constrained by self-enforcing family rules which are themselves a collectively rational response to the economic and legal environment. This line of thought is developed in a series of papers including, among others, Rosati (1996), Anderberg and Balestrino (2003), and Barnett et al. (2018).Footnote 1Young (1998) and Caillaud and Cohen (2000) follow essentially the same approach to explain the emergence of self-enforcing rules at the societal, rather than family level (“social norms”).Footnote 2 Another strand of economic literature, stemming from Bisin and Verdier (2001), and Tabellini (2008), assumes that optimizing parents, motivated either by a paternalistic form of altruism, or by a social conscience, undertake costly actions to transmit their values on to their offspring. These values are then modified by the interaction with other individuals who received different inputs from their parents.

The implicit assumption underlying all the contributions mentioned is that reproduction is asexual or, equivalently, that the parental couple think and act as if they were one person. Doing away with this assumption raises a host of new questions. First, what is the gain from forming a couple (or, why do people “marry”)? The usual answer is that a couple can jointly produce local public goods like children, companionship, etc., they could not produce separately; see, for example, Folbre (1994). Therefore, marriage produces a surplus. Second, how is the gain shared between the parties (“spouses”)? Since Manser and Brown (1980), and McElroy and Horney (1981), the answer is usually that the couple Nash-bargain over the distribution of the marital surplus; see Lundberg and Pollak (1996), Konrad and Lommerud (2000), and Cigno (2012). Third, how are couples matched? An earlier literature referring back to Gary Becker’s seminal contributions addresses the issue by envisaging a “marriage market” where potential husbands and wives are matched much in the same way as employers and employees are matched in the labor market; see Grossbard (2018) for a thorough survey. This literature emphasizes relative scarcity and consequently the sex ratio. A more recent one starts from the consideration that the benefit each partner derives from a particular match depends on the preferences and endowments of both partners. Peters and Siow (2002), and Iyigun and Walsh (2007), focus on endowments. Cigno et al. (2017) focus on preferences, in particular on taste for a good without perfect market substitutes, filial attention. Assuming that married couples Nash-bargain over the allocation of their joint time and money endowments, and over the distribution of the marital surplus, this last article demonstrates that a young person whose preferences are compatible with the existence of a self-enforcing family rule requiring the young to give attention to their elderly parents will marry a young person of the opposite sex who holds the same preferences. The couple thus formed will transmit their common preferences to their children, who will in turn marry persons with the same preferences and thus abide by the same rule, and so on. The fourth question concerns information. How much does a person know about the partner before they are actually married? We can safely assume that money endowments are common knowledge. Earning capacities may not be known with certainty, but they can be gaged with sufficient accuracy from educational achievements. Preferences, by contrast, may well be private information. In particular, the taste for filial attention may not be revealed until well after the date of marriage. Cigno et al. (2017) assume that preferences are common knowledge at all times. We examine the opposite case where a person’s taste for filial attention is private information until a couple is formed. In both cases, the focus is on cooperation between generations, rather than among members of the same generation as in other articles that will be mentioned later.Footnote 3 This focus gives us insights into the joint dynamics of marriage and preference transmission that would otherwise elude us.

If the taste for filial attention is private information, it cannot be a criterion for marrying a person rather than another. Of course, couples may well be formed on the basis of some observable characteristic (possibly including a person’s taste for other goods), but the matching will in any case be random where the taste for filial attention is concerned.Footnote 4 Using a stripped-down version of the model in Cigno et al. (2017), we demonstrate that a couple will still obey a rule requiring them to provide attention for their elderly parents if they expect their children to do the same. The latter is uncertain, however, because the couple know their children’s preferences (or rather, they can deduce them from their own), but not the preferences of their children’s future partners. Assuming rational expectations, we show that the share of the young who comply with a family rule in any generation is determined simultaneously with the next generation’s preference distribution. If all persons of the same sex look the same, and a young person will thus marry any member of the opposite sex with equal probability, everybody will eventually have the same preferences, and either everybody will then obey the same rule, or nobody will obey any. In the short run, however, outwardly identical persons could well have different preferences. Given that the same may apply also to different generations within the same line of descent, a rule could fall in abeyance for a number of generations and then spring back to life again. By contrast, if people are differentiated by some visible characteristic (physical appearance, language, religious practice, etc.) and the matching is assortative in that characteristic, all individuals displaying the same characteristic (but not the rest of the population) will hold the same preferences in the long run. Consequently, either all people holding the same characteristic will obey the same rule, or none of them will obey any. Here too, outwardly identical persons could then have different preferences in the short run, and it is thus possible that only some of them obey a family rule. Whether and how many people do that is important for policy purposes, because filial attention does not have perfect substitutes, and only a family rule will deliver it to the old. In the closing section, we review evidence that social policy crowds out family rules, and discuss the desirability of such a policy.

2 The model

Consider an economy initially populated by n men and n women. Time lasts forever, but each member of this population lives two periods. An individual i is young in period 1, old in period 2. The young can work and marry, the old can do neither. The young can also give attention to their elderly parents. The total amount of time that a young person has for working and giving attention to parents is normalized to unity.

If a young man and a young woman marry, they have a son and a daughter.Footnote 5 Siblings are not allowed to marry. Let cpi and \({a}_{k}^{i}\) denote, respectively, i’s consumption of market goods in period p = 1, 2, and the amount of attention that this person receives in period p = 2 from k = D, S, where D is i’s daughter and S is i’s son.Footnote 6 Individual i’s utility function is

$${U}_{i}={c}_{1i}+{\rm{ln}}\,{c}_{2i}+{\sum _{k\in \{D,S\}}}{u}_{i}\left({a}_{k}^{i}\right),$$
(1)

where \({u}_{i}\left({a}_{k}^{i}\right)=0\) for \({a}_{k}^{i}\le \frac{1}{\beta }\), and \({u}_{i}\left({a}_{k}^{i}\right)={\delta }_{i}\mathrm{ln}\,\beta {a}_{k}^{i}\), for \({a}_{k}^{i}> \frac{1}{\beta }\), with β > 1 and δi ≥ 0. We can think of δi as a measure of i’s taste for filial attention, and of β as a sensitivity parameter determining the threshold below which the attention received does not yield utility (a phone call once in a while is no good). Notice that market goods, including the services of professional helpers, are not perfect substitutes for filial attention. Notice also that neither i nor k is altruistic.Footnote 7 Therefore, i will not receive \({a}_{k}^{i}\) as a present. He or she could buy it off k. Given that filial attention does not have a perfect market substitute, however, k would set the price so high, that the entire surplus would go to k, and i would be indifferent between buying and not buying. We assume that i does not buy attention from k, but we will demonstrate that it may be in k’s interest to obey a rule dictating the amount of attention a young person must give her or his elderly parents in specified circumstances.

When a couple marry, they observe each other’s taste for filial attention. Until then, however, i knows only δi. Given that this parameter is private information when couples are formed, its value cannot then be a criterion for partner choice. At this stage of the exposition, we assume that couples are formed by randomly matching young men with young women. Every young person has the same probability of being matched with any other person of the opposite sex who is not the sibling. Later in the paper, we shall allow for the possibility that sampling is restricted to a particular subpopulation. In general, raising children will have a monetary cost and take parental time. Given that our focus is on filial (not parental) attention, however, we take these costs to be constant and normalize them to zero.

Consider the couple formed by a particular man f (for father), and a particular woman m (for mother). When the couple is drawn, they may either marry or split (there is no re-sampling). If they split, i = f, m maximizes (1) with \({a}_{D}^{i}={a}_{S}^{i}=0\), subject to the period budget constraints

$$\left\{ {\begin{array}{*{20}{l}}{c}_{1i}+{s}_{i}=w, \\ {c}_{2i}=r{s}_{i}, \end{array}} \right.$$

where si is the amount saved by i in period 1, r the interest factor and w the wage rate.Footnote 8

The pay-off of singlehood is then

$${\widehat{R}}_{i}=\widehat{R}:= {\mathop{\max }\limits_{{s}_{i}}}\left(w-{s}_{i}+{\mathrm{ln}}\,r{s}_{i}\right),\,\,i=f,m.$$

The solution to this maximization problem is

$${\widehat{s}}_{i}=1,$$

so that

$$\widehat{R}=w-1+{\mathrm{ln}}\,r.$$
(2)

In general, if f and m marry, they Nash-bargain the allocation of their time and earnings. In the absence of a rule obliging either or both of them to give attention to their respective parents, however, \({a}_{D}^{i}={a}_{S}^{i}=0.\) Having set the monetary and time costs of children equal to zero, there is then nothing for the spouses to bargain about. The budget constraints facing i = f, m are the same as in the case of singlehood, and the individual pay-off of marriage is again \({\widehat{R}}_{i}=\widehat{R}\). Strictly speaking, therefore, f and m are indifferent between marrying or splitting. We assume that they marry. In Section 4 below, we examine the possibility that it is in either or both spouses’ interest to obey the rule mentioned earlier, and look for conditions such that this rule is self-enforcing and renegotiation-proof. Given, however, that these conditions depend not only on i’s taste for filial attention, but also on that of i’s parents, and of i’s entire descendance, we must first study how the distribution of the preference parameter representing this taste evolves across generations. We do so in a context where the number of men and women stays constant (and equal to n for each sex) across generations, because all individuals marry and all married couples have a son and a daughter.

3 Preference evolution

There is evidence that preferences are passed on from parents to children,Footnote 9 but the transmission is not genetic.Footnote 10 That being the case, it would be reasonable to assume that δi is a random variable with expected value equal to the mean of the parents’ δs. For simplicity, however, we assume that δi is equal to that mean with certainty,

$${\delta }_{i}=\frac{{\delta }_{F}+{\delta }_{M}}{2},$$
(3)

where F and M denote, respectively, i’s father and mother (F and M are then, respectively, D and S’s grandfather and grandmother). This simplification makes no difference in the long run. We are interested in studying how δi evolves across generations.

Let t = 0, 1, 2, … identify a specific generation. Due to the transmission mechanism (3), generation t may be characterized by a variety S(t) of δs. Assume that, in generation t = 0, nH < n women (and men) are characterized by δ = δH, and nL = n − nH by δ = δL < δH. Letting \(\pi :=\frac{{n}^{H}}{n}\), in generation t = 0, there are πn women (and men) with δ = δH and (1 − π)n women (and men) with δ = δL. In generation t = 0, the number of values taken by δ is then S(0) = 20 + 1 = 2. In the subsequent generations, the number of possible δs increases as a result of marriages between individuals with different δs. In generation t = 1, the possible values of δ are δL, \(\frac{{\delta }^{L}\,+\,{\delta }^{H}}{2}\) and δH. Consequently, S(1) = 21 + 1 = 3. In generation t = 2, they are δL, \(\frac{3{\delta }^{L}\,+\,{\delta }^{H}}{4}\), \(\frac{2{\delta }^{L}\,+\,2{\delta }^{H}}{4}\), \(\frac{{\delta }^{L}\,+\,3{\delta }^{H}}{4}\) and δH. Hence, S(2) = 22 + 1 = 5. In generation t ≥ 0, the possible values are

$${\delta }_{t}(j):=\frac{({2}^{t}-j){\delta }^{L}+j{\delta }^{H}}{{2}^{t}}={\delta }^{L}+\frac{{\delta }^{H}-{\delta }^{L}}{{2}^{t}}j,\quad j=0,1,\ldots ,{2}^{t},$$
(4)

and their number is S(t) = 2t + 1.

How is the random variable δt(j) figuring in (4) distributed in a given generation t? How does the distribution of δt(j) evolve across generations starting from the initial distribution (1 − π, π) of δL and δH? Appendix section 7.1 demonstrates the following.

Proposition 1. In each generation t ≥ 0, for n sufficiently large, the distribution of δt(j) converges to a binomial, with mean (1 − π)δL + πδH and variance\(\pi (1-\pi )\frac{{({\delta }^{H}\,-\,{\delta }^{L})}^{2}}{{2}^{t}}.\)

Corollary 1. As t → , the expected δ held by all agents is

$${\delta }^{* }:=(1-\pi ){\delta }^{L}+\pi {\delta }^{H}.$$

These two results say that, generation after generation, the population is subject to a “melting pot” process that eventually transitions the economy from an initial state with two heterogeneous groups, those with δL and those with δH, to a final state with a homogeneous population sharing the same value of δ, δ*. The transition goes through an infinite number of intermediate states, each of which corresponds to a situation where the δ-values in (4) may coexist according to a binomial distribution, if n is large (see Appendix section 7.1). If, for example, we set δH = 1 and δL = 0, the long-run value of the preference parameter is \({\delta }^{* }=\pi =\frac{{n}^{H}}{n}.\) But how long is the long run? A sensible way to address this question is to calculate in how many generations t the standard deviation of the binomial distribution of δ will become σ ∈ {0.01, 0.05} for π ∈ {0.1, 0.5}. The answer is found solving the equation

$$\frac{{({\delta }^{H}-{\delta }^{L})}^{2}}{{2}^{t}}\pi (1-\pi )={\sigma }^{2}\quad \,\text{for}\,\pi \in \{0.1,0.2,\ldots ,0.5\}.$$

The value of t associated with each \(\left(\pi ,\sigma \right)\) is shown in Table 1. Of course, the long-run value of δ (equal to the mean of the distribution) will vary with \(\left(\pi ,\sigma \right)\) too.

Table 1 Number of generations needed to reach a distribution of the population with standard deviation σ, given the initial distribution (1 − π, π)

The first column of this table says that, if 10% of the population is initially characterized by δ = 1, and the remaining 90% by δ = 0, so that the limit value of δ is 0.1, it will take 5.17 generations for the standard deviation to become equal to 0.05, and another 4.64 generations for it to fall to 0.01. If generations overlap every 20 years, this means that it will take 130 years for approximately 68% of the population to have a δ comprised between 0.095 and 0.105, and more than 245 years for that same share of the population to have a δ comprised between 0.099 and 0.101 (virtually 0.1). The remaining columns show how the convergence slows down, and the limit value of δ gets closer to one, as the initial share of individuals with δ = 1 rises from one tenth to a half of the total population. It is worth noting that the evolution of the δ parameter over time is irreversible. Once someone inherits a “mixed value” of δ, i.e., δ ∉ {δL, δH}, then the possibility that one of her or his children inherits one of the two initial values δL and δH is lost forever. Consequently, starting from some generation, everybody will have a mixed value of δ. Formally, the following result holds.

Proposition 2. The state of the economy in which no individual displays a δ ∈ {δL, δH} is absorbing, and this state is reached in a finite number of generations, almost surely.

Proof. In Appendix section 7.2. □

Proposition 2 states that the initial values of the taste-for-filial-attention parameter δ are expected to disappear in a finite number of generations. Due to random matching, there is a positive probability that the initial values of δ will not be passed on to the subsequent generation. In other words, generation after generation, the number of those who display δL or δH will fall. Since this process is irreversible, when the state where no one displays δL or δH is reached, the probability to remain in such a state is one for all future generations.

4 Family rules

We are now ready to analyze family decisions in the presence of a family rule ensuring that the young give attention to their elderly parents. The reason for focussing on filial attention rather than income support is that the former cannot be obtained in any other way, while the latter can. Where developed countries are concerned, there is indeed evidence that what the elderly actually get is mostly attention; see, for example, Lowenstein and Daatland (2006) and Klimaviciute et al. (2017).

According to a strand of economic literature stemming from Bisin and Verdier (2001), and Tabellini (2008), cooperative behavior arises because well-meaning parents expend resources to instill pro-social values into their children. According to Alger and Weibull (2013), by contrast, innate individual preferences have a selfish and a moral component. Analyzing preference evolution, they show that stable preferences attribute the moral component a weight equal to the exogenous degree of assortativity in the matching process. As a result, cooperative behavior may prevail. We take a different tack. Our argument is that cooperative behavior may emerge as a self-enforcing rule even if individuals are entirely selfish as assumed in Section 2, and matching is completely random, simply because cooperative behavior is incentive-compatible.Footnote 11 In our specific model, the cooperation is between members of different generations of the same family, and it consists in the provision of filial attention by the young to the old. The rule in question is thus a family rule.

The following two definitions formalize what we mean by cooperative behavior and family rule.

Definition 1 (Cooperative behavior). The young give attention to their elderly parents.

This definition partitions the young into two groups: those who give attention to their parents, “cooperators”, and those who do not, “non-cooperators”. The latter may include “deviators”, who do not give attention to their parents when the latter were cooperators, and “defectors”, who do not give attention to their parents when the latter were non-cooperators. We are interested in studying whether the following rule can support the cooperative behavior in Definition 1 as a Nash-bargaining equilibrium.

Definition 2 (Family rule). A young person i must provide attention \({a}_{i}^{h}\) to elderly parent h = F, M if the latter is not a deviator.

Note that a young person is not obliged to give attention to a deviating parent. Having assumed that the young do not get direct utility from giving attention (or anything else) for free, however, children will punish deviating parents by giving them no attention. Therefore, the family rule in Definition 2 identifies two behaviors—being a cooperator and being a defector—that do not justify punishment, and one—being a deviator—that calls for punishment. As will be discussed later, two conditions ensure that the family rule defined above sustains the cooperative behavior in Definition 1. First, an individual must be better-off obeying the rule rather than disobeying it. Second, the individual must expect that the rule is incentive-compatible also for all her or his descendants (otherwise, by backward induction, he or she would anticipate that, starting from his or her children, the rule would be disobeyed). In general, cooperators, defectors, and deviators may coexist. In the long run, however, all individuals behave the same: either they all cooperate or they all defect.

In the following subsection, we start by examining the properties of the Nash-bargaining equilibrium under the assumption that married young people give their elders certain specified amounts of attention, and then establish conditions such that a rule requiring married young people to do so is self-enforcing and renegotiation proof.

4.1 Bargaining

Suppose that the \(\left(f,m\right)\) couple comply with the rule set out in Definition 2, and that they Nash-bargain over the allocation of their time and money. Given that the best alternative to marrying and obeying the rule in question is to marry and disobey it, i’s reservation utility is \(\widehat{R}\), for i = f, m. The Nash-bargaining (NB) equilibrium then maximizes

$$N=({U}_{f}-\widehat{R})({U}_{m}-\widehat{R})$$
(5)

subject to

$$\left\{\begin{array}{l}{c}_{1m}+{s}_{m}=w(1-{a}_{m}^{F}-{a}_{m}^{M})+T,\\ {c}_{2m}=r{s}_{m},\end{array}\right.\quad \left\{\begin{array}{l}{c}_{1f}+{s}_{f}+T=w(1-{a}_{f}^{F}-{a}_{f}^{M}),\\ {c}_{2f}=r{s}_{f},\end{array}\right.$$
(6)

where \({a}_{i}^{F}\) is the amount of attention given to i’s father F, \({a}_{i}^{M}\) that given to i’s mother M, with i = f, m, and T is defined as a transfer from f to m in period 1. Assuming an interior solution (or the rule would be inoperative), we show in Appendix section 7.3 that the equilibrium is now

$$\begin{array}{*{20}{l}}{\widehat{s}}_{i}=1,\\ \widehat{T}=\frac{w}{2}\left[{a}_{m}^{F}+{a}_{m}^{M}-{a}_{f}^{F}-{a}_{f}^{M}\right]+\frac{1}{2}\left[{\delta }_{f}{\mathrm{ln}}\,\beta {a}_{S}^{f}+{\delta }_{f}{\mathrm{ln}}\,\beta {a}_{D}^{f}-{\delta }_{m}{\mathrm{ln}}\,\beta {a}_{S}^{m}-{\delta }_{m}{\mathrm{ln}}\,\beta {a}_{D}^{m}\right].\end{array}$$

The equilibrium expected utilities are then

$$\begin{array}{*{20}{l}}{\widehat{U}}_{i}=\widehat{U}=w+{\mathrm{ln}}\,r-1-\frac{w}{2}\left[{a}_{f}^{F}+{a}_{f}^{M}+{a}_{m}^{F}+{a}_{m}^{M}\right]\\ +\,\frac{1}{2}\left[{\delta }_{f}{\mathrm{ln}}\,\beta {a}_{S}^{f}+{\delta }_{f}{\mathrm{ln}}\,\beta {a}_{D}^{f}+{\delta }_{m}{\mathrm{ln}}\,\beta {a}_{S}^{m}+{\delta }_{m}{\mathrm{ln}}\,\beta {a}_{D}^{m}\right],\,\quad i=f,\,m.\end{array}$$
(7)

The NB equilibrium was derived under the assumption that a family rule specifying \({a}_{i}^{h}\) and \({a}_{k}^{i}\), with h = F, M and k = D, S, is in force. There is such an equilibrium for any possible specification of this rule. Among all possible equilibria there may be some that are not Pareto-dominated by any of the others. Only these are renegotiation proof.Footnote 12 This gives us a general criterion for rule selection. A further criterion is to postulate that the prescribed \({a}_{k}^{i}\) will depend on observable variables such as \({a}_{i}^{F}\) and \({a}_{i}^{M}\). Let \({a}_{k}^{i}={g}^{i}({a}_{i}^{F},{a}_{i}^{M})\), with k = D, S, where \({g}^{i}\in {{\mathcal{C}}}^{1}\) is an increasing function of its arguments. This means that the amount of attention that k ∈ {D, S} provides to the parent i ∈ {f, m} increases with the amount of attention that the parent i has provided to her or his own parents, F and M. The function gi is the same for both siblings because daughter and son enter the analysis symmetrically, and there is thus no justification for requiring them to give different amounts of attention.

A natural candidate for the function gi, one that is easy to understand, is

$${a}_{k}^{i}=\gamma {a}_{i}^{F}+(1-\gamma ){a}_{i}^{M},\quad \,{\mathrm{with}}\,\gamma \in [0,\,1],\quad \,{\mathrm{and}}\,k=D,\,S,$$
(8)

but we show in Appendix section 7.6 that our results are robust to more general functional forms. Each value of γ in (8) identifies a different specification of the family rule, and thus a different NB equilibrium. To find the value of γ that is renegotiation proof, we maximize (7) with respect to \({a}_{i}^{h},\) h = F, M, subject to (8). The following result holds.

Proposition 3. If an interior solution to the maximization of \({\widehat{U}}_{i}\) with respect to \({a}_{i}^{h}\) exists, then the renegotiation-proof family rule is such that\(\gamma =\frac{1}{2}\). Consequently,

$${a}_{k}^{i}=\frac{{\delta }_{i}}{w}\quad \,{\mathrm{and}}\,\quad {a}_{i}^{h}=\frac{{\delta }_{h}}{w},\quad \,{\mathrm{for}}\,\,i=f,\,m,\quad h=F,\,M,\quad \,\,\,{\mathrm{and}}\,k=D,\,S.$$
(9)

Proof. See Appendix section 7.4. □

This proposition says that, for it to be renegotiation-proof, a family rule must require a young person to give each of her or his elderly parents an amount of attention that depends positively on the receivers’ tastes for this good, and negatively on the giver’s wage rate (the opportunity-cost of attention). This amount is equal to the mean of those given by the receiver to her or his own elderly parents a period earlier,

$${a}_{k}^{i}=\frac{{a}_{i}^{F}+{a}_{i}^{M}}{2},\quad \,{\mathrm{with}}\,k=D,\,S.$$
(10)

Substituting (9) into (7) allows us to write the equilibrium expected utilities as

$${\widehat{U}}_{i}=w+{\delta }_{f}\left({\mathrm{ln}}\,\beta {\delta }_{f}-{\mathrm{ln}}\,w-1\right)+{\delta }_{m}\left({\mathrm{ln}}\,\beta {\delta }_{m}-{\mathrm{ln}}\,w-1\right)-1+{\mathrm{ln}}\,r,\quad i=f,\,m.$$
(11)

If the wage rate is a constant, as assumed for simplicity in the present paper, each elderly person will then receive the same amount of attention from each child. In reality, however, wage rates may differ across genders, generations and countries. Cigno et al. (2019) prove that the result in (9) is robust to possible randomness of the wage rate. If that is the case, the specification in (9) becomes \({a}_{k}^{i}=\frac{{\delta }_{i}}{{w}_{k}}\) and \({a}_{i}^{h}=\frac{{\delta }_{h}}{{w}_{i}}\), where wk and wi are the realized wage rates of individual k and individual i, respectively. In particular, the child with the lower opportunity cost will then give more attention than the one with the higher opportunity cost.

4.2 Self-enforcing rules

We now establish conditions such that the family rule in Definition 2, prescribing the amount of filial attention (9), is self-enforcing. Given (9), a necessary condition for (f, m) to obey the rule is that they would not be better-off disobeying it. Hence,

$$\widehat{U}-\widehat{R}\ge 0,$$

which, substituting from (11) and (2), becomes

$${\delta }_{f}\left({\mathrm{ln}}\,\beta {\delta }_{f}-{\mathrm{ln}}\,w-1\right)+{\delta }_{m}\left({\mathrm{ln}}\,\beta {\delta }_{m}-{\mathrm{ln}}\,w-1\right)\ge 0.$$
(12)

This condition is obviously satisfied if a NB equilibrium exists, because the equilibrium expected utilities cannot be lower than the reservation utilities. As will be discussed later, for an equilibrium to exist and (12) to be satisfied for at least some couples, it must be the case that \({\mathrm{ln}}\,\beta {\delta }^{H}\,> \, {\mathrm{ln}}\,w+1\). We assume that it is.

If \({\mathrm{ln}}\,\beta {\delta }^{L} \, > \, {\mathrm{ln}}\,w+1\) is true too, then (12) will be satisfied for every couple in the economy, and everybody will follow the family rule. If, instead, \({\mathrm{ln}} \,\beta {\delta }^{L}\, < \, {\mathrm{ln}}\,w+1,\) there may exist couples who do not satisfy (12) and do not follow the family rule. This may have an important consequence also for couples whose δs are high enough to satisfy (12). Even if this condition is satisfied, the \(\left(f,m\right)\) couple will in fact obey the rule only if they expect their children to do the same. But this will depend on their children’s δs, and on the δs of the children’s respective spouses. Since the same applies also to the children’s children, and so on to infinity, in order for the rule to be self-enforcing, the \(\left(f,m\right)\) couple must then expect that a condition analogous to (12) will hold for all their descendants and respective spouses. Again taking (9) as given, we need to require

$${{\mathbb{E}}}_{t}({\widehat{U}}_{{d}_{t+\ell }}-\widehat{R}| {\delta }_{f},{\delta }_{m})\ge 0,\,\,\,{\mathrm{with}}\,{d}_{t+\ell }\in \{\,{\mathrm{descendants}}\,{\mathrm{of}}\,\,(f,\,m)\},\,\,\forall \ell \ge 1,$$
(13)

where t refers to the generation of the \(\left(f,m\right)\) couple, and denotes the number of generations separating the (f, m) couple from their dt+ descendants, whose equilibrium utility is denoted by \({\widehat{U}}_{{d}_{t+\ell }}\).

Consider generation t. Condition (13) may hold for a couple and not for another, both formed in period t, depending on the values of their δs. Therefore, some couples may comply with the rule, but some may not. The latter will neither give nor receive attention. If f and m expect that in some generation \(t+\bar{\ell }\) their descendants will not satisfy the condition in (13) and consequently not give attention to their parents (belonging to generation \(t+\bar{\ell }-1\)) then, by a backward induction argument, f and m will anticipate that all the generations before \(t+\bar{\ell }\) will not give attention. This means that f and m’s children will be expected not to give attention and, consequently, that f and m will not either.Footnote 13 Note, however, that (13) is an expectation. This implies that, even if (13) is satisfied, there may exist a couple belonging to generation \(t+\bar{\ell }\) for which the realizations of the δs are such that, ex post, condition (12) is not satisfied. Accordingly, even if all the \(\left(f,m\right)\) couple’s descendants up to \(t+\bar{\ell }\), and all other couples belonging to generation \(t+\bar{\ell }\), abide by the family rule, the couple in question will deviate. Recall that, if (f, m) deviate, their children do not have to give them attention (f and m are defectors) but do not lose the right to receive attention from their own children. Even if the latter were to deviate, however, this would not necessarily imply that none of the deviating couple’s descendants will give attention to their parents. As we show next, starting from some generation, the realizations of the δs may be such that all couples will give attention. There is no going back.

Recalling Definition 1, the following result holds.

Theorem 1. For any initial distribution (1 − π, π), if β is sufficiently large, there exists an interval W such that, for any w ∈ W, a generation τ exists such that the family rule in Definition 2 and expression (9) sustain cooperative behavior for all t ≥ τ.

Proof. See Appendix section 7.5. □

In addition to the formal proof, provided in Appendix section 7.5, we lay out below the possible scenarios in order to help the intuition of this result. Start by noticing that the family rule defined in Definition 2 and expression (9) is self-enforcing if both (12) and (13) hold. Given the convergence of δ to the value δ* in the long run, it can be proved (see Appendix section 7.5) that a sufficient condition for (12) and (13) to be true, starting from some generation τ, is

$${\mathrm{ln}}\,\beta {\delta }^{* }\, > \, {\mathrm{ln}}\,w+1.$$
(14)

Conversely, if (14) does not hold, individuals can at best be indifferent between obeying and disobeying the rule.

Now, fix β for a moment. There are three possible scenarios. If \({\mathrm{ln}}\,\beta {\delta }^{L}\, \ge \, {\mathrm{ln}}\,w+1\), then (14) trivially holds and, as already pointed out, the family rule is adopted by every member of every generation because (12) and (13) are both fulfilled for all generations. If \({\mathrm{ln}}\,\beta {\delta }^{H}\le {\mathrm{ln}}\,w+1\), then (14) does not hold and the family rule is never obeyed by any individual because (12) and (13) are not fulfilled in the long run. If \({\mathrm{ln}}\,\beta {\delta }^{L}\, < \, {\mathrm{ln}}\,w+1 \, < \, {\mathrm{ln}}\,\beta {\delta }^{H}\), by contrast, condition (14) may hold, depending on the values of β, w, and δ* (which in turn depends on the initial distribution (1 − π, π) of δL and δH), and the family rule may emerge in the long run. However, for any given initial distribution (1 − π, π), if β is sufficiently large, \(\beta \, > \, \frac{e}{{\delta }^{* }}\) (see Appendix section 7.5), there always exists a wage rate interval W that supports the cooperative behavior described in Definition 1 starting from a certain generation τ.

In Appendix section 7.5, we show that \(W:=(1,\frac{\beta {\delta }^{* }}{e})\). The lower-bound of W ensures that consumption is non-negative in the singlehood maximization problem (Section 2). The upper bound of W is implied by (14). The result in Theorem 1 shows that, for given β and δ*, high wage rates (\(w \, > \, \frac{\beta {\delta }^{* }}{e}\)) cannot support cooperative behavior. This is so because the giver’s wage rate is the opportunity-cost of attention. High wages make it more profitable to allocate time in labor rather than in parental care, whose reduction is compensated by an increase in the consumption of the market good. Low wages, on the contrary, can make it more profitable to allocate time in parental care rather than labor, depending on how sensitive individuals are to the amount of filial attention received. This sensitivity factor is captured by the parameter β.Footnote 14 If β is too small (for example, if it is close to 1), the amount of filial attention required in order to yield positive utility to parents is too costly for the children, who, as a result, will not follow the cooperative behavior. Instead, if β is sufficiently large, then the amount of filial attention required to yield positive utility to parents is smaller and does not generate a large opportunity cost. Consequently, cooperative behavior becomes incentive-compatible.

Summing up, in the short run it is possible that some members of a generation do not give attention to their parents, but some of their descendants will. In the long run, assuming as we have done so far that all individuals of the same sex are outwardly identical, either everybody gives attention or nobody does (the policy implications are discussed at the end of the concluding section). In the next section, we look at what happens if this outward homogeneity assumption is relaxed.

5 Persistence of family rules

The implications of the evolutionary process described in the last two sections can be better appreciated if we look at the consequences of immigration. Consider a population that is originally characterized by a common value of δ, say δ = 0. Without an exogenous shock, this population would remain homogeneous for ever. Let us then suppose that there is a once-for-all influx of immigrants, equal in size to one ninth of the native population, and that all the newcomers are characterized by a value of δ that is different from zero, say δ = 1. According to Table 1, after between five and ten generations, the population will be homogeneous again, and its common characteristics will be very similar to those that were once common to the original inhabitants. In other words, the immigrants will be absorbed by the native population. If the number of immigrants were larger than one ninth, but no larger than a half of the native population (i.e., not so large that the immigrants outnumber the natives), it would take longer for the population to become homogeneous again, and the future inhabitants would not look much like the original ones. In that case, there would be convergence, but not absorption. Whichever is the case, random matching implies that it takes a relatively short time in evolutionary terms (between 130 and 245 years) for a population to become homogeneous again after a wave of immigration. In our model, this implies that either everybody will ultimately obey a family rule, or nobody will.

Is that what we observe in reality? Lowenstein and Daatland (2006) find that a majority of adults in Norway, England, Germany, Spain and Israel acknowledge some degree of filial obligation, but both the incidence and the intensity of this sentiment are higher in the two Mediterranean countries, than in the two northern ones. Klimaviciute et al. (2017) report that working-age Greek, Italian and Spanish people spend, on average, more than 33 h a month caring for their elderly parents, while the Danish and the Dutch spend less than 11. This suggests that, despite thousands of years of cross-migrations and, more recently, complete freedom of movement within the EU, matching is not completely random. But why? Cigno et al. (2017) demonstrate that, if preferences were observable, it would be in the interest of a young person whose preferences are compatible with the existence of a self-enforcing, renegotiation-proof family rule, to seek out and marry a like-minded member of the opposite sex. As marriage partners would then be assorted according to their preferences, the latter would not evolve, and the share of the population who are governed by a family rule would remain constant. In the last two sections, we examined the opposite case where individual preferences are private information before couples are formed, and there is thus an equal probability of being matched with any member of the opposite sex. But suppose that the distribution of the taste-for-filial-attention parameter δ varies systematically with an observable trait θ denoting, for example, physical type, language or religious practice. If the density function of δ associated with each θ is common knowledge, and the expected value of δ is increasing in θ, a rational young man characterized by a δ high enough to satisfy (12) if his wife also is will then restrict his search to young women with the same θ as himself (the same applies to rational young women looking for a suitable husband). Alternatively, if the young are too impulsive to be concerned about what will happen to them in the next period of life, it will be their parents who, in their children’s but also their own interest, try and restrict the range of persons with whom their children come into contact. Choice of school and area of residence are powerful instruments for restricting that range.Footnote 15

In the long run, there may then be a different limit value of δ for each θ, and the population may thus tend to break down into a number of sharply characterized subpopulations recognizable by their θ. As the unobservable limit value of δ varies with the observable value of θ, we may then find that, not only in the short but also in the long run, some θ-types look after their elderly parents, and some other θ-types do not. Of those who do, some may give their parents more attention than others. Even if there is no causal relationship between religion and taste for filial attention, that could then go some way towards explaining the already mentioned finding by Lowenstein and Daatland (2006), and Klimaviciute et al. (2017), that adults give, on average, more attention to their elderly parents in Mediterranean countries, where the predominant religious group are Christian Orthodox, Roman Catholic or Jewish, than in North-European countries, where the reformed Christian churches prevail. Other possible explanations are discussed in the next section.

6 Discussion

We have shown that, if a couple’s tastes for filial attention satisfy a certain condition, and the same condition is expected to hold for their children and respective spouses, it is in the couple’s common interest to obey a rule that requires them to give specified amounts of attention to their respective parents. The amount due is increasing in the receivers’ tastes for filial attention, and decreasing in the giver’s wage rate. Having assumed that the taste-for-filial-attention parameter is private information until the couple is actually formed, the size of this parameter cannot be a criterion for forming a couple. Therefore, if individuals were not differentiated by any visible trait other than sex, a couple would be a random draw from the entire population (excluding siblings), and the variance of the preference parameter in question would gradually diminish. In the long run, everybody would then have the same taste for filial attention, and either everybody would give the same amount of attention or nobody would give any. Before getting there, however, some couples might obey the rule, and some might not. As this applies also to members of different generations within the same line of descent, the rule could fall in abeyance for a number of generations, and then come back into force again.

Alternatively, if the population consisted of a number of subpopulations differentiated by a visible trait such as physical appearance, language or religious practice, and this visible trait were thought to be correlated with the invisible trait which determines whether a person has any interest in complying with the rule in question, sampling would be restricted to members of the same subpopulation. In that case, each subpopulation would converge to its own limit value of the unobservable preference parameter. Even in the long run, we could then find that some subpopulations obey the family rule, and some do not. And that the amount of filial attention given by those who obey the family rule varies from one subpopulation to another. Given that different countries are home to different mixes of ethnic, linguistic and religious groups, this prediction is consistent with the observation that the average amount of filial attention given differs widely across countries despite thousands of years of cross-migrations, and even within the EU where movements have been totally free for many decades.

Our approach differs from that of others who also aim to explain how preferences, rules or values evolve across generations in that we focus on the marriage channel, while they emphasize social interaction. It differs from that of Bisin and Verdier (2001), and Tabellini (2008), also in that parents do not need to bear a cost to inculcate what we call a rule and those authors call values into their children. A contribution that bears some similarity to ours even though it is not directly concerned with reproduction is Alger and Weibull (2013). Those authors assume that preferences have a selfish component, which by itself would lead a person to behave like “homo oeconomicus,” and a “Kantian” one, which by itself would drive a person to “do the right thing” if everyone else did the same. Using the evolutionary stability notion developed in Weibull (1995), those authors show that stable preferences attribute the Kantian component a weight equal to the exogenously given degree of assortativity present in the matching process. In other words, the preferences that tend to prevail are those which display the “right” degree of morality given that the matching is assortative in that trait. In our model, by contrast, doing the right thing (taking care of the elderly) can be the equilibrium behavior even if people are entirely selfish, and the game is not repeated.Footnote 16 We regard our approach as complementary to those of others who examine the same or similar issues from different standpoints. In all those contributions, however, the focus is on cooperation among unrelated members of the same generation. Ours, by contrast, is on cooperation between related members of different generations. This different focus allowed us to link together the marriage pattern, the evolution of individual preferences, and the dynamics of family rules.

Whether and how many young people are governed by a family rule is a matter of some policy relevance. Using a slightly more general version of the present model, Cigno et al. (2017) show that wage redistribution reduces the share of the population that is governed by a self-enforcing, renegotiation-proof family rule because it makes the condition for the existence of such a rule more stringent. Taxing the young in proportion to their earnings, and subsidizing the old at a flat rate, as in a Beveridge-style public pension system, would thus reduce the share of the young who give attention to their elderly parents. Barnett et al. (2018) demonstrate that, given the existence of a self-enforcing family rule ensuring that the young give money to their elderly parents as in Cigno (1993), a public pension system can be justified only on redistributive grounds in the presence of heterogeneous agents. The same argument applies with even greater force to public policy towards the old in general if, as in the present paper, family rules ensure the delivery of a good like filial attention that, unlike money, cannot be procured in any other way. In light of evidence in Cigno and Rosati (1996), Cigno et al. (2002), Galasso et al. (2009), Billari and Galasso (2014), and many other empirical articles, that public policy towards the old crowds out family rules,Footnote 17 however, redistribution has an efficiency cost. Governments should carefully evaluate this trade-off before taxing the young to buy personal services for the old.