Evolution of individual preferences and persistence of family rules

How does the distribution of individual preferences evolve as a result of marriage between individuals with different preferences? Could a family rule be self-enforcing given individual preferences, and remain such for several generations despite preference evolution? We show that it is in a couple’s common interest to obey a rule requiring them to give specified amounts of attention to their elderly parents if the couple’s preferences satisfy a certain condition, and the same condition is rationally expected to hold also where their children and respective spouses are concerned. Given uncertainty about who their children will marry, a couple’s expectations will reflect the probability distribution of preferences in the next generation. We show that, in any given generation, some couples may obey the rule in question and some may not. It is also possible that a couple will obey the rule, but their descendants will not for a number of generations, and then obey it again. In the long run, if matching is entirely random, either everybody obeys the same rule, or nobody obeys any. If matching is restricted to particular subpopulations identifiable by some visible trait, such as religion or color of the skin, different subpopulations may obey different rules. The policy implications are briefly discussed.


Introduction
The tenet underlying most of microeconomics until not very long ago was that rational individuals with given preferences and endowments optimize subject only to the law of the land. More recently, economists have started to talk of norms or rules, educational achievements. Preferences, by contrast, may well be private information. In particular, the taste for filial attention may not be revealed until well after the date of marriage. Cigno et al. (2017) assume that preferences are common knowledge at all times. We examine the opposite case where a person's taste for filial attention is private information until a couple is formed. In both cases, the focus is on cooperation between generations, rather than among members of the same generation as in other articles that will be mentioned later. 3 This focus gives us insights into the joint dynamics of marriage and preference transmission that would otherwise elude us.
If the taste for filial attention is private information, it cannot be a criterion for marrying a person rather than another. Of course, couples may well be formed on the basis of some observable characteristic (possibly including a person's taste for other goods), but the matching will in any case be random where the taste for filial attention is concerned. 4 Using a stripped-down version of the model in Cigno et al. (2017), we demonstrate that a couple will still obey a rule requiring them to provide attention for their elderly parents if they expect their children to do the same. The latter is uncertain, however, because the couple know their children's preferences (or rather, they can deduce them from their own), but not the preferences of their children's future partners. Assuming rational expectations, we show that the share of the young who comply with a family rule in any generation is determined simultaneously with the next generation's preference distribution. If all persons of the same sex look the same, and a young person will thus marry any member of the opposite sex with equal probability, everybody will eventually have the same preferences, and either everybody will then obey the same rule, or nobody will obey any. In the short run, however, outwardly identical persons could well have different preferences. Given that the same may apply also to different generations within the same line of descent, a rule could fall in abeyance for a number of generations and then spring back to life again. By contrast, if people are differentiated by some visible characteristic (physical appearance, language, religious practice, etc.) and the matching is assortative in that characteristic, all individuals displaying the same characteristic (but not the rest of the population) will hold the same preferences in the long run. Consequently, either all people holding the same characteristic will obey the same rule, or none of them will obey any. Here too, outwardly identical persons could then have different preferences in the short run, and it is thus possible that only some of them obey a family rule. Whether and how many people do that is important for policy purposes, because filial attention does not have perfect substitutes, and only a family rule will deliver it to the old. In the closing section, we review evidence that social policy crowds out family rules, and discuss the desirability of such a policy. period 1, old in period 2. The young can work and marry, the old can do neither. The young can also give attention to their elderly parents. The total amount of time that a young person has for working and giving attention to parents is normalized to unity.
If a young man and a young woman marry, they have a son and a daughter. 5 Siblings are not allowed to marry. Let c pi and a i k denote, respectively, i's consumption of market goods in period p = 1, 2, and the amount of attention that this person receives in period p = 2 from k = D, S, where D is i's daughter and S is i's son. 6 Individual i's utility function is where u i a i k À Á ¼ 0 for a i k 1 β , and u i a i k À Á ¼ δ i ln βa i k , for a i k > 1 β , with β > 1 and δ i ≥ 0. We can think of δ i as a measure of i's taste for filial attention, and of β as a sensitivity parameter determining the threshold below which the attention received does not yield utility (a phone call once in a while is no good). Notice that market goods, including the services of professional helpers, are not perfect substitutes for filial attention. Notice also that neither i nor k is altruistic. 7 Therefore, i will not receive a i k as a present. He or she could buy it off k. Given that filial attention does not have a perfect market substitute, however, k would set the price so high, that the entire surplus would go to k, and i would be indifferent between buying and not buying. We assume that i does not buy attention from k, but we will demonstrate that it may be in k's interest to obey a rule dictating the amount of attention a young person must give her or his elderly parents in specified circumstances.
When a couple marry, they observe each other's taste for filial attention. Until then, however, i knows only δ i . Given that this parameter is private information when couples are formed, its value cannot then be a criterion for partner choice. At this stage of the exposition, we assume that couples are formed by randomly matching young men with young women. Every young person has the same probability of being matched with any other person of the opposite sex who is not the sibling. Later in the paper, we shall allow for the possibility that sampling is restricted to a particular subpopulation. In general, raising children will have a monetary cost and take parental time. Given that our focus is on filial (not parental) attention, however, we take these costs to be constant and normalize them to zero.
Consider the couple formed by a particular man f (for father), and a particular woman m (for mother). When the couple is drawn, they may either marry or split (there is no re-sampling). If they split, i = f, m maximizes (1) with a i D ¼ a i S ¼ 0, 5 We assume this simply because our focus is on family rules rather than demography. 6 In the Italian Statistical Institute's Indagine Multiscopo sulle Famiglie, "attention" is defined to include personal services such as help with domestic chores, bureaucratic matters and health matters. In the European Commission sponsored five-country (Norway, England, Germany, Spain and Israel) Old Age and Autonomy: The Role of Service Systems and Inter-generational Family Solidarity (OASIS) survey, "attention" similarly includes help with transport/shopping, house repair/gardening, household chores and personal care, but also emotional support. 7 Allowing for a modicum of altruism on either side would make the analysis less sharp without altering the results in any substantive way.
subject to the period budget constraints where s i is the amount saved by i in period 1, r the interest factor and w the wage rate. 8 The pay-off of singlehood is then The solution to this maximization problem is In general, if f and m marry, they Nash-bargain the allocation of their time and earnings. In the absence of a rule obliging either or both of them to give attention to their respective parents, however, a i D ¼ a i S ¼ 0: Having set the monetary and time costs of children equal to zero, there is then nothing for the spouses to bargain about. The budget constraints facing i = f, m are the same as in the case of singlehood, and the individual pay-off of marriage is again b Strictly speaking, therefore, f and m are indifferent between marrying or splitting. We assume that they marry. In Section 4 below, we examine the possibility that it is in either or both spouses' interest to obey the rule mentioned earlier, and look for conditions such that this rule is self-enforcing and renegotiation-proof. Given, however, that these conditions depend not only on i's taste for filial attention, but also on that of i's parents, and of i's entire descendance, we must first study how the distribution of the preference parameter representing this taste evolves across generations. We do so in a context where the number of men and women stays constant (and equal to n for each sex) across generations, because all individuals marry and all married couples have a son and a daughter.

Preference evolution
There is evidence that preferences are passed on from parents to children, 9 but the transmission is not genetic. 10 That being the case, it would be reasonable to assume 8 The assumption that w is the same for every i and constant over time is a simplification. In the working paper version of the present article, Cigno et al. (2019), we take i's wage rate w i to be a random variable. That makes the model more realistic but complicates the formulae without making any difference of substance to the results. In Cigno et al. (2017), the probability distribution of w i is conditional on the educational investment carried out by i's parents. That makes the model even more realistic, but it has no bearing on the issues addressed in the present article. 9 Albanese et al. (2016) estimate that parental influence does not vanish when children come into contact with the outside world. Ottoni-Wilhelm et al. (2017) report that verbal socialization is more effective than setting a good example. 10 Bjorklund et al. (2006) find that it works also with adopted children. that δ i is a random variable with expected value equal to the mean of the parents' δs. For simplicity, however, we assume that δ i is equal to that mean with certainty, where F and M denote, respectively, i's father and mother (F and M are then, respectively, D and S's grandfather and grandmother). This simplification makes no difference in the long run. We are interested in studying how δ i evolves across generations. Let t = 0, 1, 2, … identify a specific generation. Due to the transmission mechanism (3), generation t may be characterized by a variety S(t) of δs. Assume that, in generation t = 0, n H < n women (and men) are characterized by δ = δ H , and n L = n − n H by δ = δ L < δ H . Letting π :¼ n H n , in generation t = 0, there are πn women (and men) with δ = δ H and (1 − π)n women (and men) with δ = δ L . In generation t = 0, the number of values taken by δ is then S(0) = 2 0 + 1 = 2. In the subsequent generations, the number of possible δs increases as a result of marriages between individuals with different δs. In generation t = 1, the possible values of δ are δ L , and δ H . Consequently, S(1) = 2 1 + 1 = 3. In generation t = 2, they are δ L , and δ H . Hence, S(2) = 2 2 + 1 = 5. In generation t ≥ 0, the possible values are and their number is S(t) = 2 t + 1. How is the random variable δ t (j) figuring in (4) distributed in a given generation t? How does the distribution of δ t (j) evolve across generations starting from the initial distribution (1 − π, π) of δ L and δ H ? Appendix section 7.1 demonstrates the following.
Proposition 1. In each generation t ≥ 0, for n sufficiently large, the distribution of δ t (j) converges to a binomial, with mean (1 − π)δ L + πδ H and variance πð1 À πÞ ðδ H À δ L Þ 2 2 t : Corollary 1. As t → ∞, the expected δ held by all agents is These two results say that, generation after generation, the population is subject to a "melting pot" process that eventually transitions the economy from an initial state with two heterogeneous groups, those with δ L and those with δ H , to a final state with a homogeneous population sharing the same value of δ, δ * . The transition goes through an infinite number of intermediate states, each of which corresponds to a situation where the δ-values in (4) may coexist according to a binomial distribution, if n is large (see Appendix section 7.1). If, for example, we set δ H = 1 and δ L = 0, the long-run value of the preference parameter is δ Ã ¼ π ¼ n H n : But how long is the long run? A sensible way to address this question is to calculate in how many generations t the standard deviation of the binomial distribution of δ will become σ ∈ {0.01, 0.05} for π ∈ {0.1, 0.5}. The answer is found solving the equation ðδ H À δ L Þ 2 2 t πð1 À πÞ ¼ σ 2 for π 2 f0:1; 0:2; . . . ; 0:5g: The value of t associated with each π; σ ð Þis shown in Table 1. Of course, the longrun value of δ (equal to the mean of the distribution) will vary with π; σ ð Þ too. The first column of this table says that, if 10% of the population is initially characterized by δ = 1, and the remaining 90% by δ = 0, so that the limit value of δ is 0.1, it will take 5.17 generations for the standard deviation to become equal to 0.05, and another 4.64 generations for it to fall to 0.01. If generations overlap every 20 years, this means that it will take 130 years for approximately 68% of the population to have a δ comprised between 0.095 and 0.105, and more than 245 years for that same share of the population to have a δ comprised between 0.099 and 0.101 (virtually 0.1). The remaining columns show how the convergence slows down, and the limit value of δ gets closer to one, as the initial share of individuals with δ = 1 rises from one tenth to a half of the total population. It is worth noting that the evolution of the δ parameter over time is irreversible. Once someone inherits a "mixed value" of δ, i.e., δ ∉ {δ L , δ H }, then the possibility that one of her or his children inherits one of the two initial values δ L and δ H is lost forever. Consequently, starting from some generation, everybody will have a mixed value of δ. Formally, the following result holds.
Proposition 2. The state of the economy in which no individual displays a δ ∈ {δ L , δ H } is absorbing, and this state is reached in a finite number of generations, almost surely.
Proof. In Appendix section 7.2. □ Proposition 2 states that the initial values of the taste-for-filial-attention parameter δ are expected to disappear in a finite number of generations. Due to random matching, there is a positive probability that the initial values of δ will not be passed on to the subsequent generation. In other words, generation after generation, the number of those who display δ L or δ H will fall. Since this process is irreversible, when the state where no one displays δ L or δ H is reached, the probability to remain in such a state is one for all future generations.

Family rules
We are now ready to analyze family decisions in the presence of a family rule ensuring that the young give attention to their elderly parents. The reason for focussing on filial attention rather than income support is that the former cannot be obtained in any other way, while the latter can. Where developed countries are concerned, there is indeed evidence that what the elderly actually get is mostly attention; see, for example, Lowenstein and Daatland (2006) and Klimaviciute et al. (2017).
According to a strand of economic literature stemming from Bisin and Verdier (2001), and Tabellini (2008), cooperative behavior arises because well-meaning parents expend resources to instill pro-social values into their children. According to Table 1 Number of generations needed to reach a distribution of the population with standard deviation σ, given the initial distribution (1 − π, π) π = 0.1 π = 0.2 π = 0.3 π = 0.4 π = 0.5 σ = 0.01 9.81 10.64 11.04 11.23 11.29 σ = 0.05 5.17 6.00 6.39 6.58 6.64 Alger and Weibull (2013), by contrast, innate individual preferences have a selfish and a moral component. Analyzing preference evolution, they show that stable preferences attribute the moral component a weight equal to the exogenous degree of assortativity in the matching process. As a result, cooperative behavior may prevail. We take a different tack. Our argument is that cooperative behavior may emerge as a self-enforcing rule even if individuals are entirely selfish as assumed in Section 2, and matching is completely random, simply because cooperative behavior is incentive-compatible. 11 In our specific model, the cooperation is between members of different generations of the same family, and it consists in the provision of filial attention by the young to the old. The rule in question is thus a family rule.
The following two definitions formalize what we mean by cooperative behavior and family rule.
Definition 1 (Cooperative behavior). The young give attention to their elderly parents.
This definition partitions the young into two groups: those who give attention to their parents, "cooperators", and those who do not, "non-cooperators". The latter may include "deviators", who do not give attention to their parents when the latter were cooperators, and "defectors", who do not give attention to their parents when the latter were noncooperators. We are interested in studying whether the following rule can support the cooperative behavior in Definition 1 as a Nash-bargaining equilibrium.
Definition 2 (Family rule). A young person i must provide attention a h i to elderly parent h = F, M if the latter is not a deviator.
Note that a young person is not obliged to give attention to a deviating parent. Having assumed that the young do not get direct utility from giving attention (or anything else) for free, however, children will punish deviating parents by giving them no attention. Therefore, the family rule in Definition 2 identifies two behaviors-being a cooperator and being a defector-that do not justify punishment, and one-being a deviator-that calls for punishment. As will be discussed later, two conditions ensure that the family rule defined above sustains the cooperative behavior in Definition 1. First, an individual must be better-off obeying the rule rather than disobeying it. Second, the individual must expect that the rule is incentive-compatible also for all her or his descendants (otherwise, by backward induction, he or she would anticipate that, starting from his or her children, the rule would be disobeyed). In general, cooperators, defectors, and deviators may coexist. In the long run, however, all individuals behave the same: either they all cooperate or they all defect.
In the following subsection, we start by examining the properties of the Nashbargaining equilibrium under the assumption that married young people give their elders certain specified amounts of attention, and then establish conditions such that a rule requiring married young people to do so is self-enforcing and renegotiation proof.

Bargaining
Suppose that the f ; m ð Þ couple comply with the rule set out in Definition 2, and that they Nash-bargain over the allocation of their time and money. Given that the best alternative to marrying and obeying the rule in question is to marry and disobey it, i's reservation utility is b R, for i = f, m. The Nash-bargaining (NB) equilibrium then maximizes subject to where a F i is the amount of attention given to i's father F, a M i that given to i's mother M, with i = f, m, and T is defined as a transfer from f to m in period 1. Assuming an interior solution (or the rule would be inoperative), we show in Appendix section 7.3 that the equilibrium is now The equilibrium expected utilities are then The NB equilibrium was derived under the assumption that a family rule specifying a h i and a i k , with h = F, M and k = D, S, is in force. There is such an equilibrium for any possible specification of this rule. Among all possible equilibria there may be some that are not Pareto-dominated by any of the others. Only these are renegotiation proof. 12 This gives us a general criterion for rule selection. A further criterion is to postulate that the prescribed a i k will depend on observable variables such as a F i and a M i . Let a i k ¼ g i ða F i ; a M i Þ, with k = D, S, where g i 2 C 1 is an increasing function of its arguments. This means that the amount of attention that k ∈ {D, S} provides to the parent i ∈ {f, m} increases with the amount of attention that the parent i has provided to her or his own parents, F and M. The function g i is the same for both siblings because daughter and son enter the analysis symmetrically, and there is thus no justification for requiring them to give different amounts of attention.
A natural candidate for the function g i , one that is easy to understand, is but we show in Appendix section 7.6 that our results are robust to more general functional forms. Each value of γ in (8) identifies a different specification of the family rule, and thus a different NB equilibrium. To find the value of γ that is renegotiation proof, we maximize (7) with respect to a h i ; h = F, M, subject to (8). The following result holds.
Proposition 3. If an interior solution to the maximization of b U i with respect to a h i exists, then the renegotiation-proof family rule is such thatγ ¼ 1 2 . Consequently, Proof. See Appendix section 7.4. □ This proposition says that, for it to be renegotiation-proof, a family rule must require a young person to give each of her or his elderly parents an amount of attention that depends positively on the receivers' tastes for this good, and negatively on the giver's wage rate (the opportunity-cost of attention). This amount is equal to the mean of those given by the receiver to her or his own elderly parents a period earlier, Substituting (9) into (7) allows us to write the equilibrium expected utilities as If the wage rate is a constant, as assumed for simplicity in the present paper, each elderly person will then receive the same amount of attention from each child. In reality, however, wage rates may differ across genders, generations and countries. Cigno et al. (2019) prove that the result in (9) is robust to possible randomness of the wage rate. If that is the case, the specification in (9) becomes a i k ¼ δ i w k and a h i ¼ δ h w i , where w k and w i are the realized wage rates of individual k and individual i, respectively. In particular, the child with the lower opportunity cost will then give more attention than the one with the higher opportunity cost.

Self-enforcing rules
We now establish conditions such that the family rule in Definition 2, prescribing the amount of filial attention (9), is self-enforcing. Given (9), a necessary condition for ( f, m) to obey the rule is that they would not be better-off disobeying it. Hence, which, substituting from (11) and (2), becomes This condition is obviously satisfied if a NB equilibrium exists, because the equilibrium expected utilities cannot be lower than the reservation utilities. As will be discussed later, for an equilibrium to exist and (12) to be satisfied for at least some couples, it must be the case that ln βδ H > ln w þ 1. We assume that it is. If ln βδ L > ln w þ 1 is true too, then (12) will be satisfied for every couple in the economy, and everybody will follow the family rule. If, instead, ln βδ L < ln w þ 1; there may exist couples who do not satisfy (12) and do not follow the family rule. This may have an important consequence also for couples whose δs are high enough to satisfy (12). Even if this condition is satisfied, the f ; m ð Þ couple will in fact obey the rule only if they expect their children to do the same. But this will depend on their children's δs, and on the δs of the children's respective spouses. Since the same applies also to the children's children, and so on to infinity, in order for the rule to be self-enforcing, the f ; m ð Þcouple must then expect that a condition analogous to (12) will hold for all their descendants and respective spouses. Again taking (9) as given, we need to require where t refers to the generation of the f ; m ð Þ couple, and ℓ denotes the number of generations separating the (f, m) couple from their d t+ℓ descendants, whose equilibrium utility is denoted by b U d tþ' . Consider generation t. Condition (13) may hold for a couple and not for another, both formed in period t, depending on the values of their δs. Therefore, some couples may comply with the rule, but some may not. The latter will neither give nor receive attention. If f and m expect that in some generation t þ ' their descendants will not satisfy the condition in (13) and consequently not give attention to their parents (belonging to generation t þ ' À 1) then, by a backward induction argument, f and m will anticipate that all the generations before t þ ' will not give attention. This means that f and m's children will be expected not to give attention and, consequently, that f and m will not either. 13 Note, however, that (13) is an expectation. This implies that, even if (13) is satisfied, there may exist a couple belonging to generation t þ ' for which the realizations of the δs are such that, ex post, condition (12) is not satisfied. Accordingly, even if all the f ; m ð Þ couple's descendants up to t þ ', and all other couples belonging to generation t þ ', abide by the family rule, the couple in question will deviate. Recall that, if (f, m) deviate, their children do not have to give them attention (f and m are defectors) but do not lose the right to receive attention from their own children. Even if the latter were to deviate, however, this would not necessarily imply that none of the deviating couple's descendants will give attention to their parents. As we show next, starting from some generation, the realizations of the δs may be such that all couples will give attention. There is no going back.
Recalling Definition 1, the following result holds. Theorem 1. For any initial distribution (1 − π, π), if β is sufficiently large, there exists an interval W such that, for any w ∈ W, a generation τ exists such that the family rule in Definition 2 and expression (9) sustain cooperative behavior for all t ≥ τ.
Proof. See Appendix section 7.5. □ In addition to the formal proof, provided in Appendix section 7.5, we lay out below the possible scenarios in order to help the intuition of this result. Start by noticing that the family rule defined in Definition 2 and expression (9) is selfenforcing if both (12) and (13) hold. Given the convergence of δ to the value δ * in the long run, it can be proved (see Appendix section 7.5) that a sufficient condition for (12) and (13) to be true, starting from some generation τ, is Conversely, if (14) does not hold, individuals can at best be indifferent between obeying and disobeying the rule. Now, fix β for a moment. There are three possible scenarios. If ln βδ L ! ln w þ 1, then (14) trivially holds and, as already pointed out, the family rule is adopted by every member of every generation because (12) and (13) are both fulfilled for all generations. If ln βδ H ln w þ 1, then (14) does not hold and the family rule is never obeyed by any individual because (12) and (13) are not fulfilled in the long run. If ln βδ L < ln w þ 1 < ln βδ H , by contrast, condition (14) may hold, depending on the values of β, w, and δ * (which in turn depends on the initial distribution (1 − π, π) of δ L and δ H ), and the family rule may emerge in the long run. However, for any given initial distribution (1 − π, π), if β is sufficiently large, β > e δ Ã (see Appendix section 7.5), there always exists a wage rate interval W that supports the cooperative behavior described in Definition 1 starting from a certain generation τ.
In Appendix section 7.5, we show that W :¼ ð1; βδ Ã e Þ. The lower-bound of W ensures that consumption is non-negative in the singlehood maximization problem (Section 2). The upper bound of W is implied by (14). The result in Theorem 1 shows that, for given β and δ * , high wage rates (w > βδ Ã e ) cannot support cooperative behavior. This is so because the giver's wage rate is the opportunity-cost of attention. High wages make it more profitable to allocate time in labor rather than in parental care, whose reduction is compensated by an increase in the consumption of the market good. Low wages, on the contrary, can make it more profitable to allocate time in parental care rather than labor, depending on how sensitive individuals are to the amount of filial attention received. This sensitivity factor is captured by the parameter β. 14 If β is too small (for example, if it is close to 1), the amount of filial attention required in order to yield positive utility to parents is too costly for the children, who, as a result, will not follow the cooperative behavior. Instead, if β is sufficiently large, then the amount of filial attention required to yield positive utility to parents is smaller and does not generate a large opportunity cost. Consequently, cooperative behavior becomes incentive-compatible.
Summing up, in the short run it is possible that some members of a generation do not give attention to their parents, but some of their descendants will. In the long run, assuming as we have done so far that all individuals of the same sex are outwardly identical, either everybody gives attention or nobody does (the policy implications are discussed at the end of the concluding section). In the next section, we look at what happens if this outward homogeneity assumption is relaxed.

Persistence of family rules
The implications of the evolutionary process described in the last two sections can be better appreciated if we look at the consequences of immigration. Consider a population that is originally characterized by a common value of δ, say δ = 0. Without an exogenous shock, this population would remain homogeneous for ever. Let us then suppose that there is a once-for-all influx of immigrants, equal in size to one ninth of the native population, and that all the newcomers are characterized by a value of δ that is different from zero, say δ = 1. According to Table 1, after between five and ten 14 Recall that β determines the threshold below which the attention received does not yield utility to parents. Therefore, in order for the children to give filial attention, it must be a i k > 1 β , with β > 1.
generations, the population will be homogeneous again, and its common characteristics will be very similar to those that were once common to the original inhabitants. In other words, the immigrants will be absorbed by the native population. If the number of immigrants were larger than one ninth, but no larger than a half of the native population (i.e., not so large that the immigrants outnumber the natives), it would take longer for the population to become homogeneous again, and the future inhabitants would not look much like the original ones. In that case, there would be convergence, but not absorption. Whichever is the case, random matching implies that it takes a relatively short time in evolutionary terms (between 130 and 245 years) for a population to become homogeneous again after a wave of immigration. In our model, this implies that either everybody will ultimately obey a family rule, or nobody will. Is that what we observe in reality? Lowenstein and Daatland (2006) find that a majority of adults in Norway, England, Germany, Spain and Israel acknowledge some degree of filial obligation, but both the incidence and the intensity of this sentiment are higher in the two Mediterranean countries, than in the two northern ones. Klimaviciute et al. (2017) report that working-age Greek, Italian and Spanish people spend, on average, more than 33 h a month caring for their elderly parents, while the Danish and the Dutch spend less than 11. This suggests that, despite thousands of years of cross-migrations and, more recently, complete freedom of movement within the EU, matching is not completely random. But why? Cigno et al. (2017) demonstrate that, if preferences were observable, it would be in the interest of a young person whose preferences are compatible with the existence of a selfenforcing, renegotiation-proof family rule, to seek out and marry a like-minded member of the opposite sex. As marriage partners would then be assorted according to their preferences, the latter would not evolve, and the share of the population who are governed by a family rule would remain constant. In the last two sections, we examined the opposite case where individual preferences are private information before couples are formed, and there is thus an equal probability of being matched with any member of the opposite sex. But suppose that the distribution of the tastefor-filial-attention parameter δ varies systematically with an observable trait θ denoting, for example, physical type, language or religious practice. If the density function of δ associated with each θ is common knowledge, and the expected value of δ is increasing in θ, a rational young man characterized by a δ high enough to satisfy (12) if his wife also is will then restrict his search to young women with the same θ as himself (the same applies to rational young women looking for a suitable husband). Alternatively, if the young are too impulsive to be concerned about what will happen to them in the next period of life, it will be their parents who, in their children's but also their own interest, try and restrict the range of persons with whom their children come into contact. Choice of school and area of residence are powerful instruments for restricting that range. 15 In the long run, there may then be a different limit value of δ for each θ, and the population may thus tend to break down into a number of sharply characterized subpopulations recognizable by their θ. As the unobservable limit value of δ varies with the observable value of θ, we may then find that, not only in the short but also in the long run, some θ-types look after their elderly parents, and some other θ-types do not. Of those who do, some may give their parents more attention than others. Even if there is no causal relationship between religion and taste for filial attention, that could then go some way towards explaining the already mentioned finding by Lowenstein and Daatland (2006), and Klimaviciute et al. (2017), that adults give, on average, more attention to their elderly parents in Mediterranean countries, where the predominant religious group are Christian Orthodox, Roman Catholic or Jewish, than in North-European countries, where the reformed Christian churches prevail. Other possible explanations are discussed in the next section.

Discussion
We have shown that, if a couple's tastes for filial attention satisfy a certain condition, and the same condition is expected to hold for their children and respective spouses, it is in the couple's common interest to obey a rule that requires them to give specified amounts of attention to their respective parents. The amount due is increasing in the receivers' tastes for filial attention, and decreasing in the giver's wage rate. Having assumed that the tastefor-filial-attention parameter is private information until the couple is actually formed, the size of this parameter cannot be a criterion for forming a couple. Therefore, if individuals were not differentiated by any visible trait other than sex, a couple would be a random draw from the entire population (excluding siblings), and the variance of the preference parameter in question would gradually diminish. In the long run, everybody would then have the same taste for filial attention, and either everybody would give the same amount of attention or nobody would give any. Before getting there, however, some couples might obey the rule, and some might not. As this applies also to members of different generations within the same line of descent, the rule could fall in abeyance for a number of generations, and then come back into force again.
Alternatively, if the population consisted of a number of subpopulations differentiated by a visible trait such as physical appearance, language or religious practice, and this visible trait were thought to be correlated with the invisible trait which determines whether a person has any interest in complying with the rule in question, sampling would be restricted to members of the same subpopulation. In that case, each subpopulation would converge to its own limit value of the unobservable preference parameter. Even in the long run, we could then find that some subpopulations obey the family rule, and some do not. And that the amount of filial attention given by those who obey the family rule varies from one subpopulation to another. Given that different countries are home to different mixes of ethnic, linguistic and religious groups, this prediction is consistent with the observation that the average amount of filial attention given differs widely across countries despite thousands of years of cross-migrations, and even within the EU where movements have been totally free for many decades.
Our approach differs from that of others who also aim to explain how preferences, rules or values evolve across generations in that we focus on the marriage channel, while they emphasize social interaction. It differs from that of Bisin and Verdier (2001), and Tabellini (2008), also in that parents do not need to bear a cost to inculcate what we call a rule and those authors call values into their children. A contribution that bears some similarity to ours even though it is not directly concerned with reproduction is Alger and Weibull (2013). Those authors assume that preferences have a selfish component, which by itself would lead a person to behave like "homo oeconomicus," and a "Kantian" one, which by itself would drive a person to "do the right thing" if everyone else did the same. Using the evolutionary stability notion developed in Weibull (1995), those authors show that stable preferences attribute the Kantian component a weight equal to the exogenously given degree of assortativity present in the matching process. In other words, the preferences that tend to prevail are those which display the "right" degree of morality given that the matching is assortative in that trait. In our model, by contrast, doing the right thing (taking care of the elderly) can be the equilibrium behavior even if people are entirely selfish, and the game is not repeated. 16 We regard our approach as complementary to those of others who examine the same or similar issues from different standpoints. In all those contributions, however, the focus is on cooperation among unrelated members of the same generation. Ours, by contrast, is on cooperation between related members of different generations. This different focus allowed us to link together the marriage pattern, the evolution of individual preferences, and the dynamics of family rules.
Whether and how many young people are governed by a family rule is a matter of some policy relevance. Using a slightly more general version of the present model, Cigno et al. (2017) show that wage redistribution reduces the share of the population that is governed by a self-enforcing, renegotiation-proof family rule because it makes the condition for the existence of such a rule more stringent. Taxing the young in proportion to their earnings, and subsidizing the old at a flat rate, as in a Beveridge-style public pension system, would thus reduce the share of the young who give attention to their elderly parents. Barnett et al. (2018) demonstrate that, given the existence of a selfenforcing family rule ensuring that the young give money to their elderly parents as in Cigno (1993), a public pension system can be justified only on redistributive grounds in the presence of heterogeneous agents. The same argument applies with even greater force to public policy towards the old in general if, as in the present paper, family rules ensure the delivery of a good like filial attention that, unlike money, cannot be procured in any other way. In light of evidence in Cigno and Rosati (1996), Cigno et al. (2002), Galasso et al. (2009), Billari andGalasso (2014), and many other empirical articles, that public policy towards the old crowds out family rules, 17 however, redistribution has an efficiency cost. Governments should carefully evaluate this trade-off before taxing the young to buy personal services for the old.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
In generation t = 1, there are S(1) = 3 possible values of δ, that is, δ L ; δ L þ δ H 2 ; δ H . The probability to have a match between two L-types, which gives birth to a male and a female with δ = δ L , is Similarly, the probability to have a match between two H-types, which gives birth to a male and a female with δ = δ H , is Finally, the probability to have a match between an L-type and an H-type, which generates two individuals with δ ¼ δ L þ δ H 2 , is Hence, in generation t = 1 there are still n males and n females (grandchildren replace grandparents), however, for each of these groups π 1 (0)n individuals will have now δ = δ 1 (0) = δ L , π 1 (1)n individuals will have δ ¼ δ 1 ð1Þ ¼ δ L þ δ H 2 , while π 1 (2)n individuals will inherit a value δ = δ 1 (2) = δ H .

Proof of Proposition 2
To show that the values δ L and δ H are lost after a finite number of generations, we exploit the properties of the Markov matrix that describes the probabilities for each individual to inherit a "mixed value" of δ across two periods, where a "mixed value" is any value of δ that differs from the initial ones, δ L and δ H .
At each t ≥ 0, let 2h denote the number of individuals (h men and h women) who have inherited a δ ∉ {δ L , δ H }, where h = 0, 1, …, n. Given that each individual with a certain δ has a sibling with the same value of δ, we can limit ourselves to tracking the spread of mixed values of δ in the female half of the population, because the same applies also to the male half. At t = 0, there are h = 0 women with δ ∉ {δ L , δ H }. At t = 1, there is a nonzero probability that some new born women have a δ ∉ {δ L , δ H }. This probability depends on the initial distribution of δ-values.
Due to (3), the mixing process is irreversible, and the number of persons with δ ∉ {δ L , δ H } cannot decrease. Therefore, the contagion of mixed values of δ continues over time. Again considering only the population of women (that of men evolves symmetrically), its evolution is described by the n + 1 × n + 1 upper-triangular matrix M, where The generic element M hh 0 of this matrix represents the probability to move the women population from a state where h women have δ ∉ {δ L , δ H } to a state where h 0 women do. Put another way, M hh 0 is the probability that h 0 À h women have lost the initial value of δ, δ L or δ H . Matrix M is upper-triangular because, once someone inherits a mixed value of δ, the possibility that one of her children has one of the two initial values, δ L or δ H , is lost forever. To better understand the meaning of matrix M, consider the first row. M 00 is the probability that, starting from a population with only δ ∈ {δ L , δ H } (i.e., with h = 0), no offspring inherits a mixed value of δ after the first matching (i.e., h 0 ¼ 0). Similarly, M 01 is the probability that, starting from h = 0, only one woman exhibits a mixed value of δ, h 0 ¼ 1; and so on. Therefore, M is a Markov matrix with only one absorbing state, M n,n = 1, and this state is reached in a finite time, almost surely.

Nash-bargaining with family rules
The utilities of f and m can be written as U m ¼ w½1 À a F m À a M m À s m þ ln rs m þ δ m ½ln βa m S þ ln βa m D þ T U f ¼ w½1 À a F f À a M f À s f þ ln rs f þ δ f ½ln βa f S þ ln βa f D À T :