Introduction

The quantity and quality of children is of paramount importance for the operation and performance of economies. Determining the volume and quality of labor supplied on factor markets, population growth and human capital formation are key drivers of economic growth and of pivotal relevance for the financing of public pension systems. The quantity and quality of offspring also affect the inter-generational transmission of wealth, income, and education, and mould the functioning of marriage markets. In light of their paramount importance, it is little surprising that social scientists have shown great interest in the determinants of child quantity and quality and the relationship between the two.

Testing empirically for the existence and size of a causal link between family size and children’s quality is difficult. The reason is that both child quantity (number of children) and child quality (e.g., educational choices) are subject to parental discretion and thus endogenous and possibly also chosen simultaneously. Empirical studies on the link between family size and children’s quality have addressed this identification problem by making use of exogenous variation in child quantity provided by twin births (Angrist et al. 2010; Black et al. 2005; Li et al. 2008), the gender of newborns (Lee 2008), the height of children (Lee 2012), and birth control policies (Li and Zhang 2017; Liu 2014; Qian 2009). The evidence produced, however, is mixed. Some studies find a negative effect of child quantity on quality, while others find no effect or even a positive effect. In part, this inconclusive evidence may be explained by a focus on developed rather than developing countries. In developed countries, which exhibit more generous welfare systems, any quantity–quality trade-off should be less strong, if not entirely absent (Li et al. 2008). But even for developing countries such as China, which has received growing attention in recent years and arguably provides a more adequate testing ground, the evidence remains mixed. This recent strand of studies for China, however, suffers from a number of methodological shortcomings that cast doubt on the validity and robustness of this literature’s findings. Exploiting for identification China’s One-Child Policy (OCP), the most influential population policy in world history, as an exogenous source of variation in household size, studies in this recent branch of literature suffer from severe measurement error in their key policy variable (individual OCP coverage) and in part also in their child quality outcome considered (educational attainment).

In this paper, we address these shortcomings in the literature and re-examine the relationship between family size and child quality for China. Using household data from the 2000 Chinese census and exploiting for identification variation across time and regions in individual OCP coverage, we produce new evidence based on instrumental variable (IV) regressions on the effect of child quantity on the educational attainment of children of post-compulsory schooling age. We restrict the analysis to households with mothers who have an agricultural background (agricultural HukouFootnote 1), as these provide a more adequate testing ground for the same reasons that also justify a focus on developing rather than developed countries, and who belong to the ethnic group of Han Chinese (at least for the main body of analysis), i.e., the largest ethnic group in China, so as to have a more homogeneous sample of mothers under study. Our results show that exogenous reductions in child quantity induced by the fertility restrictions of the OCP substantially increased the educational attainment of children. Various robustness checks we conduct corroborate this finding.

Our paper contributes to the empirical literature on the link between family size and children’s quality in several ways. First, by providing new evidence on the link between family size and children’s education for China, we add to and complement the growing body of empirical literature that focuses on this country. Second, and of importance from a methodological perspective, we introduce a new and continuous instrumental variable for individual OCP coverage, which measures more accurately than hitherto the case in the literature the actual degree to which women were subjected to OCP fertility restrictions during their years of prime fertility. Consistent with economic theories of fertility, this policy variable takes into account, and reflects, the fact that households tend to make life-time decisions on reproduction and investment in child quality (Becker 1960). We construct this instrumental variable from detailed information that we compiled and processed from regional family planning regulations in China’s thirty-one provinces and changes in these regulations over time, as well as from information on women’s prime fertility age, their ethnicity, and their economic background. This new measure of OCP coverage can in the future be fruitfully employed also in other applications, such as the study of tilted sex ratios at birth and their effects on marriage market outcomes or criminal activity. Similar policy construction strategy can be applied to analyze the effect of the termination of OCP. Finally, we use enrollment (current or past) in post-compulsory education as a measure of child quality, an outcome that is more clearly subject to parental discretion than general school enrollment which has been used in parts of the literature.

Background

One-Child Policy (OCP)

The One-Child Policy (OCP) was introduced by the Chinese central government as a means to curb rapid population growth, a step deemed necessary to avoid shortages in the supply of food and housing and aid the country in its transition to a modern economy. The OCP did not mark the beginning of centralized family planning and birth control efforts in China. In fact, first steps in this direction date back to the 1950s.Footnote 2 The year 1979, however, marks a historic watershed in Chinese family planning policy. In that year, several provinces (but not all) introduced in their territory what came to be known as the One-Child Policy (OCP), among them Beijing, Tianjin, Shanghai, and Jiangsu. Prior to that date, public policies had merely advocated the virtues of low fertility and encouraged birth control. Now, governments in these provinces explicitly prescribed low fertility targets for couples and enforced these targets with the help of severe financial fines in case they were breached.Footnote 3

Although formally announced in 1979 and meant to apply for the whole country, the OCP was therefore de facto implemented only piecemeal and at first only in selected Chinese provinces. The OCP was also not uniform in the fertility restrictions it imposed across couples of different ethnic and economic background. Exemptions for minorities, those with an agricultural background (agricultural Hukou), and parents with both an agricultural background and a first-born girl were introduced in many provinces in the 1980s and 1990s, albeit at different times. Furthermore, in the 1990s, the first children of families that had already been covered by the OCP became of marriageable and fertile age. Most provinces permitted couples to have a second child if both spouses had been born as a single child to their parents. Further exemptions were introduced in late 2013, and again only in some provinces, that a second child was permissible if at least one spouse had been a single child. In 2016, the OCP was officially terminated by allowing all couples, irrespective of their ethnic, economic, and regional background, to henceforth have two children.

The afore-sketched history of the OCP, its implementation and evolution, makes clear that, during the course of its term, the OCP was not homogeneous across provinces, couples, and time, but rather a changing complex conglomerate of time-variant and province-specific regulations and exemptions that in practice entailed great diversity both in the degree of the policy’s coverage and in its bite. The empirical literature on the quantity–quality trade-off in China (Li and Zhang 2017; Liu 2014; Qian 2009), and studies investigating other outcomes, such as sex ratio imbalances (Bulte et al. 2011; Li et al. 2011), which exploit the OCP for identification, generally fail to take into account this heterogeneity of the OCP. In the next section, we will discuss this shortcoming in detail and review the existing literature on the link between family size and child quality.

Previous Literature

Existing studies on the causal effect of child quantity on quality employ a variety of identification strategies and consider different countries. One source of exogenous variation in child quantity exploited in the literature is the gender of first-borns (Lee 2008) or the sex composition of siblings (Angrist et al. 2010; Conley and Glauber 2006). In societies that exhibit a preference for sons, families with a first-born girl tend more towards having a second child, and among parents with a preference for gender heterogeneity among their offspring, those with two children that are of the same sex are more likely to seek a third child. Using the gender of the first child as an instrument for child quantity in 2SLS regressions, Lee (2008) studies parental investment in child education in South Korea. Lee finds evidence for a trade-off between the quantity and quality of children, a trade-off that becomes more pronounced as a family’s sibling size increases. Exploiting variation in the sex composition of the first two children, and using 1990 U.S. Census data, Conley and Glauber (2006) find that sibling size has a negative effect on the likelihood of attending a private school and a positive effect on the grade retention for second-born boys. Using the same instrument, but data from the 20% microdata samples of the 1995 and 1983 Israeli censuses, Angrist et al. (2010) in contrast find no evidence for a quantity–quality trade-off in Israel. The use of information on the gender of children for identification in these studies, however, is not unproblematic, as the spread of ultrasound technology in the 1980s has made prenatal identification and selection of the sex of fetuses viable. This potential endogeneity of a child’s gender casts doubt on the validity of this IV (Li et al. 2008), at least for more recent birth cohorts.

A second, also prominent, and early source of exogenous variation in child quantity used in the literature are twin births. Exploiting twin births for identification, Rosenzweig and Wolpin (1980) study the effect of family size (number of offspring) on the educational attainment of children in India. Consistent with a quantity–quality trade-off, the authors find a larger family size to adversely affect the average educational attainment of children. Li et al. (2008) also find a negative effect of family size, identified by a twin birth, on the educational attainment of children in China. The same holds true for Glick et al. (2007), who, using data from the Romania Integrated Household Survey, show that unplanned fertility (through a twin birth) has a negative impact on children’s nutrition and schooling. Similarly, Rosenzweig and Zhang (2009), using data from the Chinese Child Twins Survey, find a negative effect of an increase in family size through a twin birth at child parity one or two on the school performance and self-assessed health of children. Using data for Norway and employing standard OLS regression analysis, Black et al. (2005) also find an additional child to reduce the average educational attainment of children in a household. However, they produce evidence which shows that this effect becomes significantly smaller, once family background characteristics are controlled for, and that it disappears altogether when birth order is accounted for in the analysis. Furthermore, using twin births as an IV in 2SLS regressions, Black et al. (2005) find family size to have only negligible effects on the quality of children. For several reasons, however, the use of twin births as an instrument (like the afore-discussed gender of a child) is not unproblematic. First, as noted in Black et al. (2005), their use tends to bias 2SLS towards producing evidence in support of a trade-off between quantity and quality of offspring. Since the spacing between twin births is zero, parents may shift more resources towards non-twin children which causes bias in estimates of the quantity–quality trade-off (Rosenzweig and Zhang 2009). Second, the birth weight of twins is lower than that of non-twins, which can also directly affect the outcome of children. Finally, with the onset and spread of assisted reproductive technology that carries the risk of elevated twinning rates, a twin birth no longer needs to constitute an exogenous event beyond the control of parents, but becomes potentially subject to endogenous parental choices and hence self-selection of parents.

A third, and more recent source of exogenous variation in child quantity exploited in the literature is public policy, in particular the One-Child Policy (OCP) in China which limited (albeit at different times in different regions and for different groups) the maximum number of children that households could have to one. Focusing only on rural China and using a 1% sample of the Chinese 1990 census and county-level data from the 1989 China Health and Nutrition Survey (CHNS), Qian (2009) exploits as an instrument for family size the regional variation in the exemption of parents from the OCP when they have a first-born girl to study the effect of sibling numbers on the school enrollment of first-born children. Qian finds no evidence for a negative effect of child quantity on quality. The study, however, uses county-level OCP exemption information only from 1989 and ignores possible other exemptions, both concurrent and prior to 1989, that could impact the fertility behavior of women over the course of their fertile age. Moreover, a number of children considered in the analysis of Qian (2009) are still in compulsory education and of compulsory schooling age, where parental discretion in schooling choices is limited, if not completely lacking. Liu (2014), in turn, mainly uses data from the 1993 CHNS and exploits exemptions from the OCP as well as regional variations in the level of fines imposed for unsanctioned births as an IV. The findings of this study suggest a significant negative effect of number of siblings on child quality, as measured by a height-for-age z-score. However, OCP exemption status and fines are sampled only for three years, 1989, 1991, and 1993, again ignoring earlier potential exemptions (or restrictions) affecting female fertility over the course of women’s fertile age. Finally, Li and Zhang (2017), exploiting regional differences in OCP enforcement intensity as an instrument for family size and using data from the Chinese censuses of 1982 and 1990, find a negative effect of family size on the educational attainment of first-born Han children. Their variable of policy enforcement intensity, measured by an excess fertility rate, is defined as the percentage of all Han mothers aged 25–44 with at least one surviving child who gave a higher-order birth (2nd or higher) in 1981. This definition of Li and Zhang (2017), and hence their underlying identification strategy, is therefore not based on actual policy regulations, their measurement and quantification, but on the factual realization of births, which is highly problematic, as realized births are subject also to parental discretion and hence the influence of parental preferences.

Apart from the afore-mentioned studies by Qian (2009), Liu (2014), and Li and Zhang (2017), the OCP has been used also as an exogenous source of variation in studies investigating outcomes other than the quantity–quality trade-off. Bulte et al. (2011), for instance, examine the role of the OCP for the extremely male-biased gender ratio in China, a country with strong son preferences. In their analysis, they use only the birth year of a child to identify children born to parents covered by OCP regulations. The exclusive distinction between children born before/in or after 1979 is a very rough measure of parental exposure to OCP regulations that ignores entirely the variation across provinces in the introduction of the OCP. Since they only assume that ethnic minorities are exempt from the OCP throughout all provinces in China after 1979, the various exemptions granted to specific ethnic and economic groups across provinces and across time are also ignored (we discuss provincial family planning regulations in detail in “OCP Regulations and Exemptions” section). A quite similar dichotomous measure, but one that is also far from perfect, is used by Li et al. (2011) in their difference-in-differences based analysis of the effect that the OCP had on the sex ratio at birth in China. They define a child to be born under OCP regulations if the child is of Han ethnicity and born after 1979. As discussed above, ethnic minorities, however, were not always exempted from the OCP, nor were Han always restricted by the OCP. Furthermore, only few provinces actually implemented the OCP already in 1979. Overly simplistic classifications, as the ones employed in these studies, hence entail sizable measurement error in the actual OCP coverage of individuals which may significantly bias estimates.

Data and Empirical Strategy

OCP Regulations and Exemptions

We constructed ourselves a detailed summary of the OCP with its province-specific introduction times, regulations, and exemptions, drawing on numerous publications and directives that describe the different family planning regulations enforced over time in China’s provinces.Footnote 4 A summary of this comprehensive policy review is tabulated in Table 1. From its earliest inception in 1979 and through to the year 2000, Table 1 provides information for each province (column (1)) on the year the OCP has been first implemented (column (2)) and any periods of years in which certain types of households have been exempted from the obligation to bear at most one child (columns (3)–(6)). These households fall into four main types.Footnote 5 First, households in which both spouses have an ethnic minority background (column (3)). Second, households in which at least one spouse has an ethnic minority background (column (4)). Third, households in which both spouses have an agricultural Hukou (column (5)). And finally, households in which both spouses have an agricultural Hukou and also a first-born girl (column (6)). The last exemption is sometimes referred to as the 1.5-child policy (Ebenstein 2010; Yang 2012). Altogether, we consider thirty-one provinces.Footnote 6

Table 1 OCP regulations and exemptions across provinces and time

As can be seen from Table 1, there is great variation across provinces and across time within provinces in OCP exemptions granted to specific types of households. There is also great heterogeneity across provinces in the year they first implemented OCP fertility restrictions. The OCP did not start in 1979 in all of China, as assumed in parts of the literature and used therein as a cut-off date to define OCP treatment in the empirical analysis (see discussion below). In fact, only a minority of provinces implemented the OCP already in 1979.Footnote 7 Moreover, after 1979, some provinces were newly formed, or dissolved and integrated into other provinces, so residents of these provinces were covered by different OCP regulations before and after such administrative territorial restructuring.Footnote 8

The complexity, time-varying nature, and great regional diversity of OCP regulations documented in Table 1 have been largely ignored in existing empirical work, or taken into account only partially. For instance, Bulte et al. (2011) and Li et al. (2011) assume that the OCP took force in 1979 throughout all of China and that it applied undifferentiated and with universal coverage to all Han population. Clearly, neither was the case. They wrongly assume that exemptions from the OCP existed in all provinces, i.e., throughout China, for all minorities in all years after 1979. Moreover, the study disregards other exemptions from OCP fertility restrictions that have been granted in different provinces to different groups at different times. Qian (2009), who focuses only on first-born children from rural areas in four out of the 30 provinces in China in 1990 (Liaoning, Jiangsu, Shandong, and Henan), also considers but a single type of exemption from OCP regulations, the exemption for agricultural households with a first-born girl. Focusing on but one exemption again fails to do justice to the restrictions households actually faced in their fertility behavior during their fertile years. Liaoning province, for instance, had a very large minority population at the time, parts of which were exempted from OCP regulations even when both spouses were not agricultural or households did not have a first-born girl. Furthermore, in the study by Qian (2009), exemptions for agricultural households with a first-born girl are recorded only in a single year immediately prior to 1990, i.e., in the year 1989. Liu (2014), in turn, who also studies only a subset of Chinese provinces, considers different types of exemptions, as well as fines for violations of OCP regulations, to construct instrumental variables for the number of siblings in a household in 1993. In the study, household fertility is assumed to be fully unrestricted by OCP regulations if the household could enjoy an exemption in at least one year in 1989, 1991, or 1993. However, this narrow definition ignores that households may have been subject to quite different OCP regulations before 1989, governing part or most of their fertile years and hence reproductive behavior. Finally, the study by Li and Zhang (2017) considers an excess fertility rate, which is defined as the share of Han mothers of primary childbearing age who gave a higher-order birth in 1981. The assumption that all higher-order births to Han mothers in 1981 were not permitted under the OCP, however, is wrong. In Hunan province, for example, OCP regulations were implemented only in 1982, a year after the stock-taking year chosen to define the excess fertility rate. As a consequence, all births in Hunan in 1981 that are defined as “excess births” in the analysis are effectively mis-classified. Furthermore, taking reference to but a single calendar year (1981) ignores the time-varying nature and great regional diversity of OCP regulations and exemptions that in practice governed household fertility (over its fertile life-time) in China.

Heterogeneity in the introduction and modification of OCP regulations at provincial level and variation across households (in calendar time) in the female fertile life span imply that women covered at some point by OCP regulations may exhibit great differences in the degree to which their life-time fertility was de facto subjected to OCP fertility restrictions. OCP treatment, in short, is far from dichotomous in nature (complete vs. no coverage) but may assume different intensities. Modeling such different intensities of treatment requires a continuous measure of individualized OCP coverage or treatment. The share of female fertile life-time subjected to OCP regulations provides such a measure. Ranging from zero (not restricted at all) to one (complete life-time fertility span restricted), such a life-time-based treatment definition is also more in line with Becker’s original formulation of the quantity–quality model, where children are considered a durable consumption and production good, and households are to make life-time decisions (or life-time plans) on reproduction, child quality investments, and own consumption (Becker 1960). In the literature, however, dichotomous measures of OCP coverage have been generally used, based, for example, on whether or not OCP regulations were in force at the particular point in time a woman gave birth (see discussion above). Such dichotomous measures are clearly inadequate to capture the actual degree to which female reproductive capacity was constrained by OCP family planning policies.

Household Census Data

The second type of data we use is a 0.095% random sample of households surveyed in the 5th Chinese Census in the year 2000 (National Bureau of Statistics 2000). Several features of the 2000 census are advantageous, if not vital, for an analysis of the quantity–quality trade-off in China. First, the census contains information on an individual’s schooling level from which we can reconstruct the post-compulsory educational choices of children. Second, the census contains information on households from 31 provinces in China, rather than only a subset of (possibly selective) regions, as considered in parts of the literature (Liu 2014; Qian 2009). This allows us to consider the whole of China in the analysis and to exploit more fully the great heterogeneity and variation in OCP regulations across time, provinces, ethnicities, and household types. Third, the census provides information on the total number of children a woman has born and raised irrespective of whether these children still reside at the parental home on the census day (i.e., a measure of total child quantity, rather than an undercount that is possibly selective). Finally, the census records the ethnicity of each person (not only whether a person is Han or not), which permits us to consider specific exemptions from the OCP that apply only to particular minorities in the construction of our key policy variable, the intensity of exposure of a woman during her fertile years to the fertility restrictions of the OCP.

Quality of Children We measure child quality by a dichotomous variable that takes value one if a child has completed (or is currently enrolled in) post-compulsory education on the census day, and zero otherwise. Compulsory schooling in China includes primary school and junior secondary school education, which together amount to nine years of schooling. As children attend primary school from age six, children complete compulsory schooling at age 15.Footnote 9 After compulsory education, children may continue with senior secondary school education or other forms of post-compulsory schooling. Post-compulsory schooling choices are subject only to parental discretion, that is, a parental choice variable unfettered by public schooling laws. As such, they are better suited to proxy parental child quality investment than coarser measures, such as total years of schooling or school enrollment, which consists mostly of compulsory schooling, that have been used in parts of the literature on China (Qian 2009). Post-compulsory education is also of increasing importance in China for job search, pay levels, and rural-to-urban migration. Returns to education, in fact, are higher in less-developed and low-income regions (Johnson and Chow 1997; Li 2003; Zhao 1997).

Quantity of children We measure child quantity by the number of siblings a child has. The number of siblings equals the total number of children a child’s mother has born and raised less one. There are hence zero siblings in a single-child household, and a single sibling in a two-children household.

OCP coverage The policy variable we use to quantify the intensity by which female reproductive capacity is restricted by OCP regulations is defined as the share of prime fertility years of a woman that are subject to OCP regulations:

$$\begin{aligned} OCP = \frac{\#\text { of years covered by OCP during woman's prime fertility}}{\#\text { of years of woman's prime fertility}}. \end{aligned}$$

Ranging from zero (no coverage) to one (complete coverage), this measure of OCP coverage is a function of several factors: female age in different calendar years, the ethnicity and household Hukou type of a woman, and the province a female resides in. Province information is vital, because province of residence determines when a female was in fact first subjected to OCP regulations, and which kind of exemptions she could potentially enjoy at certain times throughout the course of the OCP and her fertile life span. In our baseline specification, we consider the prime fertility age of women to lie between 21 to 35 years of age. This choice is inspired by several factors. First, women in China must be at least 20 years old to marry. Second, descriptive explorations for women aged 49 to 50 in our random sample of the 5th Chinese Census in 2000 (i.e., women born in 1950 or 1951 who have just completed their fertility by the time of the census) reveal that \(86.4\%\) of their children were born when these mothers were aged 21 to 35, and \(95.05\%\) were born when they were aged between 21 and 40. The overwhelming majority of births hence occurred when mothers were aged 21 to 35, respectively 21 to 40 (we will consider the latter and broader age span of mothers in one of our robustness checks). Third, plots of the age distribution of mothers who gave birth in 1986, 1990, 1995, or 2000 show that the overwhelming majority of women who gave birth in any of these years were aged between 21 and 35 (see Figure A-1 in Appendix). Note that for defining our OCP coverage variable, we make exclusive use of information on females but not males (their husbands). This restriction is inspired by the possibility that marriages may be selectively formed to enjoy certain exemptions from OCP regulations by marrying a man that is eligible for exemptions, e.g., because of his minority status. We also disregard information on the gender of a first-born child when we model the 1.5-child policy (i.e., a household is exempt from the OCP if both spouses have an agricultural Hukou and their first child is a girl). The reason for doing so is again potential endogeneity, now in fertility choices, and possibly also in the determination of the gender of first-borns (Banister 2004; Dyson and Moore 1983; Feldman et al. 2007; Li and Zheng 2009).

Our OCP coverage variable is a mother-based continuous variable that uses information on province-specific introductory times of OCP regulations, province-specific OCP exemptions for minorities and/or individuals with an agricultural Hukou, and household background information on the ethnicity, agricultural Hukou status, and fertile age of mothers. In the literature, however, dichotomous child-based measures for OCP treatment are used that consider mostly but one type of exemption, either for ethnic minorities or for individuals with an agricultural Hukou, and that ignore wholesale the complex conglomerate of province-specific and time-variant OCP regulations and exemptions. Such overly simplistic measures of OCP coverage are highly problematic. We show in Appendix that failure to account for either household agricultural Hukou (but not ethnicity) or the plethora of OCP regulations in all their diversity (as summarized in Table 1) does entail severe mis-measurement in the degree to which households were actually restricted in their fertility by the OCP.

Using the information on OCP regulations and exemptions shown in Table 1 and the construction formula of the policy variable discussed above, other researchers can generate this new and continuous OCP measure if they have access to data on females’ birth year, place of residence (province), ethnicity, and Hukou type (agricultural or non-agricultural), which are usually available from census or survey data for China. We show in Appendix some examples on how to generate this policy variable. This continuous measure of individual OCP coverage can also be fruitfully employed in other applications for China in future research, for instance to study imbalanced sex ratio, marriage market patterns, or crime.Footnote 10

Other Covariates In addition to our key explanatory variable, the quantity of siblings of a child, we will in part of our analysis control also for other potential determinants of child quantity and quality. These are (1) province fixed effects to account for time-invariant differences across regions in average fertility levels and educational attainment; (2) sets of indicators for mother and child age to control for aggregate cohort effects across provinces on parental schooling investment, parental reproductive behavior, and child school attainment; and (3) mothers’ and fathers’ educational attainment (measured again, respectively, by indicator variables for post-compulsory school attendance) to control for potential endowment and preference effects of parental background on parental child quantity and quality choices.

Estimation Sample and Summary Statistics

For the main body of our analysis, we will, as argued, restrict our estimation sample to the more homogeneous group of Han Chinese (mothers) who have an agricultural Hukou (we will consider all ethnicities, however, in a robustness check), exploiting for identification variation in individual OCP coverage intensity that comes from within-province age group variation across time in the exposure of women of fertile age to OCP regulations and exemptions (granted to those with an agricultural Hukou). Individuals in this restricted sample still account for the majority of the Chinese population and mainly live in less-developed areas, where the quantity–quality trade-off (if anything) should be more pronounced. Our final estimation sample consists of 67,953 children from 46,814 households. Table 2 provides summary statistics for this estimation sample.

Table 2 Summary statistics for estimation sample

Empirical Strategy

To identify the effects that exogenous variations in child quantity induced by OCP fertility restrictions had on child quality (as measured by post-compulsory schooling attendance), we estimate 2SLS regressions of the following type:

$$\begin{aligned} Q_i= \,& {} \delta _0+\delta _1\widehat{N_i}+\varvec{\delta _2X_i}+\varepsilon _{i2} \quad \quad \qquad \,\, \text {(2nd stage)}\\ N_i=\, & {} \theta _0+\theta _1\text {OCP}_i+\varvec{\theta _2X_i}+\varepsilon _{i1} \qquad \quad \text {(1st stage)} \end{aligned},$$

where \(Q_i\) is the quality of child i, a dichotomous dependent variable for post-compulsory schooling attendance, \(N_i\) is the number of siblings of child i, and \(\varvec{X_i}\) is a vector of characteristics of child i, its parents, and its household. \(\varvec{X_i}\) includes a set of dummies for child i’s age, its mother’s age, and the household’s province, as well as two indicators for parental education, one for post-compulsory school attendance of the mother, and one for post-compulsory school attendance of the father. OCP\(_i\), our first-stage instrumental variable, measures the degree of OCP coverage of child i’s household and ranges from zero (no coverage) to one (complete coverage of mother’s prime fertility years). Finally, \(\varepsilon _{i1}\) and \(\varepsilon _{i2}\) are error terms.

Identification in our 2SLS setting requires that our instrumental variable OCP\(_i\) is correlated with the potentially endogenous child quantity measure \(N_i\) (instrument relevance) but uncorrelated with the error term \(\varepsilon _{i2}\) in the second-stage outcome equation (instrument exogeneity). The first requirement is testable and can be shown to hold. As we will see, when discussing our regression results in “Results” section, OCP\(_i\) and \(N_i\) are highly correlated in our data. The second requirement, while not testable, is likely to be satisfied. Our instrument variable is arguably exogenous, since we control for potential confounders, such as province of residence, mother and child age, as well as parental education.Footnote 11 Province fixed effects control for level differences across provinces in, e.g., local preferences for sons and children, average child quantity and quality levels, income levels, provision of public education, size of agricultural and minority populations, average OCP intensities, and provincial preferences regarding OCP regulations and exemptions. As we also control for fixed effects in parental education, as well as mother and child age, we effectively exploit in the analysis for identification only within-province variation in our instrumental variable that is related to mothers’ age span of fertility.Footnote 12 Note that, for individuals, it is virtually impossible to change either Hukou type or ethnicity so as to enjoy certain exemptions from the OCP. Furthermore, systematic household migration across provinces to avoid unfavorable provincial restrictions on household fertility is also unlikely to pose a threat to identification in our setting. The scale of cross-province migration in our estimation sample is very low. Using province information on an individual’s place of current residence (in 2000) and birth shows that \(96.49\%\) of mothers in our estimation sample (accounting for \(96.41\%\) of all children under study in our analysis) still resided in their province of birth in the year 2000.

Results

Main Results

As we focus in our main analysis on households of mothers who are Han and that have an agricultural Hukou, variation in the extent of individual OCP coverage in our estimation sample comes from three sources only, the age of a mother (determining her fertile life span in calendar time), the calendar year that OCP regulations at province level were first introduced, and the timing and degree of exemptions from OCP regulations granted at province level to individuals with an agricultural Hukou. Our main OLS and 2SLS results for different variants of the regression specification described in “Empirical Strategy” section are shown in Table 3. Throughout, standard errors are clustered at the household level.

Table 3 OLS and 2SLS estimates of the effect of sibling size on the post-compulsory education of a child

Columns (1) and (2) in Table 3 report results from OLS and 2SLS regressions of child quality on child quantity, where we consider as additional regressors only mother age (in three groups, 35–40, 41–45, and 46–50) and a set of dummy variables for the different Chinese provinces. Mother age (in groups) controls for cohort effects, such as differences in preferences or average economic conditions and the differential exposure of different female cohorts to OCP regulations. Province dummies, in turn, control for time-invariant differences in child quantity and quality between provinces. The results of the 2SLS regression show that the longer a mother’s prime fertility years are subject to OCP fertility restrictions, the fewer children she tends to have (first stage) and that this exogenous reduction in child quantity, in turn, is associated with a statistically significant increase in child quality (second stage), i.e., the likelihood of a child of post-compulsory schooling age to have post-compulsory education (see column (2) of Table 3). The instrument is strong (large F-statistic) and its negative coefficient is large: mothers covered by the OCP for their entire fertile years tend to have on average 0.29 children less than they would have got if they had not been subject to any fertility restrictions, a sizable exogenous reduction in child quantity. Furthermore, the impact of this policy-induced reduction in family size on child quality is also large. An additional sibling is predicted to reduce the likelihood of a child to have post-compulsory education by 0.26. Children of mothers covered by the OCP for their entire fertile years therefore have an average \(-0.29\times (-0.26)=0.075\) higher likelihood to have post-compulsory schooling than children of mothers who were never constrained in their fertility by OCP regulations. This is a sizable increase in child quality given that the (unconditional) average likelihood of children in our estimation sample to have post-compulsory education is only 0.15. The OLS results reported in column (1) also show a negative and statistically significant coefficient estimate of sibling size, albeit one that is much smaller in absolute magnitude. Based on this estimate, the same decrease in the number of siblings (by 0.29) is predicted to increase the probability of being enrolled in post-compulsory education by only \(-0.29\times (-0.04)=0.012\), which suggests that OLS tends to severely underestimate the true effect of child quantity on quality.

We next add two indicator variables to our set of regressors that take value one if the mother, respectively father, has post-compulsory schooling (columns (3) and (4) in Table 3). These binaries control for parental education and account also for potential differences among parents in preferences and capabilities that are related to own education and of potential importance for parental quantity and quality choices, such as the importance parents attach to child education and fertility and their ability to provide personal support to their children in school. As shown in column (4) of Table 3, our 2SLS second-stage coefficient estimate for the number of siblings remains negative, statistically significant, and sizable, although its absolute magnitude (the scale of the trade-off between quantity and quality) is now marginally smaller. Furthermore, our estimated first-stage effect of OCP coverage on child quantity is virtually unchanged. Consistent with expectations, more educated parents tend to have fewer (only mothers) but more educated children (both mothers and fathers).

Finally, we further augment our specification by adding controls for the age of children. Adding a set of dummies for different age cohorts controls for potential birth cohort effects in family size and educational attainment. However, as shown in column (6) of Table 3, controlling for the age of children does not materially affect our 2SLS estimates. The second-stage coefficient estimate for the number of siblings remains negative and significant (albeit now somewhat further reduced in magnitude), and our instrument stays strong and of sizable influence for family size. Based on this estimate, children of mothers covered by the OCP for their entire fertile years have an average \(-0.33\times (-0.17)=0.056\) higher likelihood to have post-compulsory schooling than children of mothers who were never constrained in their fertility by OCP regulations.Footnote 13,Footnote 14

Summarizing the above, our results suggest that a sizable quantity–quality trade-off existed in China during the period under investigation, a finding that proves robust to various changes in model specification. Our findings prove robust also to the use of alternative ways of clustering standard errors.Footnote 15 First, we clustered standard errors at the level of provinces at which family planning regulations were made. With only 31 provinces, the number of clusters is small, which could bias standard errors and lead to over-rejection (Cameron et al. 2008; Cameron and Miller 2015). We therefore use a wild bootstrap test after 2SLS estimation when clustering at province level. The effect of child quantity in the second stage remains significant, albeit at a lower level (10%), while the significance of our instrument (OCP) in the first stage remains unchanged. Second, we clustered standard errors at the level of 91 groups with differential exposure to OCP restrictions, defined by combinations of mother age (3 age groups) and province of residence (31 provinces).Footnote 16 For children of mothers with an agricultural Hukou that reside in the same province and are of the same age all effectively live in households that are subject to the same OCP regulations. Reassuringly, clustering standard errors at this group level also proves immaterial for the statistical significance of our (first-stage) instrument and (second-stage) measure of child quantity.

In “Previous Literature” and “OCP Regulations and Exemptions” section, we have documented in detail actual OCP regulations and discussed various measures of OCP coverage used in the literature that, because of their overly simplistic nature, fail to do justice to the complex regulatory fabric of the OCP. In the following, we make use of several such simplistic measures as IVs to see, in how far such miscoding of OCP restrictions may bias results. First, like Bulte et al. (2011) and Li et al. (2011), we disregard information on the mother and on province-specific OCP regulations altogether and use only information on a child’s birth year and ethnic minority status to construct our OCP instrumental variable. Specifically, we generate a dummy variable born1979 that equals 1 if a child was born after 1979 and 0 otherwise, and a dummy variable Born1979 Han that equals 1 if a child was born after 1979 and of Han ethnicity and 0 otherwise. The 2SLS regression results for these two alternative IVs are shown in columns (2) and (3) of Table 4. As is evident, in both first stages, estimated coefficients on these alternative IVs (counterintuitively) turn out positive, not negative, and so do the estimated treatment effects in the respective second stages.Footnote 17 Next, we consider only information on the time of introduction of the OCP at province level and on mothers’ fertile age span to construct a dummy IV for OCP coverage when fertile (aged 21–35), OCP in fertile age. This variable takes value 1 if the OCP was introduced in a mother’s province of residence during her prime fertile age, and 0 otherwise. This third classification hence disregards any exemptions of a household from the OCP because of its Hukou type or its ethnic minority status. Results for this alternative OCP IV are shown in column (4) of Table 4. The first-stage coefficient of this OCP policy variable is negative significant, but the F-statistic is small suggesting that the instrument is weak. The estimated second-stage coefficient on the siblings variable is again positive, but insignificant. Finally, we construct a dummy policy variable that captures whether a mother (household) has never been exempt from the OCP, no exemption. The variable takes value 1 if the mother was constrained by OCP restrictions throughout her entire fertile age and 0 otherwise (i.e., the mother was never restricted by the OCP, or she was not always covered by OCP restrictions during her fertile age because of OCP exemptions or a late introduction of the OCP at province level). As shown in column (5), however, this instrument also turns out weak and the number of siblings in the second stage fails to exert a statistically significant effect on the probability of a child to have some post-compulsory education. Compared to our baseline result, reproduced in column (1) of Table 4, therefore, the exclusive use of child information, as in columns (2) and (3), which disregard province-specific OCP regulations altogether, or of mother information, as in columns (4) and (5), which disregard the extent (on the intensive margin) to which mothers were restricted by the OCP during their fertile years, produces quantitatively, and in the majority also qualitatively, different treatment effects of sibling size on educational child outcomes. The use of such crude measures of OCP coverage is hence far from innocuous, but gives rise to significant bias.

Table 4 2SLS estimates of the effect of sibling size on the post-compulsory education of a child by different OCP definitions

Robustness Checks

We also carried out checks on the robustness of our findings to various changes in the estimation sample.Footnote 18 In this section, we will consider three different estimation samples, two subgroups of children from our baseline estimation sample (first-borns, respectively younger children), and an expanded sample that includes also children of minority mothers with an agricultural Hukou.

Our baseline estimation sample considers all children aged 15 or older, irrespective of whether these children are first-born children or children of higher birth parity. First-born children, however, are conceptually different from children born at higher parities for two reasons. One reason is that parents were never constrained by the OCP in their decision to have a first child, i.e., in their decision to have children at all. Conditional on having children, OCP regulations only restricted how many children parents could have at most. The other reason is that parents, more generally, may treat children of different parity systematically different. To see whether the undifferentiated use of children of different birth parities matters for our results, we restricted the estimation sample to the oldest child who is still residing in a household. Note that this child needs not be the first-born child if some child has already moved out. The reason for this is that the census data we use does not provide information on the age of children who at the time of the census no longer reside with their parents. We can therefore identify the oldest child still living at home, but if a child has already moved out from that home, we cannot be sure whether the oldest child on record is also the first-born child.Footnote 19 However, two-thirds of the oldest children still residing in the parental household are from households where all children born to a mother still reside with their parents, i.e., two-thirds of these oldest children are in fact first-born children.Footnote 20 Focusing on oldest children only, we estimated the effect that siblings have on the likelihood of the oldest child to be enrolled in or have completed post-compulsory education using our 2SLS baseline specification of column (6) in Table 3. Column (2) of Table 5 reports the main regression output. The estimated first-stage coefficient of the OCP instrument is still negative and significant, albeit somewhat smaller in absolute magnitude than for our baseline estimation sample. The estimated second-stage coefficient on child quantity remains negatively signed and significant as well. It even increases somewhat in absolute value.

Table 5 2SLS estimates of the effect of sibling size on the post-compulsory education of a child for different age groups of children and for all ethnicities

The census data records complete information on children only if these are still living with their parents on the census day. If such co-residence is non-random and related to educational choices (either directly, because children moved out to attend higher education, or indirectly, because of early marriages that made the acquisition of post-compulsory education impossible), selectivity in our sample will entail bias in our afore-discussed results for the oldest children still residing with their parents. To address this concern, we consider two samples of younger children that arguably are less subject to such selectivity because of their age. First, we consider only children aged 15–18. These age cohorts have finished compulsory education but are too young to have started with college education. Second, we restrict the estimation sample to children aged 15–21, as children aged less than 22 are far less likely to have moved out for marriage.Footnote 21 For each restricted sample, we estimated the effect of sibling size on the probability of a child to be enrolled in (or have completed) post-compulsory schooling, using our 2SLS baseline specification of column (6) in Table 3. The results are shown in columns (3) and (4) of Table 5. As is evident, for both samples of younger children, sibling size continues to exert a negative effect on a child’s schooling.

Finally, we expanded our estimation sample to include also children of mothers with an agricultural Hukou who are not Han but from one of China’s numerous ethnic minorities. Adding mothers from ethnic minorities increases sample size and allows us to exploit more variation in our OCP instrument (policy variable), originating from OCP regulations and exemptions governing the fertility behavior of minorities at various times in various provinces. At the same time, adding other ethnicities renders our sample of children (and mothers) likely less homogeneous. Furthermore, ethnicity per se could relate to educational and fertility choices. So, there is a trade-off.Footnote 22 Nevertheless, as a robustness check, we now expand our estimation sample to include all children, irrespective of the ethnicity of their mothers. We maintain, however, that mothers must have an agricultural Hukou, given our focus on less-developed areas. Again, we use our 2SLS baseline specification of column (6) in Table 3, first in unadjusted form (see column (5) in Table 5), then in modified form (see column (6)). The latter specification adds to the set of regressors in our baseline model an indicator that takes value one if a mother belongs to an ethnic minority (and zero otherwise) to control for potential level differences between Han and minorities in their preference for sons, their economic conditions, or their average exposure to OCP regulations. As the tabulated regression output shows, sibling size also exerts a negative effect on a child’s schooling in this enlarged estimation sample for both model specifications considered.

Conclusion

In this paper, we investigated empirically for China the effects of family size on the education of children. For identification, we exploited China’s One-Child Policy (OCP) as an exogenous source of variation in the number of offspring to a woman. Our results show strong evidence for a sizable child quantity–quality trade-off among children of Han mothers with an agricultural background, a population which accounts for about three quarters of all children born in China. This finding proves robust to various changes in the estimation sample (including an expansion to all ethnicities).

In our analysis, we have used a novel and more accurate measure of individual OCP coverage than hitherto the case in the literature. This measure draws on and combines for the first time detailed regional information on actual OCP implementation, regulations, and exemptions in 31 Chinese provinces, which we collected from provincial family planning regulations, with information on the actual childbearing age of women at particular points in time and their ethnic and agricultural background. Not dichotomous as in existing studies, this continuous measure captures more accurately the intensity of treatment individual women have been exposed to by OCP regulations that restricted their childbearing decisions over the course of their life-time span of fertility. This continuous measure of OCP coverage is also more in line with existing theories of fertility that stress life-time aspects of family planning, including child quality investments, for reproductive decisions. Other than in parts of the literature for China, we also restricted the analysis to educational outcomes of children of post-compulsory schooling age only, i.e., to educational outcomes which are indeed subject to parental discretion.

The new measure of individual OCP coverage developed in this paper and used in our analyses can be fruitfully employed in other applications in future research, e.g., for studying tilted sex ratios at birth, marriage market dynamics and patterns, or criminal activity in China. Similar policy construction strategy can be applied to analyze the effect of the termination of OCP.