The Causal Effects of the Number of Children on Female Employment - Do European Institutional and Gender Conditions Matter?

This paper contributes to the discussion on the effects of the number of children on female employment in Europe. Most previous research has either (1) compared these effects across countries, assuming an exogeneity of family size; or (2) used methods that dealt with endogeneity of family size, but that focused on single countries. We combine these two approaches by taking a cross-country comparative perspective and applying quasi-experimental methods. We use instrumental variable models, with multiple births as instruments, and the harmonized data from the European Survey on Income and Living Conditions (EU-SILC). We examine the cross-country variation in the effects of family size on maternal employment across groups of European countries with different welfare state regimes. This step gives us an opportunity to investigate whether the revealed cross-country differences in the magnitude of the effect of the family size on maternal employment can be attributed to the diversity of European institutional arrangements, as well as the cultural and the structural conditions for combining work and family duties.

interest in the demographic, sociological, and economic literature in the 1980s, and has since been addressed in numerous empirical studies. Previous studies have some methodological shortcomings, however. Many of them employed methods that assumed that childbearing decisions are exogenous with respect to labor market decisions (Matysiak and Vignoli 2008). Thus, these studies failed to account for unobserved characteristics that jointly affect fertility and employment outcomes, such as an unmeasured orientation toward work or family. A failure to account for unobservables leads to a bias in the estimated effect of the family size on women's employment due to the selection of individuals with a high family orientation into the group of the non-employed. Hence, many previous studies have shown associations between family size and female employment instead of causal effects. Some recent studies tried to account for this problem by implementing statistical methods that make it possible to control for unobserved timeconstant characteristics, while assuming that women's orientation toward family or paid work does not change over time (Aassve et al. 2006;Matysiak 2011a). Women's fertility and employment preferences may, however, change in response to the birth of another child or their work experiences. There are only few studies that have succeeded in accounting for both the time-constant and the time-varying unobserved characteristics of women, taking the endogeneity of family size into account. These studies provide evidence for single countries only, which makes it difficult to understand the mediating role of the institutional context for the incompatibility of work and family. Moreover, the variation in the institutional arrangements of these countries is rather limited, as these studies are typically conducted in the U.S. (Rosenzweig and Wolpin 1980;Angrist and Evans 1998;Jacobsen et al. 1999) or in developing countries (Cruces and Galiani 2007;Vere 2011;Cáceres-Delpiano 2012). There is almost no evidence for European countries on the causal effects of family size on women's employment. Hence, the questions of whether, and, if so, how strongly family size affects female labor market outcomes across countries with differential institutional arrangements and cultural and structural conditions have yet to be adequately explored.
In this paper we combine different methodological solutions that provide more indepth insights into how the number of children affects women's employment, and how this effect depends on the country institutional and cultural conditions. First, we take an instrumental variable approach using information on multiple births to compute the causal effects of family size on women's employment. This approach was proposed in the seminal paper by Rosenzweig and Wolpin (1980), and has been applied in a number of recent empirical studies (Cruces and Galiani 2007;Vere 2011;Cáceres-Delpiano 2012). In addition, we use the approach recently proposed by Lewbel (2012) to assess the robustness of our results with respect to the type of instrumental variable. This method allows us to identify the causal effects of family size on women's employment by using regressors that are uncorrelated with the product of heteroskedastic errors. While this approach is generally applied to identify the structural parameters of interest when no instruments are available, in our study it is used to provide overidentifying conditions under which the validity of our instrumental variable based on multiple births can be tested. We examine the variation in the effects of family size on women's employment. We compare the magnitude of the possible effects across European countries, which have a very wide range of institutional, cultural, and structural conditions that may constrain or facilitate combining work and family (Brewster and Rindfuss 2000;Ahn and Mira 2002;Engelhardt et al. 2004). Although most European governments claim they are seeking to raise employment levels among women with children, the degree of progress made in implementing these policies differs strongly across countries. Thus, Europe is an interesting laboratory for research on how family policies mediate the impact of childbearing on female employment. Recently, comprehensive micro-data samples from almost all European countries have been collected in the Survey of Income and Living Conditions. The availability of these data make it possible to take advantage of European diversity for research purposes.
Our paper is structured as follows. In Section 2 we provide an overview of theories concerning conflict between work and parenthood. In Section 3 we elaborate on the European institutional and cultural contexts that may moderate the scale of this conflict. In Sections 4 and 5 we describe the data and the methods used in this study. In Section 6 we present the results. In Section 7 we summarize our findings and discuss opportunities for further research.

Literature Review
The relationship between family size and female employment is very well grounded in existing sociological and economic theories. Sociological theories stress that for a number of cultural and economic reasons, the primary responsibility for childcare continues to lie with the mother (Lehrer and Nerlove 1986). While both paid employment and childcare may be important sources of rewards and satisfaction for women, because of time constraints, women need to decide how to divide their time between working and taking care of their children. This conflict has been described in the role incompatibility hypothesis (Brewster and Rindfuss 2000).
Similar concepts have been developed in the neo-classical economic models of women's labor supply (Mincer and Ofek 1982;Joesch 1994;Rønsen and Sundström 2002). In these models the time that a parent, usually a mother, supplies in the labor market is a choice variable that is jointly determined with the time devoted to childrearing. A parent will take a job only if her or his market wage exceeds the value of the time spent at home (a reservation wage). According to this model, the impact of family size on parental involvement in the labor market can be positive, as having children increases the financial needs of the family (income effect); but it can be also negative if the income effect is surpassed by an increase in the value of a parent's time spent at home following the birth of a child (price effect).
According to economic theory, the effects of family size on employment are more likely to be negative for women with a low earning potential, a strong desire to spend time with children, and a low orientation toward paid work; as well as for women living in an affluent household. The effect of family size on women's employment should also depend on the country context. The value of women's time is expected to be higher in countries where working mothers are less institutionally supported (e.g., countries with poor childcare provision or inflexible working hours) or less socially accepted (Gornick et al. 1997;Esping-Andersen 1999;Stier et al. 2001). In such countries, working is more costly for a mother, as she needs to purchase childcare on the market and violate the prevalent gender norms.
The abundant empirical research on the topic has confirmed that having children exerts a negative influence on women's employment (Felmlee 1993;Giannelli 1996;Taniguchi and Rosenfeld 2002;Budig 2003). This evidence comes mainly from single-country studies or studies that compared two or three countries. There are fewer multi-national studies that would allow us to draw conclusions about the magnitude of the effect across country contexts (Steiber and Haas 2012). One of the few multi-national studies that have been carried out is by Pettit and Hook (2005), who used cross-sectional data on 19 European countries and compared the employment rates of childless women with those of women in households with small children. Their findings suggest that having young children affects women's employment significantly less in countries that provide public childcare and parental leave, and that national gender cultures do not explain the crossnational differences in women's employment. Similar conclusions have been reached by Steiber andHaas (2012) (2012) for 26 countries and by Uunk et al. (2005) for 13 European countries. Finally, using data for 18 OECD countries, Nieuwenhuis et al. (2012) demonstrated that the cross-country differences in the effects of parenthood on women's employment are attributable not just to family policies, but also to labor market structures (unemployment rates and the size of the service sector).
The comparative studies mentioned above all provide information about associations between family size and women's employment across Europe. However, they do not account for the selection of family-oriented women into motherhood and nonemployment due to unobserved time-varying-and, in the case of some of these studies, also time-constant-characteristics of women. As the literature has shown that women who are more career-oriented generally prefer to have smaller families (Lehrer and Nerlove 1986;Francesconi 2002;Hakim 2003), the findings of these studies cannot be interpreted in terms of the effects of family size on women's labor market outcomes.
There are only a small number of studies that control for endogeneity of family size, and most of those that exist are limited to one country only (Rosenzweig and Wolpin 1980;Angrist and Evans 1998;Jacobsen et al. 1999;Vere 2011), or they compare developing countries only (Cruces and Galiani 2007;Cáceres-Delpiano 2012). Notable exceptions are the studies by Del  and Del Boca and Sauer (2009), who provided cross-country comparative evidence on the role of institutional arrangements for employment and fertility decisions in western Europe. However, they did not show the effects of family size on employment, and instead focused on the effects of policies on employment and fertility. Hence, there is very little evidence on the moderating effects of the country context on women's employment that is not potentially biased by selection.

European Context
Across European countries, there is considerable diversity in the conditions for work and family reconciliation. These conditions are shaped by family policies, labor market regulations, and gender norms. Not surprisingly, there have been many attempts to classify countries according to the extent to which they provide the conditions needed to combine work and family (Anttonen and Sipilä 1996;Gornick et al. 1997;Esping-Andersen 1999;Korpi 2000;Bettio and Plantenga 2004). Although these classifications differ in how they assign various countries to specific family policy and attitudinal models, there is general agreement that the most favorable conditions for combining paid work with childrearing are in Nordic Europe. These countries stand out for their exceptionally well-developed childcare services, which are characterized by their very high coverage rates for the youngest children (under age 3), their relatively high coverage rates for pre-school children, and their relatively long opening hours (see Table 1). These countries were also forerunners in introducing the individualized right to parental leave to encourage fathers to become more involved in care (Leira 2002). In the Nordic countries, mothers' employment is also highly socially accepted (Treas and Widmer 2000). The Gender Norms Index developed by Matysiak and Weziak-Bialowolska (2016) on a battery of attitudinal statements on gender norms from the European Value Survey (2016) suggests that these countries score far higher than the other European countries in terms of the social acceptance of women's participation in the public sphere and fathers' participation in the private sphere (see Table 1). The southern European countries lie on the other side of the spectrum. In these countries, there is very limited institutional support for working parents in terms of public childcare provision, particularly for the youngest children; parental leave entitlements are short and very poorly paid; and very conservative attitudes toward women's involvement in any public sphere of life, including labor market attachment, are prevalent (see Table 1, but also Matysiak 2011a; Matysiak and Weziak-Bialowolska 2016;Lück and Hofäcker 2003). Surprisingly, individualized rights to parental leave benefits were introduced quite early in this group of countries as well. Currently, however, only around 10 % of fathers in southern Europe use these benefits, compared to 50 % in Sweden (European Commission 2015).
The conditions for work and family reconciliation in other parts of Europe are more nuanced. In Belgium and France, the public provision of childcare services is nearly as good as in the Nordic countries, but the costs of childcare for families are higher (see Table 1). The policies in those countries are strongly targeted at increasing women's employment and easing women's care responsibilities, but they are less geared toward encouraging egalitarian division of labor at home. In Belgium and France, paternity leave entitlements have been introduced only recently, and the prevailing attitudes toward working mothers are more traditional (Table 1). Austria and Germany score even lower in terms of their support for working mothers. Indeed, in these two countries mothers have long been encouraged to stay at home to care for their children by the family benefit and parental leave policies and by the joint tax system (Steiner and Wrohlich 2004). In the Netherlands, in contrast, parents have been given very limited access to leave. Yet despite some recent changes in work-family reconciliation policies, childcare provision in the German-speaking countries and in the Netherlands has remained poor, particularly for children under age three. Moreover, because the opening hours of childcare facilities in these countries are among the shortest in Europe (Table 1), mothers are often forced into part-time employment. Finally, levels of social acceptance of mothers' involvement in the labor market and fathers' involvement in the family are even lower in these countries than they are in Belgium and France (Table 1).
The Anglo-Saxon countries constitute another specific group of countries. In this country group the cultural barriers to female work are not as strong as in the Germanspeaking countries and the Netherlands (Table 1), but levels of public support for working parents are far lower. The parental leave systems in these countries are very modest, with very low payments during leave. Public childcare provision also tends to be rather poor. Although childcare services can be easily purchased on the market, the cost of these services for the parents is usually high. As a result, the coverage rates in the childcare system are relatively low, and the childcare fees are the highest in Europe. However, in these countries the labor market is relatively flexible; i.e., while it is easy to lose a job, it is also relatively easy to find a new one (Adsera 2005).
Finally, the specificity of central and eastern Europe (CEE) is related to the legacy of state socialism. During the socialist era, women were expected to be both income and care providers (Pascall and Manning 2000), and the state provided extensive childcare services in the form of either free childcare facilities or crèches and kindergartens attached to state-owned enterprises. After the economic transition, public expenditures on reconciliation policies were greatly reduced and most of the state-owned enterprises went bankrupt or were privatized. Only some of the CEE countries attempted to rebuild the welfare support for working parents in the 2000s. As a result, family policy models in this part of Europe have become increasingly diverse, with Slovenia and Estonia offering the most generous support to working mothers; while the Czech Republic, Slovakia, and Poland pursue familialism (Szelewa and Polakowski 2008;Matysiak (2016); it ranges from 0traditional to 100egalitarian 2011a). On average, however, the enrollment rates in crèches and kindergartens in the CEE region are among the lowest in Europe. Instead of investing in childcare provision, these governments continue to offer quite generous parenatal leave schemes (in terms of length and payment), which allow women to withdraw from employment during the first three years of a child's life. Interestingly, women tend to return to employment after this leave period (Matysiak 2011b), and usually take full-time jobs (Drobnič 1997). This pattern, which is very specific for the region, is partly the result of conflicting social expectations of women. On the one hand, women in the region are perceived as the main care providers, and childcare and housework are seen as female, not male jobs. On the other hand, women are also expected to work in the market and to contribute to the usually tight household budget (Lück and Hofäcker 2003;Philipov 2008).

Analytical Strategy
As it has been noted in Section 2, a mother's decision about whether to be employed depends on her preferences regarding paid work, desire to spend time with her children, the financial situation of her household, the earning potential of her partner, and her own earning capacities. This implies that we would need to control for all these variables in order to properly estimate the effect of family size on a mother's employment. In practice, controlling for all of these variables is usually impossible, as researchers generally lack the necessary data to do so. In particular, we are not able to observe women's orientations toward paid work and family. A number of theoretical and empirical studies have suggested that women with a comparative advantage in market work display a stronger preference to have small families (Lehrer and Nerlove 1986;Francesconi 2002;Hakim 2003). If this is the case, then research that ignores the role of female preferences and treats family size as exogenous may overestimate the negative effects of childbearing on the labor market outcomes of mothers. Furthermore, various unobserved characteristics such as earning potential and tastes for paid work, childcare and leisure may vary across various life phases. Specifically, the presence of children in the family significantly affects female preferences for these three types of activities (Joshi 1998;Blau and Kahn 2007). Hence, after each birthand especially after the first birth that marks the transition to parenthood-individual preferences may change. This means that even sophisticated methods of analysis that control for the unobserved time-constant characteristics of women might still generate misleading conclusions.
An experimental setting in which women could be randomly sorted into various Btreatment groups^with an exogenously defined number of children would be ideal for addressing this research problem. For obvious reasons, organizing such an experiment is not possible. However, Rosenzweig and Wolpin (1980) have proposed a method for exploiting an experiment that occurs naturally due to the occurrence of multiple births. The basic idea is to use the data on multiple births in order to construct a proper Bcontrol group^for women with a given number of children. Women who experienced multiple births may be regarded as a random Bsample^that may be used for comparisons with women who experienced births of singletons. Thus, information on twin births can be applied to construct an instrumental variable and to get unbiased estimates of the impact of the number of children on women's employment. For example, women who have had just one child can be compared to women who have had two children as the result of a multiple birth.
The instrumental variable approach exploiting information on twin births is regarded as comparable to a natural experiment; it gives us the opportunity to control for the simultaneity of family size and employment decisions among mothers without making any specific assumptions about the distribution or temporal stability of the unobserved factors that jointly affect women's family-related and employment-related decisions (Moffitt 2005). The estimates from instrumental variable models refer only to the subsample of the population who react to the instrument; i.e., the compliers (Angrist et al. 1996). In the presence of heterogeneous treatment effects, these estimates may differ from those of the average treatment effect and the average treatment effect for the treated. However, the specific feature of the instrumental variable that we use in this paper is that it identifies the effect of treatment on the non-treated, since compliance is perfect when a multiple birth occurs (Angrist et al. 2010). Still, this variable does have some drawbacks. First, it does not allow us to measure the effect of the first child on female labor supply. Using this approach gives us the opportunity to measure the family size effects at parity two or higher. Second, the occurrence of multiple births correlates with some demographic variables, such as a mother's age at birth or her race (Martin and Park 1999, see also the results of our analysis presented in Appendix Table 9). These demographic information are, however, often available in the data, and can be controlled for in regression models. 1 Another potential problem is that raising children born in multiple births may affect labor market outcomes differently than raising children from single births, and this difference may depend on the age of the children. Taking care of newborn twins can be more time-intensive than taking care of one newborn and his or her older brother or sister. At older ages, however, economies of scale may reduce the amount of time parents have to invest in taking care of twins than in taking care of two children who are of different ages. For example, since twins often attend the same classes, parents may need to spend less time on helping twins with their homework than children of different ages (Rosenzweig and Zhang 2009).
Obviously, the instrumental variable approach used in our paper is not the only possible solution. For example, previous research has exploited idiosyncratic changes in policies that increased the availability of family planning programs in the community as sources of exogenous variation in childbearing (Arpino and Aassve 2013). However, this approach allows us to measure the family size effects in selected countries only; namely, in those countries where cross-regional variation in access to contraceptives can be observed empirically. Other studies have exploited data on miscarriages and the presence of infertility problems. Some miscarriages occur at random due to the formation of abnormal fetal chromosomes at the time of conception, which causes fetal expulsion early in a pregnancy (Hotz et al. 1997;Hotz et al. 2005). However, epidemiological studies have found that the incidence of miscarriages is also higher due to the consumption of cigarettes and alcohol during pregnancy (Kline et al. 1980). At the same time, smoking cigarettes and drinking alcohol are correlated with labor market outcomes (Bray 2005;Johansson et al. 2007;Levine et al. 1997). This undermines the internal validity of the instrument based on miscarriage data. Infertility, another possible instrument, can be defined as the failure to conceive after a year of regular intercourse without contraception (Habbema et al. 2004). Studies in which instruments based on infertility have been applied include Marks (2008, 2011) and Cristia (2008). Unfortunately, a wide range of factors-such as poor health, smoking, drinking, and extreme body mass index-are associated with infertility, and may depress labor market chances.
Another type of instrumental variable that gives us the opportunity to study the family size effects on labor market outcomes is the siblings' gender composition. There is evidence of a preference for Bbalanced^families with equal numbers of boys and girls in some countries. Therefore, some studies have used the gender composition of children as an instrument for family size (Angrist and Evans 1998;Cruces and Galiani 2007;Daouli et al. 2009;Nam 2010). The internal validity of the instrumental variable constructed based on information about siblings' gender composition is also under debate (Rosenzweig and Wolpin 2000). Moreover, in many countries the impact of sex composition on the total number of children is not always strong enough to serve as a relevant instrument for family size. Finally, this approach is not practical for use in a study that focuses on Europe, because it provides estimates of the effect of having an additional child on female labor force participation that are conditional on reaching parity two, whereas the numbers of women with at least two children are very small in many European countries (Del Boca and Sauer 2009). 2 In this study, we cannot use the instrumental variables based on miscarriages or infertility because the relevant information is not available in our data. We examined the opportunity of using an instrumental variable based on gender composition, but our results showed that having two children of the same gender has a very small impact on the total number of children, and the tests of the relevance of gender indicated that this instrument is very weak. 3 In lieu of using additional instruments, an additional strategy that we followed in order to assess the robustness of our results is the approach proposed recently by Lewbel (2012). This method relies on the presence of heteroscedasticity in the error term of the first-stage equation (which is examined in this paper by means of the Breusch-Pagan test). The procedure suggested by Lewbel (2012) uses as instruments the deviations from the mean of a vector of independent exogenous variables, interacted with the residual from the first-stage regression. Previous research has applied this approach to identify the key parameters of interest in cases in which the instrumental variable was not available (Kelly et al. 2014), or to provide over-identifying conditions under which the validity of the main instrumental variable could be tested (Sabia 2007). In this paper we use this approach for the latter purpose.

Data
So far, there have been relatively few studies that have employed the instrumental variable models using data on multiple births because of the lack of a dataset with sufficiently large samples and detailed demographic information. In this study we are fortunate to have access to the European Survey of Income and Living Conditions (EU-SILC), which includes large samples, and thus allows us to identify a suitable number of mothers who experienced multiple births (Eurostat 2011). 4 Additionally, the survey provides data on the labor market situations of the respondents and the structure of their families. It was started in 2004 and has been carried out every year under the auspices of Eurostat. It provides harmonized comparable data for most countries in Europe. Based on these data, cumulated from the period 2004-2011, we can analyze and compare the effect of childbearing on employment among mothers in 30 European countries (all of the members of the European Union and Norway, Iceland, and Switzerland).
We restricted our sample to mothers aged 18-35 whose oldest child was under age 12. We excluded from our analysis women for whom the relevant information on the labor market outcomes was missing. We identified women who gave birth to two children in the same year and in the same quarter as being mothers of twins. For countries in which the information on the quarter of birth was missing, we used only the data on the year of childbirth in identifying mothers of twins; and we controlled for this fact in our analyses. As there were very few women who gave birth to triplets or experienced other types of multiple births, we excluded such cases from the analysis. We used all of the national EU-SILC samples apart from samples from surveys carried out in Malta, Cyprus, and Switzerland. These countries lack descriptions of the institutional and the cultural settings relevant to our analysis, and Switzerland has only recently been included in the survey. The total number of mothers who experienced a twin birth as their first birth is 1719. The twinning probability is 1.27, which is in line with the existing literature on multiple births (Martin and Park 1999). The sample used in the analysis includes 135,340 mothers.
We focus on two measures of women's labor market involvement: the probability of doing work, which captures the extensive margin of female labor market involvement; and the number of hours worked, which captures the intensive margin. The probability of doing paid work is defined based on the information on the current economic activity status, which distinguishes between (1) working full-time, (2) working part-time, (3) unemployment, (4) studying, (5) retirement, (6) disability, (7) compulsory military service, (8) fulfilling domestic and care responsibilities, and (9) other forms of inactivity. We classified the first two categories as involvement in work, whereas the other labor market statuses were classified as being out of work. The EU-SILC also provides information on the number of hours usually worked per week in the main job among working women. For women who were not working we assumed zero hours of work; thus, this outcome variable is not conditional on women's labor market status.
We pooled the data for all countries in order to investigate the variation in the effects of having children on mothers' employment between groups of countries that have similar institutional, cultural, and structural conditions for work and family reconciliation. The country groups were specified according to the commonly applied classification of welfare state regimes described in Section 3. The first group consists of the Nordic countries (Denmark, Finland, Island, Norway, and Sweden); the second category includes Belgium and France; the third group consists of Austria, Germany, Luxembourg, and the Netherlands; and the fourth group is made up of Anglo-Saxon countries (UK and Ireland). Finally, the last two groups cover the southern European countries (Spain, Italy, Portugal, and Greece) and the central and eastern European countries (Czech Republic, Hungary, Poland, Slovakia, Romania, Bulgaria, Slovenia, Estonia, Latvia, and Lithuania).

Model Specification
In principle, if the randomization of women with children was perfect, we could simply compare the employment rates of women with singletons and women with twins. However, to address the problems of the relationship between the risk of multiple births and age and to improve the precision of our estimates, we use two stage least squares (2SLS) instrumental variable models. In the regression framework, we can control for the individual-level characteristics of women as well as cross-country variation in the institutional setup and cultural conditions. We can also see whether the country-specific institutional or cultural factors moderate the impact of family size on female employment by introducing interaction terms implemented in line with Wooldridge's (2010) suggestion.
We chose the following specification of the 2SLS instrumental variable models, which allows for: where nchild is the total number of children, multi is an indicator that a given woman has experienced a multiple birth, X is a vector of control variables that includes age and age at first birth, as well as country-wave fixed effects. To see if there is variation in the causal effects of family size on maternal employment across the specific groups of European countries, we divided the European countries into groups that-as described in Section 3-have similar institutional settings, and we ran the regression models across different country groups.
In order to test the validity of the instruments based on multiple births, we generated additional instrumental variables by means of the Lewbel (2012) method. In the presence of heteroskedastic disturbances in the first stage equation of the instrumental variable model, the parameters of the second stage equation can be consistently estimated using an exclusion restriction in the form: where Z is a vector of exogenous variables (in our paper these are age as well as a set of fixed effects for country groups) andε 1 are the estimated residuals from the first stage equation. Next, due to the use of these additional instrumental variables, the second stage regression becomes over-identified, and we can thus perform a test of the validity of the multiple birth instrumental variable. Specifically, we examined the difference between the Sargan statistic of the equation and the set of instruments generated by means of Lewbel's (2012) approach and the equation that additionally includes the instrument based on multiple births.

Descriptive Statistics
In order to provide some preliminary insights into the impact of the number of children on female labor market attachment, we present the maternal employment rates (Fig. 1) and the number of working hours among mothers (Fig. 2) by the number of children under age 12, as calculated based on EU-SILC data. In general, the number of children is clearly negatively associated with employment opportunities among European mothers. Having two children instead of one is associated with a difference in employment rates of 17 percentage points. Having a family with three children decreases employment rates by 32 percentage points. Among mothers with four children or more, employment rates are close to zero. These effects vary very strongly depending on the country group, however. As we can see on Fig. 1, in the Nordic countries the employment rates of mothers vary little depending on whether they have one child or two children; only having three or more children is related to a strong decrease in employment rates in these countries. In the French-speaking western European countries, the difference in the employment rates of mothers depending on whether they have one or two children is also smaller than it is in the rest of Europe, but it is larger than in the Nordic countries. In other western European countries, the gap in the employment rates of mothers depending on their number of children is larger. However, it is evident that the strongest penalty for having more than one child can be observed in the Anglo-Saxon countries, the southern European countries, and the CEE countries. In these countries, having a second child is associated with a decrease in employment rates of about 20-30 percentage points, and having more than two children lowers the probability of having a job to close to zero. While the aggregate employment rates capture how the likelihood of having a job is affected by family size, an indicator of the number of hours worked shows the intensity of labor market involvement. Some women may respond to the increase in the workfamily conflict after the birth of the second child by reducing the amount of time they work rather than by simply withdrawing from the labor market. Again, the gaps in the numbers of hours worked by mothers based on their number of children vary strongly across countries. In the Nordic countries women with two children work one hour longer on average than women with one child, and a decrease in the number of working hours can only be seen among women with at least three children, but even then it is modest relative to that of the other European countries. The negative effects of the number of children tend to be strong in Western Europe, especially in the Frenchspeaking and the Anglo-Saxon countries. In the countries of southern Europe and central and eastern Europe the gap in the number of hours worked by mothers with different numbers of children is rather modest. This finding is consistent with previous research showing that women's part-time employment in those regions, particularly in CEE countries, is not very common (Matysiak and Steinmetz 2008).

Country Group Analysis
The above descriptive analysis shows associations rather than genuine relationships between family size and female labor supply. Obviously, women select into groups of mothers with different numbers of children based on a number of factors, which may simultaneously affect employment opportunities. In the next step, we carry out regression analysis in order to see how the effect of the number of children varies across countries after we eliminate the selection effect of the observed and the unobserved characteristics of mothers by means of a 2SLS procedure.
We first present the means of the explanatory variables and the results from the first stage of our regression in Tables 2 and 3, respectively. The results in Table 3 indicate that the number of children is larger among older women, and that it negatively correlates with the age at first birth, which confirms the well documented effect of the postponement of childbearing on the level of completed fertility (Sobotka 2004). The effect of twins at first birth on the number of children is close to 0.7 and the F statistic exceeds by far the level of 10, meaning that the twinning variable is not a weak instrument. In Tables 4 and 5 we report the results from an IV regression in which we Brandomized^women according to their number of children by using data on twin births. In order to see how the effect differs depending on whether we control for the unobserved characteristics of women, as a Bbaseline^we also report results from an ordinary OLS regression, which has an identical specification but does not imply a quasi-experimental design.
The OLS results show a decline in the probability of working with an increase in the number of children in all groups of countries. According to the results of the IV regression, this effect is neither significant in the Nordic countries nor in CEE countries. However, the results of the OLS regression and the IV models are similar in other country groups and there are no substantial differences in the effects of the number of children on mothers' employment across these remaining country groups. According to the results of the IV models, an increase in the number of children has a somewhat stronger negative effect on mothers' employment in southern Europe and in the Anglo-Saxon countries (where one additional child decreases the probability of working by about 20 %) than in western Europe (where a corresponding effect amounts to slightly over 10 %).
The results illustrating the effects of family size on the number of working hours are similar. An increase in family size leads to a decline in working time of about 5-8 h weekly. This effect is weaker in the IV regression than in the OLS regression in the Nordic and the CEE countries, where it also ceases to be significant. The differences between the results of these two models in the Nordic and the CEE countries suggest that a strong preference for larger number of children leads to a reduction in women's working hours in these two groups of countries. Hence, we can observe a selection of strongly familyoriented women into the non-employed group. At the same time, we find no evidence that family size has a causal effect on women's labor supply in these two groups of countries.

Sensitivity Analyses
In order to assess the robustness of our results, we used a strategy proposed by Lewbel (2012) to exploit heteroscedasticity in the first stage equation in order to generate additional instruments, which may augment our model so that the parameters become overidentified. As a result, we can perform the tests of the validity of the instrumental variable based on twin births. We carried out this sensitivity analysis separately for every country group, and present the results in Tables 6 and 7. In the first step, we  C o e f .

S . E .
C o e f .

S . E .
C o e f .

S . E .
C o e f .

S . E .
C o e f .

S . E .
C o e f .
S.E.  Source: EU-SILC data. Note: *p < 0.10, ** p < 0.05, *** p < 0.01, heteroscedasticity-robust standard errors in parentheses. Fixed effects for survey years included in the regression. Durbin-Wu-Hausman tests have been carried out to examine the exogeneity of the number of children S.E.
S.E.  Source: EU-SILC data. Note: *p < 0.10, ** p < 0.05, *** p < 0.01, heteroscedasticity-robust standard errors in parentheses. Fixed effects for survey years included in the regression. Durbin-Wu-Hausman tests have been carried out to examine the exogeneity of the number of children carried out the Brausch-Pagan test, which revealed the presence of heteroscedasticity in the regression for every country group. The availability of heteroskedastic residuals was a precondition for the next step; i.e., for generating additional instrumental variables that are calculated as the product of the first stage equations' residuals and the exogenous variables in mean-centered form. We compared the results from the models estimated by using Lewbel's (2012) approach, and by combining the heteroskedascity-based instrumental variables and the twin-based instrumental variable, we carried out Sargan tests of the validity of the instrumental variable based on twin births. Under the null hypothesis of validity, the change in the Sargan statistic follows a chi-square distribution with one degree of freedom. Rejection is interpreted as indicating that at least one of the instruments is not valid. The results indicate that with the exception of the southern European countries, the instrumental variable based on twin births can be considered a valid instrument for all regions of Europe considered in our analysis (Appendix Table 8).

Results from OLS regression
In addition, we carried out a number of additional analyses in order to check the sensitivity of our results with respect to modifications of the sample according to ethnicity, age, partnership status, and age at first birth. Specifically, we examined the effects of family size on labor market outcomes of women in the sample (1) restricted to women of EU origin, (2) extended to women up to age 45, (3) restricted to partnered (married or cohabiting) women, and (4) restricted to women who had their first birth before age 30.
The findings from these analyses are available on request. They demonstrate that our conclusion that the number of children does not have a negative effect in the Nordic and the CEE countries is robust across all of the samples except for two cases: the sample of women of EU origin in the Nordic countries, among whom the estimate of the negative effect of the number of children on the probability of working (but not the number of working hours) becomes significantly (but marginally) positive; and the sample of partnered women in the CEE countries, among whom the estimate of the negative effect of the number of children on the probability of working (but not the number of working hours) becomes significantly (although marginally) negative. However, the magnitude of the effect in the latter sample remains smaller than in the other western and southern European countries, where it was found to be negative as well.

Discussion of Key Findings
The aim of this paper was to investigate how the conditions for work and family reconciliation influence the effects of family size on women's employment. Contrary to most of the previous research on the topic, we were able to account for the selection of family-oriented women into the pool of mothers. This was achieved by estimating instrumental variable models using data on multiple births based on the cross-country comparative data for Europe. Our findings clearly show that family size has negative effects on the probability of working and on the number of working hours among women in all of the country groups, except in the Nordic and the CEE countries, where the effects were found to be insignificant. The negative effects emerged regardless of whether we controlled for selection, but they were much weaker after controls were applied. This suggests that previous research that did not take selection into account likely overestimated the negative effect of family size on women's employment. The negative effects of family size on women's employment are fairly strong in two country groups, the Anglo-Saxon countries (Ireland and the United Kingdom) and southern Europe (Italy, Greece, Portugal, and Spain), where public support for working parents is indeed weak and where families have to largely rely on either the market (Anglo-Saxon countries) or the family (southern Europe) to combine paid work with childrearing. These effects seem to be somewhat weaker, but still significantly negative, in the western French-speaking countries (France and Belgium) and in the other western countries (Austria, Germany, the Netherlands, and Luxembourg), even though the public support for working mothers in France and Belgium is better than in the remaining western European countries.
No significantly negative effects of family size on women's employment were found in the Nordic and the post-socialist countries. The finding for the Nordic countries is most likely due to a consistent set of policies in the Nordic states designed to support gender equality in the labor market as well as at home. This policy framework, which has been in place for at least three decades (Leira 2002), includes the broad provision of high quality childcare services to all social strata, and a system of individualized parental leaves that encourages an equal division of care of very young children between the partners. The Nordic countries are currently considered to have the most egalitarian division of paid and unpaid labor (Goldscheider et al. 2015) and to have the best conditions for work and family reconciliation (Matysiak and Weziak-Bialowolska 2016).
In the post-socialist countries, the conditions for work and family reconciliation are far worse than in the Nordic countries. In many of those countries childcare provision for the youngest children (under age 3) is even worse than in many western European countries and men's levels of involvement in unpaid labor at home are relatively low (Fisher and Robinson 2011). Instead, women are offered extensive parental leave opportunities to provide care at home (Saxonberg and Sirovátka 2006;Robila 2012). Nonetheless, in our study we found that raising children does not affect women's employment in this part of Europe in general and has an only slightly negative effect on the probability of working among partnered women. These findings are consistent with those of previous research on CEE countries. Using different methods and different data, this research showed that in this region, women's employment is affected by family size to a lesser extent than in other European countries (except in the Nordic countries) (Matysiak and Steinmetz 2008;Matysiak and Vignoli 2008) and that working women are less likely to postpone the transition to motherhood than women in other European countries (Kreyenfeld 2004;Róbert and Bukodi 2005). Financial necessities and cultural factors (the intergenerational transmission of the image of the working mother) are most often referenced in explanations for these empirical findings. On the one hand, women in the post-socialist countries are expected by society to care for their children at home (Treas and Widmer 2000;Muszyńska 2007). Given the poor childcare provision and generous parental leave programs, women in these countries have few options other than to combine paid work and care. On the other hand, however, women are supposed to accept a double burden once their children are older, and to work for pay just as their mothers did under socialism in order to contribute to the tight household budget (Lück and Hofäcker 2003;Philipov 2008). A desire to contribute to their family's finances and to improve their family's living standards are often mentioned in empirical studies for the region when young parents are asked why they are willing to accept long working hours and strong work pressures, and why they reject the idea of advocating for shorter or more flexible working hours to achieve greater work-family balance (Hobson et al. 2011;Mrcela and Sadar 2011). Thus, the common pattern found in the data is that women in the CEE countries tend to take long career breaks after the birth of a child (even up to three years), thereby taking advantage of the relatively generous parental leave schemes; and then return to employment, often full-time (Matysiak 2011b).
In this study we focused on the effects of the number of children on female employment and we did not elaborate on other kinds of consequences of motherhood. An issue that definitely requires more attention in future research is the impact of the number of children on women's wages. In some countries, notably in the Nordic countries, gender equality in terms of employment chances coincides with a high degree of gender segregation across public and private sectors, and is accompanied by clear differences in wages across these sectors (Hansen 1997). Hence, while this study documents the differences in the effect of family size on the female labor supply in countries with different welfare state regimes, there may be still be trade-offs between employability and financial rewards for work among mothers in different countries.
While this study focuses on the impact of the number of children on female employment across groups of countries, the differences in the institutional and the cultural settings of these countries might also affect national fertility levels (Thévenon and Gauthier 2011). These differences may be important when considering the aging of European societies, and the challenges associated with this process for socioeconomic development. Hence, the positive impact of family-friendly institutional arrangements may have effects not only on current employment rates, but also on generational relations in Europe over the long term. Documenting such long-term influences goes beyond the scope of this article, and is left for future research.