1 Introduction

Until the 1970s, the Netherlands had a very low labor force participation rate for women (32% in 1977) compared to other Western countries, such as Sweden (70%) and Denmark (65%) [see Euwals et al. (2011)]. By 2007, this picture had completely changed. Today, the labor force participation rate of Dutch women is among the highest in the Western world with 75.8% according to the OECD (2018). This rapid growth of the female labor force resulted in substantial economic prosperity that could not have been reached solely by population growth. Several researchers, such as Van Ewijk et al. (2006), even believe that with the aging population, a high labor force participation rate of women contributes to the fiscal sustainability of the welfare state.

In this paper, we will estimate reduced form models to investigate trends in labor force participation of Dutch women from 1985 until 2014. The key issue in doing such research is to take into account age, period and cohort effects.Footnote 1 Euwals et al. (2011) correctly point out that it is important to disentangle those three effects in order to predict future trends in labor force participation. Age effects may reflect decisions such as timing of education, fertility and retirement. Period effects might be relevant because of business cycle effects or policy changes which might affect the labor market behavior of all women in the same way. Finally, cohort effects may include educational attainment, lower fertility rates of younger cohorts, changed social norms or the availability of oral contraceptives.

Unfortunately, it is well-known that the linear Age–Period–Cohort (henceforth APC) model suffers from a fundamental point identification problem as the three regressors are perfectly multicollinear. De Ree and Alessie (2011) mention that this problem has been a point of discussion since the 1970s. They investigated which information can be extracted from the data without making any assumptions. They show that age, period and cohort profiles can be fully non-parametrically identified in deviation from a linear trend in age, period and cohort. However, due to the identity Calendar year = Year of birth + Age, the linear trends in age, cohort and time are not separately identified. The finding of De Ree and Alessie (2011) implies that the second derivative of the age profile is fully identified, a result established earlier by e.g. Attanasio (1998) and McKenzie (2006). We will review the paper of De Ree and Alessie (2011) in Sect. 3.

Numerous methods have been developed to address this problem in a wide range of applications. Most of these methods boil down to imposing at least one parameter restriction on either the age-, period- or cohort-dummy coefficients (one assumption is enough).Footnote 2 Preferably, such restriction should be based on prior information which is clearly context dependent. A popular restriction, proposed by Deaton and Paxson (1994), says that the period effects are orthogonal to a linear time trend and add up to zero. In terms of the model of De Ree and Alessie (2011) described above, Deaton and Paxson (1994) basically assume that the the linear time trend coefficient is equal to zero. Often this assumption is justified by calling upon the importance of unanticipated business cycle effects.

The age–period–cohort problem can also be resolved by imposing at least one restriction on adjacent age-, period- or cohort-dummy coefficients. For instance, Hall (1971), who studied second-hand prices of pickup trucks, came up with a credible restriction on the cohort dummy coefficients: he equals the two most recent cohort dummy coefficients because the quality specifications of the pickup trucks did not change between those 2 years. This means that the quality should be identical, which is basically measured by the cohort effect in his model.Footnote 3 However, one often makes arbitrary equality assumptions on adjacent age dummy coefficients. De Ree and Alessie (2011) show an example that reveals clearly that two different arbitrary assumptions lead to very different age profiles of life satisfaction.

Another strategy, proposed by Heckman and Robb (1985), is to come up with proxy variables for either period or cohort effects that describe the underlying processes causing these effects. This method is later used by researchers for different applications, such as saving behavior (Kapteyn et al. 2005), home ownership (Van der Schors et al. 2007) and durable good purchases (Browning et al. 2016). Euwals et al. (2011) study female labor force participation in the Netherlands and proxy the time effects by means of the aggregate (regional) unemployment rate.

Finally, Yang and her coauthors have proposed the so-called intrinsic estimator of the APC model which we will shortly review in Sect. 3 [see e.g. Yang et al. (2004)]. Browning et al. (2012) have used intrinsic estimator to estimate the age-, period- and cohort profiles of female labor force participation in the United Kingdom. According to them, those estimates are plausible: the cohort effects rise for successive dates of birth and the age profile is hump shaped.

This paper contributes to the literature on female labor force participation by proposing a new method that assumes a prime working age period for women around their late forties (45–50) when their children are no longer young and they are not yet closely approaching retirement age. We basically assume a constant age effect in that stage of the life cycle. By resolving the age–period–cohort identification problem in this way, we hope to provide a tool for future research projects in female labor market participation. We compare the estimates of the prime working age model with those obtained from: (1) the Deaton Paxson model; (2) the model of Euwals et al. (2011) in which the period effects are modeled through the aggregate unemployment rate; and (3) the “Intrinsic Estimator” (IE) model. The ‘prime working age model’ estimates of the age, period and cohort profiles are plausible and very similar to the ones obtained by the intrinsic estimator. Both estimators produce a time profile which exhibits a slight positive trend. This positive trend might be due to changes in the income tax schedule during the sample period which favored two-earner couples. (By construction) the estimates of the other two models do not exhibit this positive trend. Finally, it should be mentioned that the fit of the prime working age model (measured by the AIC) is slightly better than that of the IE model.

This paper is organized as follows: in Sect. 2, we will introduce the dataset used in this study. Subsequently, we will review in Sect. 3 several approaches to estimate the age-, period- and cohort effects. The empirical results will be discussed in Sect. 4. Section 5 concludes.

2 Data

For this research, we use data from the OSA Labor Supply Panel (OSA Arbeidsaanbodpanel) 1985–2014, which is a Dutch panel survey conducted at the household level. The survey started in 1985 and is conducted every 2 years from 1986 onwards. Each wave has at least 1942 female respondents, with a maximum of 2713 observations in 2006. Next to labor related information, the survey also collects information on household and individual specific characteristics, such as the number of children and education level [see Gesthuizen and Dagevos (2005) for an extensive description of the survey].

The following observations are dropped from the sample: males, students, individuals older than 64 years and individuals that were doing their military service. We also remove observations with missing values for education. We construct our dependent variable, labor force participation, from a question concerning the current labor market position of the respondent (Table 4 presents the survey question). This dummy variable takes on the value 1 if the respondent has a job or is searching for a job. We end up with a sample consisting of 35,741 observations.

Female labor force participation rates by 5-year-of-birth cohorts and age (in %) are presented in Fig. 1. One can read the graph as follows: following one cohort line provides information about the age- and period effects on a given cohort, while studying the vertical difference between lines tells us something about period- and cohort effects for a given age. For example, it is interesting to see that for the older cohorts, labeled as 1926, 1931, 1936 and 1941, the participation rates steeply decrease from age 55 onwards. The large positive vertical difference between the younger cohorts and these older cohorts can be partly attributed to cohort effects and partly attributed to period effects, i.e. changing eligibility conditions regarding early retirement in the Netherlands. A series of policy changes, starting in 1997 for civil servant jobs and ending with a law completely abolishing favorable fiscal conditions for early retirement in 2005, is clearly visible in Fig. 1. Note that in 1997 when these changes commenced, people in the 1936 cohort group were between 59 and 63 years old, while respondents in the 1946 group were between 49 and 53 years old.

Fig. 1
figure 1

Female participation rates by age and cohort. Note: Cohorts in 5-year groups, from cohorts born in 1989–1993 labeled as 1991 to cohorts born in 1924–1928 labeled as 1926

We can also see that for all cohorts the labor force participation is rather constant between age 45 and 50. This finding provides prima-facie evidence for our prime working age hypothesis which we presented in the introduction. Furthermore, we can also observe a dip in the labor force participation rates around age 30 (especially for the 1961 cohort). A possible explanation is that in the past a considerable fraction of women temporarily stopped working once they got their first child. However, this ‘child valley’ in the labor force participation rate has almost completely disappeared for the younger generations. So, there could be differences in the age effects between different cohorts. To model this, we should allow for interaction effects between age and cohort effects. We choose to drop all observations born after 1963 in order to evade this issue. We also drop all observations younger than 25, because these observations may not have completed their education yet. Consequently, we are left with an estimation sample consisting of 23,684 observations from 7240 respondents. Table 1 provides some summary statistics of the dependent variable and all explanatory variables. Next to age, cohort and period variables, our regression models will also include the following control variables: dummy variables indicating the nationality of the respondent (Dutch), marital status (Partner), the presence of children in the household (ChildHome), and education level.

Table 1 Summary statistics (23,684 observations)

3 Strategies to Address the APC Problem

In this section, we will elaborate on the APC identification problem and discuss methods to address this issue. First we will explain the essence of the APC identification problem. Secondly, we will extensively review the paper of De Ree and Alessie (2011) to better understand which conclusions can be drawn from the data without making any identifying assumptions. Then, we will elaborate on three model assumptions as to identify the age period and cohort effects. Finally, we will explain the empirical strategy that we will apply in this paper.

3.1 The Age–Period–Cohort Problem

One of the issues that one can encounter while doing micro-econometric research using panel data, is the ease with which one can violate the Gauss–Markov assumption of no perfect multicollinearity. The classical Gauss–Markov assumption dictates that none of the independent variables can be linearly dependent of one another, nor can an independent variable be constant over all observations. The issue when investigating age-, period- and cohort effects using longitudinal data is the following: for an individual i in year t it holds that

$$\begin{aligned} c_i + a_{it} = t, \end{aligned}$$
(1)

where t denotes calendar year \(\left( t\in \{t_{min},\ldots ,t_{max}\}\right) \), \(a_{it}\) is age (expressed in years) of individual i at year t\(\left( a_{it}\in \{a_{min},\ldots ,a_{max}\}\right) \), and \(c_i\) is year of birth of individual i\(\left( c_i\in \{c_{min},\ldots ,c_{max}\}\right) \). Given (1) and considering the case where age-, period- and cohort effects are additive, we clearly have perfect multicollinearity. To clarify, consider the following regression model:

$$\begin{aligned} y_{it} = \phi + \beta a_{it} + \gamma t + \delta c_i + \varepsilon _{it} \end{aligned}$$
(2a)

Substitution of Eq. (1) into (2a) gives

$$\begin{aligned} y_{it}&= \phi + \beta a_{it} + \gamma \left( a_{it} + c_i \right) + \delta c_i + \varepsilon _{it} \nonumber \\&= \phi + \left( \beta + \gamma \right) a_{it} + \left( \delta + \gamma \right) c_i + \varepsilon _{it} \end{aligned}$$
(2b)

This clearly shows that the age-, period- and cohort effects, \(\beta \), \(\gamma \) and \(\delta \), are not separately identified, since models (2a) and (2b) are observationally equivalent.

We will not work with the model proposed in (2a) and (2b), because the linear structure of the profiles of the APC variables is rather restrictive. To also consider nonlinearities, we will work with the APC accounting/multiple classification model proposed by Mason et al. (1973):

$$\begin{aligned} y_{it} = \phi + \sum ^{a_{max}}_{\alpha = a_{min}+1} \beta _\alpha D^A_\alpha (a_{it}) + \sum ^{t_{max}}_{\tau = t_{min}+1} \gamma _\tau D^T_\tau (t) + \sum ^{c_{max}}_{\kappa = c_{min}+1} \delta _\kappa D^C_\kappa (c_{i}) + \varepsilon _{it} \end{aligned}$$
(3)

where the age-, period- and cohort dummies are defined as follows: \(D^A_\alpha (a_{it}) = 1\) if \(a_{it}=\alpha \), 0 otherwise; \(D^T_\tau (t) = 1\) if \(t=\tau \), 0 otherwise; and \(D^C_\kappa (c_i) = 1\) if \(c_i=\kappa \), 0 otherwise. Clearly, the APC problem still applies to this model as (2a) is nested in Eq. (3). Rewriting model (3) in matrix format, we have

$$\begin{aligned} {\varvec{y}}={\varvec{X}}{\varvec{\beta }}+{\varvec{\varepsilon }}\end{aligned}$$
(4)

where we stack all observations of the response variable into the vector \({\varvec{y}}\): \({\varvec{y}}=(y_{1t_{min}},\ldots ,y_{1t_{max}},\ldots ,y_{nt_{max}})'\). The vector \({\varvec{\varepsilon }}\) is defined in an analogous way. Likewise, we collect all observations of the regressors in the regression design matrix \({\varvec{X}}\). This matrix contains a vector of ones and age, period and cohort dummy variable column vectors for the model parameter vector \({\varvec{\beta }}\). This parameter vector has dimension \(m = 1 + (a_{max}-a_{min}) + (t_{max}-t_{min}) + (c_{max}-c_{min})\). Obviously, \({\varvec{X}}\) is not of full column rank because of identity (1). Consequently, the matrix \({\varvec{X}}'{\varvec{X}}\) is singular so that the OLS estimate of \({\varvec{\beta }}\) cannot be computed.

3.2 Identification Without Assumptions

McKenzie (2006) has shown that second differences of the age-, period- and cohort effects can be estimated without any normalization restrictions on (3). De Ree and Alessie (2011) have pointed out that complete knowledge of the second differences implies complete knowledge of the shape of the age-, period- and cohort profiles orthogonal to their linear trends. This means that, without having to make any assumption or modification on (3), one can retrieve information about the shapes of these profiles. To show this, we will define the complete age (which we will use as an example, but the same applies for period and cohort) profile, f(a), nested in (3). The age profile orthogonal to the linear trend will be captured by a nonlinear term \(f_\perp (a)\), while the linear trend will be captured by a standard linear operator \(\pi _0 + \pi _1a\), i.e.

$$\begin{aligned} f(a) = \pi _0 + \pi _1a + f_\perp (a) \end{aligned}$$
(5)

Since \(f_\perp (a)\) is said to be orthogonal to \(\pi _0 + \pi _1a\), we have to pose this as a restriction. From McKenzie (2006) and from (5) we know that we can identify \(f''(a)=f_\perp ''(a)\). However, to integrate \(f_\perp ''(a)\) back to \(f_\perp (a)\) without having to acquire additional information, we also need to impose the following restrictions:

$$\begin{aligned} \int _{a_{min}}^{a_{max}} f_\perp (a) da = 0 {\text { and }} \int _{a_{min}}^{a_{max}} f_\perp (a)a da = 0 \end{aligned}$$

The normalization method to create the dummies orthogonal to the linear trend employed by Deaton and Paxson (1994) satisfies these restrictions, so we will make the same normalization. The Deaton and Paxson dummies for age \(\tilde{D}_\alpha ^A(a_{it})\), period \(\tilde{D}^T_\tau (t)\) and cohort \(\tilde{D}^C_\kappa (c_{i})\) will be defined as follows:

$$\begin{aligned} \tilde{D}_\alpha ^A(a_{it})&= D_\alpha ^A(a_{it})+ (\alpha - a_{min} -1)D_{a_{min}}^A(a_{it})-(\alpha - a_{min})D_{a_{min+1}}^A(a_{it}) \\ \tilde{D}^T_\tau (t)&= D_\tau ^T (t) + (\tau - t_{min} - 1)D_{t_{min}}^T(t) - (\tau - t_{min})D_{t_{min+1}}^T(t) \\ \tilde{D}^C_\kappa (c_{i})&= D^C_\kappa (c_{i}) + (\kappa - c_{min} - 1)D^C_{c_{min}} (c_{i}) - (\kappa - c_{min})D^C_{c_{min+1}} (c_{i}) \end{aligned}$$

where \(\alpha =a_{min}+2,\ldots ,a_{max}\), \(\tau =t_{min}+2,\ldots ,t_{max}\) and \(\kappa =c_{min}+2,\ldots ,c_{max}\). Adding these dummies to (2a) gives us:

$$\begin{aligned} y_{it}&= \phi + \beta a_{it} + \gamma t + \delta c_i + \sum ^{a_{max}}_{\alpha = a_{min}+2} \tilde{\beta _\alpha } \tilde{D}^A_\alpha (a_{it}) + \sum ^{t_{max}}_{\tau = t_{min}+2} \tilde{\gamma _\tau } \tilde{D}^T_\tau (t) \nonumber \\&\quad + \sum ^{c_{max}}_{\kappa = c_{min}+2} \tilde{\delta _\kappa } \tilde{D}^C_\kappa (c_{i}) + \varepsilon _{it} \end{aligned}$$
(6)

Note that this model is observationally exactly the same as model (3). By dropping one of the linear terms [as we have done in (2b)] we can estimate the model, while we accept the unidentifiability of the linear terms. The model we then estimate is:

$$\begin{aligned} y_{it}&= \phi + (\beta + \gamma ) a_{it} + (\delta + \gamma ) c_i + \sum ^{a_{max}}_{\alpha = a_{min}+2} \tilde{\beta _\alpha } \tilde{D}^A_\alpha (a_{it}) + \sum ^{t_{max}}_{\tau = t_{min}+2} \tilde{\gamma _\tau } \tilde{D}^T_\tau (t) \nonumber \\&\quad + \sum ^{c_{max}}_{\kappa = c_{min}+2} \tilde{\delta _\kappa } \tilde{D}^C_\kappa (c_{i}) + \varepsilon _{it} \end{aligned}$$
(7)

Although we cannot separately identify the parameters \(\beta \), \(\gamma \) and \(\delta \), we can estimate the detrended age-, period- and cohort profiles.

3.3 Assumptions to Address the Point Identification Problem

Essentially, as we have seen in (7), at least one assumption is necessary to identify a model which controls for all three APC variables. Preferably, such identifying assumptions should be based on some prior knowledge as to avoid arbitrary results. One such an assumption, put forward by Deaton and Paxson (1994) is that the coefficients corresponding to the time dummies add up to zero and are orthogonal to a time trend. In terms of Eq. (6), Deaton and Paxson (1994) assume that the time trend parameter \(\gamma \) is equal to zero which implies that the Deaton–Paxson model is just identified. If one analyses e.g. consumption data, one could justify this assumption by stating that period effects are only due to macroshocks which average out over time. This is a plausible assumption if the life cycle-permanent income model provides a good characterization of consumption behavior and one has panel data or a time series of cross-sections available with many waves.

In this subsection we will shortly review other types of identifying assumptions which have been proposed in the literature.

3.3.1 Intrinsic Estimator

The Intrinsic Estimator (IE) of the APC model has been introduced by Yang and her coauthors [see e.g. Yang et al. (2004)]. This brief review of the IE is heavily based on Yang (2008). She takes the matrix version of the APC accounting model (4) as starting point of analysis:

$$\begin{aligned} {\varvec{y}}={\varvec{X}}{\varvec{\beta }}+{\varvec{\varepsilon }}\end{aligned}$$

As we have said above, the design matrix \({\varvec{X}}\) has one-less-than-full column rank because of the identity Calendar year = Year of birth + Age. Consequently, there exists an m-dimensional unit length vector \({\varvec{c}}_0\) for which

$$\begin{aligned} {\varvec{X}}{\varvec{c}}_0=\varvec{0}. \end{aligned}$$

\({\varvec{c}}_0\) turns out to be the eigenvector corresponding to the unique zero eigenvalue of the matrix \({\varvec{X}}'{\varvec{X}}\).

Note that the parameter vector space of the APC regression (4) can be decomposed into a direct sum of two linear subspaces that are orthogonal (independent) to each other. The first subspace, N, is the null space defined by the vector \({\varvec{c}}_0\). The other non-null subspace, \({\varvec{\varTheta }}\), is the orthogonal complement of N. Consider now two so-called constrained general linear model (CGLIM) estimators, \(\hat{{\varvec{\beta }}}_1\) and \(\hat{{\varvec{\beta }}}_2\), each obtained by imposing one arbitrary but different equality constraint on the regression coefficients. For instance, one could assume that two adjacent age or cohort coefficients are the same. It can be shown that the difference between those two estimators must be in the null space of \({\varvec{X}}\), i.e.

$$\begin{aligned} {\varvec{X}}(\hat{{\varvec{\beta }}}_1-\hat{{\varvec{\beta }}_2})={\varvec{X}}(t {\varvec{c}}_0)=\varvec{0}\end{aligned}$$
(8)

where t is an arbitrary real number, depending on the choice of the two estimators. Note that the vector \({\varvec{c}}_0\) only depends on the design matrix \({\varvec{X}}\) and not on the response variable \({\varvec{y}}\). Therefore one could argue that the influence of the eigenvector \({\varvec{c}}_0\) should be removed from the estimation. Due to the orthogonal decomposition of the parameter space one can decompose the parameter vector \({\varvec{\beta }}\) into

$$\begin{aligned} {\varvec{\beta }}={\varvec{\beta }}_{0}+t {\varvec{c}}_0 \end{aligned}$$
(9)

where \({\varvec{\beta }}_{0} = ({\varvec{I}}-{\varvec{c}}_0{\varvec{c}}_0') {\varvec{\beta }}\) is a special parameter vector that is obtained by projecting \({\varvec{\beta }}\) on the non-null space \({\varvec{\varTheta }}\). Notably, the parameter vector \({\varvec{\beta }}_0\) is orthogonal to the arbitrary term \(t{\varvec{c}}_0\). Therefore this parameter is estimable as follows:

$$\begin{aligned} \hat{{\varvec{\beta }}}_0=({\varvec{I}}-{\varvec{c}}_0{\varvec{c}}_0') \hat{{\varvec{\beta }}} \end{aligned}$$
(10)

where \(\hat{{\varvec{\beta }}}_0\) is called the intrinsic estimator and \(\hat{{\varvec{\beta }}}\) denotes any CGLIM estimator. Yang et al. (2004) point out that the IE can also be obtained by using a principal component regression algorithm. Like the Deaton–Paxson APC model, the IE model is just identified because it only imposes one identifying restriction, which involves the geometric orientation of the parameter vector: \(t=0\) (cf. Eq. 9).

Yang et al. (2004) conducted an empirical analysis of US female mortality rates and found that the IE yields results that are very similar to those of the CGLIM estimators with sensible equality constraints. Browning et al. (2012) performed an APC analysis on the UK female labor force participation rate. They applied the intrinsic estimator and they estimated a CGLIM model in which they assumed that age effects are the same at ages 39 and 40.Footnote 4 This last assumption is a bit similar to our prime working age assumption. They found that the two estimation methods yielded similar age, period and cohort profiles of female labor force participation. These findings provide support for the intrinsic estimator approach which does not depend on the chosen normalization of e.g. age effects.

3.3.2 Proxy Variable Approach

Euwals et al. (2011) assume that the period effect on female labor force participation is associated with the unemployment rate. This is because an increment in the unemployment rate demoralizes those without a job to join the labor market. They reason that replacing the year effect dummies by the unemployment rate per education group in that year is a viable way to prevent perfect multicollinearity. We will estimate as well a model using the unemployment rate as a proxy variable. Unfortunately, we cannot access unemployment data considering 1985–2014 per education group, which is why we take Eurostat’s unemployment rate for women between 25 and 74 years old to estimate the model. Notice that this APC model is over-identified because, compared to model (3), it replaces all period dummies with one variable: the aggregate unemployment rate. In the results section we will test for the validity of those overidentifying restrictions.

3.3.3 Functional Form Approach

Another approach, introduced by Fitzenberger and Wunderlich (2004), is to abandon the nonparametric model (6) and to replace the age, period or cohort function (linear term plus the detrended dummies) by means of a nonlinear function. Euwals et al. (2011) apply this so-called functional form (specification) approach to an APC model of female labor force participation. They correctly observe that over the past years, there have been large increases to the female labor market participation rate. Part of this increase can be attributed to cohort effects. However, it seems rather unlikely to observe similar growth forever, as its maximum is clearly 100%. Therefore, Euwals et al. (2011) assumed that the cohort effects is positively related with the logarithm of the variable ‘year of birth’.

3.3.4 Prime Working Age Assumption

As we already pointed out in Sect. 2, Fig. 1 shows that for most generations labor force participation is rather constant for women who are between 45 and 50 years old. This finding is confirmed by Euwals et al. (2011) who present a similar figure based on the Dutch Labor Force Survey 1992–2004.Footnote 5 Although age and period effects cannot be disentangled in Fig. 1, this finding provides prima-facie evidence for the prime working age hypothesis which says that female labor force participation remains constant around these ages. There appears to exist a prime working age for middle aged women because, for most, their children are no longer very young and they are not yet approaching retirement. So we will assume that age effects on labor force participation do not change between age 45 and 50. To model this, we will create dummies for each year of age except for the ages 45–50 and we will make one additional dummy for anyone between the ages of 45 and 50.

Figure 1 also suggests that labor force participation rises between ages 30 and 45. This finding is consistent with those obtained by Lucassen (2004) who claims that women gradually return to the labor market after they have children. She finds that among those who return to the labor market, most return before age 45. However, there is a sizable fraction who return to the labor market between age 40 and 45 (the average age of return is equal to 43). Therefore we choose as prime working age interval ‘45–50’ and not, say, ‘41–46’. In a sensitivity analysis we will investigate how the results are affected by considering an alternative prime working age interval of 41–46 instead of 45–50.

Essentially, the prime working age model is a CGLIM model in which we address the APC problem by making an assumption on adjacent age effects. Notice also that the prime working age model is overidentified because it places restrictions on six adjacent age effects.

3.4 Empirical Models

We will first estimate (7) to find the detrended age-, period- and cohort profiles of female labor force participation.Footnote 6 We will enrich this specification by including a set of control variables \({\varvec{x}}_{it}\).Footnote 7\(^,\)Footnote 8 Next to model (7), we will also consider the following regression models:

  1. 1.

    The “prime working age model” in which we assume that ceteris paribus the labor force participation rate does not change between age 45 and 50;

  2. 2.

    A model using the nationwide unemployment rate, \(unemp_t\), as a proxy variable for the period effects;

    $$\begin{aligned} y_{it}&= \phi + {\varvec{\mu }}' {\varvec{x}}_{it} + \sum ^{64}_{\alpha = 26} \beta _\alpha D^A_\alpha (a_{it}) + \gamma unemp_t + \sum ^{1963}_{\kappa = 1925} \delta _\kappa D^C_\kappa (c_{i}) + \varepsilon _{it} \end{aligned}$$
    (11)
  3. 3.

    The “Intrinsic Estimator (IE) model” described in Sect. 3.3.1 (with control variables included).

We will estimate the parameters of the models by means of pooled OLS. Since we have panel data at our disposal, we will compute robust standard errors clustered at the individual level. Notice that we are basically estimating linear probability models as our dependent variable (female labor force participation) is binary. We also estimated the probit versions of the models presented above. It turns out that the probit and linear probability models yield very similar age, period and cohort profiles. Moreover, the linear probability models generated predictions outside the unit interval for only less than 5 percent of the sample. We therefore abstain from presenting the estimation results based on the probit models.

4 Results

4.1 Detrended Age-, Period- and Cohort Profiles

The estimates of the linear trend parameters of model (7) can be found in Table 2 while the detrended age-, period- and cohort effects (i.e. the coefficients of \(\tilde{\beta }_\alpha \), \(\tilde{\gamma }_\tau \) and \(\tilde{\delta }_\kappa \) for all \(\alpha \), \(\tau \) and \(\kappa \)) are displayed in Fig. 2. The sum of the age and time trend parameter estimate, \(\widehat{\beta + \gamma }\), is very small (0.0007) and does not differ significantly from zero. This result does not necessarily indicate the absence of linear trends in time and age. There could be a negative age and a positive time trend (or vice versa) which cancel each other out. The parameter estimate with cohort should be interpreted as the sum of linear effects in year of birth and time: \(\widehat{\delta + \gamma } = 0.0133\). Notably, its positive value does not imply that younger cohorts work more than the older ones or that female labor force participation increases over time.

The detrended age profile (see Fig. 2) is convex around age 30 and hump shaped after that age with a rather flat part around the middle age. The convexity of the detrended age profile at young ages is consistent with the ‘child valley’ hypothesis (see Sect. 2) which says that (in the past) some young married women temporarily stopped working when their first child was born. However, since the age-, period- and cohort trend parameters are not separately identified, we cannot unambiguously conclude that there is an actual dip in the female labor force participation rate around age 30. The concavity of the detrended age profile after the prime child bearing ages reassures us that the data complies with the previous literature on female labor force participation which says that labor force participation drops at older ages because of retirement. The detrended age profile also provides some information about the validity of the prime working age hypothesis presented in Sect. 3.3.4. The prime working age hypothesis of constant labor force participation rates around middle ages (45–50) implies that the detrended age profile should be linear around those ages (or that the second derivative of the age function is equal to 0). According to Fig. 2, the detrended age profile indeed seems to be linear around ages 45–50.

Table 2 Age-, period- and cohort effects on participation
Fig. 2
figure 2

Detrended age-, period- and cohort profiles

The detrended cohort function seems to be convex. This result runs counter to the hypothesis put forward by Euwals et al. (2011). They assumed that the cohort effects are positively related with the logarithm of the variable ‘year of birth’ (see Sect. 3.3.3). Figure 2 does not confirm this hypothesis because the logarithm is a concave function. It is interesting to see that we can reject the model of Euwals et al. (2011) without making any point identifying assumption. By the way, we estimated the model of Euwals et al. (2011) but we found an implausible negative and significant estimate for the \(\ln (birthyear)\)-coefficient.Footnote 9 In light of the convex cohort profile, this result is not really surprising (\(-\ln (birthyear)\) is a convex function). Next to this, we obtained implausible estimates for the age and time profiles.

The detrended period profile appears to fluctuate around zero with rather small coefficients, which is not surprising as period effects have shown to make a small difference in the previous literature.Footnote 10

4.2 Prime Working Age Model

Now, we consider APC models in which at least one assumption is made in order to identify fully the age, period and cohort profiles. We choose as baseline the prime working age model which assumes that the age effect is constant between age 45 and 50. Its age, period and cohort profiles are displayed in Fig. 3 whereas the parameter estimates corresponding to the control variables are presented in the first column of Table 3. Not surprisingly, female labor force participation seems to be positively associated with education level and negatively correlated with marital status and with the presence of children. Table 3 also summarizes the results of some statistical tests. From Table 3 it becomes clear that other APC models yield almost the same parameter estimates for the control variables than the prime age working model.Footnote 11 As the prime working age model is overidentified, we have carried out a test of overidentifying restrictions. It turns out that the overidentifying restrictions are not rejected (see row ‘p value misspecification test prime working age assumption (\(\chi ^2(4)\))’). From Table 3 it also becomes clear that the age, period and cohort dummy variables are jointly significant. This result implies that any model which ignores either age-, period- or cohort effects, does not adequately describe patterns in Dutch female labor force participation.

The shape of the age profile is plausible: it shows a ‘child valley’ in the labor force participation rate around age 30 and a hump shape after that age. It also turns out that labor force participation is at its maximum at the prime working age (45–50); it starts to decline thereafter due to retirement and/or disability. The time profile shows a slight positive trend. Apparently, the period cannot be exclusively attributed to unanticipated business cycle shocks. In that case we would not have observed any trend in the period effects. According to the cohort profile, younger generations work, ceteris paribus, more often than the older ones, potentially due to changed social norms. Notice that our regression model includes dummy variables indicating the education level of the respondent. This implies that the positive trend in the cohort effect cannot be attributed to improved educational attainment.

Table 3 Estimation results of four APC models
Fig. 3
figure 3

Age, cohort and year effects on female labor force participation Prime working age model

4.3 Comparison with Other APC Models

Next to the prime working age model we have estimated the following APC models: (1) the IE model; (2) the Deaton–Paxson model; and (3) the ‘unemployment model’ [cf. Eq. (11)]. The age, period and cohort profiles of the four models are displayed in Fig. 4. The IE and prime working age models generate similar predictions. This result is also obtained by Yang et al. (2004) and Browning et al. (2012). They find that the APC profiles of the IE model are similar to those CGLIM models which are based on plausible equality restrictions.Footnote 12 As we said before, the prime working age model can be seen as a specific CGLIM model which imposes the restriction that the age effect is constant between ages 45 and 50. We think that the similarity of the APC profiles obtained by the IE and prime working age models confirms the plausibility of the prime working age model. As the IE model is just-identified and the prime working age model is over-identified, we can compare the fit of the two models on basis of the adjusted \(R^2\) and the Akaike information criterion.Footnote 13 On basis of those two criteria—slightly higher (lower) value of the adjusted \(R^2\) (Akaike information criterion)—we should prefer the prime working age model above the IE model. However, one can also argue, like Yang et al. (2004) and Browning et al. (2012), that one should opt for the IE model because it does not rely on normalization(s) of the age effects.

Contrary to the IE model and prime working age model, the Deaton–Paxson model yields a period profile, which, by assumption, does not exhibit any trend. Like Browning et al. (2012), we believe that the cyclicality assumption underlying the Deaton–Paxson model might not be correct in this context because some policy changes (e.g. changes in the tax legislation) might have affected labor market behavior of all women in the same way. The Deaton–Paxson model ignores the positive linear trend in the period effects, i.e. it assumes that \(\gamma =0\) in Eq. (7). Consequently, the DP model implicitly imposes a higher linear age and cohort trend than the IE and prime working age models. However, it should also be stressed that the Deaton–Paxson and IE models produce APC profiles which are not dramatically different.

The age and cohort profiles of the Deaton–Paxson and the unemployment models are very similar. According to the unemployment model (11), the aggregate unemployment rate has a small negative effect on female labor force participation but this estimate does not significantly differ from zero. As a result, period effects are almost absent in this model. A misspecification test indicates that the over-identified unemployment model should be rejected against any just identified APC model such as the Deaton–Paxson model or the IE model (see row ‘p value misspecification test year effects’ in the last column of Table 3).

4.4 Robustness Checks

We estimated two alternative models in order to check the robustness of our results. Up to now, our estimation sample consisted of women born between 1924 and 1963. We did not consider younger generations because Fig. 1 suggested that the ‘child valley’ in the labor force participation rates has almost completely disappeared for them. In other words, interaction between age and cohort effects might be relevant, which additive APC models do not take into account. Since the modeling of such interaction effects is a very complicated exercise, we decided to consider only a limited number of generations in our sample. In the first sensitivity analysis, we enhance the estimation sample by including women born between 1964 and 1983. The APC profiles are presented in Fig. 5. The main results, which we summarized in the previous subsection, still stand: the IE and prime working age models yield very similar age-, period- and cohort profiles and the shape of those profiles are not dramatically affected by the inclusion of younger cohorts in the sample. We still see a slight positive trend in the period effects. This result runs counter to the Deaton–Paxson and unemployment models which still generate similar results. Finally, the prime working age model still has the best fit in terms of the adjusted \(R^2\) and the Akaike information criterion.

In the second sensitivity analysis we investigate how the results are affected by considering an alternative prime working age interval of 41–46 instead of 45–50. Figure 6 shows that the alternative choice of the prime working age interval leads to quite different and rather implausible APC profiles. First, according to the new model, labor force participation increased by 30 percentage points between 1985 and 2014 because of period effects. We do not observe such a strong positive trend in the baseline prime working age model (5 percentage points, see Fig. 4).Footnote 14 Simultaneously, labor force participation drops by almost 20 percentage points between age 48 and 55 (ceteris paribus) according to the alternative prime working age model. This strong drop in labor force participation which we do not observe in the baseline model (5 percentage points), cannot be explained by early retirement because most early retirement schemes which prevailed during the sample period, did not allow for early retirement before age 55. We find it hard to believe that the sharp decline in labor force participation between age 48 and 55 can be explained by either disability or unemployment. Finally it should be mentioned that the alternative prime working age model predicts rather small cohort effects: labor force participation of women born in 1963 is 15 percentage points higher than for those born in 1930 (ceteris paribus). The corresponding number for the baseline prime working age model is about 40 percentage points.

In Sect. 3.3.4 we already justified our choice of ‘45–50’ as the prime working age interval instead of ‘41–46’. Figure 1 also provides some information about how the alternative choice of the prime working age interval will affect the age-, period- and cohort profiles. According to the OSA data, the labor force participation rate of the generation born between 1954 and 1958 (labeled as 1956) increased almost linearly from about 40 % in 1985 to 80% in 2002 (when this generation was around 46 years old) and did not change significantly between 2002 and 2008. In other words, women of this generation increased their labor force participation between age 40 (year 1996) and 46 (year 2002) and not between age 45 and 50. A similar pattern can also be observed for the 1951 and 1946 cohorts. These changes along ‘cohort curves’ might be attributed to age effects, period effects or a combination of both. If one adopts a prime working age interval of 41–46, one basically assumes that the rise in labor force participation of the 1956 generation between age 41 and 46 can mainly be explained by time effects which explains the finding presented in the bottom panel of Fig. 6. Notice also that labor force participation rates between ages 45 and 50 did not change much for the 1961, 1956, and 1941 generations.

5 Conclusions

In the past four decades, female labor force participation rose dramatically in the Netherlands from under 35% to over 75%. One could wonder which factors have contributed to this increase. In order to answer this question, it is important to come up with models which consistently estimate age-, period- and cohort effects. Such effects cannot be disentangled without making at least one identifying assumption. In this paper we propose the prime working age model which normalizes the age effects between 45 and 50 to be the same. Figure 1 provides some prima facie evidence that this normalization might be plausible. Moreover, we find that the detrended age profile is approximately locally linear around the ages 45–50 which is a necessary (but not sufficient) condition for the validity of the prime working age hypothesis. We compare the predictions of this model with those of three other APC models: the Intrinsic Estimator (IE) model, the Deaton–Paxson model and the unemployment model. It turns out that the predicted APC profiles of the IE and prime working age models are similar and plausible. Browning et al. (2012) and Yang et al. (2004) obtain a similar result. They advocate the IE approach by stating that its predictions agree with APC models which are based on plausible normalizations. According to the prime working age and IE models there is a slight positive linear trend in the period effects. This result invalidates the cyclicality assumption underlying the Deaton–Paxson model which precludes such a trend. The positive trend in the period effects seems plausible because policy measures (e.g. changes in tax legislation) have been taken during the sample period to stimulate labor supply by women. These policy changes might have affected labor market behavior of all women in the same way. Finally, it should also be mentioned that the prime working age model yields a lower AIC than the Deaton–Paxson model, indicating a better fit.

Future research should pay more attention to modeling interactions between age-, period- and cohort effects. For instance, Fig. 1 suggests that women from younger cohorts retire later and leave the labor market less frequently when they get their first child than the older cohorts.

Fig. 4
figure 4

Age, cohort and period effects on female labor force participation: results of four different models; prime working age interval: 45–50

Fig. 5
figure 5

Age, cohort and period effects on female labor force participation: results of four different models; younger cohorts included in sample

Fig. 6
figure 6

Age, cohort and period effects on female labor force participation: results of four different models; prime working age interval: 41–46