Simultaneity in binary outcome models with an application to employment for couples

Two of Peter Schmidt’s many contributions to econometrics have been to introduce a simultaneous logit model for bivariate binary outcomes and to study estimation of dynamic linear fixed effects panel data models using short panels. In this paper, we study a dynamic panel data version of the bivariate model introduced in Schmidt and Strauss (Econometrica 43:745–755, 1975) that allows for lagged dependent variables and fixed effects as in Ahn and Schmidt (J Econom 68:5–27, 1995). We combine a conditional likelihood approach with a method of moments approach to obtain an estimation strategy for the resulting model. We apply this estimation strategy to a simple model for the intra-household relationship in employment. Our main conclusion is that the within-household dependence in employment differs significantly by the ethnicity composition of the couple even after one allows for unobserved household specific heterogeneity.


Introduction
A large recent literature has been concerned with econometric models in which binary outcomes interact with each other. The papers by Bresnahan and Reiss (1991) and Tamer (2003) are early examples of this. In those papers, the dependence is due to strategic interactions between economic agents. This literature was predated by Schmidt and Strauss (1975) who proposed a reduced form statistical model that has the feature that the conditional distribution of each binary variable depends on the outcome of the other.
At the same time, a large econometric literature has been concerned with estimation of linear panel data models with fixed effects and lagged dependent variables. This literature dates back to Nickell (1981) and Anderson and Hsiao (1982). The paper by Ahn and Schmidt (1995) is an important contribution to this literature. This paper combines insights from these literatures by illustrating how the simultaneous binary outcome model in Schmidt and Strauss (1975) can be modified to allow for panel data with individual specific fixed effects and lagged dependent variables. The main contribution of the paper is to develop a toolbox of estimation procedures that can be used to estimate the resulting models.
Methodologically, the paper fits into the literature that is concerned with estimation of standard nonlinear panel data models with fixed effects using short panels. This literature has a long history in econometrics. The main problem to be solved is that treating the fixed effects as parameters to be estimated will typically lead to inconsistent estimation of all the model parameters. The literature has developed a number of methods to deal with this. One approach for parametric models is to try to construct a non-trivial sufficient statistic for the fixed effect. If such a sufficient statistic exists, then conditional maximum likelihood (conditional on this sufficient statistic) can typically be used to estimate the parameters of the model. This approach was, for example, taken by Rasch (1960) and Hausman et al. (1984) for the logit model and the Poisson regression model, respectively. Manski (1987) proposed a conditional maximum score estimator for the semiparametric binary response model with fixed effects, which can be thought of as a generalization of the conditional maximum likelihood approach. Honoré and Kyriazidou (2000) adapted both the conditional maximum likelihood and the conditional maximum score methods to binary outcome models with lagged dependent variables and fixed effects. A second strand of the literature has studied specific semiparametric models and has been able to find moment conditions which do not depend on the fixed effects, and which can therefore be used to estimate the model parameters via generalized method of moments. See for example, Honoré (1992), Chamberlain (1992), Kyriazidou (1997), Wooldridge (1997) , Kyriazidou (2001) and Hu (2002). More recently, Johnson (2004), Kitazawa (2013), Honoré and Weidner (2022) and Honoré et al. (2021) and Davezies et al. (2022) have derived moment conditions for parametric logit-type models with fixed effects, for which the conditional likelihood approach cannot be applied.
In this paper, we study estimation of a dynamic fixed effects panel data version of the Schmidt-Strauss model. It turns out that although the conditional likelihood approach can be applied to identify and estimate some of the parameters of the model, it does not identify the key parameter that captures the dependence between the binary outcomes. On the other hand, it turns out that one can construct moment conditions that do depend on this parameter, which can therefore be estimated by generalized method of moments.
As an empirical illustration of the models and methods studied in this paper, we investigate the joint determination of husbands' and wives' employment. In this context, it is natural to allow for the possibility that the outcome for each spouse is related to the outcome of the other, which makes it natural to consider the Schmidt-Strauss framework. The specific empirical question is how the parameter that captures the dependence between outcomes for husbands and wives differs by the ethnicity of the couple, and whether it varies over time. Since there is likely persistence in employment, and that some of this persistence might be due to heterogeneity as opposed to true state dependence, it is therefore natural to study this question using dynamic panel data versions of the model proposed by Schmidt and Strauss (1975).
The paper is organized as follows: In Sect. 2, we present the Schmidt and Strauss (1975) model. In Sect. 3, we discuss the data. Section 4 presents simple evidence for the intra-household dependence in couples ' employment by ethnicity. Section 5 develops and discusses a conditional likelihood approach for estimating a version of the Schmidt and Strauss model that incorporates lagged dependent variables as well as fixed effects. Section 6 discusses how the method of moments approach of Honoré and Weidner (2022) can be used to identify the dependence parameter. In Sect. 7, we compare the fixed effects approach to a correlated random effects approach in the spirit of Wooldridge (2005). Section 8 concludes. The Appendix provides moment conditions for a special case of the model in Sect. 6. Schmidt and Strauss (1975) proposed a cross-sectional simultaneous equations logit model in which two binary variables, y 1,i and y 2,i , for a unit i are each distributed according to a logit model conditional on the other variable and on a set of explanatory variables P y 1,i = 1 y 2,i , x 1,i , x 2,i = Λ x 1,i β 1 + ρ y 2,i ,
Here x 1,i and x 2,i are vectors of explanatory variables, β 1 , β 2 and ρ are parameters to be estimated, and Λ (·) is the logistic cumulative distribution function. The parameter ρ captures the dependence between y 1,i and y 2,i . Schmidt and Strauss (1975) show that this model cannot be generalized to allow for different values for ρ in the distribution of y 1,i given y 2,i and in the distribution of y 2,i given y 1,i . In this sense, ρ resembles the covariance between two random variables. When the parameter ρ is positive (negative), the probability that y 1,i equals one is higher (lower) conditional on y 2,i being one than conditional on y 2,i being zero. The same holds for the probability that y 2,i is one conditional on y 1,i . Holding the explanatory variables fixed, a positive (negative) ρ therefore corresponds to a positive (negative) statistical association between y 1,i and y 2,i . The simultaneous logit model of Schmidt and Strauss (1975) has been applied in a variety of cross-sectional studies and in various fields such as labor economics (for example, by Lehrer and Stokes (1985) to study the determinant of different aspects of a chosen occupation), urban economics (for example, by Boehm (1981) to study the effects of various variables on the choice to own or rent and on expected future mobility), health economics (for example, by Akin et al. (1981) to study the use of different kinds of health services, and by Wang and Rosenman (2007) to study the need for health insurance on one hand and actual purchase of health insurance on the other), transportation (for example, by Ye et al. (2007) to study the relationship between mode of transportation and trip chaining), political economy (for example, by Kau et al. (1982) to study the interactions between congressional voting, campaign contributions and electorial margins), and demography (for example, by Koo and Janowitz (1983) to study the relationship between the probability of dissolving a marriage and of having a child).
The conditional probabilities in Eq. (1) emerge from a statistical model in which y 1,i and y 2,i have the joint probability distribution Another way to see that ρ measures the dependence between y 1,i and y 2,i in Eq.
(2), is to note that Therefore, log P y 1,i = c 1 , y 2,i = c 2 x 1,i , x 2,i is supermodular or submodular depending on whether ρ > 0 or ρ < 0. To understand how the magnitude of ρ, as opposed to its sign, translates into other measures of dependence, one can consider the following thought experiment: Suppose that, for a given ρ, β 1 and β 2 above are chosen such that y 1,i and y 2,i are Bernoulli, each with 1 probability of success equal to 0.5. The correlation between y 1,i and y 2,i then relates to ρ as depicted in Fig. 1.
Below, we apply the model of Schmidt and Strauss (1975) (and its panel data extensions) to an empirical study of husbands' and wives' employment status. In this context, i denotes the identity of the household, and y 1,i and y 2,i will denote Fig. 1 The Relationship between ρ and the Correlation Coefficient. The figure shows the correlation between two Bernoulli random variables from the model in Eq. (2), each with probability of success equal to 1 2 as a function of the parameter ρ the employment status of the wife and the husband, respectively. The next section introduces the data.

Data
For the analysis in this paper, we use the Current Population Survey (CPS) Basic Monthly micro data from the 40 years between January of 1982 and December of 2021. The data are sourced from https://www.ipums.org/ (Flood et al. 2021). The monthly CPS has a panel design. Households are interviewed for four consecutive months, then not interviewed for eight months, and finally interviewed for four more consecutive months. We identify households with one head of household and one married or unmarried partner (of the head). The data consist of these heads and partners provided that they are of different sex and are both between the age of 25 and 65 (inclusive). 2 Below, we sometimes refer to the partners as husbands and wives or as spouses although they are not always legally married. Since our ultimate goal is to investigate the dynamics of the employment status and a number of missing observations are missing in the last four months, we restrict the sample to the first four interview months, and we only use households who are in the sample in all of those 4 months.
We define four race/ethnicity groups: White, Black, Hispanic, and Other. Below we interchangeably refer to these groups as "race,""ethnicity" or "race/ethnicity". The couples are then grouped into five groups based on the race/ethnicity of the two partners: White, Black, Hispanic, Other, and Mixed Race. For example, White will refer to a couple, where both spouses are White, and "Mixed" will refer to a couple where the wife and husband have different ethnicity. We refer to these groups as the "ethnicity mix" (or sometimes just the "ethnicity") of the couple. Table 1presents summary statistics for the variables used in this paper. The first is a dummy variable for working defined as the employment status being "At work". The remaining variables are age in years, a dummy variable for the presence of children under the age of 5, a dummy variable for any children, and dummy variables for three education levels: high school or less, some college and college degree or more. Note that we report the number of individuals. Since this is a balanced panel with four time periods, the number of observations is larger than the number of individuals by a factor of four.

Summary statistics
We start by presenting summary statistics for the joint probability of working by ethnicity. The first panel of Table 2is for the whole sample, while the next two panels are for the subsamples of couples without children and with children. Our main takeaway from this table is that there is a large difference in these probabilities across the ethnicities, with Hispanic-Hispanic couples looking quite different from the others. Table 2 aggregates the data for all years. In Fig. 2we plot the joint probability of working over time for each ethnicity. These are depicted in the four leftmost plots. The two plots to the right are the marginal probabilities of working for the husbands and wives. Again, the main takeaway is that there are interesting differences across ethnicities, with Hispanics and, to a lesser extent, Blacks standing out. In terms of the evolution of the probabilities over time, the most distinct feature is the increase in the employment of women in the first part of the sample. This is seen in the marginal probabilities as well as the joint probabilities. It is also interesting that the 2008 recession had a large impact on the employment of men, but almost no effect for the women. The left panel of Fig. 3displays the correlation between the spouses' employment over time. The reported correlation is a five year centered moving average. The correlation is always positive for all of the ethnicities. For Blacks and Whites, it remained more or less stable over time, while it decreased dramatically for the other groups, especially for Hispanics and for Others. It is difficult to compare correlations of different pairs of binary variables when the marginal probabilities differ across the pairs. In the right panel of Fig. 3, we therefore present the five year centered moving average of the estimate of the parameter ρ in a Schmidt-Strauss model with no explanatory variables. Hereρ is calculated by the sample analog of Eq. (3). The estimated trend for ρ is similar to that for the correlation, although ρ shows a larger difference between Whites and Blacks.

Static cross-sectional Schmidt-Strauss models
It is clear from the evidence in Sect. 4.1 that there is a strong relationship between employment of husbands and of wives. In this section, we document that this persists after controlling for a set of observable characteristics. Specifically, in the first four columns of Table 3, we present the results from estimating separate single-equation logit models for employment for husbands and for wives as well as the results from maximum likelihood estimation of the Schmidt-Strauss model in Eq. (2 ). The explanatory variables are dummy variables for the presence of children younger than 5, for any children, for the person's own ethnicity, for the education categories "some college" and "college and above," and dummy variables for the ethnicity of the couple. The estimation also controls for year dummies, the age and the age-squared of both the husband and the wife, as well as the interaction of the ages. The last four columns present the results from estimating the same models after also including the ethnicity and the education variables of the spouse as explanatory variables.
The dependent variable is working and the parameters are estimated by maximum likelihood. The data are from IPUMS CPS and cover a balanced panel of couples where each individual's age is between 25 and 65. The data cover the period between 1982    Table 3 clearly suggest that there is positive association between the employment of husbands and wives after controlling for observed characteristics. In order to investigate whether this association varies systematically across ethnicities, we re-estimate the model in the last two columns of Table 3 separately for each ethnicity. In Table 4, we report the estimated ρ 's. The most striking finding is that the estimated ρ for Whites is much larger than for other ethnicities, while the estimate for Hispanics is the lowest. This is also reflected in counterfactual marginal effects. Specifically, for each ethnicity, we calculate the average probabilities implied by the model that a wife works conditional on whether her husband works or not. The difference in these average probabilities is 18 percentage points for Whites, 8 for Hispanics, and between 11 and 14 for each of the other three groups. The corresponding counterfactual marginal effects for husbands are 10 percentage points for Whites, 4 for Hispanics, and between 6 and 9 percentage points for the other groups. This ordering is consistent with that found in Fig. 3.
The dependent variable is working and the parameters are estimated by maximum likelihood using the same specification as in Table 3. The data are from IPUMS CPS and cover a balanced panel of couples where each individual's age is between 25 and 65. The data cover the period between 1982 and 2021. Standard errors are clustered at the household level Figure 3 above suggested a dramatic fall in the association between the employment of wives and husbands for households where both the wife and the husband are Hispanic, and for households where each spouse is of "other ethnicity". To investigate whether this holds after controlling for observable covariates, we estimate the model in the last two columns of Table 3 for each ethnicity and for rolling 5-year time-spans. The estimated ρ coefficients are presented in Fig. 4. Qualitatively, the pattern in Fig. 4 is similar to that in Fig. 3: The association between the employment of wives and husbands has been falling for Hispanics and for Others, while it has been relatively stable for White, Black and Mixed couples.

Dynamic panel data Schmidt-Strauss models
In the Schmidt-Strauss models estimated in Table 3, the only avenue for interdependence between the employment of wives and husbands (conditional on the observed characteristics) is through the parameter ρ. If the employment of a partner actually also depends on the lagged employment of both partners, then this will be captured by the estimate of ρ. Fig. 4 Evolution of ρ over Time by Household Ethnicity. The dependent variable is working and the parameters are estimated by maximum likelihood using the same specification as in Table 3. The data are from IPUMS CPS and cover a balanced panel of couples where each individual's age is between 25 and 65. The data cover the period between 1982 and 2021 and the estimation is done over five year centered rolling windows

Fig. 5
Evolution of γ 's over Time by Household Ethnicity. The dependent variable is working and the parameters are estimated by maximum likelihood using the same specification as in Table 5. The data are from IPUMS CPS and cover a balanced panel of couples where each individual's age is between 25 and 65. The data cover the period between 1982 and 2021 and the estimation is done over five year centered rolling windows In order to investigate the role of dynamics, we first estimate the Schmidt-Strauss model in the last two columns of Table 3 after including an individual's own as well as the partner's lagged employment as explanatory variables. Specifically, we estimate the model The results are presented in Table 5. Since the lagged values of the dependent variable are not observed in the first time period, we do the estimation using waves two through four of our dataset. The results in Table 5 suggest that each partner's employment depends strongly and positively on her or his own lagged employment, and that it depends negatively on the partner's lagged employment (after controlling for the observed covariates). In combination, these will introduce a negative correlation in the contemporaneous employment status, which -in turn -would lead to a downward bias in the estimate of ρ when these dynamic interactions are not controlled for in the model. This is reflected in the higher estimate of ρ in the model that allows for lagged employment of both partners as explanatory variables as in Eq. (4). The dependent variable is working and the parameters are estimated by maximum likelihood. The data are from IPUMS CPS and cover a balanced panel of couples where each individual's age is between 25 and 65. The data cover the period between 1982 and 2021. Coefficients on year dummies, husband's and wife's age, their interaction and their squares are not reported. Standard errors are clustered at the household level Since controlling for the lagged employment status of both partners dramatically change the estimate of ρ when we use the full sample, we next investigate whether the same is true across ethnicities. Specifically, we estimate the same specification as in Table 5 separately for each ethnicity group. Table 6reports the estimated coefficients on the lagged employment variables as well as the estimated ρ. In this specification, Hispanics and Blacks are quite similar to each other in terms of the contemporaneous interdependence between the employment status of the two partners (measured by ρ) as well as in terms of the dynamic interdependence (measured by the γ 's).
The dependent variable is working and the parameters are estimated by maximum likelihood using the same specification as in Table 5. The data are from IPUMS CPS and cover a balanced panel of couples where each individual's age is between 25 and 65. The data cover the period between 1982 and 2021. Standard errors are clustered at the household level The evolution of the estimates of the parameters that govern the dynamics and the interdependence is shown in Figs. 5 and 6 . Specifically, we estimate the Schmidt-Strauss model in Table 5 for each ethnicity over rolling 5-year time-spans and plotted  the estimates of the γ 's and of ρ against time. Comparing the patterns in Fig. 6 to the patterns in Fig. 4 , we see that Black and Hispanic couples are more similar. This is consistent with the finding in Table 6. Interestingly, the estimated ρ's for Hispanics and for Others are now much more stable over time, while the ρ for Whites is now trending up. It is well-understood that it can be difficult to disentangle state dependence (the causal dependence of a variable at one point in time from its value in the previous period) from unobserved heterogeneity. Intuition suggests that it is also difficult to distinguish between the effect of ρ and the effect of unobserved heterogeneity that is correlated between the husband and wife in the same household. These issues raise the question of whether it is possible to semiparametrically identify ρ and the coefficients on the lagged dependent variables in a model that allows for fixed effects. In the next section, we therefore investigate whether it is possible to identify and estimate the parameters of a model that allows for fixed effects in the dynamic Schmidt-Strauss framework. Honoré and Kyriazidou (2019) adapt the Schmidt-Strauss model discussed in Sect. 2 to a static panel data setting where each outcome can also depend on unit-specific fixed effects. Specifically, they assume that P y 1,it = 1 y 2,it , y 1,is , y 2,is s<t ,

Dynamic panel data Schmidt-Strauss models with fixed effects
and P y 2,it = 1 y 1,it , y 1,is , y 2,is s<t , = Λ α 2,i + x 2,it β 2 + ρ y 1,it In this model, α 1,i and α 2,i are the fixed effects, x 1,it and x 2,it are strictly exogenous explanatory variables, and ρ is the cross-equation dependence parameter, which as in Schmidt and Strauss (1975), needs to be the same in the two equations given the structure in equations (5) and (6). Following Schmidt and Strauss (1975), it can be shown that Honoré and Kyriazidou (2019) show that a conditional likelihood argument can be used to identify and estimate β 1 , β 2 , and ρ with as few as T = 2 time periods. Indeed, ρ can be allowed to be time dependent in Eqs. (5) and (6). Honoré and Kyriazidou (2019) also consider a vector autoregressive simultaneous logit model: This model is arguably the most relevant fixed effects specification for the application in this paper. For each individual, we only use data from four months, so with the exception of time-dummies, there is essentially no exogenous variability in the explanatory variables over time. Moreover, we use one time period to provide the initial conditions, and the effect of time variables is probably not important over a three month period. 3 Honoré and Kyriazidou (2019) show that (γ 11 , γ 12 , γ 21 , γ 22 ) is identified in the model given in Eq. (7) with a total of four time periods (including the one that delivers the initial condition). However, the conditioning argument that leads to the identification eliminates the parameter ρ along with the fixed effects, α 1i and α 2i . On the positive side, this implies that one can allow the parameter ρ in Eq. (7) to be individual-specific. On the other hand, ρ may be the parameter of interest in many applications, including the one considered here. This makes it problematic that the conditioning argument eliminates it along with α 1i and α 2i . In the next subsection, we first generalize the results in Honoré and Kyriazidou (2019) to show that using a conditional likelihood approach to eliminate α 1i and α 2i in Eq. (7) will also eliminate ρ for all values of T . The conditional likelihood approach is then illustrated empirically by obtaining estimates of the γ 's in Eq. (7) in the context of husbands' and wives' employment. Since the simultaneity parameter, ρ, is not generally identified from a conditional likelihood approach, we next consider a restricted version of the model, in which the two individual fixed effects are the same, except for an additive constant which is the same across all pairs. In our application, we interpret this as a model with household specific fixed effects. This model is also illustrated empirically.

Conditional likelihood for dynamic Schmidt-Strauss model with fixed effects
The traditional approach to estimating nonlinear fixed effects models is to find a sufficient statistic for the fixed effects, and then to construct a conditional likelihood function conditioning on the sufficient statistic. By construction, this conditional likelihood function will not depend on the fixed effects and it may or may not depend on some or all of the parameters of interest. In this subsection, we consider the conditional likelihood approach for the model in Eq. (7). This extends the analysis in Honoré and Kyriazidou (2019).
We consider a situation in which a pair of outcomes 4 y 1,t , y 2,t from Eq. (7) are observed for T periods. We also assume that the initial condition, y 1,0 , y 2,0 , is observed. We denote the probability distribution of y 1,0 , y 2,0 by p y 1,0 , y 2,0 , α 1 , α 2 , and we do not assume that it is necessarily generated by the same model. For notational simplicity, we let z 1,t = γ 11 y 1,t + γ 12 y 2,t and z 2,t = γ 21 y 1,t + γ 22 y 2,t .
In the numerator, the α's cancel if two sequences have the same T t=1 y 1,t and the same T t=1 y 2,t . In the denominator, each combination of y 1,t y 2,t must appear equally often. The latter is the same as saying that T −1 t=1 y 1,t , T −1 t=1 y 2,t , T −1 t=1 y 1,t y 2,t must be the same 5 . This suggests the sufficient statistic y 1,0 , y 2,0 , T −1 t=1 y 1,t y 2,t , y 1,T , y 2,T and the conditional likelihood function (for a given observation with fixed effects α 1 and α 2 ) is therefore exp y 1,t γ 11 y 1,t−1 + γ 12 y 2,t−1 exp y 2,t γ 21 y 1,t−1 + γ 22 y 2,t−1 where B is the set of all sequences, {c t , d t } T t=0 , such that Note that not only does α drop out of the conditional likelihood, but so does ρ. In other words, a conditional likelihood approach does not identify ρ for any T . Also note that the conditional likelihood is constant if T < 3, so at least three periods are needed in addition to the one providing the initial conditions. We finally note that the argument above is unchanged if one replaces γ 11 , γ 12 , γ 21 , γ 22 , and ρ with functions of exogenous covariates as long as the functions do not change over time. For example, in the application some of these parameters could be functions of the level of education or of the presence of children.

Empirical illustration
In Table 7 , we present the results from estimating γ 11 , γ 12 , γ 21 , and γ 22 using the conditional likelihood approach discussed above for the full sample as well as by ethnicity. As one might expect, these parameters are much lower in the fixed effects specification than those reported in Table 6, where we do not allow for unobserved heterogeneity. Figure 7 shows the results of estimating the model on rolling 5-year sub-samples for each ethnicity. The estimates are fairly stable over time, and not very different across ethnicities. Overall, there is strong evidence that, after controlling for fixed effects, an individual's own lagged employment has a positive effect. The effect of the spouse's lagged employment tends to be negative and smaller in magnitude.
other. Chountas and Kyriazidou (2021) pursue such a strategy for the conditional likelihood in a multinomial multivariate model with discrete explanatory variables. In the case of continuous explanatory variables, one may use the kernel weight approach introduced in Honoré and Kyriazidou (2000), although this would lead to an estimator that converges slower than the usual √ n. As a comparison, Chountas and Kyriazidou (2021)

Conditional likelihood for dynamic Schmidt-Strauss model with restricted fixed effects
In this subsection, we investigate whether additional identification can be obtained by assuming that α 1 = α and α 2 = α + κ for some constant κ, which does not vary across units. Our motivation is to see whether this will allow for identification of ρ.
As above, the key question is whether the unit-specific a's cancel in the ratio of the probabilities of two different sequences with the same initial conditions. In the numerator, the α's cancel if the two sequences have the same T t=1 y 1,t + T t=1 y 2,t . In the denominator, each combination of y 1,t , y 2,t must appear equally often 6 . The latter is the same as saying that T −1 t=1 y 1,t , T −1 t=1 y 2,t , T −1 t=1 y 1,t y 2,t must be the same. This suggests the sufficient statistic y 1,0 , y 2,0 , T −1 t=1 y 1,t y 2,t , y 1,T + y 2,T The difference from the case where the α's are unrestricted is that we do not need to condition on y 1,T and y 2,T , but only on the sum. The implication is that a conditional likelihood approach will lead to more sequences being compared to each other.
The conditional likelihood function (for a given individual) is exp y 1,t γ 11 y 1,t−1 + γ 12 y 2,t−1 exp y 2,t γ 21 y 1,t−1 + γ 22 y 2,t−1 + κ where B is the set of all sequences, {c t , d t } T t=0 , such that Note that while α and ρ drop out of this expression, κ does not. Also note that this argument is unchanged if one replaces κ with some function of predetermined covariates as long as the function does not change over time. The same is true for the parameters γ 11 , γ 12 , γ 21 , and γ 22 .

Empirical illustration
In Table 8 , we present the results from estimating γ 11 , γ 12 , γ 21 , and γ 22 using the conditional likelihood approach discussed above for the full sample as well as by ethnicity. The fixed effects estimates are again lower than those reported in Table 6, which did not allow for unobserved heterogeneity, but they are larger than the ones that were obtained when we did not restrict the fixed effects for the husbands and the wives reported in Table 7. Since the conditional likelihood in Eq. (9) uses more observations that the one in Eq. (8), we would expect the estimated standard error to be smaller in Table 8 than in Table 7. Figure 8shows the results of estimating the model on rolling 5-year sub-samples for each ethnicity. The estimates are fairly stable over time, and not very different across ethnicities.
The dependent variable is working and the parameters are estimated maximizing the conditional likelihood in Eq. (9). The data are from IPUMS CPS and cover a balanced panel of couples where each individual's age is between 25 and 65. The data cover the period between 1982 and 2021

Moment conditions for the dynamic Schmidt-Strauss model with fixed effects
In panel data models with fixed effects, it is sometimes possible to construct moment conditions that do not depend on the fixed effects. When that is the case, one can consider estimating the common parameters of the model by generalized method of moments. The dynamic linear panel data model is a simple example of this; see, for example Hsiao (1981) or Holtz Eakin et al. (1988). Applications of this idea to nonlinear models include Honoré (1992), Kyriazidou (2001), Hu (2002) and Kitazawa (2013). 7 Bonhomme (2012) proposes a general approach for constructing such moment conditions and Honoré and Weidner (2022) develop a specific numeric strategy for determining whether such moment conditions can be constructed in particular models with discrete outcomes. In this section, we report the results from applying the approach in Honoré and Weidner (2022) to determine whether there are moments that can be used to identify and estimate ρ in a Schmidt-Strauss model with lagged dependent variables and fixed effects. We consider two versions of the model P y 1,t = c 1 , y 2,t = c 2 y 1,s , y 2,s s<t , x 1,s T s=1 , x 2,s T s=1 , α 1 , α 2 = exp c 1 z 1,t + α 1 + c 2 z 2,t + α 2 + c 1 c 2 ρ 1 + exp z 1,t + α 1 + exp(z 2,t + α 2 ) + exp z 1,t + α 1 + z 2,t + α 2 + ρ for t = 1, 2, 3 and c 1 , c 2 ∈ {0, 1} , where z 1,t = x 1,t β 1 + y 1,t−1 γ 11 + y 2,t−1 γ 12 and z 2,t = x 2,t β 2 + y 1,t−1 γ 21 + y 2,t−1 γ 22 . In one version, α 1 and α 2 are unrestricted as in Section 5.1, while the other version restricts them to be identical except for an additive constant as in Section 5.3. Note that these are the same models as in Sects. 5.1 and 5.3, except that we here allow for strictly exogenous covariates. Table 9reports the number of moment conditions for each of the two versions of the model when one has 3, 4 or 5 time periods of observations in addition to the one that provides the initial conditions. The data used in this paper has a total of four consecutive time periods, and the results for T = 3 are therefore the relevant ones here. In the empirical illustration in Sects. 5.2 and 5.4, we have no strictly exogenous time-varying explanatory variables, so according to the calculation reported in Table 9, there will be no moment conditions that depend on ρ when the fixed effects are left unrestricted. On the other hand, there will be six moment conditions for each initial condition when the fixed effects are restricted. With more than three time periods (in addition to the one providing the initial conditions), the results suggest that there are moment conditions that depend on ρ even when the fixed effects are unrestricted. While introducing explanatory variables changes the number of moment conditions, it does not change the answer to the question of whether there exist moment conditions that depend on ρ for a given value of T . Results from the numerical counting of moment conditions for the dynamic simultaneous logit are reported. Four different model specifications are considered: additional exogenous regressors are present (x k,t = 0) or not (x k,t = 0), and the fixed effects (α 1 , α 2 ) are unrestricted or restricted (α 2 = α 1 + κ). For each of those four specifications and each value of T we report n tot / n para / n ρ , where n tot is the total number of moment conditions available, n para is the number of moment conditions available that depend on any of the common parameters (γ 11 , γ 12 , γ 21 , γ 22 β 1 , β 2 , ρ, κ), and n ρ is the number of moment conditions available that depend on the parameter ρ. All results are for one fixed value of the initial condition (y 1,0 , y 2,0 ), but the number of moment conditions is independent from the initial condition. Notice that for T = 3 and unrestricted (α 1 , α 2 ) we have n ρ = 0, and in general we believe that the parameter is not identified in that case. However, for either T > 3 or restricted α 2 = α 1 + κ we find that n ρ > 0 and the parameter ρ can be identified and estimated from those moment conditions

Moment conditions For
It is not always easy to derive analytical expressions for the moment conditions. For the empirical application in Sects. 5.2 and 5.4 of this paper, T is three and there are no strictly exogenous time-varying explanatory variables. In order to make statements about ρ , we therefore have to limit attention to the model in which the fixed effect is household specific in the sense that α 2 = α 1 + κ.
As mentioned above, there will be a total of 45 moment conditions in this case. One can write these as six that depend on ρ, 36 that depend on some of the common parameters in the model, but not on ρ, and three that do not depend on any of the parameters in the model. In principle, one may need to use all of these moments to construct an efficient GMM estimator. On the other hand, we can already identify the γ 's and κ from the conditional likelihood approach in Sect. 5.3, so we only need to use one moment 8 that depends on ρ in order to (inefficiently) estimate ρ. We therefore focus on finding the six linearly independent moment conditions that depend on ρ. Unfortunately, these will not be unique. For example, adding a linear combination of moment conditions that do not depend on ρ to one of the six that do, will leave us with six linearly independent moment conditions that depend on ρ. This also means that some of the moment conditions can be extremely complicated.
Fortunately, it turns out that for the model considered here, one can find six linearly independent moment conditions (for each initial condition) which all depend on ρ, and where each only depends on five of the 64 possible sequences. They are given in the Appendix, and we use those to estimate ρ in the next subsection. These moment conditions are linear in exp (ρ).

Empirical illustration
In this subsection, we illustrate how the method of moments approach discussed above can be used to estimate ρ in the dynamic Schmidt-Strauss model with restricted fixed effects. We proceed in two steps. We first estimate the γ 's and κ using the conditional likelihood approach. We then fix the γ 's and κ at those estimates and estimate ρ by generalized method of moments using the moment conditions in the Appendix. As weighting matrix, we use the inverse of a diagonal matrix that has the variance of the moments evaluated at ρ = 0 in the diagonal. This choice is arbitrary and may lead to statistical inefficiency, but ρ = 0 is a natural benchmark, and the hope is that using a diagonal matrix will alleviate small sample issues resulting from estimation of an efficient weighting matrix. 9 Since the moment conditions are linear in exp (ρ), the GMM objective function will be quadratic in exp (ρ). This implies that it is numerically well behaved and that ρ is actually identified from it. On the other hand, the solution for exp (ρ), can sometimes be negative in finite samples. For the estimation below, we search over values of ρ between −2 and 4.
The results of the estimation of ρ are presented in Table 10 . Compared to the estimates of ρ presented in Table 6 , the fixed effects estimates are much smaller. This suggests that the household specific fixed effect captures much more of the intra-household correlation than the observed characteristics.
The dependent variable is working. The parameter ρ is estimated by generalized method of moments using the moment conditions in the Appendix, and the γ 's and κ   Figure 9 presents the results of estimating ρ separately for each ethnicity over rolling 5-year periods. The estimates for Whites seem fairly stable over time and are statistically significantly different from 0 in all time periods. 10 When testing at a 5% level of significance, the estimates for the other ethnicities are statistically significantly different from 0 in only six of 144 cases (four for Blacks and two for Others).

Dynamic Schmidt-Strauss models with correlated random effects
The calculations reported above establish that (γ 11 , γ 12 , γ 21 , γ 22 , κ, ρ) in the model in Sect. 5.3 is semiparametrically identified without assumptions on α. In such cases, Wooldridge (2005) has proposed estimating (γ 11 , γ 12 , γ 21 , γ 22 , κ, ρ) by maximum likelihood conditional on the initial observations, (y 1,0 , y 2,0 ), after modeling the distribution of α conditional on those initial observations. This approach is in the spirit of Mundlak (1978) and Chamberlain (1982) and is known as a correlated random effects approach. See also Wooldridge (2019). If the conditional distribution of α given the initial conditions is sufficiently flexible, then one might interpret this approach as a semiparametric sieve maximum likelihood estimator. Table 11 shows the estimates of (γ 11 , γ 12 , γ 21 , γ 22 , κ, ρ) that we obtain from the correlated random effects approach after modelling α conditional on (y 1,0 , y 2,0 ) as α = δ 0 + y 1,0 δ 1 + y 2,0 δ 2 + y 1,0 y 2,0 δ 3 + ν, The dependent variable is working and the parameters are estimated by maximizing the likelihood function conditional on the initial conditions and under the assumption that α is distributed as in Eq. (10). The data are from IPUMS CPS and cover a balanced panel of couples where each individual's age is between 25 and 65. The data cover the period between 1982 and 2021 The estimates of (γ 11 , γ 12 , γ 21 , γ 22 ) in Table 11 are larger in magnitude than those reported in Table 8, but the overall pattern is similar. The coefficients on one's own past employment for women and for men, γ 11 and γ 22 , are positive and of the same magnitude, and the coefficients on the spouse's past employment for women and for men, γ 12 and γ 21 , are negative and of the same magnitude. Moreover, these coefficients  Table 11 show the same pattern as the estimates in Table 10. Whites have the largest coefficient, while the estimates for Blacks and Hispanics are much lower. The parameters estimated based on the correlated random effects approach have less sampling uncertainty than the fixed effects estimators in Sect. 5.3 (presumably because they are based on additional assumptions). Figures 10 and 11 show the results of estimating the model using rolling 5-year sub-samples for each ethnicity. The estimates are fairly stable over time, and not very different across ethnicities. In terms of patterns, the results from estimating the γ 's presented in Fig. 10 mainly differ from the fixed effects estimates presented in Fig. 8 by displaying a clearer upward trend in the husband's coefficient on his own past employment, γ 22 . The estimates also tend to have less sampling uncertainty. Again, this is to be expected because the correlated random effects approach imposes additional structure relative to the fixed effects approach. The correlated random effects estimates of the ρ's presented in Fig. 11 are also noticeably less volatile than the GMM estimates in Fig. 9.
The results are in Table 12. The table gives the probability limit of the correlated random effects estimator for various distributions of the fixed effect when (γ 11 , γ 12 , γ 21 , γ 22 , ρ, κ) = (2.5, −1.5, −1.5, 2.5, 1, 2) and y 1,0 and y 2,0 are independent and equal to 1 with probability 1 2 The probability limits in Table 12 illustrate that the correlated random effects approach can provide a very good approximation when the distribution of the heterogeneity (α) is well-approximated by the assumed functional form, but also that the biases can be a much larger source of estimation error for the estimator than sampling variance for the kind of sample sizes considered here.

Conclusion
Two of Peter Schmidt's many contributions to econometrics have been to introduce an econometric model for simultaneous binary outcomes and to study the estimation of dynamic linear fixed effects panel data models using short panels. In this paper, we combine aspects of this research by studying panel data versions of the model introduced in Schmidt and Strauss (1975) that allow for lagged dependent variables and fixed effects, and we apply existing as well as new methods to investigate the joint behavior of employment of husbands and wives. On the methodological side, we first use the conditional likelihood approach of Honoré and Kyriazidou (2019) to construct a likelihood function that does not depend on the fixed effects of the model. While this conditional likelihood can be used to estimate the other parameters of the model when the total number of time periods is at least four, it turns out that it does not depend on the parameter ρ, which in the Schmidt-Strauss model captures the inter-equation dependence. As a result, our conditional likelihood approach cannot be used to estimate this parameter. We therefore next use the approach in Honoré and Weidner (2022) to study whether one can construct moment conditions that can be used to estimate ρ. We find that it is in principle possible to estimate the common parameters of such models when the total number of time periods for each individual is at least five. To construct moment conditions for four time periods, it is necessary to restrict the model. We do this by restricting the fixed effects for the two outcomes to be equal, except for an additive constant.
On the empirical side, we apply existing methods like those developed in Schmidt and Strauss (1975), as well as the estimation methods developed in this paper, to estimate a simple model for the relationship of employment of husbands and wives. Our main conclusion is that the parameter that captures the intra-household dependence in employment varies by the ethnicity composition of the couple and over time, even after one allows for unobserved household specific heterogeneity.

Conflict of interest The authors declare that they have no conflict of interests.
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Appendix: Moment conditions
In this Appendix, we explicitly present the six moment conditions discussed in Sect. 6.1. To simplify the notation, we write Γ i j = exp γ i j , B = exp (β), and P = exp (ρ).