Identification and semiparametric estimation of a finite horizon dynamic discrete choice model with a terminating action


We study identification and estimation of finite-horizon dynamic discrete choice models with a terminal action. We first demonstrate a new set of conditions for the identification of agents’ time preferences. Then we prove conditions under which the per-period utilities are identified for all actions in the agent’s choice-set, without having to normalize the utility for one of the actions. Finally, we develop a computationally tractable semiparametric estimator. The estimator uses a two-step approach that does not use either backward induction or forward simulation. Our methodology can be implemented using standard statistical packages without the need to write specialized computational routines, as it involves linear (or nonlinear) projections only. Monte Carlo studies demonstrate the superior performance of our estimator compared with existing two-step estimation methods. Monte Carlo studies further demonstrate that the ability to identify the per-period utilities for all actions is crucial for counterfactual predictions. As an empirical illustration, we apply the estimator to the optimal default behavior of subprime mortgage borrowers, and the results show that the ability to identify the discount factor, rather than assuming an arbitrary number as typically done in the literature, is also crucial for obtaining correct counterfactual predictions. These findings highlight the empirical relevance of key identification results of the paper.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. 1.

    Our approach of nonparametrically estimating agents’ expectations and using the estimates to recover their preferences is closely related to Ahn and Manski (1993) and Manski (1991, 1993 and 2000), who examine agents’ responses to their expectations in models with uncertainty or endogenous social effects.

  2. 2.

    Note that we use t to denote the loan’s age, not calendar time.

  3. 3.

    To be precise, we must assume that the lifetime value of choosing default does not depend on the sequence of optimal actions (of the considered model) to be taken in the future periods. The most straightforward case where this assumption would be satisfied is when agents make no further decisions once default is chosen in the current period. However, we need not interpret the terminating action as literally resulting in the agent not having to make any future decisions. Rather, the utility from default can be regarded as the “reduced-form” representation of the ex ante value function from a separate dynamic decision problem the borrower solves after foreclosure.

  4. 4.

    Alternatively, a functional form assumption on how the utilities depend on time t would also allow us to exploit non-stationarity in the CCPs for identification. We do not focus on this approach since we are interested in nonparametric identification.

  5. 5.

    We omit discussing the identification of u T (⋅,⋅), as it is obvious that, if u T u, then u T (⋅,⋅) is identified if and only if decisions from the final period are observed.

  6. 6.

    We are grateful to Günter Hitsch, who encouraged us to present the formal argument supporting this statement.

  7. 7.

    When the per-period utilities are parametric functions, this problem reduces to a simpler nonlinear instrumental variable setup. Although in our model the per-period utilities can be nonparametrically identified, we chose to adhere to a semiparametric model (with parametrically specified utilities) for inference in the econometric section as well as in our Monte Carlo and application. We believe the semiparametric model to be more common in applied work.

  8. 8.

    Under additional technical conditions, our estimation procedure can be extended to the case where utility functions are nonparametrically specified.

  9. 9.

    While our focus in the paper is on a single-agent finite-horizon dynamic discrete choice model with a terminating action, we can extend our estimation framework straightforwardly to multi-agent settings (assuming identification, which requires additional conditions for games compared to single agent models) as well as the finite dependence problems. For instance, for an extension to the finite dependence problems, we have to increase the number of terms that need to be included in the second stage IV regression, because the presence of the finite dependence makes the choice specific value function be determined by sequence of discounted future utilities in the “dependence window” as well as the expected ex ante value function.

  10. 10.

    In our Monte Carlo exercises, we specify the per period utility function to be a linear function of the state variables, and assume a type I extreme value distribution for the idiosyncratic payoff shocks. Because the type I extreme value distribution has an infinitely differentiable density, the continuation value function in period T−1 is infinitely differentiable. Continuing the backward induction, we can establish that, in the context of our specification, all continuation value functions are infinitely differentiable. This implies that the (optimal) convergence rate of the nonparametric estimator approaches the parametric convergence rate.

  11. 11.

    In particular if we use a degree K polynomial to estimate the nonparametric regression, then the bias corresponds to the magnitude of the residual from numerical approximation of the estimated function via such a polynomial. For infinitely differentiable functions, this residual can be majorized by a power function of the degree K, e.g., c K, where c depends on the support of the state space (see Judd 1998). At the same time, the variance of the estimator has the magnitude O(K/J). This permits choosing the degree of the approximating polynomials to be up to KJ/ log J. In practice, this choice can be made using cross-validation.

  12. 12.

    MATLAB code for the Monte Carlo exercises is provided as an Online Appendix.

  13. 13.

    We chose these specific parameter values in order for the state variables to have constant variance over time, although this feature is certainly not important to our method.

  14. 14.

    To be more precise, we set u(0,s) = u T (0,s) and u(k,s)+2 = u T (k,s) for k=1,2. Although the specified condition in the theorem states u(k,s) = u T (k,s) for k=0,1,2, in fact the theorem holds as long as the researcher knows the exact relationship between u(⋅,⋅) and u T (⋅,⋅) (which can be easily seen in the proof of the theorem). In the Monte Carlo study, since the researcher is assumed to know that the final period’s utility from prepayment or payment is higher than earlier period’s utility from the same action by 2 and impose that knowledge during estimation, the theorem still holds.

  15. 15.

    Data size is likely to become less of a binding constraint in many increasingly popular empirical settings in which large consumer panels are available.

  16. 16.

    We will investigate the implications of the identification regarding β for counterfactual predictions in our application, since estimation of β is the main focus of the application.

  17. 17.

    To be precise, when we increase u(0,s) by c (in all periods), also adding (1−β)c to u(1,s) and u(2,s) for t<T and adding c to u(1,s) and u(2,s) for the final period T ensures that the CCPs remain unaffected.

  18. 18.

    To be more precise, we focus on Hotz and Miller estimator augmented with conditional choice simulator proposed by Hotz et al. (1994). The original Hotz and Miller estimator requires computation of conditional choice probability for all feasible future paths, which could become impractical when, for example, some observable state variables are continuous. To address this problem, Hotz et al. (1994) proposed use of forward simulation so that only path of simulated future choices needs to be considered. Since the use of forward simulation clearly speeds up estimation, we use the Hotz and Miller estimator augmented with forward simulation, instead of the original Hotz and Miller estimator, in our comparison. For simplicity, we still call the estimator in the comparison group as Hotz and Miller.

  19. 19.

    In these tables, we assume that the researcher uses correctly specified state transitions in case of HM and BBL.

  20. 20.

    We used Hermite polynomials of the third degree to estimate the CCPs for our proposed estimator as well as for our implementation of HM and BBL. For our proposed estimator, we also used the same basis functions to compute the projection of the continuation value.

  21. 21.

    We find that results for BBL2 and BBL3 are also sensitive to the choice of initial values, but we do not present separate cases depending on the initial values for those due to space constraints. Results are available upon request.

  22. 22.

    We decided to use 250 draws for simulation since we found that using less than 250 simulation draws occasionally resulted in implausible estimates.

  23. 23.

    We found that initial guesses close to the true parameter values typically lead to sensible estimates, but not always.

  24. 24.

    The results on mean bias and standard errors, reported in the Online Appendix, also show that our proposed estimator outperforms the competing methods in terms of mean bias and standard error.

  25. 25.

    The fact that our estimator is easier to code, reducing the risk of analyst error, is hard to incorporate into a Monte Carlo, but would likely increase the precision and reduce the bias of our estimator relative to its rivals in applications.

  26. 26.

    Atlanta, Boston, Charlotte, Chicago, Cleveland, Dallas, Denver, Detroit, Las Vegas, Los Angeles, Miami, Minneapolis, New York, Phoenix, Portland, San Diego, San Francisco, Seattle, Tampa, and Washington D.C.

  27. 27.

    In a cash-out refinance, the borrower pays off the balance on the existing loan with a larger loan, and receives cash for the difference.

  28. 28.

    Specifically, for these loans, we observe the “front-end debt-to-income” ratio—defined as the ratio of monthly mortgage-related payments to the borrower’s income—from which we can impute the borrower’s income based on the fact that we also observe the numerator of the ratio.

  29. 29.

    It is possible that the maturity of a loan is negotiated in case the loan goes through a loan modification program. However, loan modifications occur very rarely, most of the modifications involve rate reduction or principal reduction, but not an extension of a loan term, and, importantly, whether a loan modification is granted or not is at the discretion of lenders, not borrowers. Therefore, it seems reasonable to assume that T is seen as exogenously fixed from the perspective of a borrower.

  30. 30.

    Under non-recourse, lenders cannot go after a defaulter’s assets other than the mortgage collateral (i.e., the house), which lowers the perceived cost of default to the borrower.

  31. 31.

    The value 0.993 corresponds to \(1/(1+\bar {r})\), where \(\bar {r}\) is the empirical mean of the monthly interest rate that borrowers pay on their mortgages.

  32. 32.

    Proof is in Appendix Appendix.


  1. Aguirregabiria, V (2010). Another look at the identification of dynamic discrete decision processes: an application to retirement behavior. Journal of Business & Economic Statistics, 28(2), 201–218.

    Article  Google Scholar 

  2. Aguirregabiria, V, & Magesan, A (2013). Euler equations for the estimation of dynamic discrete choice structural models. Advances in Econometrics, 31, 3–44.

    Article  Google Scholar 

  3. Aguirregabiria, V, & Suzuki, J (2014). Identification and counterfactuals in dynamic models of market entry and exit. Quantitative Marketing and Economics, 12(3), 267–304.

    Article  Google Scholar 

  4. Ai, C, & Chen, X (2003). Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica, 71(6), 1795–1843.

    Article  Google Scholar 

  5. Ahn, H, & Manski, C (1993). Distribution theory for the analysis of binary choice under uncertainty with nonparametric estimation of expectations. Journal of Econometrics, 56(3), 291–321.

    Article  Google Scholar 

  6. Altug, S, & Miller, R (1998). The effect of work experience on female wages and labour supply. Review of Economic Studies, 65(1), 45–85.

    Article  Google Scholar 

  7. Andrews, D (1991). Asymptotic normality of series estimators for nonparametric and semiparametric regression models. Econometrica, 59(2), 307–345.

    Article  Google Scholar 

  8. Arcidiacono, P, Bayer, P, Blevins, J, & Ellickson, P (2015). Estimation of dynamic discrete choice models in continuous time with an application to retail competition. forthcoming in Review of Economic Studies.

  9. Arcidiacono, P, & Miller, R (2011). CCP estimation of dynamic discrete choice models with unobserved heterogeneity. Econometrica, 79, 1823–1867.

    Article  Google Scholar 

  10. Arcidiacono, P, & Miller, R (2015a). Identifying dynamic discrete choice models off short panels. Working paper.

  11. Arcidiacono, P, & Miller, R (2015b). Nonstationary dynamic models with finite dependence. Working paper.

  12. Bajari, P, Benkard, L, & Levin, J (2007). Estimating dynamic models of imperfect competition. Econometrica, 75(5), 1331–1370.

    Article  Google Scholar 

  13. Bajari, P, Hong, H, & Nekipelov, D (2013). Game theory and econometrics: A survey of some recent research. In Advances in economics and econometrics: tenth world congress of econometric society (Vol. 3). Cambridge University Press.

  14. Beauchamp, A (2015). Regulation, imperfect competition, and the U.S. Abortion Market. International Economic Review, 56(3), 963–996.

    Article  Google Scholar 

  15. Chen, X (2007). Large sample sieve estimation of semi-nonparametric models. Handbook of Econometrics, 7, 5549–5632.

    Article  Google Scholar 

  16. Chen, X, Linton, O, & van Keilegom, I (2003). Estimation of semiparametric models when the criterion function is not smooth. Econometrica, 71(5), 1591–1608.

  17. Chen, X, Chernozhukov, V, Lee, S, & Newey, W (2014). Local identification of nonparametric and semiparametric models. Econometrica, 82(2), 785–809.

    Article  Google Scholar 

  18. Chung, D, Steenburgh, T, & Sudhir, K. (2014). Do bonuses enhance sales productivity? A dynamic structural analysis of bonus-based compensation plans. Marketing Science, 33(2), 165–187.

    Article  Google Scholar 

  19. Dubé, J-P, Hitsch, G, & Jindal, P (2014). The joint identification of utility and discount functions from stated choice data: an application to durable goods adoption. Quantitative Marketing and Economics, 12, 331–377.

    Article  Google Scholar 

  20. Duffie, D, & Singleton, KJ (1997). An econometric model of the term structure of interest-rate swap yields. Journal of Finance, 52(4), 1287–1321.

    Article  Google Scholar 

  21. Eckstein, Z, & Wolpin, K (1999). Why youths drop out of high school: The impact of preferences, opportunities, and abilities. Econometrica, 67(6), 1295–1339.

    Article  Google Scholar 

  22. Fang, H, & Wang, Y (2015). Estimating dynamic discrete choice models with hyperbolic discounting, with an application to mammography decisions. International Economic Review, 56(2), 565–596.

    Article  Google Scholar 

  23. Frederick, S, Loewenstein, G, & O’Donohue, T (2002). Time discounting and time preference: A critical review. Journal of Economic Literature, 40, 350–401.

    Article  Google Scholar 

  24. Härdle, W (1990). Applied nonparametric regression. Cambridge University Press.

  25. Hausman, J (1979). Individual discount rates and the purchase and utilization of energy-using durables. Bell Journal of Economics, 10(1), 33–54.

    Article  Google Scholar 

  26. Heckman, J, & Navarro, S (2007). Dynamic discrete choice and dynamic treatment effects. Journal of Econometrics, 136(2), 341–396.

    Article  Google Scholar 

  27. Hotz, J, & Miller, R (1993). Conditional choice probabilities and the estimation of dynamic models. Review of Economic Studies, 60(3), 497–529.

    Article  Google Scholar 

  28. Hotz, J, Miller, R, Sanders, S, & Smith, J (1994). A simulation estimator for dynamic models of discrete choice. Review of Economic Studies, 61(2), 265–289.

    Article  Google Scholar 

  29. Joensen, JS (2009). Academic and labor market success: The impact of student employment, abilities, and preferences. Working paper.

  30. Judd, K (1998). Numerical methods in economics. MIT Press.

  31. Kalouptsidi, M (2014). Time to build and fluctuations in bulk shipping. American Economic Review, 104(2), 564–608.

    Article  Google Scholar 

  32. Kalouptsidi, M, Scott, P, & Souza-Rodrigues, E (2016). Identification of counterfactuals in dynamic discrete choice models. Working Paper.

  33. Magnac, T, & Thesmar, D (2002). Identifying dynamic discrete decision processes. Econometrica, 70(2), 801–816.

    Article  Google Scholar 

  34. Mammen, E, Rothe, C, & Schienle, M (2012). Nonparametric regression with nonparametrically generated covariates. Annals of Statistics, 40(2), 1132–1170.

    Article  Google Scholar 

  35. Manski, C (1991). Nonparametric estimation of expectations in the analysis of discrete choice under uncertainty. In: Nonparametric and semiparametric methods in econometrics and statistics: proceedings of the fifth international symposium in economic theory and econometrics. Cambridge University Press.

  36. Manski, C (1993). Identification of endogenous social effects: The reflection problem. Review of Economic Studies, 60(3), 531–542.

    Article  Google Scholar 

  37. Manski, C (2000). Identification problems and decisions under ambiguity: Empirical analysis of treatment response and normative analysis of treatment choice. Journal of Econometrics, 95(2), 415–442.

    Article  Google Scholar 

  38. Newey, W (1997). Convergence rates and asymptotic normality for series estimators. Journal of Econometrics, 79, 147–168.

    Article  Google Scholar 

  39. Newey, W, & Powell, J (2003). Instrumental variable estimation of nonparametric models. Econometrica, 71(5), 1565–1578.

    Article  Google Scholar 

  40. Norets, A, & Takahashi, S (2013). On the surjectivity of the mapping between utilities and choice probabilities. Quantitative Economics, 4, 149–155.

    Article  Google Scholar 

  41. Norets, A, & Tang, X (2014). Semiparametric inference in dynamic binary choice models. Review of Economic Studies, 81(3), 1229–1262.

    Article  Google Scholar 

  42. Pagan, A, & Ullah, A (1999). Nonparametric econometrics. Cambridge University Press.

  43. Pollard, D (1984). Convergence of stochastic processes. Springer-Verlag.

  44. Pesendorfer, M., & Schmidt-Dengler, P. (2008). Asymptotic least squares estimators for dynamics games. Review of Economic Studies, 75(3), 901–928.

    Article  Google Scholar 

  45. Rust, J (1987). Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher. Econometrica, 55(5), 999–1033.

    Article  Google Scholar 

  46. Rust, J (1994). Structural estimation of Markov decision processes. In Engle, R, & McFadden, D (Eds.) Handbook of econometrics, Vol. 4. Amsterdam: North-Holland.

  47. Rust, J, & Phelan, C (1997). How social security and medicare affect retirement behavior in a world of incomplete markets. Econometrica, 65(4), 781–831.

    Article  Google Scholar 

  48. Scott, P (2013). Dynamic discrete choice estimation of agricultural land use. Working paper.

  49. van der Vaart, A, & Wellner, J (1996). Weak convergence and empirical processes. Springer.

  50. Wong, WH, & Shen, X (1995). Probability inequalities for likelihood ratios and convergence rates of sieve MLES. Annals of Statistics, 23(2), 339–362.

    Article  Google Scholar 

  51. Yao, S, Mela, CF, Chiang, J, & Chen, Y (2012). Determining consumers’ discount rates with field studies. Journal of Marketing Research, 49(6), 822–841.

    Article  Google Scholar 

Download references


We are grateful to the editor and anonymous reviewers for their insightful comments and constructive suggestions. The paper has also benefited from helpful comments by seminar participants at Chicago Booth, Olin Business School, Stanford GSB, Berkeley ARE, IO fest, Cirpée Conference on Industrial Organization, and Conference on “Recent Contributions to Inference in Game Theoretic Models” at University College London. All remaining errors are our own.

Author information



Corresponding author

Correspondence to Minjung Park.

Additional information

Chenghuan Sean Chu’s work on this paper was conducted before employment at Facebook.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 47.8 KB)


Appendix A: Optimal policy functions

Proposition 1

Under Assumption 1 there exists a unique decision rule \(D_{t}^{\ast }(s_{t},\varepsilon _{t})\) supported on A for t=1,2,…,T that solves the maximization problem

$$\sup\limits_{(D_{1},D_{2},\ldots,D_{T})\in A^{T}}V_{1,\sigma}(s_{1}). $$

Proof 3

Our argument uses backward induction. In the final period (at mortgage maturity) the borrower faces a static optimization problem of choosing among V T (0,s T ) + ε 0,T , V T (1,s T ) + ε 1,T , and V T (2,s T ) + ε 2,T . The optimal decision delivers the highest payoff, yielding the decision rule \(D_{T}^{\ast }(s_{T},\varepsilon _{T})=\text {arg\,max}_{k\in A}\{V_{T}(k,s_{T}) + \varepsilon _{k,T}\}\). Provided that the payoff shocks are idiosyncratic and have a continuous distribution, the optimal choice probabilities are characterized by continuous functions of (V T (k,s T ), kA). Knowing the optimal decision rule in period T, we can obtain the choice-specific value function in period T−1 as

$$\begin{array}{@{}rcl@{}} \textstyle V_{T-1}(k,s_{T-1})&=&u(k,s_{T-1})+\beta E\left[ \sum\limits_{k^{\prime}\in A}\mathbf{1}\{D_{T}^{\ast}=k^{\prime}\}\left( V_{T}(k^{\prime},s_{T})+\varepsilon_{k^{\prime},T}\right) \right.\\ &&\quad\quad\quad\quad\quad\quad\quad\quad\left.\big\vert\,s_{T-1},a_{T-1}=k\vphantom{\sum\limits_{k^{\prime}\in A}}\right] . \end{array} $$

Provided that the T th period optimal decision has already been derived, the optimal decision problem in T−1 becomes a static choice among three alternatives. Its solution, again, trivially exists and is (almost surely) unique because the distribution of ε T−1 is continuous. We iterate this procedure back to t = 1. □

Appendix B: Lemma 1

Under our assumptions, the system of equations

$$\begin{array}{@{}rcl@{}} && \sigma_{0}(z_{1},z_{2})=\bar{\sigma}_{0},\\ && \sigma_{1}(z_{1},z_{2})=\bar{\sigma}_{1} \end{array} $$

has a unique solution if and only if \(\bar {\sigma }_{0}+\bar {\sigma }_{1}<1\).

This result generalizes that in Hotz and Miller (1993) to general full support distributions, and is also proved in Norets and Takahashi (2013). For completeness of exposition, we provide the proof.

Proof 4

Consider partial derivatives

$$\begin{array}{@{}rcl@{}} \frac{\partial\sigma_{0}(z_{1},z_{2})}{\partial z_{1}} & =&-{\int}_{-\infty }^{+\infty}\frac{\partial^{2}F_{\varepsilon}}{\partial\varepsilon_{0} \partial\varepsilon_{1}}(\varepsilon_{0},\varepsilon_{0}-z_{1},\varepsilon_{0}-z_{2})\,d\varepsilon_{0},\;\;\\ \frac{\partial\sigma_{0}(z_{1},z_{2})}{\partial z_{2}} & =&-{\int}_{-\infty }^{+\infty}\frac{\partial^{2}F_{\varepsilon}}{\partial\varepsilon_{0} \partial\varepsilon_{2}}(\varepsilon_{0},\varepsilon_{0}-z_{1},\varepsilon_{0}-z_{2})\,d\varepsilon_{0}. \end{array} $$

Similarly, we can find that

$$\begin{array}{@{}rcl@{}} \frac{\partial\sigma_{1}(z_{1},z_{2})}{\partial z_{1}}&= & {\int}_{-\infty }^{+\infty}\frac{\partial^{2}F_{\varepsilon}}{\partial\varepsilon_{0} \partial\varepsilon_{1}}(z_{1}+\varepsilon_{1},\varepsilon_{1},z_{1} -z_{2}+\varepsilon_{1})\,d\varepsilon_{1}\\ && +{\int}_{-\infty}^{+\infty}\frac{\partial^{2}F_{\varepsilon}}{\partial \varepsilon_{1}\partial\varepsilon_{2}}(z_{1}+\varepsilon_{1},\varepsilon_{1},z_{1}-z_{2}+\varepsilon_{1})\,d\varepsilon_{1}\\ & =&{\int}_{-\infty}^{+\infty}\left( \frac{\partial^{2}F_{\varepsilon} }{\partial\varepsilon_{0}\partial\varepsilon_{1}}+\frac{\partial ^{2}F_{\varepsilon}}{\partial\varepsilon_{1}\partial\varepsilon_{2}}\right) (\varepsilon_{0},\varepsilon_{0}-z_{1},\varepsilon_{0}-z_{2})\,d\varepsilon_{0}, \end{array} $$


$$\frac{\partial\sigma_{1}(z_{1},z_{2})}{\partial z_{2}}=-{\int}_{-\infty }^{+\infty}\frac{\partial^{2}F_{\varepsilon}}{\partial\varepsilon_{1} \partial\varepsilon_{2}}(\varepsilon_{0},\varepsilon_{0}-z_{1},\varepsilon_{0}-z_{2})\,d\varepsilon_{0}. $$

We assumed that the joint distribution of errors has a continuous density with a full support on \({\mathbb {R}}^{3}\). Provided that \(\frac {\partial \sigma _{0}(z_{1},z_{2})}{\partial z_{1}}\,\frac {\partial \sigma _{0}(z_{1},z_{2} )}{\partial z_{2}}>0\) the mapping z 1z 2 implicitly defined by equation \(\sigma _{0}(z_{1},z_{2})=\bar {\sigma }_{0}\) is invertible. Moreover, if we denote this mapping \(z_{2}=m_{0}(z_{1},\bar {\sigma }_{0})\), then using the result regarding the derivative of the inverse function, we can conclude that

$$\frac{\partial m_{0}(z_{1},\bar{\sigma}_{0})}{\partial z_{1}}\leq0. $$

Similarly, we can define a map \(z_{2}=m_{1}(z_{1},\bar {\sigma }_{1})\), then using the result regarding the derivative of the inverse function, we can conclude that

$$\frac{\partial m_{1}(z_{1},\bar{\sigma}_{1})}{\partial z_{1}}\geq0. $$

We can explore the asymptotic behavior of both maps. Consider m 0 first. Suppose that z 1→−. Then \(\lim \limits _{z_{1} \rightarrow -\infty }m_{0}(z_{1},\bar {\sigma }_{0})=z_{2}^{\ast }\), where \(z_{2}^{\ast }\) solves \(\int \mathbf {1}\{\varepsilon _{0}\geq z_{2}^{\ast }+\varepsilon _{2}\}F_{\varepsilon }(d\varepsilon \,)=\bar {\sigma }_{0}\). Also let \(z_{1}^{\ast }\) solve \(\int \mathbf {1}\{\varepsilon _{0}\geq z_{1}^{\ast }+\varepsilon _{1}\}F_{\varepsilon }(d\varepsilon \,)=\bar {\sigma }_{0}\). Then \(\lim \limits _{z_{1}\rightarrow z_{1}^{\ast }}m_{0}(z_{1},\bar {\sigma }_{0})=-\infty \).

Next consider m 1. Suppose that \(z_{2}^{\ast \ast }\) is the solution of \(\int \mathbf {1}\{\varepsilon _{1}\geq z_{2}^{\ast \ast }+\varepsilon _{2}\}F_{\varepsilon }(d\varepsilon )=\bar {\sigma }_{1}\). Then as z 1→ + , the map approaches asymptotically to the line: \(m_{1}(z_{1},\bar {\sigma }_{1})\rightarrow z_{1}+z_{2}^{\ast \ast }\). Suppose that \(z_{1}^{\ast \ast }\) is the solution of \(\int \mathbf {1}\{z_{1}^{\ast \ast }+\varepsilon _{1}\geq \varepsilon _{0}\}F_{\varepsilon }(d\varepsilon \,)=\bar {\sigma }_{1}\). Then \(\lim \limits _{z_{1}\rightarrow z_{1}^{\ast \ast } }m_{1}(z_{1},\bar {\sigma }_{1})=-\infty \). Thus m 0 is a continuous strictly decreasing mapping from \((-\infty ,\,z_{1}^{\ast }]\) into \((-\infty ,z_{2}^{\ast }]\) and m 1 is a continuous strictly increasing mapping from \([z_{1}^{\ast \ast },+\infty )\) into the real line.

Provided that both curves are continuous and monotone, they intersect if and only if their projections on z 1 and z 2 axes overlap. The projections on the z 2 axis are guaranteed to overlap (\((-\infty ,\,z_{2}^{\ast }]\subset {\mathbb {R}}\)). The projections on the z 1 axis will overlap if and only if \(z_{1}^{\ast \ast }<z_{1}^{\ast }\). Given that function \(\sigma (z)=\int \mathbf {1}\{\varepsilon _{0}-\varepsilon _{1}\leq z\}F_{\varepsilon }(d\varepsilon \,)\) is strictly monotone in z, then \(z_{1}^{\ast \ast }<z_{1}^{\ast }\) if and only if \(\bar {\sigma }_{0}+\bar {\sigma }_{1}<1\). This proves the statement of Lemma 1. □

Appendix C: Asymptotic theory for the plug-in estimator

Section 3 outlined the structure of the two-step plug-in estimator for the structural parameters, which include the per-period payoffs and the discount factor. This Appendix provides the asymptotic theory for the constructed estimator. We assume a parametric specification for the per-period utility, although our theory allows for an immediate extension to a nonparametric specification of the per-period utility. A key requirement of the plug-in semiparametric procedure is that the first-stage nonparametric estimator of the policy functions converge at a sufficiently fast rate. Our results for the consistency and the convergence rate of the first-stage estimator rely on the results in Wong and Shen (1995), Andrews (1991), and Newey (1997).

To assure consistency and a fast convergence rate for the first-stage estimator, we need the following assumption.

Assumption 2

  1. (i)

    In addition to the Markov assumption (Assumption 1.iii), for each period t the distribution of states s t |s t−1 is identical across borrowers and over time, and the choice probabilities σ k,t (⋅) are uniformly bounded from 0 and 1 for each k=0,1,2. The state space \({\mathcal {S}}\) is compact.

  2. (ii)

    The eigenvalues of E[q L (s t )q L′ (s t ) | a t ] are bounded away from zero uniformly over L, and |q l (⋅)|≤C for all l.

  3. (iii)

    \(\frac {\sigma _{k,t}(s)}{\sigma _{0,t}(s)}\) belongs to a separable functional space with basis \(\{q_{l}(\cdot )\}_{l=1}^{\infty }\) . For each t≤T and k∈{1,2} the selected series terms provide a uniformly good approximation for the probability ratio

    $$\sup\limits_{s\in{\mathcal{S}}}\left\Vert \log\,\frac{\sigma_{k,t}(s)} {\sigma_{0,t}(s)}-\text{proj}\left( \log\,\frac{\sigma_{k,t}(s)}{\sigma_{0,t}(s)}\,\big\vert\,q^{L}(\cdot)\right) \right\Vert =O(L^{-\alpha}) $$

    for some \(\alpha \geq \frac {1}{2}\).

Assumption 2 can be verified for particular classes of polynomials and sieves (see Chen 2007). Assumption 2 implies the following result establishing the consistency and convergence rate of the first-stage estimator for the policy functions.Footnote 32

Theorem 3

Under Assumptions 1 and 2, the estimator (4) is consistent uniformly over s:

$$\sup\limits_{s\in{\mathcal{S}}}\left\Vert \widehat{\sigma}_{k,t}(s)-{\sigma }_{k,t}(s)\right\Vert =o_{P}\left( J^{-1/4}\right) $$

provided that L→∞ with \(\frac {J}{L\log (J)}\rightarrow \infty \) as J→∞.

The asymptotics in this theorem is in terms of the number of loans J, reflecting the fact that each loan is observed only once for a given t. We use the estimated first-stage policy functions as inputs for the estimation of the second-stage structural parameters. Our approach is based on applying existing plug-in implementations for estimating the system of Eq. 5. These techniques involve constructing nonparametric elements based on a statistical model (in our case, the policy functions) that are then plugged into a fully parametric second step. Estimation in the second step is commonly performed by means of a weighted minimum distance procedure, with weights that are chosen optimally to maximize the efficiency of the resulting estimator.

To establish the asymptotic properties of the designed procedure we impose the following assumptions.

Assumption 3

  1. (i)

    Parameter space Θ is a compact subset of \({\mathbb {R}}^{p}\).

  2. (ii)

    The per-period payoff is Lipschitz-continuous in parameters.

  3. (iii)

    The variance of the one-period-ahead policy function is bounded (\(\sup \limits _{s\in {\mathcal {S}}}E\left [ \sigma _{k,t+1}(s_{t+1} )^{2}\,|\,s_{t}=s\right ] <1\) ) and strictly positive ( \(\inf \limits _{s\in {\mathcal {S}}}E\left [ \sigma _{k,t+1}(s_{t+1})^{2}\,|\,s_{t}=s\right ] >0\) ) for any t<T.

Under this assumption and the technical assumption described in Appendix Appendix, which restricts the complexity of the class of functions that is associated with our “nonparametric multinomial logit” estimator, we can use the results regarding semiparametric plug-in estimators in Ai and Chen (2003) and Chen et al. (2003), and establish the following result for the estimator for the second-stage structural parameters.

Theorem 4

Under Assumptions 1, 2 and 3, the estimator (5) is consistent and has asymptotic normal distribution:

$$\sqrt{JT^{\ast}}\left( \left( \hat{\theta}(a),\hat{\beta}\right) -\left( \theta_{0}(a),\beta\right) \right) \overset{d}{\longrightarrow}N(0,\,V). $$

where variance V is determined by the functional structure of the model.

The result of this theorem follows from Theorem 3.1 in Ai and Chen (2003). A significant difference between (5) used for our estimation and the conditional moment equations implied by infinite-horizon Markov dynamic decision processes is that the one-period-ahead values in our moment equations are estimated separately. As a result, the estimated choice-specific value function and the ex ante value function can be considered to be unrelated nonparametric objects (in contrast to infinite-horizon dynamics, in which the two are connected via a fixed point). This feature facilitates the evaluation of the asymptotic variance.

An explicit expression for the variance can be obtained as follows. We introduce

$$J_{k}(\sigma_{0,t},\sigma_{1,t},\sigma_{0,t+1},\sigma_{1,t+1},s)=\left( \frac{\partial F_{k}}{\partial\sigma_{0,t}},\,\frac{\partial F_{k}} {\partial\sigma_{1,t}},\,\frac{\partial F}{\partial\sigma_{0,t+1}} ,\,\frac{\partial F}{\partial\sigma_{1,t+1}}\right)^{\prime} $$

, J(s)=(J 1(s), J 2(s)), and

$$\begin{array} [c]{l} M(s)=\\ E\!\left[ \!\!\left. \left( \begin{array} [c]{cccc} \frac{\partial u(s_{t};\theta(1))}{\partial\theta(1)} & 0 & -\frac{\partial u(s_{t};\theta(0))}{\partial\theta(0)}+\beta\frac{\partial u(s_{t+1} ;\theta(0))}{\partial\theta(0)} & F(s_{t+1})+u(s_{t+1};\theta(0))\\ 0 & \frac{\partial u(s_{t};\theta(2))}{\partial\theta(2)} & -\frac{\partial u(s_{t};\theta(0))}{\partial\theta(0)}+\beta\frac{\partial u(s_{t+1} ;\theta(0))}{\partial\theta(0)} & F(s_{t+1})+u(s_{t+1};\theta(0)) \end{array} \right) \right\vert \;s_{t}\,=\,s\!\right] { \!,} \end{array} $$

as well as

$${\Omega}(s)=\text{Var}\left( \left( \widehat{\sigma}_{0,t}(s_{t} ),\,\widehat{\sigma}_{1,t}(s_{t}),\,\widehat{\sigma}_{0,t+1}(s_{t+1} ),\,\widehat{\sigma}_{1,t+1}(s_{t+1})\right) ^{\prime}\,\big\vert\,s_{t} =s\right) . $$

Then, the variance of the second-stage estimates is determined by the sampling noise and the error from the first stage estimates:

$$V=E\left[ M(s_{t})^{-1}\left( E\left[ Z_{t}^{\prime}\text{Var}(\epsilon_{t}|s_{t})Z_{t}\,|\,s_{t}\right] +J(s_{t})\,{\Omega}(s_{t})\,J(s_{t})^{\prime }\right) M(s_{t})^{-1\prime}\,\right] . $$

As an alternative to using the asymptotic formula, we can use the subsampling approach to estimate the variance.

Appendix D: Proof of Theorem 3

In this proof by n we denote the sample size corresponding to the borrowers observed with t periods from mortgage origination. We introduce the notation for the trinomial logit function \(\ell (z_{1},z_{2})=\frac {\exp (z_{1})} {1+\exp (z_{1})+\exp (z_{2})}\). By \(\tilde {\sigma }_{k,t}^{L}(s)\) we denote the choice probability

$$\tilde{\sigma}_{k,t}^{L}(s)=\ell(\tilde{r}^{L\prime}(t,k)q^{L}(s),\,\tilde {r}^{L\prime}(t,j)q^{L}(s)),\;\;j\neq k $$

where \(\tilde {r}^{L}(t,k)\) are the coefficients of the projection of the probability ratio \(\log \,\frac {\sigma _{k,t}(s)}{\sigma _{0,t}(s)}\) on L first orthogonal polynomials. We also denote \(\tilde {\sigma }_{0,t}^{L} (s)=1-\tilde {\sigma }_{1,t}^{L}(s)-\tilde {\sigma }_{2,t}^{L}(s)\). We note that \(\frac {\partial \ell }{\partial z_{1}},\;\frac {\partial \ell }{\partial z_{2}} \leq \frac {1}{2}.\) Thus, \(\frac {1}{2}\) is a uniform Lipschitz constant and

$$\begin{array}{@{}rcl@{}} \sup\limits_{s\in{\mathcal{S}}}|\tilde{\sigma}_{k,t}^{L}(s)\!\!\! && -\sigma _{k,t}(s)|=\sup\limits_{s\in{\mathcal{S}}}\left\vert \ell\left( \log \frac{\tilde{\sigma}_{k,t}^{L}(s)}{\tilde{\sigma}_{0,t}^{L}(s)},\,\log \frac{\tilde{\sigma}_{j,t}^{L}(s)}{\tilde{\sigma}_{0,t}^{L}(s)}\right) -\ell\left( \log\frac{\sigma_{k,t}(s)}{\sigma_{0,t}(s)},\,\log\frac {\sigma_{j,t}(s)}{\sigma_{0,t}(s)}\right) \right\vert \\ &&\leq\frac{1}{2}\,\sqrt{\sup\limits_{s\in{\mathcal{S}}}\left\vert \log \frac{\tilde{\sigma}_{1,t}^{L}(s)}{\tilde{\sigma}_{0,t}^{L}(s)}-\log \frac{\sigma_{1,t}(s)}{\sigma_{0,t}(s)}\right\vert^{2}+\sup\limits_{s\in {\mathcal{S}}}\left\vert \log\frac{\tilde{\sigma}_{2,t}^{L}(s)}{\tilde{\sigma }_{0,t}^{L}(s)}-\log\frac{\sigma_{2,t}(s)}{\sigma_{0,t}(s)}\right\vert^{2} }=O(L^{-\alpha}) \end{array} $$

for some \(\alpha \geq \frac {1}{2}.\) This guarantees the quality of approximation of the choice probability using a logit transformation of the series expansion.

Now we omit index t in the variables (whenever the period of time under consideration is known), use \({r_{k}^{L}}\) in place of r L(t,k), and construct function

$$\begin{array}{@{}rcl@{}} \rho(a,s;{r_{1}^{L}},{r_{2}^{L}})&= & \left( \mathbf{1}\{a=1\}-\mathbf{1} \{a=0\}\right) \ell\left( r_{1}^{L\prime}\,q^{L}(s),\;r_{2}^{L\prime} \,q^{L}(s)\right) \\ && +\left( \mathbf{1}\{a=2\}-\mathbf{1}\{a=0\}\right) \ell\left( r_{2}^{L\prime}\,q^{L}(s),\;r_{1}^{L\prime}\,q^{L}(s)\right) . \end{array} $$

Then we can express the sample quasi-likelihood as

$$\widehat{Q}({r_{1}^{L}},{r_{2}^{L}})=E_{n}\left[ \rho(a,s;{r_{1}^{L}},r_{2}^{L})\right] +E_{n}\left[ \mathbf{1}\{a=0\}\right] , $$

where we adopted the notation from the empirical process theory where \(E_{n}[\cdot ]=\frac {1}{n}{\sum }_{i=1}^{n}\). Also introduce the population likelihood with the series expansion

$${Q}({r_{1}^{L}},{r_{2}^{L}})=E\left[ \rho(a,s;{r_{1}^{L}},{r_{2}^{L}})\right] +E\left[ \mathbf{1}\{a=0\}\right] . $$

Consider function

$$\begin{array}{@{}rcl@{}} f(a,s;\,{r_{1}^{L}},{r_{2}^{L}},\tilde{r}_{1}^{L},\tilde{r}_{2}^{L})&=&\rho (a,s;{r_{1}^{L}},{r_{2}^{L}})-\rho(a,s;\tilde{r}_{1}^{L},\tilde{r}_{2}^{L} )-E[\rho(a,s;{r_{1}^{L}},{r_{2}^{L}})]\\&&+E[\rho(a,s;\tilde{r}_{1}^{L},\tilde{r} _{2}^{L})]. \end{array} $$

Provided that we established that function (⋅,⋅) is Lipschitz, we can evaluate

$$\text{Var}\left( f(a,s;\,{r_{1}^{L}},{r_{2}^{L}},\tilde{r}_{1}^{L},\tilde{r}_{2}^{L})\right) =O\left( L\sup\limits_{k=1,2,\,l\leq L}\Vert r_{k,l} -\tilde{r}_{k,l}\Vert\right) =O(L). $$

where \({r_{k}^{L}}=(r_{k,1},\ldots ,r_{k,L})\). Next we impose a technical assumption that allows us to establish consistency of estimator (4).

Assumption 4

Consider the class of functions indexed by n

$$\begin{array}{@{}rcl@{}} {\mathcal{F}}_{n}&=&\left\{ f(\cdot,\cdot;\,r_{1}^{L_{n}},r_{2}^{L_{n}} ,\tilde{r}_{1}^{L_{n}},\tilde{r}_{2}^{L_{n}})-E[f(\cdot,\cdot;\,r_{1}^{L_{n} },r_{2}^{L_{n}},\tilde{r}_{1}^{L_{n}},\tilde{r}_{2}^{L_{n}})],\;r_{k,l} \in{\Theta},\right.\\&&\left.l\leq L_{n},\,k=1,2\vphantom{\frac{}{}}\right\} , \end{array} $$

where Θ is the compact subset of \(\mathbb {R}\) and \(\tilde {r}_{1} ^{L_{n}}\) and \(\tilde {r}_{2}^{L_{n}}\) are the coefficients of projections of population probability ratios on L n series terms. Then for each L n →∞ such that n/(L n log n)→∞ the L 1 covering number for class \({\mathcal {F}}_{n}\) , N, has the following bound

$$\log\,N\left( \delta,\,{\mathcal{F}}_{n},\,\mathbf{L}_{1}\right) \leq An^{r_{0}}\log\frac{1}{\delta}, $$

where \(0<r_{0}\leq \frac {3}{4}\) and r 0 ↓0 is assumed to correspond to the factor log n.

This is the condition restricting the complexity of the functions created by logit transformations of series expansions. By construction any \(f\in {\mathcal {F}}_{n}\) is bounded |f|<1<. We established that Var(f) = O(L n ) for \(f\in {\mathcal {F}}_{n}\). The symmetrization inequality (30) in Pollard (1984) holds if \(\varepsilon _{n}/(16n\,\mu _{n}^{2})\leq \frac {1}{2}\). This will occur if \(\frac {n\epsilon _{n}}{N^{2} }\rightarrow \infty \). Provided that the symmetrization inequality holds, we can follow the steps of Theorem 37 in Pollard (1984) to establish the tail bound on the deviations of the sample average of f via a combination of the Hoeffding inequality and the covering number for the class \({\mathcal {F}}_{n} \). As a result, we obtain that

$$\begin{array}{@{}rcl@{}} && P\left( \sup\limits_{f\in{\mathcal{F}}_{n}}\frac{1}{n}\biggl\|E_{n} [f(\cdot)]\biggr\|>8\mu_{n}\right) \\ && \leq2\exp\left( An^{r_{0}}\log\frac{1}{\mu_{n}}\right) \exp\left( -\frac{1}{128}\frac{n{\mu_{n}^{2}}}{L_{n}}\right) +P\left( \sup\limits_{f\in {\mathcal{F}}_{n}}\frac{1}{n}\biggl\|E_{n}[f(\cdot)]^{2}\biggr\|>64L_{n} \right) . \end{array} $$

The second term can be evaluated with the aid of Lemma 33 in Pollard (1984):

$$P\left( \sup\limits_{f\in{\mathcal{F}}_{n}}\frac{1}{n}\biggl\|E_{n} [f(\cdot)]^{2}\biggr\|>64L_{n}\right) \leq4\exp\left( An^{2r_{0}}\log \frac{1}{L_{n}}\right) \exp(-nL_{n}). $$

As a result, we find that

$$\begin{array}{@{}rcl@{}} && P\left( \sup\limits_{f\in{\mathcal{F}}_{n}}\frac{1}{n}\biggl\|E_{n} [f(\cdot)]\biggr\|>8\mu_{n}\right) \\ && \leq2\exp\left( An^{r_{0}}\log\frac{1}{\mu_{n}}\right) \exp\left( -\frac{1}{128}\frac{n{\mu_{n}^{2}}}{L_{n}}\right) +4\exp\left( An^{r_{0}} \log\frac{1}{L_{n}}-nL_{n}\right) . \end{array} $$

We start the analysis with the first term. Consider the case with r 0>0. Then the log of the first term takes the form

$$An^{r_{0}}\log(1/\mu_{n})-\frac{1}{128}\frac{n{\mu_{n}^{2}}}{L_{n}}. $$

Then one needs that \(\frac {n{\mu _{n}^{2}}}{L_{n}\,n^{r_{0}}\log \,n} \rightarrow \infty \) if r 0>0 and \(\frac {n{\mu _{n}^{2}}}{L_{n}\,\log ^{2} \,n}\rightarrow \infty \) if r 0 0. Hence the first term is of o(1). This condition also guarantees that the second term vanishes. We note also that the CLT applies to the term \(E_{n}\left [ \mathbf {1}\{a=0\}\right ] =E\left [ \mathbf {1}\{a=0\}\right ] +O_{p}(\frac {1}{\sqrt {n}})\). Now for some slowly diverging sequence δ n such that \(\mu _{n}=\delta _{n}\sqrt {\frac {L_{n}\,n^{r_{0}}\log \,n}{n}}\rightarrow 0\), we establish that

$$\begin{array}{@{}rcl@{}} \sup\limits_{(r_{1}^{L_{n}},r_{2}^{L_{n}})\in{\Theta}^{L_{n}}\times{\Theta}^{L_{n}}}\!\!\!&&\left\Vert \widehat{Q}({r_{1}^{L}},{r_{2}^{L}})-{Q}({r_{1}^{L}},r_{2} ^{L})+\widehat{Q}(\tilde{r}_{1}^{L},\tilde{r}_{2}^{L})-{Q}(\tilde{r}_{1}^{L},{r_{2}^{L}})\right\Vert \\&&=O_{p}\left( \mu_{n}+\frac{1}{\sqrt{n}}\right) =o_{p}(1). \end{array} $$

Thus, the sample quasi-likelihood converges uniformly to the population quasi-likelihood and the estimated choice probabilities are uniformly consistent over \(\mathcal {S}\). To establish the rate for the estimated choice probabilities, we consider a neighborhood of the population projections defined by \(\sup \limits _{k=1,2,\,l\leq L_{n}}\Vert r_{k,l}-\tilde {r} _{k,l}\Vert \leq \varepsilon \). Using Lemma 2.3.1 from van der Vaart and Wellner (1996), we can find that

$$E\left[ \sup\limits_{f\in{\mathcal{F}}_{n}}\sqrt{n}E_{n}[f(\cdot)]\right] \leq Cn^{r_{0}/2}\sqrt{L_{n}}\varepsilon\log\frac{1}{\sqrt{L_{n}}\varepsilon }, $$

for some constant C. Using Theorem 3.4.1 from van der Vaart and Wellner (1996) and the derived inequality, we can express the convergence rate for the estimated parameters of the approximated choice probabilities as \(\rho _{n}^{2}n^{r_{0}/2}\sqrt {L_{n}}\frac {1}{\rho _{n}}\log \,\frac {\rho _{n}}{\sqrt {L_{n}}}\leq \sqrt {n}\). Then

$$\sup\limits_{s\in{\mathcal{S}}}\left\Vert \widehat{\sigma}_{k,t} (s)-\tilde{\sigma}_{k,t}(s)\right\Vert =O_{p}\left( \frac{L_{n}}{\rho_{n} }\right) . $$

To attain the rate o p (n −1/4) we need to assure that \(\frac {L_{n}} {\rho _{n}}=o(n^{-1/4})\). To assure n −1/4 we choose δ n →0 and set L n = δ n n −1/4 ρ n . Then the rate constraint can be re-written as

$${\rho_{n}^{2}}n^{-3/8+r_{0}/2}\,\frac{\log\,\frac{n^{1/4}\sqrt{\rho_{n}}} {\sqrt{\delta_{n}}}}{\frac{n^{1/4}\sqrt{\rho_{n}}}{\sqrt{\delta_{n}}}} \leq\sqrt{n}. $$

Provided that \(\lim \limits _{x\rightarrow \infty }\log \,x\,/\,x=0\), we conclude that \(\rho _{n}=O(n^{7/8-r_{0}/2})\), meaning that \(L_{n}=o(n^{5/8-r_{0}/2})\). We note that the slowest rate for the choice of L n has to satisfy

$$\frac{n^{1-r_{0}}{\mu_{n}^{2}}}{L_{n}\log\,n}\rightarrow\infty, $$

for μ n →0. Thus, the estimator with rate o(n −1/4) is plausible if r 0<3/4. Using the triangle inequality and our previous result, we find that

$$\sup\limits_{s\in{\mathcal{S}}}\left\Vert \widehat{\sigma}_{k,t}(s)-{\sigma }_{k,t}(s)\right\Vert =O_{p}\left( \frac{L_{n}}{\rho_{n}}+L_{n}^{-\alpha }\right) =o_{p}\left( n^{-1/4}\right) , $$

if α≥1.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bajari, P., Chu, C.S., Nekipelov, D. et al. Identification and semiparametric estimation of a finite horizon dynamic discrete choice model with a terminating action. Quant Mark Econ 14, 271–323 (2016).

Download citation


  • Finite horizon optimal stopping problem
  • Time preferences
  • Semiparametric estimation

JEL Classification

  • C14
  • C18
  • C50