Abstract
Duration data often suffer from both left-truncation and right-censoring. We show how both deficiencies can be overcome at the same time when estimating the hazard rate nonparametrically by kernel smoothing with the nearest-neighbor bandwidth. Smoothing Turnbull’s estimator of the cumulative hazard rate, we derive strong uniform consistency of the estimate from Hoeffding’s inequality, applied to a generalized empirical distribution function. We also apply our estimator to rating transitions of corporate loans in Germany.
Similar content being viewed by others
References
Bluhm, C., Overbeck, L., Wagner, C.: An Introduction to Credit Risk Modeling. Chapman & Hall, London (2002)
Einmahl, U., Mason, D.: Uniform in bandwidth consistency of kernel-type function estimators. Ann. Stat. 33, 1380–1403 (2005)
Gefeller, O., Dette, H.: Nearest neighbour kernel estimation of the hazard function from censored data. J. Stat. Comput. Simul. 43, 93–101 (1992)
Gefeller, O., Weißbach, R., Bregenzer, T.: The implementation of a data-driven selection procedure for the smoothing parameter in nonparametric hazard rate estimation using SAS/IML software. In: Friedl, H., Berghold, A., Kauermann, G. (eds.) Proceedings of the 13th SAS European Users Group International Conference, Stockholm, pp. 1288–1300. SAS Institute, Carry (1996)
Goto, F.: Achieving semiparametric efficiency bounds in left-censored duration models. Econometrica 64(2), 439–442 (1996)
Grillenzoni, C.: Robust nonparametric estimation of the intensity function of point data. AStA Adv. Stat. Anal. 92, 117–134 (2008)
Hewitt, E., Savage, L.: Symmetric measures on Cartesian products. Trans. Am. Math. Soc. 80, 470–501 (1955)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58, 13–30 (1963)
Kiefer, N., Larson, C.: A simulation estimator for testing the time homogeneity of credit rating transitions. J. Empir. Finance 14, 818–835 (2007)
Kim, Y.-D., James, L., Weißbach, R.: Bayesian analysis of multi-state event history data: Beta–Dirichlet process prior. Biometrika 99, 127–140 (2012)
Knüppel, L., Hermsen, O.: Median split, k-group split, and optimality in continuous populations. AStA Adv. Stat. Anal. 94, 53–74 (2010)
Li, D., Li, Q.: Nonparametric/semiparametric estimation and testing of econometric models with data dependent smoothing parameters. J. Econom. 157, 179–190 (2010)
Merton, R.: On the pricing of corporate debt: the risk structure of interest rates. J. Finance 29, 449–470 (1974)
Schäfer, H.: Local convergence of empirical measures in the random censorship situation with application to density and rate estimators. Ann. Stat. 14, 1240–1245 (1986)
Shorack, G., Wellner, J.: Empirical Processes with Application to Statistics. Wiley, New York (1986)
Silverman, B.: Density Estimation. Chapman & Hall, London (1986)
Stute, W.: Almost sure representations of the product-limit estimator for truncated data. Ann. Stat. 21, 146–156 (1993)
Turnbull, B.W.: The empirical distribution function with arbitrarily grouped, censored and truncated data. J. R. Stat. Soc., Ser. B, Stat. Methodol. 38, 290–295 (1976)
Weißbach, R., Mollenhauer, T.: Modelling rating transitions. J. Korean Stat. Soc. 4, 469–485 (2011)
Weißbach, R., Pfahlberg, A., Gefeller, O.: Double-smoothing in kernel hazard rate estimation. Methods Inf. Med. 47, 167–173 (2008)
Weißbach, R., Tschiersch, P., Lawrenz, C.: Testing time-homogeneity of rating transitions after origination of debt. Empir. Econ. 36, 575–596 (2009)
Weißbach, R., Walter, R.: A likelihood ratio test for stationarity of rating transitions. J. Econom. 155, 188–194 (2010)
Weißbach, R.: A general kernel functional estimator with general bandwidth—strong consistency and applications. J. Nonparametr. Stat. 18, 1–12 (2006)
Wied, D., Weißbach, R.: Consistency of the kernel density estimator—a survey. Stat. Pap. 53, 1–21 (2012)
Woodroofe, M.: Estimating a distribution function with truncated data. Ann. Stat. 13, 163–177 (1985)
Acknowledgement
Financial support by Deutsche Forschungsgemeinschaft is gratefully acknowledged (SFB 823 and Grant WE3573/2).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Proof of Theorem 2
The proof of Theorem 2 is in four steps. First, for an interval I:=[a,b]⊆[A,B] we establish an exponential bound for the distribution of the difference \(|\varPsi _{n}^{*}(I) - \varPsi(I)|\):
for all p>0, ε>0, n∈ℕ>0 and for each fixed I⊆[A,B] with Ψ(I)≤p.
Because of definition (4) and the boundedness of \(0 \le \varDelta ^{x}_{i} \le \varDelta _{\max} < \infty\)
is the arithmetic mean of the n independent and bounded random variables for each fixed I⊆[A,B], distributed as
The expectation, the variance and the bound of T I can then be calculated for fixed I⊆[A,B] with Ψ(I)≤p.
The expectation of T I follows from assumption (B3):
From assumption (B1), we get the following bound of |T I | on [A,B]:
The variance of T I can be obtained from the expectation (11) and the bound (12) as follows:
From (10), (11), (12), (13), and the inequality from Hoeffding (1963) results the following right bound:
for each fixed interval I⊆[A,B] with Ψ(I)≤p.
In the second step we derive the inequality
almost surely for a constant \(C>\sqrt{2(2\varDelta _{\max} M + \varPsi(B))}\) and large n.
On the right hand side of the inequality (9), p and ε can be substituted with p n and \(\varepsilon_{n}:=C\sqrt{\log{(n)}p_{n}/n}\) for C>0 and n>1 altering the upper bound to
The series (A n ) is then summable starting from some large n<∞ only if the exponent β n :=(C 2 p n )/(2g(p n +ε n ))>1. From \(\varepsilon_{n}/p_{n} = C\sqrt{\log(n)/(np_{n})}\) and the assumptions for p n follow ε n /p n →0 and p n /(p n +ε n )→1 for large n. The condition β n >1 can be then achieved with C 2/2g>1 or \(C > \sqrt{2g}\).
As a consequence, the series (A n ) is summable from some large n<∞ and only for \(C>\sqrt{2g}\). For each I⊆[A,B] with Ψ(I)≤p n we get then \(\exists C>\sqrt{2g}\) ∃m<∞, \(m \in \mathbb{N}: \sum_{n=m}^{\infty} P(|\varPsi _{n}^{*}(I)-\varPsi(I)| > \varepsilon_{n}) < \sum_{n=m}^{\infty} A_{n} < \infty\) and \(\forall m < \infty, m \in \mathbb{N}: \sum_{n=1}^{m} P(|\varPsi _{n}^{*}(I)-\varPsi(I)| > \varepsilon_{n}) \le m < \infty\).
Because of the summability of \(P(|\varPsi _{n}^{*}(I)-\varPsi(I)| > \varepsilon_{n})\),
results from the Borel–Cantelli lemma for \(C>\sqrt{2g}\), i.e. \(|\varPsi _{n}^{*}(I)-\varPsi(I)|\) does not exceed ε n for most of the n. For large n and for all I⊆[A,B] with Ψ(I)≤p n , we derive almost surely that \(|\varPsi _{n}^{*}(I)-\varPsi(I)| \le C\sqrt{\log{(n)}p_{n}/n}\).
The same inequality holds for the supremum of \(|\varPsi _{n}^{*}(I)-\varPsi(I)|\) on [A,B]: \(\sup_{I \subseteq [A,B], \varPsi(I)\le p_{n}}|\varPsi _{n}^{*}(I)-\varPsi(I)| \le C\sqrt{\log{(n)}p_{n}/n}\) for \(C>\sqrt{2g}\) and large n almost surely.
Using the results above we prove the following inequality in a third step:
almost surely for some C>D G ⋅M and large n.
From assumption (B4) and the limes superior formulation of Hewitt and Savage (1955) we obtain the right bound
almost surely for \(C_{1}'>D_{G}\), large n and all x∈[A,B]. These bounds can be rewritten for G n (x) as follows:
From assumption (B4) we have inf t∈[A,B] G(t)>0. Because of \(\sqrt {\log {(n)} / n} \rightarrow 0\), the following inequalities hold for x∈[A,B] and large n:
and
The following bounds for \(\varPsi _{n}^{*}(I)-\varPsi(I)\) and \(\varPsi _{n}^{*}(I)\) result from (14) almost surely for I⊆[A,B] with Ψ(I)≤p n , large n and \(C_{2}'>\sqrt{2\cdot(2\varDelta _{\max} M + \varPsi(B))}\):
and consequently
We then obtain the following equation from assumption (B2) almost surely for each I⊆[A,B] with Ψ(I)≤p n and large n:
By \(p_{n} + C_{2}'\sqrt{\log{(n)}p_{n}/n} = p_{n}[1 + C_{2}'\sqrt{\log{(n)}/(p_{n}n)}]\) it is evident that \(C_{2}'\sqrt{\log{(n)}/(p_{n}n)}\) can be neglected for large n because of the assumptions for p n . For large n, we can also neglect the term \(\sqrt {\log {(n)} / n}\) in the numerator. For all I⊆[A,B] with Ψ(I)≤p n and for large n, we derive the inequality
almost surely.
The requested bound
results for some C>D G ⋅M and large n almost surely.
In a final step we examine the expression \(\sup_{I \subseteq [A,B], \varPsi(I)\le p_{n}}|\varPsi _{n}(I)-\varPsi (I)|\). This overall difference can be represented by the sum of the deviations of the empirical and theoretical measures Ψ n (I) and Ψ(I) from the preliminary measure \(\varPsi _{n}^{*}( I)\) as follows:
Because \(p_{n}\sqrt{\log (n)/n}/\sqrt{\log (n)p_{n}/n} = \sqrt{p_{n}}\) approaches zero, i.e. \(p_{n}\sqrt{\log (n)/n} \le \sqrt{\log (n)p_{n}/n}\) holds for large n.
The previously mentioned upper bounds of \(|\varPsi _{n}(I)-\varPsi _{n}^{*}(I)|\) and \(|\varPsi _{n}^{*}(I)-\varPsi (I)|\) imply the existence of a constant \(C > \sqrt{2\cdot(2\varDelta _{\max} M + \varPsi(B))} + D_{G} \cdot M\), such that almost surely for large n
Due to the symmetry of Ψ n (I) the limes superior formulation of the convergence follows from Hewitt and Savage (1955).
Appendix B: Proof of Corollary 3
The boundedness of the \(\varDelta ^{x}_{i}\) for each x∈[A,B] and conditions (B1) and (B2) follow from the definition of \(\varDelta ^{x}_{i}\). This is so because the variables \(\varDelta ^{x}_{i}\) do not depend on the x.
The consistency of the estimator G n (⋅) (B4) can be easily shown and is a slight modification of the law of the iterated logarithm (see Shorack and Wellner 1986, p. 504).
The assumption (A2) for F(⋅) implies that the cumulative hazard rate Λ(⋅) is strictly increasing and the hazard rate λ(⋅) is obviously strictly positive on [A,B].
Now only the condition (B3) needs to be verified. We note that the vectors S i =(X i ,L i ,δ i ) i=1,…,n are observable under L i ≤X i . Hence, we derive the following conditional expectation:
where F X,δ(x,y)=P(X≤x,δ≤y∣L≤X) is the conditional distribution function of (X,δ).
The integral \(\int_{x_{1} \in I} dF^{X, \delta}(x_{1},1)\) for the intervals I:=[a,b]⊆[A,B] can now be calculated. First we express the probability P(X i ∈I,δ i =1∣L i ≤X i ) in the terms of the non-observable vector (T i ,L i ,C i ) as follows:
where α=P(L i ≤X i ). Hence, we express the probabilities P(X i ∈I,δ i =1∣L i ≤X i ) and P(T i ∈I,L i ≤T i ≤C i ) as the following expectations of the Bernoulli-variables:
and
One can see that dF X,δ(x,1)=α −1 F L(j)(1−F C(j)) dF(x) follows from the expressions (16), (17), and (18). Consequently the expectation (15) can be written as follows:
Obviously, conditions (B1)–(B4) hold and the local convergence
follows for a constant \(D \le 2(\sqrt{2\cdot(2M + \varLambda(B))} + 2 M)\).
Rights and permissions
About this article
Cite this article
Weißbach, R., Poniatowski, W. & Krämer, W. Nearest neighbor hazard estimation with left-truncated duration data. AStA Adv Stat Anal 97, 33–47 (2013). https://doi.org/10.1007/s10182-012-0194-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10182-012-0194-5