Analysis of an outcome-dependent enriched sample: hypothesis tests

Vahl, Christopher I.; Kang, Qing

doi:10.1007/s10260-014-0285-4

Analysis of an outcome-dependent enriched sample: hypothesis tests

Published: 19 September 2014

Volume 24, pages 387–409, (2015)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

Christopher I. Vahl¹ &
Qing Kang²

187 Accesses
1 Citation
Explore all metrics

Abstract

An outcome-dependent sample is generated by a stratified survey design where the stratification depends on the outcome. It is also known as a case–control sample in epidemiological studies and a choice-based sample in econometrical studies. An outcome-dependent enriched sample (ODE) results from combining an outcome-dependent sample with an independently collected random sample. Consider the situation where the conditional probability of a categorical outcome given its covariates follows an explicit model with an unknown parameter whereas the marginal probability of the outcome and its covariates are left unspecified. Profile-likelihood (PL) and weighted-likelihood (WL) methods have been employed to estimate the model parameter from an ODE sample. This article develops the PL- and WL-based families of tests on the model parameter from an ODE sample. Asymptotic properties of their test statistics are derived. The PL likelihood-ratio, Wald and score tests are shown to obey classical inference, i.e. their test statistics are asymptotically equivalent and Chi-squared distributed. In contrast, the WL likelihood-ratio statistic asymptotically has a weighted Chi-squared distribution and is not equivalent to the WL Wald and score statistics. Our theoretical derivation and simulation show that tests based on these new statistics carry nominal type I error and good power. Advantages of ODE sampling together with the implementation of the PL and WL methods are demonstrated in an illustrative example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semiparametric empirical likelihood estimation for two-stage outcome-dependent sampling under the frame of generalized linear models

Article 01 July 2014

Estimation of Population Variance Under an Imputation Method in Two-Phase Sampling

Article 01 January 2019

Response-Dependent Sampling with Clustered and Longitudinal Data

References

Agresti AA (2002) Categorical data analysis. Wiley-Interscience, Hoboken
Book MATH Google Scholar
Breslow NE, Cain KC (1988) Logistic regression for two-stage case–control data. Biometrika 75:11–20
Article MATH MathSciNet Google Scholar
Breslow NE (1996) Statistics in epidemiology: the case–control study. J Am Stat Assoc 91:14–28
Article MATH MathSciNet Google Scholar
Breslow N, McNeney B, Wellner JA (2003) Large sample theory for semiparametric regression models with two-phase, outcome dependent sampling. Ann Stat 31:1110–1139
Article MATH MathSciNet Google Scholar
Chatterjee N, Chen HY, Breslow NE (2003) A pseudoscore estimator for regression problems with two-phase sampling. J Am Stat Assoc 98:158–168
Article MATH MathSciNet Google Scholar
Chatterjee N, Chen YH (2007) Maximum likelihood inference on a mixed conditionally and marginally specified regression model for genetic epidemiologic studies with two-phase sampling. J R Stat Soc B 69:123–142
Article MATH MathSciNet Google Scholar
Chen HY (2003) A note on the prospective analysis of outcome-dependent samples. J R Stat Soc B 65: 575–584
Article MATH Google Scholar
Cosslett SR (1981a) Efficient estimation of discrete-choice models. In: Manski C, McFadden D (eds) Structural analysis of discrete data with econometric applications. The MIT Press, Cambridge, pp 51–111
Google Scholar
Cosslett SR (1981b) Maximum likelihood estimator for choice-based samples. Econometrica 49:1289–1316
Article MATH MathSciNet Google Scholar
Doll R, Hill AB (1950) Smoking and carcinoma of the lung. Br Med J 221:739–748
Article Google Scholar
Doll R, Peto R, Boreham J, Sutherland I (2004) Mortality in relation to smoking: 40 years’ observations on male British doctors. Br Med J 328:1519–1527
Article Google Scholar
Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, New York
Book MATH Google Scholar
Holt D, Ewings PD (1989) Logistic models for contingency tables. In: Skinner CJ, Holt D, Smith TMF (eds) Analysis of complex surveys. Wiley, New York, pp 261–279
Google Scholar
Johnson NL, Kotz S (1970) Continuous univariate distributions, vol 2. Houghton Mifflin, Boston
MATH Google Scholar
Kang Q, Nelson PI, Vahl CI (2010) Parameter estimation from an outcome-dependent sample using weighted likelihood method. Statist Sinica 20:1529–1550
MATH MathSciNet Google Scholar
Kullback S (1997) Information theory and statistics. Dover Publications, New York
MATH Google Scholar
Manski CF, Lerman SR (1977) The estimation of choice probabilities from choice based samples. Econometrica 45:1977–1988
Article MATH MathSciNet Google Scholar
Manski CF, McFadden D (1981) Alternative estimators and sample designs for discrete choice analysis. In: Manski C, McFadden D (eds) Structural analysis of discrete data with econometric applications. The MIT Press, Cambridge, MA, pp 2–50
Manski CF, Thompson TS (1989) Estimation of best predictors of binary response. J Econ 40:97–123
Article MATH MathSciNet Google Scholar
Morgenthaler S, Vardi Y (1986) Choice-based samples: a nonparametric approach. J Econ 32:109–125
Article MATH MathSciNet Google Scholar
Prentice RL, Pyke R (1979) Logistic disease incidence models and case–control studies. Biometrika 66: 403–411
Article MATH MathSciNet Google Scholar
Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley, New York
Book MATH Google Scholar
Rao JNK, Thomas DR (1989) Chi-squared tests for contingency table. In: Skinner CJ, Holt D, Smith TMF (eds) Analysis of complex surveys. Wiley, New York, pp 89–114
Google Scholar
Roberts G, Rao JNK, Kumar S (1987) Logistic regression analysis of sample survey data. Biometrika 74:1–12
Article MATH MathSciNet Google Scholar
Rose S, van der Laan MJ (2009) Why match? Investigating matched case–control study designs with causal effect estimation. Int J Biostat 5: Article 1
Scott A, Wild C (1986) Fitting logistic models under case–control or choice based sampling. J R Stat Soc B 48:170–182
MATH MathSciNet Google Scholar
Scott AJ, Wild CJ (1997) Fitting regression models to case–control data by maximum likelihood. Biometrika 84:57–71
Article MATH MathSciNet Google Scholar
Vardi Y (1985) Empirical distributions in selection bias models. Ann Stat 13:178–203
Article MATH MathSciNet Google Scholar
Wang XF, Zhou HB (2006) A semiparametric empirical likelihood method for biased sampling schemes with auxiliary covariates. Biometrics 62:1149–1160
Article MATH MathSciNet Google Scholar
Zhou H, Weaver MA, Qin H, Longnecker MP, Wang MC (2002) A semiparametric empirical likelihood method for data from an outcome-dependent sampling scheme with a continuous outcome. Biometrics 58:413–421
Article MATH MathSciNet Google Scholar
Zhou H, Song R, Wu YS, Qin J (2011) Statistical inference for a two-stage outcome-dependent sampling design with a continuous outcome. Biometrics 67:194–202
Article MATH MathSciNet Google Scholar

Download references

Acknowledgments

We thank Paul I. Nelson for his constructive comments on this paper. We also thank the anonymous reviewer and the associate editor for their insightful comments and suggestions.

Author information

Authors and Affiliations

Department of Statistics, Kansas State University, 101 Dickens Hall, Manhattan, KS, 66506, USA
Christopher I. Vahl
117 Firethorn Lane, Manhattan, KS, 66503, USA
Qing Kang

Authors

Christopher I. Vahl
View author publications
You can also search for this author in PubMed Google Scholar
Qing Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christopher I. Vahl.

Appendices

Appendix A: Proof of Theorem 1

First we adopt Rao’s (1973, sec. 6e) strategy to convert ${ LR }^{W}$ into a quadratic form. Note that $\nabla _{\varvec{\uptheta }} l_N^W (\hat{{\varvec{\uptheta }}}^{W})=\mathbf{0}$ and $\nabla _{\varvec{\upbeta }} l_N^W ({\varvec{\Psi }}(\hat{{\varvec{\upbeta }}}^{W}))=\mathbf{0}$. The chain rule implies $\nabla _{\varvec{\upbeta }} l_N^W ({\varvec{\Psi }}({\varvec{\upbeta }}))=\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }})\mathbf{M}^{\prime }$. Subject $\nabla _{\varvec{\uptheta }} l_N^W (\hat{{\varvec{\uptheta }}}^{W})$ and $\nabla _{\varvec{\upbeta }} l_N^W ({\varvec{\Psi }}(\hat{{\varvec{\upbeta }}}^{W}))$ to first-order Taylor-series expansions at ${\varvec{\uptheta }}^{*}$ and ${\varvec{\upbeta }}^{*}$, respectively, and apply Lemma 1. This leads to

$$\begin{aligned} \hat{{\varvec{\uptheta }}}^{W}-{\varvec{\uptheta }}^{*}&= -\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})(\mathbf{H}^{W})^{-1}+O_p (N^{-1}), \nonumber \\ \hat{{\varvec{\upbeta }}}^{W}-{\varvec{\upbeta }}^{*}&= -\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})\mathbf{M}^{\prime }(\mathbf{MH}^{W}\mathbf{M}^{\prime })^{-1}+O_p (N^{-1}). \end{aligned}$$

(6)

Perform second-order Taylor-series expansions on $l_N^W ({\varvec{\uptheta }}^{*})$ at $\hat{{\varvec{\uptheta }}}^{W}$ and $\hat{{\varvec{\upbeta }}}^{W}$, separately, and take the difference. This leads to

$$\begin{aligned} { LR }^{W}\!+\!N[(\hat{{\varvec{\uptheta }}}^{W}\!-\!{\varvec{\uptheta }}^{*})\mathbf{H}^{W}(\hat{{\varvec{\uptheta }}}^{W}\!-\!{\varvec{\uptheta }}^{*})^{\prime }\!-\!(\hat{{\varvec{\upbeta }}}^{W}\!-\!{\varvec{\upbeta }}^{*})\mathbf{MH}^{W}\mathbf{M}^{\prime }(\hat{{\varvec{\upbeta }}}^{W}\!-\!{\varvec{\upbeta }}^{*})^{\prime }]\!+\!O_P (N^{-1/2})\!=\!0. \end{aligned}$$

Plugging Formula (6) into the above equation yields

$$\begin{aligned} { LR }^{W}=N\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})\mathbf{O}^{W}[\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})]^{\prime }+o_p (1). \end{aligned}$$

From Johnson and Kotz (1970, pp. 150–151), ${ LR }^{W}$ has the same asymptotic distribution as $\sum _{i=1}^q {[e_i^W \chi _i^2 (1,0)]} $, where $e_1^W \ge \cdots \ge e_q^W $ are the eigenvalues of $\mathbf{O}^{W}\mathbf{V}^{W}$. Note that $-\mathbf{O}^{W}\mathbf{H}^{W}$ is idempotent of rank $r$. By Lemma 1, both $\mathbf{V}^{W}$ and $-\mathbf{H}^{W}$are $p.d$. Hence we find $\mathbf{O}^{W}$ to be positive-semi-definite of rank $r$ and, subsequently, eigenvalues of $\mathbf{O}^{W}\mathbf{V}^{W}$ satisfy that $e_1^W \ge \cdots \ge e_r^W >0$ and $e_{r+1}^W =\cdots =e_q^W =0$. This completes the proof of Theorem 1$(i)$.

A similar strategy is used to derive the limiting distribution of ${ LR }^{P}$. Briefly, we have

$$\begin{aligned} { LR }^{P}=N\nabla _{\varvec{\upupsilon }} l_N^P ({\varvec{\upupsilon }}^{*})\mathbf{O}^{P}[\nabla _{\varvec{\upupsilon }} l_N^P ({\varvec{\upupsilon }}^{*})]^{\prime }+o_p (1), \end{aligned}$$

where $\mathbf{O}^{P}=\mathbf{N}^{\prime }(\mathbf{NH}^{P}\mathbf{N}^{\prime })^{-1}\mathbf{N}-(\mathbf{H}^{P})^{-1}$, $\mathbf{N}=diag(\mathbf{M},\mathbf{I}_K )$. To prove Theorem 1(ii), it suffices to show that $\mathbf{O}^{P}\mathbf{V}^{P}$ is idempotent of rank $r$. By Lemma 1, $\exists {\varvec{\Gamma }}$ such that . Given the fact that the last $K$ rows of $\mathbf{N}$ is and $\mathbf{N}^{\prime }(\mathbf{NH}^{P}\mathbf{N}^{\prime })^{-1}\mathbf{NH}^{P}\mathbf{N}^{\prime }=\mathbf{N}^{\prime }$, we obtain

Obviously, $-\mathbf{N}^{\prime }(\mathbf{NH}^{P}\mathbf{N}^{\prime })^{-1}\mathbf{NH}^{P}+\mathbf{I}_{q+K} $ is idempotent of rank $r$.

Appendix B: Proof of Theorem 3

Partition $\mathbf{H}^{W}$ in accordance with ${\varvec{\uptheta }}=({\begin{array}{ll} {\varvec{\upalpha }}&{} {\varvec{\upbeta }} \\ \end{array} })$ into four submatrices:$\mathbf{H}_{11}^W $, $\mathbf{H}_{12}^W $, $\mathbf{H}_{21}^W $, $\mathbf{H}_{22}^W $. Let $\mathbf{Q}^{W}=(\mathbf{I}_r \quad -\mathbf{H}_{12}^W (\mathbf{H}_{22}^W )^{-1})$, $\mathbf{H}^{W11}=[\mathbf{H}_{11}^W -\mathbf{H}_{12}^W (\mathbf{H}_{22}^W )^{-1}\mathbf{H}_{21}^W ]^{-1}$, and . Theorem 8.5.11 of Harville (1997) states that , which yields

$$\begin{aligned} (\mathbf{H}^{W})^{-1}\mathbf{R}=\left( \mathbf{Q}^{W}\right) ^{\prime }\mathbf{H}^{W11}. \end{aligned}$$

(7)

According to Lemma 1, Slutsky’s theorem and Formula (7), we have

$$\begin{aligned} Score^{W}&= N\nabla _{\varvec{\upalpha }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})[\mathbf{Q}^{W}\mathbf{V}^{W} (\mathbf{Q}^{W})^{\prime }]^{-1}[\nabla _{\varvec{\upalpha }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})]^{\prime }+o_p (1), \nonumber \\ Wald^{W}&= N(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})[\mathbf{R}^{\prime }(\mathbf{H}^{W})^{-1}\mathbf{V}^{W}(\mathbf{H}^{W})^{-1}\mathbf{R}]^{-1}(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})^{\prime }+o_p (1) \nonumber \\&= N(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})[\mathbf{H}^{W11}\mathbf{Q}^{W}\mathbf{V}^{W}(\mathbf{Q}^{W})^{\prime }\mathbf{H}^{W11}]^{-1}(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})^{\prime }+o_p (1) \nonumber \\&= N(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})(\mathbf{H}^{W11})^{-1}[\mathbf{Q}^{W}\mathbf{V}^{W}(\mathbf{Q}^{W})^{\prime }]^{-1}(\mathbf{H}^{W11})^{-1}(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}\!-\!\mathbf{a})^{\prime }\!+\!o_p (1).\nonumber \\ \end{aligned}$$

(8)

Perform first-order Taylor-series expansions on $\nabla _{\varvec{\upalpha }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})$ and $\nabla _{\varvec{\upbeta }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})$ at ${\varvec{\upbeta }}^{*}$, respectively. It follows from $\nabla _{\varvec{\upbeta }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})=\mathbf{0}$ that

$$\begin{aligned} \nabla _{\varvec{\upalpha }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})=\nabla _{\varvec{\uptheta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})(\mathbf{Q}^{W})^{\prime }+O_p (N^{-1}). \end{aligned}$$

(9)

Also note that $\nabla _{\varvec{\uptheta }} l_N^W (\hat{{\varvec{\uptheta }}}^{W})=\mathbf{0}$. Performing a first-order Taylor-series expansion on $\nabla _{\varvec{\uptheta }} l_N^W (\hat{{\varvec{\uptheta }}}^{W})$ at $({\begin{array}{ll} \mathbf{a}&{} {{\varvec{\upbeta }}^{*}} \\ \end{array} })$ and applying Formula (7) leads to

$$\begin{aligned} \nabla _{\varvec{\uptheta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})(\mathbf{Q}^{W})^{\prime }=-(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})(\mathbf{H}^{W11})^{-1}+O_p (N^{-1}). \end{aligned}$$

(10)

It is thus seen from Formulas (8), (9) and (10) that $Score^{W}=Wald^{W}+o_p (1)$. The asymptotic distribution of $Score^{W}$ and $Wald^{W}$ is a direct result of Lemma 1 and Johnson and Kotz (1970, pp. 150–151). This completes of proof of Theorem 3$(i)$.

To prove Theorem 3(ii), first note that $\mathbf{MH}^{W}\mathbf{M}^{\prime }=\mathbf{H}_{22}^W $. Perform second-order Taylor-series expansions on $l_N^W ({\varvec{\uptheta }}^{*})$ and $l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})$ at $\hat{{\varvec{\uptheta }}}^{W}$ and $\hat{{\varvec{\upbeta }}}^{W}$, separately, and take the difference. This generates

$$\begin{aligned} { LR }^{W}&= 2N[l_N^W ({\varvec{\uptheta }}^{*})-l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})]-N(\hat{{\varvec{\uptheta }}}^{W}-{\varvec{\uptheta }}^{*})\mathbf{H}^{W}(\hat{{\varvec{\uptheta }}}^{W}-{\varvec{\uptheta }}^{*})^{\prime }\\&+N(\hat{{\varvec{\upbeta }}}^{W}-{\varvec{\upbeta }}^{*})\mathbf{H}_{22}^W (\hat{{\varvec{\upbeta }}}^{W}-{\varvec{\upbeta }}^{*})^{\prime }+O_P (N^{-1/2}). \end{aligned}$$

For ${\varvec{\upalpha }}^{*}=\mathbf{a}+N^{-1/2}{\varvec{\Delta }}$, we have

$$\begin{aligned} l_N^W ({\varvec{\uptheta }}^{*})-l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})&= N^{-1/2}{\varvec{\Delta }}[\nabla _{\varvec{\upalpha }} l_N^W ({\varvec{\uptheta }}^{*})]^{\prime }-0.5N^{-1}{\varvec{\Delta }}\mathbf{H}_{11}^W {\varvec{\Delta }}^{\prime }+O_p (N^{{-3}/2}), \\ {\varvec{\upbeta }}^{*}-\hat{{\varvec{\upbeta }}}^{W}&= \nabla _{\varvec{\upbeta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})(\mathbf{H}_{22}^W )^{-1}+O_p (N^{-1})\\&= [\nabla _{\varvec{\upbeta }} l_N^W ({\varvec{\uptheta }}^{*})-N^{-1/2}{\varvec{\Delta }}\mathbf{H}_{12}^W ](\mathbf{H}_{22}^W )^{-1}+O_p (N^{-1}). \end{aligned}$$

Recall that $\hat{{\varvec{\uptheta }}}^{W}-{\varvec{\uptheta }}^{*}=-\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})(\mathbf{H}^{W})^{-1}+O_p (N^{-1})$. Collecting the information above, we then convert ${ LR }^{W}$ to a quadratic form as

$$\begin{aligned} { LR }^{W}&= -[N^{1/2}\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})(\mathbf{Q}^{W})^{\prime }-{\varvec{\Delta }}(\mathbf{H}^{W11})^{-1}]\mathbf{H}^{W11}[N^{1/2}\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})(\mathbf{Q}^{W})^{\prime }\nonumber \\&-{\varvec{\Delta }}(\mathbf{H}^{W11})^{-1}]^{\prime }+o_p (1) \nonumber \\&= N\nabla _{\varvec{\uptheta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})(\mathbf{Q}^{W})^{\prime }(-\mathbf{H}^{W11})\mathbf{Q}^{W}[\nabla _{\varvec{\uptheta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})]^{\prime }+o_p (1). \end{aligned}$$

(11)

Apply Cholesky decomposition to the $r\times r\,p.d.$ matrix $\mathbf{Q}^{W}\mathbf{V}^{W}(\mathbf{Q}^{W})^{\prime }$ to get $\mathbf{Q}^{W}\mathbf{V}^{W}(\mathbf{Q}^{W})^{\prime }=\mathbf{LL}^{\prime }$. Let $e_1^W \ge \cdots \ge e_r^W >0$ be the eigenvalues of $-\mathbf{L}^{\prime }\mathbf{H}^{W11}\mathbf{L}$ and let $\mathbf{P}$ be the associated orthogonal matrix of eigenvectors, i.e. $-\mathbf{P}^{\prime }\mathbf{L}^{\prime }\mathbf{H}^{W11}\mathbf{LP}=diag(e_1^W ,\ldots ,e_r^W )$. Further denote $\mathbf{p}_i $ as the $\hbox {i}^{\mathrm{th}}$ row vector of $\mathbf{P}$. From Johnson and Kotz (1970, pp. 150–151) and Lemma 1, ${ LR }^{W}$ has a limiting distribution of $\sum _{i=1}^r {[e_i^W \chi _i^2 (1,\varphi _i )]} $, $\varphi _i =0.5{\varvec{\Delta }}(\mathbf{L}^{\prime }\mathbf{H}^{W11})^{-1}(\mathbf{p}_i )^{\prime }\mathbf{p}_i (\mathbf{H}^{W11}\mathbf{L})^{-1}{\varvec{\Delta }}^{\prime }$. Because $(\mathbf{L}^{\prime }\mathbf{H}^{W11})^{-1}\mathbf{P}^{\prime }\mathbf{P}(\mathbf{H}^{W11}\mathbf{L}^{\prime })^{-1}$ is $p.d.$, $\varphi _1 =\cdots =\varphi _r =0$ iff ${\varvec{\Delta }}=\mathbf{0}$. It is easy to see that $\mathbf{O}^{W}\mathbf{V}^{W}=-(\mathbf{Q}^{W})^{\prime }\mathbf{H}^{W11}\mathbf{Q}^{W}\mathbf{V}^{W}$ has the same eigenvalues as $-\mathbf{L}^{\prime }\mathbf{H}^{W11}\mathbf{L}$. This completes the proof of Theorem 3(ii).

With respect to Theorem 3(iii), partition $\mathbf{H}^{P}$ into $\mathbf{H}_{11}^P $, $\mathbf{H}_{12}^P $, $\mathbf{H}_{21}^P $, $\mathbf{H}_{22}^P $ by separating out ${\varvec{\upalpha }}$ from ${\varvec{\upbeta }}$ and ${\varvec{\upxi }}_{+Y} $. Set $\mathbf{Q}^{P}=(\mathbf{I}_r \quad -\mathbf{H}_{12}^P (\mathbf{H}_{22}^P )^{-1})$ and . The fact that for some ${\varvec{\Gamma }}$ assures that $-\mathbf{H}^{P11}=[\mathbf{Q}^{P}\mathbf{V}^{P}(\mathbf{Q}^{P})^{\prime }]^{-1}$ (we leave the proof of this equation to the reader). Our formulation of $Score^{P}$ is a direct application of this equality. Analogous to the proof for Theorem 3$(i)$, we have

$$\begin{aligned} Wald^{P}=Score^{P}+o_p (1)=\chi ^{2}(r,-0.5{\varvec{\Delta }}(\mathbf{H}^{P11})^{-1}{\varvec{\Delta }}^{\prime })+o_p (1). \end{aligned}$$

Like Formula (11), ${ LR }^{P}$ can be converted to a quadratic form as

$$\begin{aligned} { LR }^{P}&= -[N^{1/2}\nabla _{\varvec{\upupsilon }} l_N^P ({\varvec{\upupsilon }}^{*})(\mathbf{Q}^{P})^{\prime }-{\varvec{\Delta }}(\mathbf{H}^{P11})^{-1}]\mathbf{H}^{P11}[N^{1/2}\nabla _{\varvec{\upupsilon }} l_N^P ({\varvec{\upupsilon }}^{*})(\mathbf{Q}^{P})^{\prime }\\&-{\varvec{\Delta }}(\mathbf{H}^{P11})^{-1}]^{\prime }+o_p (1) \\&= -N\nabla _{\varvec{\upupsilon }} l_N^P (\mathbf{a},{\varvec{\upbeta }}^{*},{\varvec{\upxi }}_{+Y}^*)(\mathbf{Q}^{P})^{\prime }\mathbf{H}^{P11}\mathbf{Q}^{P}\nabla _{\varvec{\upupsilon }} l_N^P (\mathbf{a},{\varvec{\upbeta }}^{*},{\varvec{\upxi }}_{+Y}^*)+o_p (1) \\&= -N\nabla _{\varvec{\upalpha }} l_N^P (\mathbf{a},\hat{{\varvec{\upbeta }}}^{P},{\breve{{\varvec{\upxi }}}}_{+Y} )\mathbf{H}^{P11}[\nabla _{\varvec{\upalpha }} l_N^P (\mathbf{a},\hat{{\varvec{\upbeta }}}^{P},{\breve{{\varvec{\upxi }}}}_{+Y} )]^{\prime }+o_p (1)=Score^{P}+o_p (1) \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vahl, C.I., Kang, Q. Analysis of an outcome-dependent enriched sample: hypothesis tests. Stat Methods Appl 24, 387–409 (2015). https://doi.org/10.1007/s10260-014-0285-4

Download citation

Accepted: 11 September 2014
Published: 19 September 2014
Issue Date: September 2015
DOI: https://doi.org/10.1007/s10260-014-0285-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of an outcome-dependent enriched sample: hypothesis tests

Abstract

Access this article

Similar content being viewed by others

Semiparametric empirical likelihood estimation for two-stage outcome-dependent sampling under the frame of generalized linear models

Estimation of Population Variance Under an Imputation Method in Two-Phase Sampling

Response-Dependent Sampling with Clustered and Longitudinal Data

References

Acknowledgments