Constrained Maximum Likelihood Estimation under Logistic Regression Models Based on Case-Control Data
Abstract
We study constrained maximum likelihood estimation and tests under the logistic regression model based on case-control data. Our approach is based on the semiparametric profile log likelihood function under a two-sample semiparametric model, which is equivalent to the assumed logistic regression model. We show that the semiparametric likelihood ratio statistic, the Lagrangian multiplier statistic, and the Wald statistic are asymptotically equivalent and that they have an asymptotic chi-squared distribution under the null hypothesis and an asymptotic noncentral chi-squared distribution under local alternatives to the null hypothesis. Moreover, we demonstrate that the three test statistics and their asymptotic distributions may be obtained by fitting the prospective logistic regression model to case-control data. We present some results on simulation and on the analysis of two real data sets.
AMS Subject Classification
Primary 62G05 62G10 62G20Keywords
Biased sampling problem Case-control data Chi-squared Consistency Constrained estimation Fisher information Lagrangian multiplier statistic Local alternative Mixture sampling Profile likelihood Semiparametric likelihood ratio statistic Wald statisticPreview
Unable to display preview. Download preview PDF.
References
- Agresti, A., 1996. An Introduction to Categorical Data Analysis. John Wiley & Sons, New York.zbMATHGoogle Scholar
- Agresti, A., 2002. Categorical Data Analysis. 2nd ed, Wiley-Interscience, New York.CrossRefGoogle Scholar
- Aitchison, J., Silvey, S.D., 1958. Maximum-likelihood estimation of parameters subject to restraints. Ann. Math. Statist., 29, 813–828.MathSciNetCrossRefGoogle Scholar
- Boos, D.D., 1992. On generalized score tests. Amer. Statist., 46, 327–333.Google Scholar
- Breslow, N., Day, N.E., 1980. Statistical Methods in Cancer Research, Vol. 1, The Analysis of Case-control Studies, IARC, Lyon.Google Scholar
- Day, N.E., Kerridge, D.F., 1967. A general maximum likelihood discriminant. Biometrics, 23, 313–323.CrossRefGoogle Scholar
- Fokianos, K., Kedem, B., Qin, J., Haferman, J.L., Short, D.A., 1998. On combining instruments. Journal of Applied Meteorology, 37, 220–226.CrossRefGoogle Scholar
- Fokianos, K., Kedem, B., Qin, J., Short, D.A., 2001. A semiparametric approach to the one-way layout. Technometrics, 43, 56–65.MathSciNetCrossRefGoogle Scholar
- Gilbert, P., Lele, S., Vardi, Y., 1999. Maximum likelihood estimation in semiparametric selection bias models with application to AIDS vaccine trials. Biometrika, 86, 27–43.MathSciNetCrossRefGoogle Scholar
- Gill, R.D., Vardi, Y., Wellner, J.A., 1988. Large sample theory of empirical distributions in biased sampling models. Ann. Statist., 16, 1069–1112.MathSciNetCrossRefGoogle Scholar
- Gramenzi, A., Gentile, A., Fasoli, M., D’avanzo, B., Negri, E., Parazzini, F., Vecchia, C.L., 1989. Smoking and myocardial infarction in women: a case-control study from northern Italy. J. Epidemiol. Commun. Health, 43, 214–217.CrossRefGoogle Scholar
- Hosmer, D. J., Lemeshow, S., 1989. Applied Logistic Regression. John Wiley & Sons, New York.zbMATHGoogle Scholar
- Prentice, R. L., Pyke, R., 1979. Logistic disease incidence models and case-control studies. Biometrika, 66, 403–411.MathSciNetCrossRefGoogle Scholar
- Qin, J., 1993. Empirical likelihood in biased sample problems. Ann. Statist., 21, 1182–1196.MathSciNetCrossRefGoogle Scholar
- Qin, J., Lawless, J., 1995. Estimating equations, empirical likelihood and constraints on parameters. Canad. J. Statist., 23, 145–159.MathSciNetCrossRefGoogle Scholar
- Qin, J., Zhang, B., 1997. A goodness of fit test for logistic regression models based on case-control data. Biometrika, 84, 609–618.MathSciNetCrossRefGoogle Scholar
- Serfling, R.J., 1980. Approximation Theorems of Mathematical Statistics. John Wiley & Sons, New York.CrossRefGoogle Scholar
- Silvey, S.D., 1959. The Lagrangian multiplier test. Ann. Math. Statist., 30, 389–407.MathSciNetCrossRefGoogle Scholar
- Vardi, Y., 1982. Nonparametric estimation in presence of length bias. Ann. Statist., 10, 616–620.MathSciNetCrossRefGoogle Scholar
- Vardi, Y., 1985. Empirical distribution in selection bias models. Ann. Statist., 13, 178–203.MathSciNetCrossRefGoogle Scholar
- Wang, C.Y., Carroll, R.J., 1993. On robust estimation in logistic case-control studies. Biometrika, 80, 237–241.MathSciNetCrossRefGoogle Scholar
- Weinberg, C.R., Wacholder, S., 1993. Prospective analysis of case-control data under general multiplicative-intercept risk models. Biometrika, 80, 461–465.MathSciNetzbMATHGoogle Scholar
- Zhang, B., 2001. An information matrix test for logistic regression models based on case-control data. Biometrika, 88, 921–932.MathSciNetCrossRefGoogle Scholar