Journal of Statistical Theory and Practice

, Volume 2, Issue 3, pp 369–383 | Cite as

Constrained Maximum Likelihood Estimation under Logistic Regression Models Based on Case-Control Data

  • Biao ZhangEmail author


We study constrained maximum likelihood estimation and tests under the logistic regression model based on case-control data. Our approach is based on the semiparametric profile log likelihood function under a two-sample semiparametric model, which is equivalent to the assumed logistic regression model. We show that the semiparametric likelihood ratio statistic, the Lagrangian multiplier statistic, and the Wald statistic are asymptotically equivalent and that they have an asymptotic chi-squared distribution under the null hypothesis and an asymptotic noncentral chi-squared distribution under local alternatives to the null hypothesis. Moreover, we demonstrate that the three test statistics and their asymptotic distributions may be obtained by fitting the prospective logistic regression model to case-control data. We present some results on simulation and on the analysis of two real data sets.

AMS Subject Classification

Primary 62G05 62G10 62G20 


Biased sampling problem Case-control data Chi-squared Consistency Constrained estimation Fisher information Lagrangian multiplier statistic Local alternative Mixture sampling Profile likelihood Semiparametric likelihood ratio statistic Wald statistic 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Agresti, A., 1996. An Introduction to Categorical Data Analysis. John Wiley & Sons, New York.zbMATHGoogle Scholar
  2. Agresti, A., 2002. Categorical Data Analysis. 2nd ed, Wiley-Interscience, New York.CrossRefGoogle Scholar
  3. Aitchison, J., Silvey, S.D., 1958. Maximum-likelihood estimation of parameters subject to restraints. Ann. Math. Statist., 29, 813–828.MathSciNetCrossRefGoogle Scholar
  4. Boos, D.D., 1992. On generalized score tests. Amer. Statist., 46, 327–333.Google Scholar
  5. Breslow, N., Day, N.E., 1980. Statistical Methods in Cancer Research, Vol. 1, The Analysis of Case-control Studies, IARC, Lyon.Google Scholar
  6. Day, N.E., Kerridge, D.F., 1967. A general maximum likelihood discriminant. Biometrics, 23, 313–323.CrossRefGoogle Scholar
  7. Fokianos, K., Kedem, B., Qin, J., Haferman, J.L., Short, D.A., 1998. On combining instruments. Journal of Applied Meteorology, 37, 220–226.CrossRefGoogle Scholar
  8. Fokianos, K., Kedem, B., Qin, J., Short, D.A., 2001. A semiparametric approach to the one-way layout. Technometrics, 43, 56–65.MathSciNetCrossRefGoogle Scholar
  9. Gilbert, P., Lele, S., Vardi, Y., 1999. Maximum likelihood estimation in semiparametric selection bias models with application to AIDS vaccine trials. Biometrika, 86, 27–43.MathSciNetCrossRefGoogle Scholar
  10. Gill, R.D., Vardi, Y., Wellner, J.A., 1988. Large sample theory of empirical distributions in biased sampling models. Ann. Statist., 16, 1069–1112.MathSciNetCrossRefGoogle Scholar
  11. Gramenzi, A., Gentile, A., Fasoli, M., D’avanzo, B., Negri, E., Parazzini, F., Vecchia, C.L., 1989. Smoking and myocardial infarction in women: a case-control study from northern Italy. J. Epidemiol. Commun. Health, 43, 214–217.CrossRefGoogle Scholar
  12. Hosmer, D. J., Lemeshow, S., 1989. Applied Logistic Regression. John Wiley & Sons, New York.zbMATHGoogle Scholar
  13. Prentice, R. L., Pyke, R., 1979. Logistic disease incidence models and case-control studies. Biometrika, 66, 403–411.MathSciNetCrossRefGoogle Scholar
  14. Qin, J., 1993. Empirical likelihood in biased sample problems. Ann. Statist., 21, 1182–1196.MathSciNetCrossRefGoogle Scholar
  15. Qin, J., Lawless, J., 1995. Estimating equations, empirical likelihood and constraints on parameters. Canad. J. Statist., 23, 145–159.MathSciNetCrossRefGoogle Scholar
  16. Qin, J., Zhang, B., 1997. A goodness of fit test for logistic regression models based on case-control data. Biometrika, 84, 609–618.MathSciNetCrossRefGoogle Scholar
  17. Serfling, R.J., 1980. Approximation Theorems of Mathematical Statistics. John Wiley & Sons, New York.CrossRefGoogle Scholar
  18. Silvey, S.D., 1959. The Lagrangian multiplier test. Ann. Math. Statist., 30, 389–407.MathSciNetCrossRefGoogle Scholar
  19. Vardi, Y., 1982. Nonparametric estimation in presence of length bias. Ann. Statist., 10, 616–620.MathSciNetCrossRefGoogle Scholar
  20. Vardi, Y., 1985. Empirical distribution in selection bias models. Ann. Statist., 13, 178–203.MathSciNetCrossRefGoogle Scholar
  21. Wang, C.Y., Carroll, R.J., 1993. On robust estimation in logistic case-control studies. Biometrika, 80, 237–241.MathSciNetCrossRefGoogle Scholar
  22. Weinberg, C.R., Wacholder, S., 1993. Prospective analysis of case-control data under general multiplicative-intercept risk models. Biometrika, 80, 461–465.MathSciNetzbMATHGoogle Scholar
  23. Zhang, B., 2001. An information matrix test for logistic regression models based on case-control data. Biometrika, 88, 921–932.MathSciNetCrossRefGoogle Scholar

Copyright information

© Grace Scientific Publishing 2008

Authors and Affiliations

  1. 1.Department of MathematicsUniversity of ToledoToledoUSA

Personalised recommendations