A Note on Penalized Regression Spline Estimation in the Secondary Analysis of Case-Control Data
- 149 Downloads
Primary analysis of case-control studies focuses on the relationship between disease (D) and a set of covariates of interest (Y,X). A secondary application of the case-control study, often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated due to the case-control sampling, and to avoid the biased sampling that arises from the design, it is typical to use the control data only. In this paper, we develop penalized regression spline methodology that uses all the data, and improves precision of estimation compared to using only the controls. A simulation study and an empirical example are used to illustrate the methodology.
KeywordsBiased samples B-splines Homoscedastic regression Nonparametric regression Regression splines Secondary data Secondary phenotypes Two-stage samples
Jennings, Wei and Carroll’s research were supported by a grant from the National Cancer Institute (R37-CA057030). This publication is based in part on work supported by Award Number KUS-CI-016-04, made by King Abdullah University of Science and Technology (KAUST).
- 15.Modan MD, Hartge P, Hirsh-Yechezkel G, Chetrit A, Lubin F, Beller U, Ben-Baruch G, Fishman A, Menczer J, Struewing JP, Tucker MA, Wacholder S for the National Israel Ovarian Cancer Study Group (2001) Parity, oral contraceptives and the risk of ovarian cancer among carriers and noncarriers of a BRCA1 or BRCA2 mutation. N Engl J Med 345:235–240 CrossRefGoogle Scholar
- 23.Wood SN (2006) Generalized additive models: an introduction with R. CRC Press, Boca Raton Google Scholar