Statistics in Biosciences

, Volume 5, Issue 2, pp 250-260

First online:

A Note on Penalized Regression Spline Estimation in the Secondary Analysis of Case-Control Data

  • Suzan GaziogluAffiliated withDepartment of Mathematical Sciences, Montana Tech of the University of Montana
  • , Jiawei WeiAffiliated withBeijing Novartis Pharma Co. Ltd.
  • , Elizabeth M. JenningsAffiliated withDepartment of Statistics, Texas A&M University
  • , Raymond J. CarrollAffiliated withDepartment of Statistics, Texas A&M University Email author 

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


Primary analysis of case-control studies focuses on the relationship between disease (D) and a set of covariates of interest (Y,X). A secondary application of the case-control study, often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated due to the case-control sampling, and to avoid the biased sampling that arises from the design, it is typical to use the control data only. In this paper, we develop penalized regression spline methodology that uses all the data, and improves precision of estimation compared to using only the controls. A simulation study and an empirical example are used to illustrate the methodology.


Biased samples B-splines Homoscedastic regression Nonparametric regression Regression splines Secondary data Secondary phenotypes Two-stage samples