Identifying Gene–Environment Interactions Associated with Prognosis Using Penalized Quantile Regression

  • Guohua Wang
  • Yinjun Zhao
  • Qingzhao Zhang
  • Yangguang Zang
  • Sanguo Zang
  • Shuangge Ma
Chapter
Part of the Contributions to Statistics book series (CONTRIB.STAT.)

Abstract

In the omics era, it has been well recognized that for complex traits and outcomes, the interactions between genetic and environmental factors (i.e., the G×E interactions) have important implications beyond the main effects. Most of the existing interaction analyses have been focused on continuous and categorical traits. Prognosis is of essential importance for complex diseases. However with significantly more complexity, prognosis outcomes have been less studied. In the existing interaction analysis on prognosis outcomes, the most common practice is to fit marginal (semi)parametric models (for example, Cox) using likelihood-based estimation and then identify important interactions based on significance level. Such an approach has limitations. First data contamination is not uncommon. With likelihood-based estimation, even a single contaminated observation can result in severely biased estimation and misleading conclusions. Second, when sample size is not large, the significance-based approach may not be reliable. To overcome these limitations, in this study, we adopt the quantile-based estimation which is robust to data contamination. Two techniques are adopted to accommodate right censoring. For identifying important interactions, we adopt penalization as an alternative to significance level. An efficient computational algorithm is developed. Simulation shows that the proposed method can significantly outperform the alternative. We analyze a lung cancer prognosis study with gene expression measurements.

Keywords

Quantile Regression Inverse Probability Prognosis Outcome Data Contamination Minimax Concave Penalty 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

We thank the organizers and participants of “The Fourth International Workshop on the Perspectives on High-dimensional Data Analysis.” The authors were supported by the China Postdoctoral Science Foundation (2014M550799), National Science Foundation of China (11401561), National Social Science Foundation of China (13CTJ001, 13&ZD148), National Institutes of Health (CA165923, CA191383, CA016359), and U.S. VA Cooperative Studies Program of the Department of Veterans Affairs, Office of Research and Development.

References

  1. 1.
    Bang, H., Tsiatis, A.A.: Median regression with censored cost data. Biometrics 58 (3), 643–649 (2002)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Breheny, P., Huang, J.: Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Stat. 5 (1), 232 (2011)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Caspi, A., Moffitt, T.E.: Gene-environment interactions in psychiatry: joining forces with neuroscience. Nat. Rev. Neurosci. 7 (7), 583–590 (2006)CrossRefGoogle Scholar
  4. 4.
    Cordell, H.J.: Detecting gene–gene interactions that underlie human diseases. Nat. Rev. Genet. 10 (6), 392–404 (2009)CrossRefGoogle Scholar
  5. 5.
    Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96 (456), 1348–1360 (2001)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Hunter, D.R.: MM algorithms for generalized Bradley-Terry models. Ann. Stat. 32, 384–406 (2004)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Hunter, D.J.: Gene-environment interactions in human diseases. Nat. Rev. Genet. 6 (4), 287–298 (2005)CrossRefGoogle Scholar
  8. 8.
    Hunter, D.R., Lange, K.: Quantile regression via an MM algorithm. J. Comput. Graph. Stat. 9 (1), 60–77 (2000)MathSciNetGoogle Scholar
  9. 9.
    Hunter, D.R., Li, R.: Variable selection using MM algorithms. Ann. Stat. 33 (4), 1617 (2005)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Koenker, R.: Quantile Regression, vol. 38. Cambridge University Press, Cambridge (2005)CrossRefMATHGoogle Scholar
  11. 11.
    Koenker, R., Bassett Jr, G.: Regression quantiles. Econometrica: J. Econom. Soc. 33–50 (1978)Google Scholar
  12. 12.
    Liu, J., Huang, J., Xie, Y., Ma, S.: Sparse group penalized integrative analysis of multiple cancer prognosis datasets. Genet. Res. 95 (2–3), 68–77 (2013)CrossRefGoogle Scholar
  13. 13.
    Liu, J., Huang, J., Zhang, Y., Lan, Q., Rothman, N., Zheng, T., Ma, S.: Identification of gene-environment interactions in cancer studies using penalization. Genomics 102 (4), 189–194 (2013)CrossRefGoogle Scholar
  14. 14.
    Lopez, O., Patilea, V.: Nonparametric lack-of-fit tests for parametric mean-regression models with censored data. J. Multivar. Anal. 100 (1), 210–230 (2009)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Mazumder, R., Friedman, J.H., Hastie, T.: Sparsenet: Coordinate descent with nonconvex penalties. J. Am. Stat. Assoc. 106 (495), 1125–1138 (2011)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    North, K.E., Martin, L.J.: The importance of gene-environment interaction implications for social scientists. Sociol. Methods Res. 37 (2), 164–200 (2008)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Shi, X., Liu, J., Huang, J., Zhou, Y., Xie, Y., Ma, S.: A penalized robust method for identifying gene-environment interactions. Genet. Epidemiol. 38 (3), 220–230 (2014)CrossRefGoogle Scholar
  18. 18.
    Thomas, D.: Methods for investigating gene-environment interactions in candidate pathway and genome-wide association studies. Ann. Rev. Public Health 31, 21 (2010)CrossRefGoogle Scholar
  19. 19.
    Wang, H.J., Wang, L.: Locally weighted censored quantile regression. J. Am. Stat. Assoc. 104 (487), 1117–1128 (2009)MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Wu, C., Ma, S.: A selective review of robust variable selection with applications in bioinformatics. Brief. Bioinform. 16 (5), 873–883 (2015)CrossRefGoogle Scholar
  21. 21.
    Xie, Y., Xiao, G., Coombes, K.R., Behrens, C., Solis, L.M., Raso, G., Girard, L., Erickson, H.S., Roth, J., Heymach, J.V., et al.: Robust gene expression signature from formalin-fixed paraffin-embedded samples predicts prognosis of non-small-cell lung cancer patients. Clin. Cancer Res. 17 (17), 5705–5714 (2011)CrossRefGoogle Scholar
  22. 22.
    Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38, 894–942 (2010)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Zhu, R., Zhao, H., Ma, S.: Identifying gene-environment and gene–gene interactions using a progressive penalization approach. Genet. Epidemiol. 38 (4), 353–368 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Guohua Wang
    • 1
    • 2
  • Yinjun Zhao
    • 3
    • 4
  • Qingzhao Zhang
    • 3
    • 4
  • Yangguang Zang
    • 1
    • 2
  • Sanguo Zang
    • 1
    • 2
  • Shuangge Ma
    • 3
    • 4
  1. 1.School of Mathematical SciencesUniversity of Chinese Academy of SciencesBeijingChina
  2. 2.Key Laboratory of Big Data Mining and Knowledge ManagementChinese Academy of SciencesBeijingChina
  3. 3.Department of BiostatisticsYale UniversityNew HavenUSA
  4. 4.VA Cooperative Studies Program Coordinating CenterWest HavenUSA

Personalised recommendations