Skip to main content

Advertisement

Log in

Integrative analysis of GWAS and transcriptomics data reveal key genes for non-small lung cancer

  • Original Paper
  • Published:
Medical Oncology Aims and scope Submit manuscript

Abstract

Lung cancer is one of the world’s most common and deadly cancers. The two main types of lung cancer are non-small cell lung cancer (NSCLC) and small cell lung cancer (SCLC). More than 85% of lung cancers are NSCLC. Genetic factors play a significant role in the risk of NSCLC. Growing studies focus on studying risk factors at the molecular level. The aim of the study is to build a pipeline to integrate Genome-wide association analysis (GWAS) and transcriptomics data with machine learning to effectively identify genetic risk factors of NSCLC. GWAS datasets and GWAS summary data were downloaded from GWAS catalog, which include lung carcinoma genetic variants among the European population. Then, with the GWAS summary, data functional analysis of significant SNPs was performed using a webserver called FUMAGWAS. The transcriptomics data of NSCLC and non-NSCLC people were used to build a machine learning model to identify the key genes that help predict the NSCLC. The top up-regulation and down-regulation genes were identified by the BART cancer webserver, and the mechanistic roles of the genes were validated by literature review. By performing integrative analysis of GWAS and transcriptomics analysis using machine learning, we identified multiple SNPs and genes that related to NSCLC. The computational pipeline may facilitate the biomarker discovery for NSCLC and other diseases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Bray F, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68:394–424.

    Article  PubMed  Google Scholar 

  2. Sung H, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71:209–49.

    Article  PubMed  Google Scholar 

  3. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020;70:7–30.

    Article  PubMed  Google Scholar 

  4. Oser MG, Niederst MJ, Sequist LV, Engelman JA. Transformation from non-small-cell lung cancer to small-cell lung cancer: molecular drivers and cells of origin. Lancet Oncol. 2015;16:e165-172.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Skřičková J, Kadlec B, Venclíček O, Merta Z. Lung cancer. Cas Lek Cesk. 2018;157:226–36.

    PubMed  Google Scholar 

  6. Webb JL, Burns RE, Brown HM, LeRoy BE, Kosarek CE. Squamous cell carcinoma. Compend Contin Educ Vet. 2009;31:E9.

    PubMed  Google Scholar 

  7. Ettinger DS, et al. Non-small cell lung cancer, version 2. 2013. J Natl Compr Canc Netw. 2013;11:645–53.

    Article  PubMed  Google Scholar 

  8. Sinha R, et al. Fried, well-done red meat and risk of lung cancer in women (United States). Cancer Causes Control CCC. 1998;9:621–30.

    Article  CAS  PubMed  Google Scholar 

  9. Ebrahimi H, et al. Global, regional, and national burden of respiratory tract cancers and associated risk factors from 1990 to 2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Respir Med. 2021;9:1030–49.

    Article  Google Scholar 

  10. Bade BC, Dela Cruz CS. Lung cancer 2020: epidemiology, etiology, and prevention. Clin Chest Med. 2020;41:1–24.

    Article  PubMed  Google Scholar 

  11. Loeb LA, Ernster VL, Warner KE, Abbotts J, Laszlo J. Smoking and lung cancer: an overview. Cancer Res. 1984;44:5940–58.

    CAS  PubMed  Google Scholar 

  12. Tammemägi MC, Berg CD, Riley TL, Cunningham CR, Taylor KL. Impact of lung cancer screening results on smoking cessation. J Natl Cancer Inst. 2014;106:dju084.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Doll R, Peto R, Boreham J, Sutherland I. Mortality from cancer in relation to smoking: 50 years observations on British doctors. Br J Cancer. 2005;92:426–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Malhotra J, Malvezzi M, Negri E, La Vecchia C, Boffetta P. Risk factors for lung cancer worldwide. Eur Respir J. 2016;48:889–902.

    Article  PubMed  Google Scholar 

  15. Lorenzo-González M, et al. Radon exposure: a major cause of lung cancer. Expert Rev Respir Med. 2019;13:839–50.

    Article  PubMed  Google Scholar 

  16. Lee SS, Cheah YK. The interplay between MicroRNAs and cellular components of tumour microenvironment (TME) on Non-small-cell lung cancer (NSCLC) progression. J Immunol Res. 2019;2019:3046379.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Altorki NK, et al. The lung microenvironment: an important regulator of tumour growth and metastasis. Nat Rev Cancer. 2019;19:9–31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Hsieh C-H, et al. An innovative NRF2 nano-modulator induces lung cancer ferroptosis and elicits an immunostimulatory tumor microenvironment. Theranostics. 2021;11:7072–91.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Wu J, et al. A risk model developed based on tumor microenvironment predicts overall survival and associates with tumor immunity of patients with lung adenocarcinoma. Oncogene. 2021;40:4413–24.

    Article  CAS  PubMed  Google Scholar 

  20. Romaszko AM, Doboszyńska A. Multiple primary lung cancer: a literature review. Adv Clin Exp Med Off Organ Wroclaw Med Univ. 2018;27:725–30.

    Article  Google Scholar 

  21. de Sousa VML, Carvalho L. Heterogeneity in lung cancer. Pathobiol J Immunopathol Mol Cell Biol. 2018;85:96–107.

    Article  Google Scholar 

  22. Dai J, et al. Risk loci identification and polygenic risk score in prediction of lung cancer: a large-scale prospective cohort study in Chinese. Lancet Respir Med. 2019;7:881–91.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Hu Z, et al. A genome-wide association study identifies two new lung cancer susceptibility loci at 13q12.12 and 22q12.2 in Han Chinese. Nat Genet. 2011;43:792–6.

    Article  CAS  PubMed  Google Scholar 

  24. McKay JD, et al. Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. Nat Genet. 2017;49:1126–32.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Jin G, et al. Low-frequency coding variants at 6p21.33 and 20q11.21 are associated with lung cancer risk in Chinese populations. Am J Hum Genet. 2015;96:832–40.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Wu F, et al. Single-cell profiling of tumor heterogeneity and the microenvironment in advanced non-small cell lung cancer. Nat Commun. 2021;12:2540.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Marees AT, et al. A tutorial on conducting genome-wide association studies: quality control and statistical analysis. Int J Methods Psychiatr Res. 2018;27: e1608.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Purcell S, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Chang CC. Data management and summary statistics with PLINK. Methods Mol Biol Clifton NJ. 2020;2090:49–65.

    Article  Google Scholar 

  30. Keich U, Noble WS. Controlling the FDR in imperfect matches to an incomplete database. J Am Stat Assoc. 2018;113:973–82.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat Commun. 2017;8:1826.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Watanabe K, Umićević Mirkov M, de Leeuw CA, van den Heuvel MP, Posthuma D. Genetic mapping of cell type specificity for complex traits. Nat Commun. 2019;10:3222.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Greener JG, Kandathil SM, Moffat L, Jones DT. A guide to machine learning for biologists. Nat Rev Mol Cell Biol. 2022;23:40–55.

    Article  CAS  PubMed  Google Scholar 

  34. Jordan MI, Mitchell TM. Machine learning: trends, perspectives, and prospects. Science. 2015;349:255–60.

    Article  CAS  PubMed  Google Scholar 

  35. Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak. 2019;19:281.

    Article  PubMed  PubMed Central  Google Scholar 

  36. Thomas ZV, Wang Z, Zang C. BART Cancer: a web resource for transcriptional regulators in cancer genomes. NAR Cancer. 2021;3:zcab011.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Song S, Tang H, Quan W, Shang A, Ling C. Estradiol initiates the immune escape of non-small cell lung cancer cells via ERβ/SIRT1/FOXO3a/PD-L1 axis. Int Immunopharmacol. 2022;107: 108629.

    Article  CAS  PubMed  Google Scholar 

  38. Zhang Y, et al. MicroRNA-663a is downregulated in non-small cell lung cancer and inhibits proliferation and invasion by targeting JunD. BMC Cancer. 2016;16:315.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Sha Z, et al. Transcription factor CDX2 up-regulates proto-oncogenic miR-744 via a promoter activation mechanism in non-small-cell lung cancer. Ann Transl Med. 2021;9:1538.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Zeng Z, et al. Distinct expression and prognostic value of members of SMAD family in non-small cell lung cancer. Medicine. 2020;99: e19451.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Li J, Zhang S, Zhu L, Ma S. Role of transcription factor FOXA1 in non-small cell lung cancer. Mol Med Rep. 2018;17:509–21.

    CAS  PubMed  Google Scholar 

  42. Kumar MS, et al. The GATA2 transcriptional network is requisite for RAS oncogene-driven non-small cell lung cancer. Cell. 2012;149:642–55.

    Article  CAS  PubMed  Google Scholar 

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangxiong Feng.

Ethics declarations

Competing interests

The authors have not disclosed any competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feng, X. Integrative analysis of GWAS and transcriptomics data reveal key genes for non-small lung cancer. Med Oncol 40, 270 (2023). https://doi.org/10.1007/s12032-023-02139-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12032-023-02139-x

Keywords

Navigation