Skip to main content

Advertisement

Log in

Highly accurate two-gene signature for gastric cancer

  • Original Paper
  • Published:
Medical Oncology Aims and scope Submit manuscript

Abstract

Large amount of expression data were generated by high-throughput experimental techniques such as microarray. Single algorithm cannot be widely accepted as suitable method for mining of gene expression data. Therefore, integration of different algorithms and extraction of more useful information from the expression data are the key problems for identification of biomarkers. Here, we used three machine learning algorithms to select feature genes based on gene profiling data of gastric cancer (GC). Then, a common divisor was extracted as candidate feature genes aggregation for Tree Building and Tree Pruning analysis by Decision Tree (DT) algorithm. Real-time quantitative PCR and immunohistochemistry (IHC) staining were used to validate the relative expression levels of the candidate feature genes. Receiver operating characteristic curves were used to analyse the classification sensitivity and specificity of the feature genes. A total of 174, 202, 149 feature genes were selected by Class Information Index, Information Gain Index and Relief algorithms, with a common divisor consisting of 32 genes. Using a DT algorithm to contribute to the classification rule sets, we identified COL2A1 and ATP4B as candidate biomarkers of GC. The expression levels of these two genes were validated by real-time PCR and IHC with high sensitivity (>90 %) and specificity (>90 %) in both training and test samples. We first introduced an integral and systematic data-mining model for identification of biomarkers based on gene expression data. The two-gene signature obtained by our predictive model could be used for recognizing the biological characteristic of GC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Lau SK, Boutros PC, Pintilie M, Blackhall FH, Zhu CQ, Strumpf D, Johnston MR, Darling G, Keshavjee S, Waddell TK, Liu N, Lau D, Penn LZ, Shepherd FA, Jurisica I, Der SD, Tsao MS. Three-gene prognostic classifier for early-stage non small-cell lung cancer. J Clin Oncol. 2007;25(35):5562–9.

    Article  PubMed  Google Scholar 

  2. Yoshihara K, Tajima A, Komata D, Yamamoto T, Kodama S, Fujiwara H, Suzuki M, Onishi Y, Hatae M, Sueyoshi K, Fujiwara H, Kudo Y, Inoue I, Tanaka K. Gene expression profiling of advanced-stage serous ovarian cancers distinguishes novel subclasses and implicates ZEB2 in tumor progression and prognosis. Cancer Sci. 2009;100(8):1421–8.

    Article  PubMed  CAS  Google Scholar 

  3. Yan Z, Li J, Xiong Y, Xu W, Zheng G. Identification of candidate colon cancer biomarkers by applying a random forest approach on microarray data. Oncol Rep. 2012;28(3):1036–42.

    PubMed  CAS  Google Scholar 

  4. Peyre M, Commo F, Dantas-Barbosa C, Andreiuolo F, Puget S, Lacroix L, Drusch F, Scott V, Varlet P, Mauguen A, Dessen P, Lazar V, Vassal G, Grill J. Portrait of ependymoma recurrence in children: biomarkers of tumor progression identified by dual-color microarray-based gene expression analysis. PLoS ONE. 2010;5(9):e12932.

    Article  PubMed  Google Scholar 

  5. Colombo J, Fachel AA, De Freitas Calmon M, Cury PM, Fukuyama EE, Tajara EH, Cordeiro JA, Verjovski-Almeida S, Reis EM, Rahal P. Gene expression profiling reveals molecular marker candidates of laryngeal squamous cell carcinoma. Oncol Rep. 2009;21(3):649–63.

    Google Scholar 

  6. Crispi S, Calogero RA, Santini M, Mellone P, Vincenzi B, Citro G, Vicidomini G, Fasano S, Meccariello R, Cobellis G, Menegozzo S, Pierantoni R, Facciolo F, Baldi A, Menegozzo M. Global gene expression profiling of human pleural mesotheliomas: identification of matrix metalloproteinase 14 (MMP-14) as potential tumour target. PLoS ONE. 2009;4(9):e7016.

    Article  PubMed  Google Scholar 

  7. Fèvre-Montange M, Champier J, Durand A, Wierinckx A, Honnorat J, Guyotat J, Jouvet A. Microarray gene expression profiling in meningiomas: differential expression according to grade or histopathological subtype. Int J Oncol. 2009;35(6):1395–407.

    Article  PubMed  Google Scholar 

  8. Li W, Wang R, Yan Z, Bai L, Sun Z. High accordance in prognosis prediction of colorectal cancer across independent datasets by multi-gene module expression profiles. PLoS ONE. 2012;7(3):e33653.

    Article  PubMed  CAS  Google Scholar 

  9. Yang S, Chen J, Guo Y, Lin H, Zhang Z, Feng G, Hao Y, Cheng J, Liang P, Chen K, Wu H, Li Y. Identification of prognostic biomarkers for response to radiotherapy by DNA microarray in nasopharyngeal carcinoma patients. Int J Oncol. 2012;40(5):1590–600.

    PubMed  CAS  Google Scholar 

  10. Lahat G, Tuvin D, Wei C, Wang WL, Pollock RE, Anaya DA, Bekele BN, Corely L, Lazar AJ, Pisters PW, Lev D. Molecular prognosticators of complex karyotype soft tissue sarcoma outcome: a tissue microarray-based study. Ann Oncol. 2010;21(5):1112–20.

    Article  PubMed  CAS  Google Scholar 

  11. Yan Z, Xiong Y, Xu W, Gao J, Cheng Y, Wang Z, Chen F, Zheng G. Identification of hsa-miR-335 as a prognostic signature in gastric cancer. PLoS ONE. 2012;7(7):e40037.

    Article  PubMed  CAS  Google Scholar 

  12. Quackenbush J. Microarray analysis and tumour classification. N Engl J Med. 2006;354:2463–72.

    Article  PubMed  CAS  Google Scholar 

  13. Allison DB, Cui X, Page GP, Sabripour M. Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006;7(1):55–65.

    Article  PubMed  CAS  Google Scholar 

  14. Yang L. Incidence and mortality of gastric cancer in China. World J Gastroenterol. 2006;12:17–20.

    PubMed  Google Scholar 

  15. Zang SZ, Guo RF, Zhang L, Lu Y. Integration of statistical inference methods and a novel control measure to improve sensitivity and specificity of data analysis in expression profiling studies. J Biomed Inform. 2007;40:552–60.

    Article  PubMed  CAS  Google Scholar 

  16. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286(5439):531–7.

    Article  PubMed  CAS  Google Scholar 

  17. Li YX, Ruan XG. Feature selection for cancer classification based on support vector machine. J Comput Res Dev. 2005;42:1796–801.

    Article  Google Scholar 

  18. Lee C, Lee G. Information gain and divergence-based feature selection for machine learning-based text categorization. Inf Process Manag. 2006;42:155–65.

    Article  Google Scholar 

  19. Kingsford C, Salzberg SL. What are decision trees? Nat Biotechnol. 2008;26:1011–3.

    Article  PubMed  CAS  Google Scholar 

  20. Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2 (-Delta Delta C(T)) method. Methods. 2001;25:402–8.

    Article  PubMed  CAS  Google Scholar 

  21. Price ND, Trent J, El-Naggar AK, Cogdell D, Taylor E, Hunt KK, Pollock RE, Hood L, Shmulevich I, Zhang W. Highly accurate two-gene classifier for differentiating gastrointestinal stromal tumors and leiomyosarcomas. Proc Natl Acad Sci USA. 2007;104(9):3414–9.

    Article  PubMed  CAS  Google Scholar 

  22. Zhang X, Yan Z, Zhang J, Gong L, Li W, Cui J, Liu Y, Gao Z, Li J, Shen L, Lu Y. Combination of hsa-miR-375 and hsa-miR-142-5p as a predictor for recurrence risk in gastric cancer patients following surgical resection. Ann Oncol. 2011;22(10):2257–66.

    Article  PubMed  CAS  Google Scholar 

  23. Gálvez-Rosas A, González-Huerta C, Borgonio-Cuadra VM, Duarte-Salazár C, Lara-Alvarado L, de los Angeles Soria-Bastida M, Cortés-González S, Ramón-Gallegos E, Miranda-Duarte A. A COL2A1 gene polymorphism is related with advanced stages of osteoarthritis of the knee in Mexican Mestizo population. Rheumatol Int. 2010;30(8):1035–9.

  24. Hämäläinen S, Solovieva S, Hirvonen A, Vehmas T, Takala EP, Riihimäki H, Leino-Arjas P. COL2A1 gene polymorphisms and susceptibility to osteoarthritis of the hand in Finnish women. Ann Rheum Dis. 2009;68(10):1633–7.

    Article  PubMed  Google Scholar 

  25. Zhang Z, He JW, Fu WZ, Zhang CQ, Zhang ZL. Identification of three novel mutations in the COL2A1 gene in four unrelated Chinese families with spondyloepiphyseal dysplasia congenita. Biochem Biophys Res Commun. 2011;413(4):504–8.

    Article  PubMed  CAS  Google Scholar 

  26. Mark PR, Torres-Martinez W, Lachman RS, Weaver DD. Association of a p.Pro786Leu variant in COL2A1 with mild spondyloepiphyseal dysplasia congenita in a three-generation family. Am J Med Genet A. 2011;155A(1):174–9.

    PubMed  Google Scholar 

  27. Xu P, Yao J, Hou W. Relationships between COL2A1 gene polymorphisms and knee osteoarthritis in Han Chinese women. Mol Biol Rep. 2011;38:2377–81.

    Article  PubMed  CAS  Google Scholar 

  28. Jamieson SE, de Roubaix LA, Cortina-Borja M, Tan HK, Mui EJ, Cordell HJ, Kirisits MJ, Miller EN, Peacock CS, Hargrave AC, Coyne JJ, Boyer K, Bessieres MH, Buffolano W, Ferret N, Franck J, Kieffer F, Meier P, Nowakowska DE, Paul M, Peyron F, Stray-Pedersen B, Prusa AR, Thulliez P, Wallon M, Petersen E, McLeod R, Gilbert RE, Blackwell JM. Genetic and epigenetic factors at COL2A1 and ABCA4 influence clinical outcome in congenital toxoplasmosis. PLoS ONE. 2008;3(6):e2285.

    Article  PubMed  Google Scholar 

  29. Zechi-Ceide RM, Jesus Oliveira NA, Guion-Almeida ML, Antunes LF, Richieri-Costa A, Passos-Bueno MR. Clinical evaluation and COL2A1 gene analysis in 21 Brazilian families with Stickler syndrome: identification of novel mutations, further genotype/phenotype correlation, and its implications for the diagnosis. Eur J Med Genet. 2008;51(3):183–96.

    Google Scholar 

  30. Gerth-Kahlert C, Grisanti S, Berger E, Höhn R, Witt G, Jung U. Bilateral vitreous hemorrhage in a newborn with Stickler syndrome associated with a novel COL2A1 mutation. J AAPOS. 2011;15(3):311–3.

    Article  PubMed  Google Scholar 

  31. Yaguchi H, Ikeda T, Osada H, Yoshitake Y, Sasaki H, Yonekura H. Identification of the COL2A1 mutation in patients with type I Stickler syndrome using RNA from freshly isolated peripheral white blood cells. Genet Test Mol Biomarkers. 2011;15(4):231–7.

    Article  PubMed  CAS  Google Scholar 

  32. Richards AJ, McNinch A, Martin H, Oakhill K, Rai H, Waller S, Treacy B, Whittaker J, Meredith S, Poulson A, Snead MP. Stickler syndrome and the vitreous phenotype: mutations in COL2A1 and COL11A1. Hum Mutat. 2010;31(6):E1461–71.

    Article  PubMed  CAS  Google Scholar 

  33. Göõz M, Hammond CE, Larsen K, Mukhin YV, Smolka AJ. Inhibition of human gastric H(+)-K(+)-ATPase alpha-subunit gene expression by Helicobacter pylori. Am J Physiol Gastrointest Liver Physiol. 2000;278(6):G981–91.

    PubMed  Google Scholar 

  34. Scarff KL, Judd LM, Toh BH, Gleeson PA, Van Driel IR. Gastric H(+), K(+)-adenosine triphosphatase beta subunit is required for normal function, development, and membrane structure of mouse parietal cells. Gastroenterology. 1999;117(3):605–18.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We thank Prof. Jiangeng Li (Academy of Electronic Information & control Engineering, Beijing University of Technology, Beijing, China.) for the great help in processing the gene expression data using machine learning algorithms.

Conflict of interest

None.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guorong Zheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, Z., Xu, W., Xiong, Y. et al. Highly accurate two-gene signature for gastric cancer. Med Oncol 30, 584 (2013). https://doi.org/10.1007/s12032-013-0584-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12032-013-0584-x

Keywords

Navigation