Skip to main content

Advertisement

Log in

Pathway-based microarray analysis for robust disease classification

  • ICONIP2010
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The advent of high-throughput technology has made it possible to measure genome-wide expression profiles, thus providing a new basis for microarray-based diagnosis of disease states. Numerous methods have been proposed to identify biomarkers that can accurately discriminate between case and control classes. Many of the methods used only a subset of ranked genes in the pathway and may not be able to fully represent the classification boundaries for the two disease classes. The use of negatively correlated feature sets (NCFS) to obtain more relevant features in form of phenotype-correlated genes (PCOGs) and inferring pathway activities is proposed in this study. The two pathway activity inference schemes that use NCFS significantly improved the power of pathway markers to discriminate between two phenotypes classes in microarray expression datasets of breast cancer. In particular, the NCFS-i method provided better contrasting features for classification purposes. The improvement is consistent for all cases of pathways used, using both within- and across-dataset validations. The results show that the two proposed methods that use NCFS clearly outperformed other pathway-based classifiers in terms of both ROC area and discriminative score. That is, the identification of PCOGs within each pathway, especially NCFS-i method, helps to reduce noisy or variable measurements, leading to a high performance and more robust classifier. In summary, we have demonstrated that effective incorporation of pathway information into expression-based disease diagnosis and using NCFS can provide better discriminative and more robust models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Golub TR, Slonim DK, Tamayo P, Huard C, Gassenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537

    Article  Google Scholar 

  2. Berns A (2000) Cancer: gene expression diagnosis. Nature 403:491–492

    Article  Google Scholar 

  3. Dupuy A, Simon RM (2007) Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J Natl Cancer Inst 99:147–157

    Article  Google Scholar 

  4. Zheng C-H, Chong Y-W, Wang H-Q (2011) Gene selection using independent variable group analysis for tumor classification. Neural Comput Appl 20:161–170. doi:10.1007/s00521-010-0513-2

    Article  Google Scholar 

  5. Vogelstein B, Kinzler KW (2004) Cancer genes and the pathways they control. Nat Med 10:789–799

    Article  Google Scholar 

  6. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, Katayama T, Kawashima S, Okuda S, Tokimatsu T, Yamanishi Y (2008) KEGG for linking genomes to life and the environment. Nucleic Acids Res 36:D480–D484

    Article  Google Scholar 

  7. Ertel A, Verghese A, Byers SW, Ochs M, Tozeren A (2006) Pathway-specific differences between tumor cell lines and normal and tumor tissue cells. Mol Cancer 5:55

    Article  Google Scholar 

  8. Guo Z, Zhang T, Li X, Wang Q, Xu J, Yu H, Zhu J, Wang H, Wang C, Topol EJ, Wang Q, Rao S (2005) Towards precise classification of cancers based on robust gene functional expression profiles. BMC Bioinformatics 6:58. doi:10.1186/1471-2105-6-58

    Article  Google Scholar 

  9. Bild AH, Yao G, Chang JT, Wang Q, Potti A, Chasse D, Joshi M-B, Harpole D, Lancaster JM, Berchuck A, Olson JA Jr, Marks JR, Dressman HK, West M, Nevins JR (2006) Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature 439:353–357

    Article  Google Scholar 

  10. Lee E, Chuang H-Y, Kim J-W, Ideker T, Lee D (2008) Inferring pathway activity toward precise disease classification. PLoS Comput Biol 4(11):e1000217. doi:10.1371/journal.pcbi.1000217

  11. Kim K-J, Cho S-B (2006) Ensemble classifiers based on correlation analysis for DNA microarray classification. Neurocomputing 70:187–199

    Article  Google Scholar 

  12. Sootanan P, Prom-on S, Meechai A, Chan JH (2010) Microarray-based disease classification using pathway activities with negatively correlated feature sets. In: Wong KW, Mendis BSU, Bouzerdoum A (eds) 17th international conference on neural information processing, (ICONIP 2010), part II, vol 6444. LNCS, Sydney, pp 250–258

    Google Scholar 

  13. Pawitan Y, Bjöhle J, Amler L, Borg AL, Egyhazi S, Hall P, Han X, Holmberg L, Huang F, Klaar S, Liu ET, Miller L, Nordgren H, Ploner A, Sandelin K, Shaw PM, Smeds J, Skoog L, Wedrén S, Bergh J (2005) Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res 7(6):R953–R964

    Article  Google Scholar 

  14. Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, Jatkoe T, Berns EMJJ, Atkins D, Foekens JA (2005) Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365:671–679

    Google Scholar 

  15. Edgar R, Domrachev M, Lash AE (2002) Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210

    Article  Google Scholar 

  16. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18. doi:10.1145/1656274.1656278

    Article  Google Scholar 

  17. Liao JG, Chin KV (2007) Logistic regression for disease classification using microarray data: model selection in a large p and small n case. Bioinformatics 23(15):1945–1951

    Article  Google Scholar 

  18. Helman P, Veroff R, Atlas SR, Willman C (2004) A Bayesian network classification methodology for gene expression data. J Comput Biol 11(4):581–615

    Article  Google Scholar 

  19. Ringnér M, Peterson C (2003) Microarray-based cancer diagnosis with artificial neural networks. BioTechniques 34:S30–S35

    Google Scholar 

  20. McDonald JH (2009) Handbook of biological statistics, 2nd edn edn. Sparky House Publishing, Baltimore, pp 198–201

    Google Scholar 

  21. Esteban LM, Sanz G, López FJ, Borque Á, Vergara JM (2006) Logistic regression versus neural networks for medical data. Monografias del Seminario Matemático García de Galdeano 33:245–252

    Google Scholar 

  22. Stewart B (1998) Improving performance of naïve Bayes classifiers by including hidden-variables. In: Mira J, Del Pobil AP (eds) Methodology and tools in knowledge-based systems, 11th international conference on industrial and engineering applications of artificial intelligence and expert systems, IEA/AIE-98, vol I. Lecture Notes in Computer Science, vol 1415, Springer, Berlin, pp 272–280

  23. Pirooznia M, Yang JY, Yang MQ, Deng Y (2008) A comparative study of different machine learning methods on microarray gene expression data. BMC Genomics 9(Suppl 1):S13

    Article  Google Scholar 

Download references

Acknowledgments

The main author (PS) gratefully acknowledges the financial support from National Research Council of Thailand, School of Information Technology, King Mongkut’s University of Technology Thonburi, as well as Burapha University during his current doctorate study at King Mongkut’s University of Technology Thonburi. PS is especially thankful to Mr. Ponlavit Larpeampaisarl, who helped to implement the script in the work of PCOG identification and activity inference.

Conflict of interests

The authors declare that they have no competing interests.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonathan H. Chan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sootanan, P., Prom-on, S., Meechai, A. et al. Pathway-based microarray analysis for robust disease classification. Neural Comput & Applic 21, 649–660 (2012). https://doi.org/10.1007/s00521-011-0662-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-011-0662-y

Keywords

Navigation