Skip to main content

Computational Diagnostics with Gene Expression Profiles

  • Protocol
Bioinformatics

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 453))

Abstract

Gene expression profiling using micro-arrays is a modern approach for molecular diagnostics. In clinical micro-array studies, researchers aim to predict disease type, survival, or treatment response using gene expression profiles. In this process, they encounter a series of obstacles and pitfalls. This chapter reviews fundamental issues from machine learning and recommends a procedure for the computational aspects of a clinical micro-array study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Roepman, P., Wessels, L. R, Kettelarij, N., et al. (2005) An expression profile for diagnosis of lymph node metastases from primary head and neck squamous cell carcinomas. Nat Genet 37, 182–186.

    Article  PubMed  CAS  Google Scholar 

  2. Schölkopf, B., Smola, A. J. (2001) Learning with Kernels MIT Press, Cambridge, MA.

    Google Scholar 

  3. Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, UK.

    Google Scholar 

  4. Devroye, L., Györfi, L., Lugosi, L. (1996) A Probabilistic Theory of Pattern Recognition. Springer, New York.

    Google Scholar 

  5. Hastie, T., Tibshirani, R., Friedman, J. (2001) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag, New York.

    Google Scholar 

  6. Duda, R. O., Hart, P. E., Stork, D. G. (2001) Pattern Classification. Wiley, New York.

    Google Scholar 

  7. McLachlan, G. J., Do, K. A., Ambroise, C. (2004) Analyzing Micro-array GeneExpression Data. Wiley, New York.

    Book  Google Scholar 

  8. Terry Speed (ed.) (2003) Statistical Analysis of Gene Expression Micro-array Data. Chapman & Hall/CRC, Boca Raton, FL.

    Google Scholar 

  9. Haferlach, T., Kohlmann, A., Schnittger, S., et al. (2005)A global approach to the diagnosis of leukemia using gene expression profiling. Blood 106, 1189–1198.

    Article  PubMed  CAS  Google Scholar 

  10. van't Veer, L. J., Dai, H., van de Vijver, M. J., et al. (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536.

    Article  Google Scholar 

  11. Cheok, M. H., Yang, W, Pui, C. H., et al. (2003) Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells. Nat Genet 34, 85–90.

    Article  PubMed  CAS  Google Scholar 

  12. West, M., Blanchette, C, Dressman, H., et al. (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci USA 98, 11462–11467.

    Article  PubMed  CAS  Google Scholar 

  13. Wessels, L. F., Reinders, M. J., Hart, A. A., et al. (2005) A protocol for building and evaluating predictors of disease state based on micro-array data. Bioinformatics 21, 3755–3762.

    Article  PubMed  CAS  Google Scholar 

  14. Dudoit, S., Fridlyand, J., Speed, T. (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Amer Stat Assoc 97, 77–87.

    Article  CAS  Google Scholar 

  15. Jäger, J., Weichenhan, D., Ivandic, B., et al. (2005) Early diagnostic marker panel determination for micro-array based clinical studies. SAGMB 4, Art 9.

    Google Scholar 

  16. John, G. H., Kohavi, R., Pfleger, K. (1994) Irrelevant Features and the Subset Selection Problem Morgan Kaufmann Publishers International Conference on Machine Learning, San Francisco CA, USA pp. 121–129.

    Google Scholar 

  17. Ihaka, R., Gentleman, R. (1996) R: a language for data analysis and graphics. J Corn-put Graphical Stat 5, 299–314.

    Article  Google Scholar 

  18. Tedm, R. D. C. (2005), R Foundation for Statistical Computing. Vienna, A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria

    Google Scholar 

  19. Gentleman, R. C., Carey, V. J., Bates, D. M., et al. (2004) Bioconductor: Open software development for computational biology and bioinformatics. Gen Biol 5, R80.

    Article  Google Scholar 

  20. Liu, Li, Wong (2005) Use of extreme patient samples for outcome prediction from gene expression data. Bioinformatics.

    Google Scholar 

  21. Stone, M. (1974) Cross-validatory choice and assessment of statistical predictions. J Roy Stat Soc Series B (Method) 36, 111–147.

    Google Scholar 

  22. Geisser, S. (1975) The predictive sample reuse method with applications. J Amer Stat Assoc 70, 320–328.

    Article  Google Scholar 

  23. Ruschhaupt, M., Huber, W, Poustka, A., et al. (2004) A compendium to ensure computational reproducibility in high-dimensional classification tasks. Stat Appl Gen Mol Biol 3, 37.

    Google Scholar 

  24. Dudoit, S. (2003) Introduction to Multiple Hypothesis Testing. Biostatistics Division, California University Berkeley CA, USA.

    Google Scholar 

  25. Tibshirani, R., Hastie, T, Narasimhan, B., et al. (2003) Class prediction by nearest shrunken centroids, with applications to DNA micro-arrays. Statist Sci 18, 104–117.

    Article  Google Scholar 

  26. Tibshirani, R., Hastie, T., Narasimhan, B., et al. (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 99, 6567– 6572.

    Article  PubMed  CAS  Google Scholar 

  27. Huang, X., Pan, W. (2003) Linear regression and two-class classification with gene expression data. Bioinformatics 19, 2072–2078.

    Article  PubMed  CAS  Google Scholar 

  28. Vapnik, V (1998) Statistical Learning Theory. Wiley, New York.

    Google Scholar 

  29. Vapnik, V (1995) The Nature of Statistical Learning Theory. Springer, New York.

    Google Scholar 

  30. Guyon, I., Weston, J., Barnhill, S., et al. (2002) Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422.

    Article  Google Scholar 

  31. Braga-Neto, U. M., Dougherty, E. R. (2004) Is cross-validation valid for small-sample micro-array classification? Bioinformatics 20, 374–380.

    Article  PubMed  CAS  Google Scholar 

  32. Kohavi, R. (1995) IJCAI 1137–1145.

    Google Scholar 

  33. Hastie, T., Tibshirani, R, Friedman, J. (2001) The Elements of Statistical Learning. Springer, New York.

    Google Scholar 

  34. Efron, B., Tibshirani, R (1997) Improvements on cross-validation: the 632+ bootstrap method. J Amer Stat Assoc 92, 548–560.

    Article  Google Scholar 

  35. Ambroise, C., McLachlan, G. J. (2002) Selection bias in gene extraction on the basis of micro-array gene-expression data. Proc Natl Acad Sci USA 99, 6562–6566.

    Article  PubMed  CAS  Google Scholar 

  36. Simon, R, Radmacher, M. D., Dobbin, K., et al. (2003) Pitfalls in the use of DNA micro-array data for diagnostic and prognostic classification. J Natl Cancer Inst 95, 14–18.

    Article  PubMed  CAS  Google Scholar 

  37. Ntzani, E. E., Ioannidis, J. P. A. (2003) Predictive ability of DNA micro-arrays for cancer outcomes and correlates: an empirical assessment. Lancet 362, 1439–1444.

    Article  PubMed  CAS  Google Scholar 

  38. Reid, J. F., Lusa, L., De Cecco, L., et al. (2005) Limits of predictive models using micro-array data for breast cancer clinical treatment outcome. J Natl Cancer Inst 97, 927–930.

    Article  PubMed  CAS  Google Scholar 

  39. Michiels, S., Koscielny, S., Hill, C. (2005) Prediction of cancer outcome with micro-arrays: a multiple random validation strategy. Lancet 365, 488–492.

    Article  PubMed  CAS  Google Scholar 

  40. van de Vijver, M. J., He, Y. D., van't Veer, L. J., et al. (2002) A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347, 1999–2009.

    Article  PubMed  Google Scholar 

  41. Sorlie, T., Tibshirani, R, Parker, J., et al. (2003) Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA 100, 8418–8423.

    Article  PubMed  CAS  Google Scholar 

  42. Ramaswamy, S., Ross, K. N., Lander, E. S., et al. (2003) A molecular signature of metastasis in primary solid tumors. Nat Genet 33, 49–54.

    Article  PubMed  CAS  Google Scholar 

  43. Dor, L. E., Kela, I., Getz, G., et al. (2005) Outcome signature genes in breast cancer: is there a unique set? Bioinformatics 21, 171–178.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Lottaz, C., Kostka, D., Markowetz, F., Spang, R. (2008). Computational Diagnostics with Gene Expression Profiles. In: Keith, J.M. (eds) Bioinformatics. Methods in Molecular Biology™, vol 453. Humana Press. https://doi.org/10.1007/978-1-60327-429-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-60327-429-6_15

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60327-428-9

  • Online ISBN: 978-1-60327-429-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics