Skip to main content

A Novel Approach for Effective Learning of Cluster Structures with Biological Data Applications

  • Conference paper
Data Mining and Bioinformatics (VDMB 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4316))

Included in the following conference series:

  • 497 Accesses

Abstract

Recently DNA microarray gene expression studies have been actively performed for mining unknown biological knowledge hidden under a large volume of gene expression data in a systematic way. In particular, the problem of finding groups of co-expressed genes or samples has been largely investigated due to its usefulness in characterizing unknown gene functions or performing more sophisticated tasks, such as modeling biological pathways. Nevertheless, there are still some difficulties in practice to identify good clusters since many clustering methods require user’s arbitrary selection of the number of target clusters. In this paper we propose a novel approach to systematically identifying good candidates of cluster numbers so that we can minimize the arbitrariness in cluster generation. Our experimental results on both synthetic dataset and real gene expression dataset show the applicability and usefulness of this approach in microarray data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hand, D.J., Heard, N.A.: Finding groups in gene expression data. Journal of Biomedicine and Biotechnology 2, 215–225 (2005)

    Article  Google Scholar 

  2. Slonim, D.K.: From patterns to pathways: gene expression data analysis comes of age. Nature genetics supplement 32, 502–508 (2002)

    Article  Google Scholar 

  3. Walker, M.G.: Pharmaceutical target identification by gene expression analysis. Mini reviews in medicinal chemistry 1, 197–205 (2001)

    Article  Google Scholar 

  4. Eisen, M.B., Spellman, P.T., Brown, P.O., Bostein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95, 14863–14868 (1998)

    Article  Google Scholar 

  5. Tamayo, P., et al.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. 96, 2907–2912 (1999)

    Article  Google Scholar 

  6. Golub, T.R., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  7. Liu, H., Li, J., Wong, L.: Use of extreme patient samples for outcome prediction from gene expression data. Bioinformatics 21(16), 3377–3384 (2005)

    Article  Google Scholar 

  8. Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nature Genetics 22, 281–285 (1999)

    Article  Google Scholar 

  9. Toh, H., Horimoto, K.: Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling. Bioinformatics 18(2), 287–297 (2002)

    Article  Google Scholar 

  10. Xu, R., Wunsch II, D.: Survey of clustering algorithms. IEEE Trans. on Neural Networks 16(3), 645–678 (2005)

    Article  Google Scholar 

  11. Horn, D., Axel, I.: Novel clustering algorithm for microarray expression data in a truncated SVD space. Bioinformatics 19, 1110–1115 (2003)

    Article  Google Scholar 

  12. Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)

    Article  Google Scholar 

  13. Dhilon, I., et al.: Diametrical clustering for identifying anti-correlated gene clusters. Bioinformatics 19, 1612–1619

    Google Scholar 

  14. Sharan, R., et al.: Click and expander: a system for clustering and visualizing gene expression data. Bioinformatics 19, 1787–1799 (2003)

    Article  MathSciNet  Google Scholar 

  15. Bolshakova, N., Azuaje, F.: Estimating the number of clusters in DNA microarray data. Methods Inf. Med. 45(2), 153–157 (2006)

    Google Scholar 

  16. Amato, R., et al.: A multi-step approach to time series analysis and gene expression clustering. Bioinformatics 22(5), 589–596 (2006)

    Article  MathSciNet  Google Scholar 

  17. Tseng, V.S., Kao, C.-P.: Efficiently mining gene expression data via a novel parameterless clustering method. IEEE/ACM trans. on Comp. Biology and Bioinformatics 2(4), 355–365 (2005)

    Article  Google Scholar 

  18. Golub, G.H., Van Loan, C.F.: Matrix Computation, 3rd edn. The Johns Hopkins University Press (1996)

    Google Scholar 

  19. Quackenbush, J.: Computational analysis of microarray data. Nature Reviews Genetics 2, 418–422 (2001)

    Article  Google Scholar 

  20. Cho, R.J., et al.: A genome-wide transcriptional analysis of the mitotic cell cycle. Molecular Cell 2, 65–73 (1998)

    Article  Google Scholar 

  21. Shin, M., Park, S.H.: Microarray expression data analysis using seed-based clustering method. Key engineering materials 277, 343–348 (2005)

    Article  Google Scholar 

  22. Yeung, K.Y., Haynor, D.R., Ruzzo, W.L.: Validating clustering for gene expression data. Bioinformatics 17(4), 309–318 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shin, M. (2006). A Novel Approach for Effective Learning of Cluster Structures with Biological Data Applications. In: Dalkilic, M.M., Kim, S., Yang, J. (eds) Data Mining and Bioinformatics. VDMB 2006. Lecture Notes in Computer Science(), vol 4316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11960669_2

Download citation

  • DOI: https://doi.org/10.1007/11960669_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68970-6

  • Online ISBN: 978-3-540-68971-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics