Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

IAPR International Conference on Pattern Recognition in Bioinformatics

PRIB 2012: Pattern Recognition in Bioinformatics pp 59–70Cite as

  1. Home
  2. Pattern Recognition in Bioinformatics
  3. Conference paper
A Unified Adaptive Co-identification Framework for High-D Expression Data

A Unified Adaptive Co-identification Framework for High-D Expression Data

  • Shuzhong Zhang23,
  • Kun Wang25,
  • Cody Ashby25,
  • Bilian Chen24 &
  • …
  • Xiuzhen Huang25 
  • Conference paper
  • 1605 Accesses

  • 2 Citations

Part of the Lecture Notes in Computer Science book series (LNBI,volume 7632)

Abstract

High-throughput techniques are producing large-scale high-dimensional (e.g., 4D with genes vs timepoints vs conditions vs tissues) genome-wide gene expression data. This induces increasing demands for effective methods for partitioning the data into biologically relevant groups. Current clustering and co-clustering approaches have limitations, which may be very time consuming and work for only low-dimensional expression datasets. In this work, we introduce a new notion of “co-identification”, which allows systematical identification of genes participating different functional groups under different conditions or different development stages. The key contribution of our work is to build a unified computational framework of co-identification that enables clustering to be high-dimensional and adaptive. Our framework is based upon a generic optimization model and a general optimization method termed Maximum Block Improvement. Testing results on yeast and Arabidopsis expression data are presented to demonstrate high efficiency of our approach and its effectiveness.

Keywords

  • Gene Expression Data
  • Gene Expression Dataset
  • Classical Cluster
  • Generalize Maximum Entropy
  • Yeast Gene Expression

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This research is supported by grants from NIH NCRR (5P20RR016460-11) and NIGMS (8P20GM103429-11).

Download conference paper PDF

References

  1. Aguilar-Ruiz, J.S.: Shifting and scaling patterns from gene expression data. Bioinformatics 21, 3840–3845 (2005)

    CrossRef  Google Scholar 

  2. Banerjee, A., et al.: A generalized maximum entropy approach to bregman coclustering and matrix approximation. JMLR 8, 1919–1986 (2007)

    MATH  Google Scholar 

  3. Ben-Dor, A., et al.: Discovering local structure in gene expression data: the order-preserving submatrix problem. In: RECOMB 2002, pp. 49–57 (2002)

    Google Scholar 

  4. Ben-Hur, A., et al.: A stability based method for discovering structure in clustered data. In: Proc. of PSB (2002)

    Google Scholar 

  5. Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (1999)

    MATH  Google Scholar 

  6. Chen, B., et al.: Maximum block improvement and polynomial optimization. SIAM Journal on Optimization 22, 87–107 (2012)

    CrossRef  MathSciNet  MATH  Google Scholar 

  7. Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proc. Int. Conf. Intell. Syst. Mol. Biol., vol. 8, pp. 93–103 (2000)

    Google Scholar 

  8. Cheung, A.N.: Molecular targets in gynaecological cancers. Pathology 39, 26–45 (2007)

    CrossRef  Google Scholar 

  9. Cho, H., et al.: Minimum sum-squared residue co-clustering of gene expression data. In: Proc. SIAM on Data Mining, pp. 114–125 (2004)

    Google Scholar 

  10. Costa, I.G., et al.: Comparative analysis of clustering methods for gene expression time course data. Genet. Mol. Biol. 27, 623–631 (2004)

    CrossRef  Google Scholar 

  11. Deodhar, M., et al.: Hunting for Coherent Co-clusters in High Dimensional and Noisy Datasets. In: IEEE Intl. Conf. on Data Mining Workshops (2008)

    Google Scholar 

  12. D’haeseleer, P.: How does gene expression clustering work? Nature Biotechnology 23, 1499–1501 (2005)

    CrossRef  Google Scholar 

  13. Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer (1999)

    Google Scholar 

  14. Dudoit, S., Fridlyand, J.: A prediction based resampling method for estimating the number of clusters in a data set. Genome Biology 3, 1–21 (2002)

    CrossRef  Google Scholar 

  15. Eisen, M.B., et al.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95, 14863–14868 (1998)

    CrossRef  Google Scholar 

  16. Gibbons, F.D., Roth, F.P.: Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res. 12, 1574–1581 (2002)

    CrossRef  Google Scholar 

  17. Hochreiter, S., et al.: FABIA: factor analysis for bicluster acquisition. Bioinformatics 26, 1520–1527 (2010)

    CrossRef  Google Scholar 

  18. Kilian, J., et al.: The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. The Plant Journal 2, 347–363 (2007)

    CrossRef  Google Scholar 

  19. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Review 51, 455–500 (2009)

    CrossRef  MathSciNet  MATH  Google Scholar 

  20. Jegelka, S., Sra, S., Banerjee, A.: Approximation Algorithms for Tensor Clustering. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS, vol. 5809, pp. 368–383. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  21. Jiang, D., et al.: Mining coherent gene clusters from gene-sample-time microarray data. In: Proc. ACM SIGKDD, pp. 430–439 (2004)

    Google Scholar 

  22. Lathauwer, D., et al.: A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl. 21, 1253–1278 (2000)

    CrossRef  MathSciNet  MATH  Google Scholar 

  23. Lazzeroni, L., Owen, A.B.: Plaid models for gene expression data. Statistica Sinica 12, 61–86 (2002)

    MathSciNet  MATH  Google Scholar 

  24. Lee, M., et al.: Biclustering via Sparse Singular Value Decomposition. Biometrics 66, 1087–1095 (2010)

    CrossRef  MathSciNet  MATH  Google Scholar 

  25. Li, A., Tuck, D.: An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation. Gene Regulation and Systems Biology 3, 49–64 (2009)

    Google Scholar 

  26. Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans. Comput. Biology Bioinform. 1, 24–45 (2004)

    CrossRef  Google Scholar 

  27. Magic, Z., et al.: cDNA microarrays: identification of gene signatures and their application in clinical practice. J. BUON 12(suppl.1), S39–S44 (2007)

    Google Scholar 

  28. Murali, T., Kasif, S.: Extracting conserved gene expression motifs from gene expression data. In: Pacific Symposium on Biocomputing, vol. 8, pp. 77–88 (2003)

    Google Scholar 

  29. Prelic, A., et al.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22, 1122–1129 (2006)

    CrossRef  Google Scholar 

  30. Snider, N., Diab, M.: Unsupervised Induction of Modern Standard Arabic Verb Classes. In: HLT-NAACL, New York (2006)

    Google Scholar 

  31. Strauch, M., et al.: A Two-Step Clustering for 3-D Gene Expression Data Reveals the Main Features of the Arabidopsis Stress Response. J. Integrative Bioinformatics 4, 54–66 (2007)

    Google Scholar 

  32. Supper, J., et al.: EDISA: extracting biclusters from multiple time-series of gene expression profiles. BMC Bioinformatics 8, 334–347 (2007)

    CrossRef  Google Scholar 

  33. Suter, L., et al.: Toxicogenomics in predictive toxicology in drug development. Chem. Biol. 11, 161–171 (2004)

    Google Scholar 

  34. Tamayo, P., et al.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912 (1999)

    CrossRef  Google Scholar 

  35. Tavazoie, S., et al.: Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999)

    CrossRef  Google Scholar 

  36. Tucker, L.R.: Some mathematical notes on three-mode factor analysis. Psychometrika 31, 279–311 (1966)

    CrossRef  MathSciNet  Google Scholar 

  37. Tibshirani, R., et al.: Estimating the Number of Clusters in a Dataset via the Gap Statistic. J. Royal Stat. Soc. B 63, 411–423 (2001)

    CrossRef  MathSciNet  MATH  Google Scholar 

  38. Wang, H., et al.: Clustering by pattern similarity in large data sets. In: Proc. KDD 2002, pp. 394–405 (2002)

    Google Scholar 

  39. Xu, X., et al.: Mining shifting-and-scaling co-regulation patterns on gene expression profiles. In: Proc. ICDE 2006, pp. 89–98 (2006)

    Google Scholar 

  40. Zhang, S., Wang, K., Chen, B., Huang, X.: A New Framework for Co-clustering of Gene Expression Data. In: Loog, M., Wessels, L., Reinders, M.J.T., de Ridder, D. (eds.) PRIB 2011. LNCS, vol. 7036, pp. 1–12. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  41. Zhao, L., Zaki, M.J.: Tricluster: an effective algorithm for mining coherent clusters in 3D microarray data. In: Proc. ACM SIGMOD, pp. 694–705 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. University of Minnesota, Minneapolis, MN, 55455, USA

    Shuzhong Zhang

  2. Xiamen University, Xiamen, 361000, China

    Bilian Chen

  3. Arkansas State University, Jonesboro, AR, 72467, USA

    Kun Wang, Cody Ashby & Xiuzhen Huang

Authors
  1. Shuzhong Zhang
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Kun Wang
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Cody Ashby
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Bilian Chen
    View author publications

    You can also search for this author in PubMed Google Scholar

  5. Xiuzhen Huang
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Institute of Medical Science, University of Tokyo, 4-6-1, Shirokanedai, 108-8639, Minato-ku, Tokyo, Japan

    Tetsuo Shibuya

  2. Department of Mathematical Informatics, The University of Tokyo, 7-3-1 Hongo, 113-8654, Bunkyo-ku, Tokyo, Japan

    Hisashi Kashima

  3. Department of Comouter Science, Tokyo Institute of Technology, 2-12-1 Ookayamama, 152-8550, Meguro-ku, Tokyo, Japan

    Jun Sese

  4. Bioinformatics Project, National Institute of Biomedical Innovation, 7-6-8 Saito-Asagi, 567-0085, Suita, Osaka, Japan

    Shandar Ahmad

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, S., Wang, K., Ashby, C., Chen, B., Huang, X. (2012). A Unified Adaptive Co-identification Framework for High-D Expression Data. In: Shibuya, T., Kashima, H., Sese, J., Ahmad, S. (eds) Pattern Recognition in Bioinformatics. PRIB 2012. Lecture Notes in Computer Science(), vol 7632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34123-6_6

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-34123-6_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34122-9

  • Online ISBN: 978-3-642-34123-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • The International Association for Pattern Recognition

    Published in cooperation with

    http://www.iapr.org/

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature