Gene Relation Finding Through Mining Microarray Data and Literature

  • Hei-Chia Wang
  • Yi-Shiun Lee
  • Tian-Hsiang Huang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4070)


Finding gene relations has become important in research, since finding relations could assist biologists in finding a genes functionality. This article describes our proposal to combine microarray data and literature to find the relations among genes. The proposed method tries emphasizes the combined use of microarray data and literature rather than microarray data alone. Currently, many scholars use clustering algorithms to analyze microarray data, but these algorithms can find only the same expression mode, not the transcriptional relation between genes. Moreover, most traditional approaches involve all-against-all comparisons that are time-consuming. To reduce the comparison time and to find more relations in a microarray, we propose a method to expand microarray data and use association-rule algorithms to find all possible rules first. With its literature text mining, our method can be used to select the most suitable rules. Under such circumstances, the suitable gene group is selected and the gene comparison frequency is reduced sharply. Finally, we can then apply dynamic Bayesian network (DBN) to find the genes interaction. Unlike other techniques, this method not only reduces the comparison complexity but also reveals more mutual interactions among genes.


Microarray Data Association Rule Vector Space Model Dynamic Bayesian Network Retrieval Module 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Becquet, C., Blachon, S., Jeudy, B., Boulicaut, J.F., Gandrillon, O.: Strong-association-rule mining for large-scale gene-expression data analysis: a case study o human SAGE data. Genome Biol. 12, 1–16 (2003)Google Scholar
  2. 2.
    Creighton, C., Hanash, S.: Mining gene expression databases for association rules. Bioinformatics 19, 79–86 (2003)CrossRefGoogle Scholar
  3. 3.
    Doddi, S., Marathe, A., Ravi, S.S., Torney, D.C.: Discovery of association rules in medical data. Med. Inform. Internet Med. 26, 25–33 (2001)MATHCrossRefGoogle Scholar
  4. 4.
    Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Science (USA) 95, 14863–14868 (1998)CrossRefGoogle Scholar
  5. 5.
    Eisenberg, D., Marcotte, M.E., Xenarios, I., Yeates, O.T.: Protein function in the post-genomic era. Nature 405, 823–826 (2000)CrossRefGoogle Scholar
  6. 6.
    Ewing, B., Green, P.: Analysis of expressed sequence tags indicates 35,000 human genes. Nature Genet. 25, 232–234 (2000)CrossRefGoogle Scholar
  7. 7.
    Hieter, P., Boguski, M.: Functional genomics: its all how you read it. Science 278, 601–602 (1997)CrossRefGoogle Scholar
  8. 8.
    Jenssen, T., Lagreid, A., Komorowski, J., Hovig, E.: A literature network of human genes for high-throughput analysis of gene expression. Nature genetics 28, 21–28 (2001)CrossRefGoogle Scholar
  9. 9.
    Ji, L., Tan, K.L.: Mining Gene expression data for positive and negative co-regulated gene cluster. Bioinformatics 20(16), 2711–2718 (2004)CrossRefGoogle Scholar
  10. 10.
    Kim, S.Y., Imoto, S., Miyano, S.: Inferring gene networks from time series microarray data using Dynamic Bayesian Networks. Briefing Bioinformatics 4(3), 228–235 (2003)CrossRefGoogle Scholar
  11. 11.
    Kim, S.Y., Imoto, S., Miyano, S.: Dynamic Bayesian networks and nonparametric regression for nonlinear modeling of gene networks from time series gene expression data. Biosystems 75, 57–65 (2004)CrossRefGoogle Scholar
  12. 12.
    Murphy, K., Mian, S.: Modeling gene expression data using dynamic Bayesian networks. Technical Report, Computer Science Division. University of California, Berkeley, CA (1999)Google Scholar
  13. 13.
    Narayanasamy, V., Mukhopadhyay, S., Palakal, M., Potter, D.A.: TransMiner:Mining Transitive Associations among Biological Objects form Text. Journal of biomedical science 11, 864–873 (2004)CrossRefGoogle Scholar
  14. 14.
    Ong, I.M., Glasner, J.D., Page, D.: Modeling regulatory pathways in E.coli from time series expression profiles. Bioinformatics 18, 241–248 (2002)Google Scholar
  15. 15.
    Salton, G., Wong, A., Yang Cornel, C.S.: A Vector Space Model for Automated Indexing. Journal of the ACM 18(1), 613–620 (1975)MATHCrossRefGoogle Scholar
  16. 16.
    Shatkay, H., Edwards, S., Boguski, M.: Information retrieval meets gene analysis. IEEE Intelligent Systems, Special Issue on Intelligent Systems in Biology 17(2), 45–53 (2002)Google Scholar
  17. 17.
    Stephens, M., Palakal, M., Mukhopadhyay, S., Raje, R., Mostafa, J.: Detecting gene relations from medline abstracts. Pac. Symp. Biocomput., 483–495 (2001)Google Scholar
  18. 18.
    Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with self-organizing maps methods and application to hematopoietic differentiation. Nature Genetics 96, 2907–2912 (1999)Google Scholar
  19. 19.
    Tao, Y.C., Leibel, R.L.: Identifying functional relationships among human genes by systematic analysis of biological literature. BMC Bioinformatics 3(16), 1–9 (2002)Google Scholar
  20. 20.
    Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nature Genetics 22, 281–285 (1999)CrossRefGoogle Scholar
  21. 21.
    Torgeir, R.H., Astrid, L., Jan, K.: Learning rule-based models of biological process from gene expression time profiles using Gene Ontology. Bioinformatics 19, 1116–1123 (2002)Google Scholar
  22. 22.
    Webb, G.I., Zhang, S.: K-Optimal Rule Discovery. Data mining and Knowledge Discovery 10(1), 39–79 (2005)CrossRefMathSciNetGoogle Scholar
  23. 23.
    Zou, M., Conzen, S.D.: A new dynamic Bayesian network approach for identifying gene regulatory networks from time course microarray data. Bioinformatics, Advance Access published on August 12, 2004, pp. 1–29 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Hei-Chia Wang
    • 1
  • Yi-Shiun Lee
    • 1
  • Tian-Hsiang Huang
    • 1
  1. 1.Institute of Information ManagementNational Cheng Kung UniversityTainanTaiwan

Personalised recommendations