A Multi-information Based Gene Scoring Method for Analysis of Gene Expression Data

  • Hsieh-Hui Yu
  • Vincent S. Tseng
  • Jiin-Haur Chuang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4070)


Hepatitis B virus (HBV) infection is a worldwide health problem, with more than 1 million people died from liver cirrhosis and hepatocellular carcinoma (HCC) each year. HBV infection could result in the progression from normal to serious cirrhosis which is insidious and asymptomatic in most of the cases. The recent development of DNA microarray technology provides biomedical researchers with a molecular sight to observe thousands of genes simultaneously. How to efficiently extract useful information from these large-scale gene expression data is an important issue. Although there exist a number of interesting researches on this issue, they used to deploy some complicated statistical hypotheses. In this paper, we propose a multi-information-based methodology to score genes based on the microarray expressions. The concept of multi-information here is to combine different scoring functions in different tiers for analyzing gene expressions. The proposed methods can rank the genes according to the degree of relevance to the targeted diseases so as to form a precise prediction base. The experimental results show that our approach delivers accurate prediction through the assessment of QRT-PRC results.


Liver Cirrhosis Gene Expression Data Growth Hormone Receptor Oligonucleotide Array Hierarchical Agglomerative Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alizadeh, A.A., Eisen, M.B., Davis, R.E., Ma, C., Lossos, I.S., Rosenwald, A., Boldrick, J.C., Sabet, H., Tran, T., Yu, X., Powell, J.I., Yang, L., Marti, G.E., Moore, T., Hudson, J.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769), 503–511 (2000)CrossRefGoogle Scholar
  2. 2.
    Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. In: Proceedings of the National Academy of Sciences, vol. 96, pp. 6745–6750 (1999)Google Scholar
  3. 3.
    Ben-Dor, A., Friedman, N., Yakhini, Z.: Scoring genes for relevance, Technical Report, 2000-38, School of Computer Science & Engineering. Hebrew University, Jerusalem Google Scholar
  4. 4.
    Ben-Dor, A., Friedman, N., Yakhini, Z.: Overabundance Analysis and Class Discovery in Gene Expression Data, Technical Reports of the Leibniz Center (2002)Google Scholar
  5. 5.
    Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles. Jour. Of Comp. Bio. 7, 559–584 (2000)CrossRefGoogle Scholar
  6. 6.
    Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. J. Comp. Bio. 6(3-4), 281–297 (1999)CrossRefGoogle Scholar
  7. 7.
    Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 245–271 (1997)CrossRefMathSciNetMATHGoogle Scholar
  8. 8.
    Chuang, H.Y., Liu, H.F., Brown, S., Cameron, M.C., Kao, C.Y.: Identifying significant genes from microarray data. In: fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE), pp. 358–366 (2004)Google Scholar
  9. 9.
    Chuang, H.Y., Tsai, H.K., Tsai, Y.F., Kao, C.Y.: Ranking genes for discriminability on microarray data. Journal of Information Science and Engineering 19, 953–966 (2003)Google Scholar
  10. 10.
    Cortes, C., Vapnik, V.: Support vector machines. Machine Learning 20, 273–297 (1995)MATHGoogle Scholar
  11. 11.
    de Kok, J.B., Roelofs, R.W., Giesendorf, B.A., Pennings, J.L., Waas, E.T., Feuth, T., Swinkels, D.W., Span, P.N.: Normalization of gene expression measurements in tumor tissues: comparison of 13 endogenous control genes. Lab Invest. Jan 85(1), 154–159 (2005)Google Scholar
  12. 12.
    Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. PNAS 95(25), 14863–14868 (1998)CrossRefGoogle Scholar
  13. 13.
    Gerard, C.J., Andrejka, L.M., Macina, R.A.: Mitochondrial ATP synthase 6 as an endogenous control in the quantitative RT-PCR analysis of clinical cancer samples. Mol Diagn 5, 39–46 (2000)Google Scholar
  14. 14.
    Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)CrossRefGoogle Scholar
  15. 15.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)MATHGoogle Scholar
  16. 16.
    Kunth, K., Hofler, H., Atkinson, M.J.: Quantification of messenger RNA expression in tumors: which standard should be used for best RNA normalization? Verh Dtsch Ges Pathol 78, 226–230 (1994)Google Scholar
  17. 17.
    Marden, J.I.: ‘Analysing and Modeling Rank Data. Chapman and Hall, Boca Raton (1995)Google Scholar
  18. 18.
    McQueen, J.: Some Methods of Classification and Analysis of Multivariate Observations. In: Proc. of the 5th Berkeley Symp. Mathematical Statistics and Probability, pp. 281–297 (1967)Google Scholar
  19. 19.
    Park, P.J., Pagano, M., Bonetti, M.: A Nonparametric Scoring Algorithm for Identifying Informative Genes from Microarray Data. Pacific Symposium on Biocomputing 6, 52–63 (2001)Google Scholar
  20. 20.
    Pavlidis, P., Tang, C.: Classification of genes using probabilistic models of microarray expression profiles. In: Proceedings of BIOKDD 2001 (2001)Google Scholar
  21. 21.
    Schmittgen, T.D., Zakrajsek, B.A.: Effect of experimental treatment on housekeeping gene expression: validation by real-time, quantitative RT-PCR. J. Biochem. Biophys. Methods 46, 69–81 (2000)CrossRefGoogle Scholar
  22. 22.
    Sharan, R., Shamir, R.: CLICK: A clustering algorithm with applications to gene expression analisys. In: ISMB 2000 (2000)Google Scholar
  23. 23.
    Slonim, D.K., Tamayo, P., Mesirov, J.P., Golub, T.R., Lander, E.S.: Class prediction and discovery using gene expression data. In: RECOMB 2000 (2000)Google Scholar
  24. 24.
    Staunton, J.E., Slonim, D.K., Coller, H.A., Tamayo, P., Angelo, M.J., Park, J., Scherf, U., Lee, J.K., Reinhold, W.O., Weinstein, J.N., et al.: Chemosensitivity prediction by transcriptional profiling. In: Proc. Natl. Acad. Sci. USA 2001, vol. 98, pp. 10787–10792 (2000) Google Scholar
  25. 25.
    Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature selection for SVMs. In: Advances in Neural Information Processing Systems, vol. 13. MIT Press, Cambridge (2001)Google Scholar
  26. 26.
    Xu, L., Krzyzak, A., Suen, C.Y.: Method of Combining Multiple Classifiers and their Application to Handwriting Recognition. IEEE Trans SMC 22, 418–435 (1992)Google Scholar
  27. 27.
    Zuo, F., Kaminski, N., Eugui, E., Allard, J., Yakhini, Z., Ben-Dor, A., Lollini, L., Morris, D., Kim, Y., DeLustro, B., et al.: Gene expression analysis reveals matrilysin as a key regulator of pulmonary fibrosis in mice and humans. In: Proc. Natl. Acad. Sci. USA 2002, vol. 99, pp. 6292–6297 (2000)Google Scholar
  28. 28.
    Affymetrix. User’s guide to product comparison spreadsheets (2003),
  29. 29.
  30. 30.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Hsieh-Hui Yu
    • 1
  • Vincent S. Tseng
    • 1
  • Jiin-Haur Chuang
    • 2
  1. 1.Department of Computer Science and Information EngineeringNational Cheng Kung UniversityTainanTaiwan
  2. 2.Department of Surgery and Internal MedicineChang Gung Memorial Hospital at KaohsiungKaoshiungTaiwan

Personalised recommendations