Ranking Function Based on Higher Order Statistics (RF-HOS) for Two-Sample Microarray Experiments

  • Jahangheer Shaik
  • Mohammed Yeasin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4463)


This paper proposes a novel ranking function, called RFHOS by incorporating higher order cumulants into the ranking function for finding differentially expressed genes. Traditional ranking functions assume a data distribution (e.g., Normal) and use only first two cumulants for statistical significance analysis. Ranking functions based on second order statistics are often inadequate in ranking small sampled data (e.g., Microarray data). Also, relatively small number of samples in the data makes it hard to estimate the parameters accurately causing inaccuracies in ranking of the genes. The proposed ranking function is based on higher order statistics (RFHOS) that account for both the amplitude and the phase information by incorporating the HOS. The incorporation of HOS deviates from implicit symmetry assumed for Gaussian distribution. In this paper the performance of the RFHOS is compared against other well known ranking functions designed for ranking the genes in two sample microarray experiments.


Two-sample microarray data Higher order statistics Differentially expressed genes 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Stephen, P.A.: Affymetrix, Santa Clara, California (1992-2007), http://www.affymetrix.com/index.affx
  2. 2.
    Hewlett, B., Packard, D.: Agilent Technologies, Santa Clara, California (1999-2007), http://www.home.agilent.com/agilent/home.jspx
  3. 3.
    Guyon, I.: An Introduction of Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)MATHCrossRefGoogle Scholar
  4. 4.
    Ray, J.M., Hearl, W.G.: Methods for Evaluating Differential Gene Expression in Tissues and Cells. In: Drug Development, pp. 50–55 (2005)Google Scholar
  5. 5.
    Shaik, J., Yeasin, M.: A Progressive Framework for Two-Way Clustering Using Adaptive Subspace Iteration for Functionally Classifying Genes. In: Proceedings of IEEE IJCNN’06, Vancouver, Canada, pp. 5287–5292 (2006)Google Scholar
  6. 6.
    Shaik, J., Yeasin, M.: Performance Evaluation of Subspace-based Algorithm in Selecting differentially Expressed Genes and Classification of Tissue Types from Microarray Data. In: Proceedings of IEEE IJCNN’06, Vancouver, Canada, pp. 5279–5286 (2006)Google Scholar
  7. 7.
    Brody, J.P., et al.: Significance and Statistical Errors in the Analysis of DNa microarray Data. Proc. Natl. Acad. Sci. 99, 12975–12978 (2002)CrossRefGoogle Scholar
  8. 8.
    Chen, Y., Dougherty, E.R., Bittner, M.L.: Ratio based decisions and quantitative analysis of cDNA microarray images. Journal of Biomedical optics 2, 364–374 (1997)CrossRefGoogle Scholar
  9. 9.
    Huber, W., et al.: Variance Stabilization Applied to Microarray Data Calibration and to Quantification of Differential Expression. Bioinformatics 18, s96–104 (2002)Google Scholar
  10. 10.
    Konishi, T.: Three Parameter Lognormal Distribution Ubiquitously Found in cDNA Microarray data and Its Application to Parametric Data Treatment. Bioinformatics 5 (2004)Google Scholar
  11. 11.
    Lonnstedt, I., Speed, T.: Replicated Microarray Data. Statistica Sinica 12, 31–46 (2002)MathSciNetGoogle Scholar
  12. 12.
    Purdom, E., Holmes, S.: Error Distribution for Gene Expression Data. Statistical Applications in Genetics and Molecular Biology 4 (2005)Google Scholar
  13. 13.
    Rocke, D.M., Durbin, B.: Approximate Variance-stabilizing Transformations for Gene Expression Microarray Data. Bioinformatics 19, 966–972 (2003)CrossRefGoogle Scholar
  14. 14.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley and Sons Inc., Chichester (2000)Google Scholar
  15. 15.
    Getz, G., Levine, E., Domany, E.: Coupled two-way clustering of gene microarray data. Proceedings of National Academy of Science, USA 97, 12079–12084 (2000)CrossRefGoogle Scholar
  16. 16.
    Mukherjee, S., Roberts, S.J., Laan, M.J.: Data-adaptive Test Statistics for Microarray Data. Bioinformatics 21, 108–114 (2005)CrossRefGoogle Scholar
  17. 17.
    Shaik, J., Yeasin, M.: Adaptive Ranking and Selection of Differentially Expressed Genes from Microarray Data. WSEAS transactions on Biology and Biomedicine 3, 125–133 (2006)Google Scholar
  18. 18.
    Pan, W.: A Comparative Review of Statistical Methods for Discovering Differentially Expressed Genes in Replicated Microarray Experiments. Bioinformatics 18, 546–554 (2002)CrossRefGoogle Scholar
  19. 19.
    Jeffery, I.B., Higgins, D.G., Culhane, A.C.: Comparison and Evaluation of Methods for Generating Differentially Expressed Gene lists from MicroArray Data. BMC Bioinformatics 7, 359–375 (2006)CrossRefGoogle Scholar
  20. 20.
    Mutch, D.M., et al.: The Limit Fold Change Model: A Practical Approach for Selecting Differentially Expressed Genes from Microarray Data. BMC Bioinformatics 21, 3–17 (2002)Google Scholar
  21. 21.
    Sahai, H., Ojeda, M.M.: Analysis of Variance for Random Models: Theory, Methods, Applications and Data Analysis. Birkhäuser, Basel (2004)MATHGoogle Scholar
  22. 22.
    Casella, G., Berger, R.L.: Statistical Inference, 2nd edn. Duxbury Press, Belmont (2001)Google Scholar
  23. 23.
    Thomas, J.G., et al.: An Efficient and Robust Statistical Modeling Approach to Discover Differentially Expressed Genes using Genomic Expression Profiles. Genome Research 11, 1227–1236 (2001)CrossRefGoogle Scholar
  24. 24.
    Tusher, V.G., Tibshirani, R., Chu, G.: Significance Analysis of Microarrays Applied to The Ionizing Radiation Response. PNAS 98, 5116–5121 (2001)MATHCrossRefGoogle Scholar
  25. 25.
    Papoulis, A., Pillai, S.U.: Probability, Random Variables and Stochastic Processes, 4th edn. Tata McGraw Hill, New Delhi (2002)Google Scholar
  26. 26.
    Hyvarinen, A., Oja, E.: Independent Component Analysis: Algorithms and Applications. Neural Networks 13, 411–430 (2000)CrossRefGoogle Scholar
  27. 27.
    Stekel, D.: Microarray Bioinformatics, 1st edn. Cambridge University Press, Cambridge (2003)Google Scholar
  28. 28.
    Chen, X., et al.: Variation in Gene Expression Patterns in Human Gastric Cancers. Mol. Bio. Cell. 14, 3208–3215 (2003)CrossRefGoogle Scholar
  29. 29.
    Golub, T.R., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  30. 30.
    Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96, 6745–6750 (1999)CrossRefGoogle Scholar
  31. 31.
    Shaik, J., Yeasin, M.: Visualization of High Dimensional Data using an Automated 3D Star Co-ordinate System. In: Proceedings of IEEE IJCNN’06, Vancouver, Canada, pp. 2318–2325 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Jahangheer Shaik
    • 1
  • Mohammed Yeasin
    • 1
  1. 1.Computer vision, pattern and image analysis lab(www.cvpia.org), Electrical and Computer Engineering University of Memphis Memphis TN-38152 

Personalised recommendations