Advances in Data Analysis and Classification

, Volume 12, Issue 4, pp 917–936 | Cite as

Rank-based classifiers for extremely high-dimensional gene expression data

  • Ludwig Lausser
  • Florian Schmid
  • Lyn-Rouven Schirra
  • Adalbert F. X. Wilhelm
  • Hans A. KestlerEmail author
Regular Article


Predicting phenotypes on the basis of gene expression profiles is a classification task that is becoming increasingly important in the field of precision medicine. Although these expression signals are real-valued, it is questionable if they can be analyzed on an interval scale. As with many biological signals their influence on e.g. protein levels is usually non-linear and thus can be misinterpreted. In this article we study gene expression profiles with up to 54,000 dimensions. We analyze these measurements on an ordinal scale by replacing the real-valued profiles by their ranks. This type of rank transformation can be used for the construction of invariant classifiers that are not affected by noise induced by data transformations which can occur in the measurement setup. Our 10 \(\times \) 10 fold cross-validation experiments on 86 different data sets and 19 different classification models indicate that classifiers largely benefit from this transformation. Especially random forests and support vector machines achieve improved classification results on a significant majority of datasets.


Rank-based classification Invariance High-dimensional data Gene expression data 

Mathematics Subject Classification

62H30 Classification and discrimination; cluster analysis 68T10 Pattern recognition, speech recognition 92C40 Biochemistry, molecular biology 



The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/20072013) under Grant Agreement No. 602783, the German Research Foundation (DFG, SFB 1074 project Z1 to HAK), and the Federal Ministry of Education and Research (BMBF, Gerontosys II, Forschungskern SyStaR, project ID 0315894A and e:Med, SYMBOL-HF, Grant ID 01ZX1407A) all to HAK.

Supplementary material

11634_2016_277_MOESM1_ESM.pdf (230 kb)
Supplementary material 1 (pdf 230 KB)


  1. Bavaud F (2009) Aggregation invariance in general clustering approaches. Adv Data Anal Classif 3(3):205–225MathSciNetCrossRefGoogle Scholar
  2. Ben-Dor A, Bruhn L, Friedman N, Nachman I, Schummer M, Yakhini Z (2000) Tissue classification with gene expression profiles. J Comput Biol 7(3–4):559–583CrossRefGoogle Scholar
  3. Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, New YorkzbMATHGoogle Scholar
  4. Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefGoogle Scholar
  5. Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. The Wadsworth statistics/probability series. Chapman & Hall/CRC, Boca RatonGoogle Scholar
  6. Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874zbMATHGoogle Scholar
  7. Fix E, Hodges JL (1951) Discriminatory analysis: nonparametric discrimination: consistency properties. Tech. Rep. Project 21-49-004, Report Number 4, USAF School of Aviation Medicine, Randolf Field, TexasGoogle Scholar
  8. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182zbMATHGoogle Scholar
  9. Haasdonk B, Burkhardt H (2007) Invariant kernel functions for pattern analysis and machine learning. Mach Learn 68(1):35–61CrossRefGoogle Scholar
  10. Hariharan B, Malik J, Ramanan D (2012) Discriminative decorrelation for clustering and classification. In: Fitzgibbon AW, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Computer Vision–ECCV 2012, Springer, Lecture notes in computer science 7575:459–472Google Scholar
  11. Irizarry R, Hobbs B, Collin F, Beazer-Barclay Y, Antonellis K, Scherf U, Speed T (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2):249–264CrossRefGoogle Scholar
  12. Jamain A, Hand D (2009) Where are the large and difficult datasets? Adv Data Anal Classif 3(1):25–38MathSciNetCrossRefGoogle Scholar
  13. Kestler HA, Lausser L, Lindner W, Palm G (2011) On the fusion of threshold classifiers for categorization and dimensionality reduction. Comput Stat 26(2):321–340MathSciNetCrossRefGoogle Scholar
  14. Lausser L, Müssel C, Kestler HA (2012) Representative prototype sets for data characterization and classification. In: Mana N, Schwenker F, Trentin E (eds) Artificial neural networks in pattern recognition (ANNPR12), Lecture notes in artificial intelligence, Springer, Heidelberg 7477:36–47CrossRefGoogle Scholar
  15. McCall M, Bolstad B, Irizarry R (2010) Frozen robust multiarray analysis (fRMA). Biostatistics 11(2):242n++253CrossRefGoogle Scholar
  16. Müssel C, Lausser L, Maucher M, Kestler HA (2012) Multi-objective parameter selection for classifiers. J Stat Softw 46(5):1–27CrossRefGoogle Scholar
  17. Niyogi P, Poggio T, Girosi F (1998) Incorporating prior information in machine learning by creating virtual examples. IEEE Proc Intell Signal Process 86(11):2196–2209Google Scholar
  18. Patil P, Bachant-Winner PO, Haibe-Kains B, Leek J (2015) Test set bias affects reproducibility of gene signatures. Bioinformatics 31(14):2318–2323CrossRefGoogle Scholar
  19. Saeys Y, Inza I, Larranaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517CrossRefGoogle Scholar
  20. Schmid F, Lausser L, Kestler HA (2014) Linear contrast classifiers in high-dimensional spaces. In: Gayar NE, Schwenker F, Suen C (eds) Artificial neural networks in pattern recognition (ANNPR14), Springer, Heidelberg, Lecture notes in artificial intelligence 8774:141–152Google Scholar
  21. Schölkopf B, Burges C, Vapnik V (1996) Incorporating invariances in support vector learning machines. In: von der Malsburg C, von Seelen W, Vorbrüggen J, Sendhoff S (eds) Artificial neural networks—ICANN’96, Springer, Lecture Notes in Computer Science, 1112:47–52Google Scholar
  22. Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319CrossRefGoogle Scholar
  23. Simard PY, LeCun YA, Denker JS, Victorri B (2012) Transformation invariance in pattern recognition—tangent distance and tangent propagation. In: Orr G, Müller KR (eds) Neural networks: tricks of the trade, vol 7700, 2nd edn., Lecture notes in computer scienceSpringer, Heidelberg, pp 239–274Google Scholar
  24. Thomas J, Olson J, Tapscott S, Zhao L (2001) An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Res 11(7):1227–1236CrossRefGoogle Scholar
  25. Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 99(10):6567–6572CrossRefGoogle Scholar
  26. Tsuda K (1999) Support vector classifier with asymmetric kernel functions. In: Verleysen M (ed) Proceedings of ESANN’99 - European symposium on artificial neural networks, D-Facto public, Brussels, pp 183–188Google Scholar
  27. Wood J (1996) Invariant pattern recognition: a review. Pattern Recogn 29(1):1–17CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Ludwig Lausser
    • 1
  • Florian Schmid
    • 1
  • Lyn-Rouven Schirra
    • 1
    • 3
  • Adalbert F. X. Wilhelm
    • 4
  • Hans A. Kestler
    • 1
    • 2
    Email author
  1. 1.Institute of Medical Systems BiologyUlm UniversityUlmGermany
  2. 2.Leibniz Institute on Aging–Fritz Lipmann InstituteJenaGermany
  3. 3.Institute of Number Theory and Probability TheoryUlm UniversityUlmGermany
  4. 4.Department of Psychology and MethodsJacobs UniversityBremenGermany

Personalised recommendations