Interpretable Classifiers in Precision Medicine: Feature Selection and Multi-class Categorization

  • Lyn-Rouven Schirra
  • Florian Schmid
  • Hans A. Kestler
  • Ludwig Lausser
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9896)


Growing insight into the molecular nature of diseases leads to the definition of finer grained diagnostic classes. Allowing for better adapted drugs and treatments this change also alters the diagnostic task from binary to multi-categorial decisions. Keeping the corresponding multi-class architectures accurate and interpretable is currently one of the key tasks in molecular diagnostics.

In this work, we specifically address the question to which extent biomarkers that characterize pairwise differences among classes, correspond to biomarkers that discriminate one class from all remaining. We compare one-against-one and one-against-all architectures of feature selecting base classifiers. They are validated for their classification performance and their stability of feature selection.


Feature Selection Ensemble Member Base Classifier Feature Selection Process Selection Stability 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/20072013) under grant agreement n\(^{\circ }\)602783, the German Research Foundation (DFG, SFB 1074 project Z1), and the Federal Ministry of Education and Research (BMBF, Gerontosys II, Forschungskern SyStaR, ID 0315894A and e:Med, SYMBOL-HF, ID 01ZX1407A) all to HAK.


  1. 1.
    Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles. J. Comput. Biol. 7(3–4), 559–583 (2000)CrossRefGoogle Scholar
  2. 2.
    Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1–2), 245–271 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A., Benítez, J., Herrera, F.: A review of microarray datasets and applied feature selection methods. Inf. Sci. 282, 111–135 (2014)CrossRefGoogle Scholar
  4. 4.
    Chow, S.C., Song, F.: Some thoughts on precision medicine. J. Biom. Biostat. 6(5), 1–2 (2015)CrossRefGoogle Scholar
  5. 5.
    Dietterich, T.G., Bariki, G.: Solving multiclass problems via error-correcting output codes. J. Artif. Intell. Res. 2, 263–286 (1995)zbMATHGoogle Scholar
  6. 6.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P.M.B. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  7. 7.
    Gress, T.M., Kestler, H.A., Lausser, L., Fiedler, L., Sipos, B., Michalski, C.W., Werner, J., Giese, N., Scarpa, A., Buchholz, M.: Differentiation of multiple types of pancreatico-biliary tumors by molecular analysis of clinical specimens. J. Mol. Med. 90(4), 457–464 (2011)CrossRefGoogle Scholar
  8. 8.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHGoogle Scholar
  9. 9.
    Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning. Springer, New York (2001)CrossRefzbMATHGoogle Scholar
  10. 10.
    Huang, Y., Suen, C.: A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Trans. Pattern Anal. Mach. Intell. 17, 90–94 (1995)CrossRefGoogle Scholar
  11. 11.
    Khan, J., Wei, J., Ringner, M., Saal, L., Westermann, F., Berthold, F., Schwab, M., Antonesco, C., Peterson, C., Meltzer, P.: Classification and diagnostic prediction of cancer using gene expression profiling and artificial neural networks. Nat. Med. 6, 673–679 (2001)CrossRefGoogle Scholar
  12. 12.
    Kraus, J., Lausser, L., Kestler, H.A.: Exhaustive k-nearest-neighbour subspace clustering. J. Stat. Comput. Simul. 85(1), 30–46 (2015)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Lattke, R., Lausser, L., Müssel, C., Kestler, H.A.: Detecting ordinal class structures. In: Schwenker, F., Roli, F., Kittler, J. (eds.) MCS 2015. LNCS, vol. 9132, pp. 100–111. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  14. 14.
    Lausser, L., Müssel, C., Maucher, M., Kestler, H.A.: Measuring and visualizing the stability of biomarker selection techniques. Comput. Stat. 28(1), 51–65 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Lausser, L., Schmid, F., Platzer, M., Sillanpää, M.J., Kestler, H.A.: Semantic multi-classifier systems for the analysis of gene expression profiles. Arch. Data Sci. Ser. A (Online First) 1(1), 1–19 (2016)Google Scholar
  16. 16.
    Lorena, A., de Carvalho, A., Gama, J.: A review on the combination of binary classifiers in multiclass problems. Artif. Intell. Rev. 30, 19–37 (2008)CrossRefGoogle Scholar
  17. 17.
    Müssel, C., Lausser, L., Maucher, M., Kestler, H.A.: Multi-objective parameter selection for classifiers. J. Stat. Softw. 46(5), 1–27 (2012)CrossRefzbMATHGoogle Scholar
  18. 18.
    Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)CrossRefzbMATHGoogle Scholar
  19. 19.
    Saeys, Y., Iñza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRefGoogle Scholar
  20. 20.
    Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)zbMATHGoogle Scholar
  21. 21.
    Völkel, G., Lausser, L., Schmid, F., Kraus, J.M., Kestler, H.A.: Sputnik: ad hoc distributed computation. Bioinformatics 31(8), 1298–1301 (2015)CrossRefGoogle Scholar
  22. 22.
    Webb, A.R.: Statistical Pattern Recognition, 2nd edn. Wiley, New York (2002)CrossRefzbMATHGoogle Scholar
  23. 23.
    Yeoh, E.J., Ross, M.E., Shurtleff, S.A., Williams, W.K., Patel, D., Mahfouz, R., Behm, F.G., Raimondi, S.C., Relling, M.V., Patel, A., Cheng, C., Campana, D., Wilkins, D., Zhou, X., Li, J., Liu, H., Pui, C.H., Evans, W.E., Naeve, C., Wong, L., Downing, J.R.: Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1(2), 133–143 (2002)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Lyn-Rouven Schirra
    • 1
    • 2
  • Florian Schmid
    • 1
  • Hans A. Kestler
    • 1
  • Ludwig Lausser
    • 1
  1. 1.Institute of Medical Systems BiologyUlm UniversityUlmGermany
  2. 2.Institute of Number Theory and Probability TheoryUlm UniversityUlmGermany

Personalised recommendations