Advertisement

An Integrated Clustering and Classification Approach for the Analysis of Tumor Patient Data

  • Stephan M. Winkler
  • Michael Affenzeller
  • Herbert Stekel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8111)

Abstract

Standard patient parameters, tumor markers, and tumor diagnosis records are used for identifying prediction models for tumor markers as well as cancer diagnosis predictions. In this paper we present a hybrid clustering and classification approach that first identifies data clusters (using standard patient data and tumor markers) and then learns prediction models on the basis of these data clusters. The so formed clusters are analyzed and their homogeneity is calculated; the models learned on the basis of these clusters are tested and compared to each other with respect to classification accuracy and variable impacts.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Affenzeller, M., Winkler, S., Wagner, S., Beham, A.: Genetic Algorithms and Genetic Programming - Modern Concepts and Practical Applications. Chapman & Hall / CRC (2009)Google Scholar
  2. 2.
    Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716–723 (1974)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Davies, D.L., Bouldin, D.: A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence 2, 224–227 (1979)CrossRefGoogle Scholar
  4. 4.
    Koepke, J.A.: Molecular marker test standardization. Cancer 69, 1578–1581 (1992)CrossRefGoogle Scholar
  5. 5.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection, pp. 1137–1143. Morgan Kaufmann (1995)Google Scholar
  6. 6.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT Press (1992)Google Scholar
  7. 7.
    Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Heidelberg (2002)CrossRefzbMATHGoogle Scholar
  8. 8.
    MacKay, D.: Information Theory, Inference and Learning Algorithms, pp. 284–292. Cambridge University Press (2003)Google Scholar
  9. 9.
    Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Gaussian Mixture Models and k-Means Clustering. Cambridge University Press, New York (2007)Google Scholar
  10. 10.
    Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)zbMATHGoogle Scholar
  11. 11.
    Wagner, S.: Heuristic Optimization Software Systems – Modeling of Heuristic Optimization Algorithms in the HeuristicLab Software Environment. Ph.D. thesis, Johannes Kepler University Linz (2009)Google Scholar
  12. 12.
    Winkler, S.: Evolutionary System Identification - Modern Concepts and Practical Applications. Ph.D. thesis, Institute for Formal Models and Verification, Johannes Kepler University Linz (2008)Google Scholar
  13. 13.
    Winkler, S., Affenzeller, M., Jacak, W., Stekel, H.: Classification of tumor marker values using heuristic data mining methods. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2010 (2010)Google Scholar
  14. 14.
    Winkler, S., Affenzeller, M., Jacak, W., Stekel, H.: Identification of cancer diagnosis estimation models using evolutionary algorithms - a case study for breast cancer, melanoma, and cancer in the respiratory system. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2011 (2011)Google Scholar
  15. 15.
    Winkler, S., Affenzeller, M., Kronberger, G., Kommenda, M., Wagner, S., Dorfer, V., Jacak, W., Stekel, H.: On the use of estimated tumor marker classifications in tumor diagnosis prediction - a case study for breast cancer. Accepted to be published in: International Journal of Simulation and Process Modelling (2013)Google Scholar
  16. 16.
    Winkler, S., Affenzeller, M., Kronberger, G., Kommenda, M., Wagner, S., Jacak, W., Stekel, H.: On the use of estimated tumor marker classifications in tumor diagnosis prediction - a case study for breast cancer. In: Proceedings of the 23rd European Modeling & Simulation Symposium (2011)Google Scholar
  17. 17.
    Winkler, S., Affenzeller, M., Stekel, H.: Evolutionary identification of cancer predictors using clustered data - a case study for breast cancer, melanoma, and cancer in the respiratory system. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2013 (2013)Google Scholar
  18. 18.
    Xu, L., Jordan, M.I.: On convergence properties of the EM algorithm for Gaussian mixtures. Neural Computation 8, 129–151 (1995)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Stephan M. Winkler
    • 1
  • Michael Affenzeller
    • 1
  • Herbert Stekel
    • 2
  1. 1.Heuristic and Evolutionary Algorithms Laboratory; Bioinformatics Research GroupUniversity of Applied Sciences Upper AustriaHagenbergAustria
  2. 2.Central LaboratoryGeneral Hospital LinzLinzAustria

Personalised recommendations