On the Identification of Virtual Tumor Markers and Tumor Diagnosis Predictors Using Evolutionary Algorithms

  • Stephan M. Winkler
  • Michael Affenzeller
  • Gabriel K. Kronberger
  • Michael Kommenda
  • Stefan Wagner
  • Witold Jacak
  • Herbert Stekel
Part of the Topics in Intelligent Engineering and Informatics book series (TIEI, volume 6)


In this chapter we present results of empirical research work done on the data based identification of estimation models for tumor markers and cancer diagnoses: Based on patients’ data records including standard blood parameters, tumor markers, and information about the diagnosis of tumors we have trained mathematical models that represent virtual tumor markers and predictors for cancer diagnoses, respectively. We have used a medical database compiled at the Central Laboratory of the General Hospital Linz, Austria, and applied several data based modeling approaches for identifying mathematical models for estimating selected tumor marker values on the basis of routinely available blood values; in detail, estimators for the tumor markers AFP, CA-125, CA15-3, CEA, CYFRA, and PSA have been identified and are discussed here. Furthermore, several data based modeling approaches implemented in HeuristicLab have been applied for identifying estimators for selected cancer diagnoses: Linear regression, k-nearest neighbor learning, artificial neural networks, and support vector machines (all optimized using evolutionary algorithms) as well as genetic programming. The investigated diagnoses of breast cancer, melanoma, and respiratory system cancer can be estimated correctly in up to 81%, 74%, and 91% of the analyzed test cases, respectively; without tumor markers up to 75%, 74%, and 87% of the test samples are correctly estimated, respectively.


Support Vector Machine Feature Selection Tumor Marker Evolutionary Algorithm Modeling Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Affenzeller, M., Wagner, S.: SASEGASA: A new generic parallel evolutionary algorithm for achieving highest quality results. Journal of Heuristics - Special Issue on New Advances on Parallel Meta-Heuristics for Complex Problems 10, 239–263 (2004)Google Scholar
  2. 2.
    Affenzeller, M., Wagner, S.: Offspring selection: A new self-adaptive selection scheme for genetic algorithms. In: Ribeiro, B., Albrecht, R.F., Dobnikar, A., Pearson, D.W., Steele, N.C. (eds.) Adaptive and Natural Computing Algorithms, Springer Computer Science, pp. 218–221. Springer (2005)Google Scholar
  3. 3.
    Affenzeller, M., Wagner, S., Winkler, S.: Goal-oriented preservation of essential genetic information by offspring selection. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), vol. 2, pp. 1595–1596. Association for Computing Machinery, ACM (2005)Google Scholar
  4. 4.
    Affenzeller, M., Winkler, S., Wagner, S., Beham, A.: Genetic Algorithms and Genetic Programming - Modern Concepts and Practical Applications. Chapman & Hall / CRC (2009)Google Scholar
  5. 5.
    Alba, E., Garca-Nieto, J., Jourdan, L., Talbi, E.G.: Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In: IEEE Congress on Evolutionary Computation 2007, pp. 284–290 (2007)Google Scholar
  6. 6.
    Alberts, B.: Leukocyte functions and percentage breakdown. In: Molecular Biology of the Cell. NCBI Bookshelf (2005)Google Scholar
  7. 7.
    Andriole, G.L., Crawford, E.D., Grubband, R.L., Buys, S.S., Chia, D., Church, T.R., et al.: Mortality results from a randomized prostate-cancer screening trial. New England Journal of Medicine 360(13), 1310–1319 (2009)CrossRefGoogle Scholar
  8. 8.
    Ariew, R.: Ockham’s Razor: A Historical and Philosophical Analysis of Ockham’s Principle of Parsimony. University of Illinois, Champaign-Urbana (1976)Google Scholar
  9. 9.
    Banzhaf, W., Lasarczyk, C.: Genetic programming of an algorithmic chemistry. In: O’Reilly, U., Yu, T., Riolo, R., Worzel, B. (eds.) Genetic Programming Theory and Practice II, pp. 175–190. Ann Arbor (2004)Google Scholar
  10. 10.
    Bitterlich, N., Schneider, J.: Cut-off-independent tumour marker evaluation using ROC approximation. Anticancer Research 27, 4305–4310 (2007)Google Scholar
  11. 11.
    Brown, G.: A new perspective for information theoretic feature selection. In: International Conference on Artificial Intelligence and Statistics, pp. 49–56 (2009)Google Scholar
  12. 12.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), Software available at
  13. 13.
    Cheng, H., Qin, Z., Feng, C., Wang, Y., Li, F.: Conditional mutual information-based feature selection analyzing for synergy and redundancy. Electronics and Telecommunications Research Institute (ETRI) Journal 33(2) (2011)Google Scholar
  14. 14.
    Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience, New York (1991)CrossRefzbMATHGoogle Scholar
  15. 15.
    Duch, W.: Feature Extraction: Foundations and Applications. Springer (2006)Google Scholar
  16. 16.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley Interscience (2000)Google Scholar
  17. 17.
    Duffy, M.J., Crown, J.: A personalized approach to cancer treatment: how biomarkers can help. Clinical Chemistry 54(11), 1770–1779 (2008)CrossRefGoogle Scholar
  18. 18.
    Efroymson, M.A.: Multiple regression analysis. Mathematical Methods for Digital Computers. Wiley (1960)Google Scholar
  19. 19.
    Eiben, A., Smith, J.: Introduction to Evolutionary Computation. Natural Computing Series. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  20. 20.
    El Akadi, A., El Ouardighi, A., Aboutajdine, D.: A powerful feature selection approach based on mutual information. International Journal of Computer Science and Network Security 8(4), 116–121 (2008)Google Scholar
  21. 21.
    Fleuret, F.: Fast binary feature selection with conditional mutual information. The Journal of Machine Learning Research 5, 1531–1555 (2004), MathSciNetzbMATHGoogle Scholar
  22. 22.
    Gold, P., Freedman, S.O.: Demonstration of tumor-specific antigens in human colonic carcinomata by immunological tolerance and absorption techniques. The Journal of Experimental Medicine 121, 439–462 (1965)CrossRefGoogle Scholar
  23. 23.
    Hammarstrom, S.: The carcinoembryonic antigen (cea) family: structures, suggested functions and expression in normal and malignant tissues. Seminars in Cancer Biology 9, 67–81 (1999)CrossRefGoogle Scholar
  24. 24.
    Holland, J.H.: Adaption in Natural and Artifical Systems. University of Michigan Press (1975)Google Scholar
  25. 25.
    Keshaviah, A., Dellapasqua, S., Rotmensz, N., Lindtner, J., Crivellari, D., et al.: Ca15-3 and alkaline phosphatase as predictors for breast cancer recurrence: a combined analysis of seven international breast cancer study group trials. Annals of Oncology 18(4), 701–708 (2007)CrossRefGoogle Scholar
  26. 26.
    Koepke, J.A.: Molecular marker test standardization. Cancer 69, 1578–1581 (1992)CrossRefGoogle Scholar
  27. 27.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 2, pp. 1137–1143. Morgan Kaufmann (1995)Google Scholar
  28. 28.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT Press (1992)Google Scholar
  29. 29.
    Kronberger, G.K.: Symbolic regression for knowledge discovery - bloat, overfitting, and variable interaction networks. Ph.D. thesis, Institute for Formal Models and Verification, Johannes Kepler University Linz (2010)Google Scholar
  30. 30.
    LaFleur-Brooks, M.: Exploring Medical Language: A Student-Directed Approach, 7th edn. Mosby Elsevier, St. Louis (2008)Google Scholar
  31. 31.
    Lai, R.S., Chen, C.C., Lee, P.C., Lu, J.Y.: Evaluation of cytokeratin 19 fragment (cyfra 21-1) as a tumor marker in malignant pleural effusion. Japanese Journal of Clinical Oncology 29(9), 421–424 (1999)CrossRefGoogle Scholar
  32. 32.
    Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Heidelberg (2002)CrossRefzbMATHGoogle Scholar
  33. 33.
    Ljung, L.: System Identification – Theory For the User, 2nd edn. PTR Prentice Hall, Upper Saddle River (1999)Google Scholar
  34. 34.
    Maton, A., Hopkins, J., McLaughlin, C.W., Johnson, S., Warner, M.Q., LaHart, D., Wright, J.D.: Human Biology and Health. Prentice Hall, Englewood Cliffs (1993)Google Scholar
  35. 35.
    Meyer, P., Bontempi, G.: On the use of variable complementarity for feature selection in cancer classification. In: Evolutionary Computation and Machine Learning in Bioinformatics, pp. 91–102 (2006)Google Scholar
  36. 36.
    Mizejewski, G.J.: Alpha-fetoprotein structure and function: relevance to isoforms, epitopes, and conformational variants. Experimental Biology and Medicine 226(5), 377–408 (2001)Google Scholar
  37. 37.
    Nelles, O.: Nonlinear System Identification. Springer, Heidelberg (2001)CrossRefzbMATHGoogle Scholar
  38. 38.
    Niv, Y.: Muc1 and colorectal cancer pathophysiology considerations. World Journal of Gastroenterology 14(14), 2139–2141 (2008)CrossRefGoogle Scholar
  39. 39.
    Osman, N., O’Leary, N., Mulcahy, E., Barrett, N., Wallis, F., Hickey, K., Gupta, R.: Correlation of serum ca125 with stage, grade and survival of patients with epithelial ovarian cancer at a single centre. Irish Medical Journal 101(8), 245–247 (2008)Google Scholar
  40. 40.
    Rai, A.J., Zhang, Z., Rosenzweig, J., Ming Shih, I., Pham, T., Fung, E.T., Sokoll, L.J., Chan, D.W.: Proteomic approaches to tumor marker discovery. Archives of Pathology & Laboratory Medicine 126(12), 1518–1526 (2002)Google Scholar
  41. 41.
    Rosen, D.G., Wang, L., Atkinson, J.N., Yu, Y., Lu, K.H., Diamandis, E.P., Hellstrom, I., Mok, S.C., Liu, J., Bast, R.C.: Potential markers that complement expression of ca125 in epithelial ovarian cancer. Gynecologic Oncology 99(2), 267–277 (2005)CrossRefGoogle Scholar
  42. 42.
    Shannon, C.E.: A mathematical theory of communication. The Bell Systems Technical Journal 27, 379–423 (1948)MathSciNetzbMATHGoogle Scholar
  43. 43.
    Tallitsch, R.B., Martini, F., Timmons, M.J.: Human anatomy, 5th edn. Pearson/Benjamin Cummings, San Francisco (2006)Google Scholar
  44. 44.
    Tesmer, M., Estevez, P.A.: Amifs: Adaptive feature selection by using mutual information. In: IEEE International Joint Conference on Neural Networks, vol. 1 (2004)Google Scholar
  45. 45.
    Thompson, I.M., Pauler, D.K., Goodman, P.J., Tangen, C.M., et al.: Prevalence of prostate cancer among men with a prostate-specific antigen level < = 4.0 ng per milliliter. New England Journal of Medicine 350(22), 2239–2246 (2004)CrossRefGoogle Scholar
  46. 46.
    Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)zbMATHGoogle Scholar
  47. 47.
    Wagner, S.: Heuristic optimization software systems – modeling of heuristic optimization algorithms in the heuristiclab software environment. Ph.D. thesis, Johannes Kepler University Linz (2009)Google Scholar
  48. 48.
    Wagner, S., Affenzeller, M.: SexualGA: Gender-specific selection for genetic algorithms. In: Callaos, N., Lesso, W., Hansen, E. (eds.) Proceedings of the 9th World Multi-Conference on Systemics, Cybernetics and Informatics (WMSCI 2005). International Institute of Informatics and Systemics, vol. 4, pp. 76–81 (2005)Google Scholar
  49. 49.
    Williams, P.W., Gray, H.D.: Gray’s anatomy, 37th edn. C. Livingstone, New York (1989)Google Scholar
  50. 50.
    Winkler, S.: Evolutionary system identification - modern concepts and practical applications. Ph.D. thesis, Institute for Formal Models and Verification, Johannes Kepler University Linz (2008)Google Scholar
  51. 51.
    Winkler, S., Affenzeller, M., Jacak, W., Stekel, H.: Classification of tumor marker values using heuristic data mining methods. In: Proceedings of the GECCO 2010 Workshop on Medical Applications of Genetic and Evolutionary Computation, MedGEC 2010 (2010)Google Scholar
  52. 52.
    Winkler, S., Affenzeller, M., Jacak, W., Stekel, H.: Identification of cancer diagnosis estimation models using evolutionary algorithms - a case study for breast cancer, melanoma, and cancer in the respiratory system. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2010 (2011)Google Scholar
  53. 53.
    Winkler, S., Affenzeller, M., Kronberger, G., Kommenda, M., Wagner, S., Jacak, W., Stekel, H.: Feature selection in the analysis of tumor marker data using evolutionary algorithms. In: Proceedings of the 7th International Mediterranean and Latin American Modelling Multiconference, pp. 1–6 (2010)Google Scholar
  54. 54.
    Winkler, S., Affenzeller, M., Kronberger, G., Kommenda, M., Wagner, S., Jacak, W., Stekel, H.: On the use of estimated tumor marker classifications in tumor diagnosis prediction - a case study for breast cancer. In: Proceedings of 23rd IEEE European Modeling & Simulation Symposium, EMSS 2011 (2011)Google Scholar
  55. 55.
    Yin, B.W., Dnistrian, A., Lloyd, K.O.: Ovarian cancer antigen CA125 is encoded by the MUC16 mucin gene. International Journal of Cancer 98(5), 737–740 (2002)CrossRefGoogle Scholar
  56. 56.
    Yonemori, K., Ando, M., Taro, T.S., Katsumata, N., Matsumoto, K., Yamanaka, Y., Kouno, T., Shimizu, C., Fujiwara, Y.: Tumor-marker analysis and verification of prognostic models in patients with cancer of unknown primary, receiving platinum-based combination chemotherapy. Journal of Cancer Research and Clinical Oncology 132(10), 635–642 (2006)CrossRefGoogle Scholar
  57. 57.
    Zhong, L., Zhou, X., Wei, K., Yang, X., Ma, C., Zhang, C., Zhang, Z.: Application of serum tumor markers and support vector machine in the diagnosis of oral squamous cell carcinoma. Shanghai Kou Qiang Yi Xue (Shanghai Journal of Stomatology) 17(5), 457–460 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Stephan M. Winkler
    • 1
  • Michael Affenzeller
    • 1
  • Gabriel K. Kronberger
    • 1
  • Michael Kommenda
    • 1
  • Stefan Wagner
    • 1
  • Witold Jacak
    • 1
  • Herbert Stekel
    • 2
  1. 1.Heuristic and Evolutionary Algorithms LaboratoryUniversity of Applied Sciences Upper Austria, School of Informatics, Communication and MediaHagenbergAustria
  2. 2.Central LaboratoryGeneral Hospital LinzLinzAustria

Personalised recommendations