Advertisement

Fast Training of Support Vector Machines for Survival Analysis

  • Sebastian Pölsterl
  • Nassir Navab
  • Amin Katouzian
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9285)

Abstract

Survival analysis is a commonly used technique to identify important predictors of adverse events and develop guidelines for patient’s treatment in medical research. When applied to large amounts of patient data, efficient optimization routines become a necessity. We propose efficient training algorithms for three kinds of linear survival support vector machines: 1) ranking-based, 2) regression-based, and 3) combined ranking and regression. We perform optimization in the primal using truncated Newton optimization and use order statistic trees to lower computational costs of training. We employ the same optimization technique and extend it for non-linear models too. Our results demonstrate the superiority of our proposed optimization scheme over existing training algorithms, which fail due to their inherently high time and space complexities when applied to large datasets. We validate the proposed survival models on 6 real-world datasets, and show that pure ranking-based approaches outperform regression and hybrid models.

Keywords

Survival analysis Support vector machine Optimization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adelson-Velsky, G., Landis, E.: An algorithm for the organization of information. In: Doklady Akademii Nauk SSSR, vol. 146, pp. 263–266 (1962)Google Scholar
  2. 2.
    Airola, A., Pahikkala, T., Salakoski, T.: Training linear ranking SVMs in linearithmic time using red–black trees. Pattern Recogn. Lett. 32(9), 1328–1336 (2011)CrossRefGoogle Scholar
  3. 3.
    Bayer, R.: Symmetric binary B-trees: Data structure and maintenance algorithms. Acta Inform. 1(4), 290–306 (1972)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Bender, R., Augustin, T., Blettner, M.: Generating survival times to simulate Cox proportional hazards models. Stat. Med. 24(11), 1713–1723 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Chapelle, O., Keerthi, S.S.: Efficient algorithms for ranking with SVMs. Information Retrieval 13(3), 201–215 (2009)CrossRefGoogle Scholar
  6. 6.
    Cox, D.R.: Regression models and life tables (with discussion). J. Roy. Stat. Soc. B 34, 187–220 (1972)zbMATHGoogle Scholar
  7. 7.
    Dembo, R.S., Steihaug, T.: Truncated Newton algorithms for large-scale optimization. Math. Programming 26(2), 190–212 (1983)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Desmedt, C., Piette, F., Loi, S., Wang, Y., Lallemand, F., Haibe-Kains, B., Viale, G., Delorenzi, M., Zhang, Y., d’Assignies, M.S., Bergh, J., Lidereau, R., Ellis, P., Harris, A.L., Klijn, J.G., Foekens, J.A., Cardoso, F., Piccart, M.J., Buyse, M., Sotiriou, C.: Strong Time Dependence of the 76-Gene Prognostic Signature for Node-Negative Breast Cancer Patients in the TRANSBIG Multicenter Independent Validation Series. Clin. Cancer Res. 13(11), 3207–3214 (2007)CrossRefGoogle Scholar
  9. 9.
    Eleuteri, A., Taktak, A.F.G.: Support vector machines for survival regression. In: Biganzoli, E., Vellido, A., Ambrogi, F., Tagliaferri, R. (eds.) CIBB 2011. LNCS, vol. 7548, pp. 176–189. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  10. 10.
    Evers, L., Messow, C.M.: Sparse kernel methods for high-dimensional survival data. Bioinformatics 24(14), 1632–1638 (2008)CrossRefGoogle Scholar
  11. 11.
    Harrell, F.E., Califf, R.M., Pryor, D.B., Lee, K.L., Rosati, R.A.: Evaluating the Yield of Medical Tests. J. Am. Med. Assoc. 247(18), 2543–2546 (1982)CrossRefGoogle Scholar
  12. 12.
    Hosmer, D., Lemeshow, S., May, S.: Applied Survival Analysis: Regression Modeling of Time to Event Data. John Wiley & Sons, Inc. (2008)Google Scholar
  13. 13.
    Jensen, P.B., Jensen, L.J., Brunak, S.: Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13(6), 395–405 (2012)CrossRefGoogle Scholar
  14. 14.
    Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data. John Wiley & Sons, Inc. (2002)Google Scholar
  15. 15.
    Kannel, W.B., Feinleib, M., McNamara, P.M., Garrision, R.J., Castelli, W.P.: An Investigation of Coronary Heart Disease in Families: The Framingham Offspring Study. Am. J. Epidemiol. 110(3), 281–290 (1979)zbMATHGoogle Scholar
  16. 16.
    Keerthi, S.S., DeCoste, D.: A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs. J. Mach. Learn. Res. 6, 341–361 (2005)MathSciNetGoogle Scholar
  17. 17.
    Khan, F.M., Zubek, V.B.: Support vector regression for censored data (SVRc): a novel tool for survival analysis. In: 8th IEEE Int. Conf. on Data Mining, pp. 863–868 (2008)Google Scholar
  18. 18.
    Kimeldorf, G.S., Wahba, G.: A correspondence between bayesian estimation on stochastic processes and smoothing by splines. Ann. Math. Stat. 41, 495–502 (1970)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Lee, C.P., Lin, C.J.: Large-Scale Linear RankSVM. Neural Comput. 26(4), 781–817 (2014)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Mangasarian, O.: A finite newton method for classification. Optimization Methods and Software 17(5), 913–929 (2002)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Ndrepepa, G., Braun, S., Mehilli, J., Birkmeier, K.A., Byrne, R.A., Ott, I., Hösl, K., Schulz, S., Fusaro, M., Pache, J., Hausleiter, J., Laugwitz, K.L., Massberg, S., Seyfarth, M., Schömig, A., Kastrati, A.: Prognostic value of sensitive troponin T in patients with stable and unstable angina and undetectable conventional troponin. Am. Heart J. 161(1), 68–75 (2011)CrossRefGoogle Scholar
  22. 22.
    Shivaswamy, P.K., Chu, W., Jansche, M.: A support vector approach to censored targets. In: 7th IEEE Int. Conf. on Data Mining, pp. 655–660 (2007)Google Scholar
  23. 23.
    Steck, H., Krishnapuram, B., Dehing-oberije, C., Lambin, P., Raykar, V.C.: On ranking in survival analysis: bounds on the concordance index. In: Adv. Neural Inf. Process. Syst., vol. 20, pp. 1209–1216 (2008)Google Scholar
  24. 24.
    Van Belle, V., Pelckmans, K., Suykens, J.A., Van Huffel, S.: Support vector machines for survival analysis. In: Proc. 3rd Int. Conf. Comput. Intell. Med. Healthc, pp. 1–8 (2007)Google Scholar
  25. 25.
    Van Belle, V., Pelckmans, K., Suykens, J.A., Van Huffel, S.: Survival SVM: a practical scalable algorithm. In: Proc. of 16th European Symposium on Artificial Neural Networks, pp. 89–94 (2008)Google Scholar
  26. 26.
    Van Belle, V., Pelckmans, K., Van Huffel, S., Suykens, J.A.K.: Support vector methods for survival analysis: a comparison between ranking and regression approaches. Artif. Intell. Med. 53(2), 107–118 (2011)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Sebastian Pölsterl
    • 1
  • Nassir Navab
    • 1
    • 2
  • Amin Katouzian
    • 1
  1. 1.Chair for Computer Aided Medical ProceduresTechnische Universität MünchenMunichGermany
  2. 2.Johns Hopkins UniversityBaltimoreUSA

Personalised recommendations