Information Systems Frontiers

, Volume 11, Issue 4, pp 391–403 | Cite as

Interactive survival analysis with the OCDM system: From development to application

  • Sebastian Klenk
  • Jürgen Dippon
  • Peter Fritz
  • Gunther Heidemann


Medical data mining is currently actively pursued in computer science and statistical research but not in medical practice. The reasons therefore lie in the difficulties of handling and statistically analyzing medical data. We have developed a system that allows practitioners in the field to interactively analyze their data without assistance of statisticians or data mining experts. In the course of this paper we will introduce data mining of medical data and show how this can be achieved for survival data. We will demonstrate how to solve common problems of interactive survival analysis by presenting the Online Clinical Data Mining (OCDM) system. Thereby the main focus is on similarity based queries, a new method to select similar cases based on their covariables and the influence of these on their survival.


Medical data mining Survival analysis Regression based distance measures User centered data mining 


  1. Abe, H., Yokoi, H., Ohsaki, M., & Yamaguchi, T. (2007). Developing an integrated time-series data mining environment for medical data mining. In Data mining workshops, 2007 ICDM workshops 2007 seventh IEEE international conference (pp. 127–132).Google Scholar
  2. Ahmad, I., & Ran, I. (2004). Data based bandwidth selection in kernel density estimation with parametric start via kernel contrasts. Journal of Nonparametric Statistics, 16(37), 841–877.CrossRefGoogle Scholar
  3. Black, N. (2003). Using clinical databases in practice. Basic Music Journal, 326(7379), 2–3.Google Scholar
  4. Brameier, M., & Banzhaf, W. (2001). A comparison of linear genetic programming and neural networks in medical data mining. IEEE Transactions on Evolutionary Computation, 5(1), 17–26.CrossRefGoogle Scholar
  5. Cherkassky, V. (2007). Learning from data, 2nd edn. New York: Wiley.Google Scholar
  6. Cios, K. J., & William, M. G. (2002). Uniqueness of medical data mining. Artificial Intelligence in Medicine, 26(1–2), 1–24.CrossRefGoogle Scholar
  7. Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society Series B (Methodological), 34(3), 187–220.Google Scholar
  8. Date, C. J. (2002). Introduction to database systems. Boston: Addison-Wesley Longman.Google Scholar
  9. Delen, D., Walker, G., & Kadam, A. (2005). Predicting breast cancer survivability: A comparison of three data mining methods. Artificial Intelligence in Medicine, 34(3), 113–127.CrossRefGoogle Scholar
  10. Dippon, J., Fritz, P., & Kohler, M. (2002). A statistical approach to case based reasoning, with application to breast cancer data. Computational Statistics & Data Analysis, 40(3), 579–602.CrossRefGoogle Scholar
  11. Dyreson, C., Grandi, F., Käfer, W., Kline, N., Lorentzos, N., Mitsopoulos, Y. et al. (1994). A consensus glossary of temporal database concepts. ACM SIGMOD Rec, 23(1), 52–64.CrossRefGoogle Scholar
  12. Eggebraaten, T. J., Tenner, J. W., & Dubbels, J. C. (2007). A health-care data model based on the hl7 reference information model. IBM Systems Journal, 46(1), 5–18.Google Scholar
  13. Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. Ai Magazine, 17, 37–54.Google Scholar
  14. Fung, G., Yu, S., Dehing-Oberije, C., Ruysscher, D. D., Lambin, P., Krishnan, S. et al. (2008). Privacy-preserving predictive models for lung cancer survival analisys. In Privacy-preserving workshop at the SIAM data mining conference 2008.Google Scholar
  15. Ghannad-Rezaie, M., Soltanain-Zadeh, H., Siadat, M. R., & Elisevich, K. (2006). Medical data mining using particle swarm optimization for temporal lobe epilepsy. Evolutionary Computation, 2006 CEC 2006 IEEE Congress on pp. 761–768.Google Scholar
  16. Györfi, L., Kohler, M., Krzyzak, A., & Walk, H. (2002). A distribution-free theory of nonparametric regression. New York: Springer.Google Scholar
  17. Han, J., & Kamber, M. (2001). Data mining. San Francisco: Morgan Kaufmann.Google Scholar
  18. Harkema, H., Setzer, A., Gaizauskas, R., Hepple, M., Power, R., & Rogers, J. (2005). Mining and modelling temporal clinical data. In Cox, S. (Ed.), Proceedings of the 4th UK e-Science all hands meeting. Nottingham, UK, available at:
  19. Hastie, T. J., Tibshirani, R. J., & Friedman, J. H. (2002). The elements of statistical learning, corrected print. edn. New York: Springer.Google Scholar
  20. Hoover, D. R., & He, Y. (1994). Nonidentified responses in a proportional hazards setting. Biometrics, 50(1), 1–10.CrossRefGoogle Scholar
  21. Houston, A. L., Chen, H., Hubbard, S. M., Schatz, B. R., Ng, T. D., Sewell, R. R., et al. (1999). Medical data mining on the internet: Research on a cancer information system. Artificial Intelligence Review, 13(5–6), 437–466.CrossRefGoogle Scholar
  22. Inokuchi, A., Takeda, K., Inaoka, N., & Wakao, F. (2007). Medtakmi-cdi: Interactive knowledge discovery for clinical decision intelligence. IBM Systems Journal, 46(1), 115–133.Google Scholar
  23. Kimball, R. (1996). The data warehouse toolkit. New York: Wiley.Google Scholar
  24. Klein, J. P, & Moeschberger, M. L. (2005). Survival analysis, 2nd edn. New York: Springer.Google Scholar
  25. Kleinbaum, D. G., & Klein, M. (2005). Survival analysis, 2nd edn. New York: Springer.Google Scholar
  26. Lundin, J., Lundin, M., Isola, J., & Joensuu, H. (2003). Infopoints: A web-based system for individualised survival estimation in breast cancer. Basic Music Journal, 326(7379), 29Google Scholar
  27. McAullay, D., Williams, G., Chen, J., Jin, H., He, H., Sparks, R., et al. (2005). A delivery framework for health data mining and analytics. In ACSC ’05: Proceedings of the twenty-eighth Australasian conference on computer science (pp. 381–387). Darlinghurst: Australian Computer Society.Google Scholar
  28. Meinicke, P., Brodag, T., Fricke, W. F., & Waack, S. (2006). P-value based visualization of codon usage data. Algorithms for Molecular Biology, 1, 10.CrossRefGoogle Scholar
  29. Mullins, I. M., Siadaty, M. S., Lyman, J., Scully, K., Garrett, C. T., Miller W. G. et al. (2006). Data mining and clinical data repositories: Insights from a 667,000 patient data set. Computers in Biology and Medicine, 36(12), 1351–1377.CrossRefGoogle Scholar
  30. Ölund, G., Lindqvist, P., & Litton, J. E. (2007). Bims: An information management system for biobanking in the 21st century. IBM Systems Journal, 46(1), 171–182.CrossRefGoogle Scholar
  31. Pedersen, T. B., & Jensen, C. S. (1998). Research issues in clinical data warehousing. In SSDBM ’98: Proceedings of the 10th international conference on scientific and statistical database management, IEEE computer society (pp. 43–52). Washington, DC, USA.Google Scholar
  32. Pedersen, T. B., & Jensen, C. S. (1999). Multidimensional data modeling for complex data. In ICDE ’99: Proceedings of the 15th international conference on data engineering, IEEE computer society (p. 336). Washington, DC, USA.Google Scholar
  33. R Development Core Team (2008) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria,, ISBN 3-900051-07-0.
  34. Radespiel-Tröger, M., Rabenstein, T., Schneider, H. T., & Lausen, B. (2003). Comparison of tree-based methods for prognostic stratification of survival data. Artificial Intelligence in Medicine, 28(3), 323–341.CrossRefGoogle Scholar
  35. Russell, S. J., & Norvig, P. (2003). Artificial intelligence, 2nd edn. Englewood Cliffs: Prentice Hall.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Sebastian Klenk
    • 1
  • Jürgen Dippon
    • 2
  • Peter Fritz
    • 3
  • Gunther Heidemann
    • 1
  1. 1.Intelligent Systems DepartmentStuttgart UniversityStuttgartGermany
  2. 2.Department of Mathematics, Institute for Stochastics and Applications (ISA)Stuttgart UniversityStuttgartGermany
  3. 3.Robert-Bosch-Krankenhaus StuttgartInstitute for Digital MedicineStuttgartGermany

Personalised recommendations