Exploiting Latent Embeddings of Nominal Clinical Data for Predicting Hospital Readmission


Hospital readmissions of patients put a high burden not only on the health care system, but also on the patients since complications after discharge generally lead to additional burdens. Estimating the risk of readmission after discharge from inpatient care has been the subject of several publications in recent years. In those publications the authors mostly tried to directly infer the readmission risk (within a certain time frame) from the clinical data recorded in the medical routine such as primary diagnosis, co-morbidities, length of stay, or questionnaires. Instead of using these data directly as inputs for a prediction model, we are exploiting latent embeddings for the nominal parts of the data (e.g., diagnosis and procedure codes). These latent embeddings have been used with great success in the natural language processing domain and can be constructed in a preprocessing step. We show in our experiments, that a prediction model that exploits these latent embeddings can lead to improved readmission predictive models.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. 1.

    Admission and discharge reason, therapy (also medication) and department codes.

  2. 2.

    Primary diagnosis, secondary diagnosis, LOINC Lab, therapies/medication, admission reason, discharge reason and department codes.


  1. 1.

    Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155

    MATH  Google Scholar 

  2. 2.

    Billings J, Blunt I, Stevenson A, Georghiou T, Lewis G, Bardsley M (2012) Development of a predictive model to identify inpatients at risk of readmission within 30 days of discharge (parr-30). BMJ Open

  3. 3.

    Choudhry S, Li J, Davis D, Erdmann C, Sikka R, Sutariya B (2013) A public-private partnership develops and externally validates a 30-day hospital readmission risk prediction model. Online J Public Health Inform 5(2)

  4. 4.

    Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537

    MATH  Google Scholar 

  5. 5.

    Donzé J, Aujesky D, Williams D, Schnipper JL (2013) Potentially avoidable 30-day hospital readmissions in medical patients. JAMA 173:632–638

    Google Scholar 

  6. 6.

    Dormann H, Neubert A, Criegee-Rieck M, Egger T, Radespiel-Troger M, Azaz-Livshits T, Levy M, Brune K, Hahn EG (2004) Readmissions and adverse drug reactions in internal medicine: the economic impact. J Int Med 255:653–663

    Article  Google Scholar 

  7. 7.

    Hasan O, Meltzer DO, Shaykevich SA, Bell CM et al (2009) Hospital readmission in general medicine patients: a prediction model. J Gen Intern Med 25:211–219

    Article  Google Scholar 

  8. 8.

    Hebert C, Shivade C, Foraker R, Wasserman J, et al (2014) Diagnosis-specific readmission risk prediction using electronic health data: A retrospective cohort study. BMC Med Inform Decis Making 14

  9. 9.

    Hendricks V, Schmidt S, Vogt A, Gysan D, Latz V, Schwang I, Griebenow R, Riedel R (2014) Case management program for patients with chronic heart failure. effectiveness in terms of mortality, hospital admissions and costs. Deutsches Aerzteblatt. International 111:264–270

    Google Scholar 

  10. 10.

    Huang EH, Socher R, Manning CD, Ng AY (2012) Improving word representations via global context and multiple word prototypes. In: Annual Meeting of the Association for Computational Linguistics (ACL)

  11. 11.

    Jack BW, Chetty VK, Anthony D, Greenwald JL et al (1999) A reengineered hospital discharge program to decrease rehospitalization: a randomized trial. JAMA 281:613–620

    Article  Google Scholar 

  12. 12.

    Jencks SF, Williams MV, Coleman EA New England Journal of Medicine 14:1418–1428

  13. 13.

    Lebret R, Collobert R (2014) Word embeddings through hellinger pca. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Association for Computational Linguistics. pp 482–490

  14. 14.

    Naylor MD, Brooten D, Campbell R, Jacobsen BS et al (1999) A comprehensive discharge planning and home follow-up of hospitalized elders: a randomized clinical trial. JAMA 281:613–620

    Article  Google Scholar 

  15. 15.

    OECD (2013) Health at a glance 2013: OECD indicators. http://dx.doi.org/10.1787/health_glance-2013-en

  16. 16.

    Department of Health (2013) Payment by results guidance for 2013–2014. Department of Health, London

    Google Scholar 

  17. 17.

    Ohman E, Granger CB, Harrington RA, Lee KL (2000) Risk stratification and therapeutic decision making in acute coronary syndromes. JAMA 286(7):876–878

    Article  Google Scholar 

  18. 18.

    Robinson P Hospitals readmissions and the 30 day threshold. http://www.chks.co.uk/userfiles/files/CHKS%20Report%20Hospital%20readmissions.pdf

  19. 19.

    Rümenapf G, Geiger S, Schneider B, Amendt K, Wilhelm N, Morbach S, Nagel N (2013) Readmissions of patients with diabetes mellitus and foot ulcers after infra-popliteal bypass surgery: attacking the problem by an integrated case management model. Eur J Vasc Med 42:56–67

    Google Scholar 

  20. 20.

    Smitht D, Giobbie-Hurder A, Weinberger M, Oddone EZ et al (2000) Predicting non-elective hospital readmissions: a multi site study. J Clin Epidemiol 53:1113–1118

    Article  Google Scholar 

  21. 21.

    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    MATH  MathSciNet  Google Scholar 

  22. 22.

    Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics. pp 141–188

  23. 23.

    Yu S, Van Esbroeck A, Farooq F, Fung G, Anand V, Krishnapuram B (2013) Predicting readmission risk with institution specific prediction models. In: ICHI, pp 415–420

Download references


The project receives funding from the German Federal Ministry of Economics and Technology; Grant Number 01MT14001A.

Author information



Corresponding author

Correspondence to Denis Krompaß.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Krompaß, D., Esteban, C., Tresp, V. et al. Exploiting Latent Embeddings of Nominal Clinical Data for Predicting Hospital Readmission. Künstl Intell 29, 153–159 (2015). https://doi.org/10.1007/s13218-014-0344-x

Download citation


  • Hospital readmission
  • Latent embeddings
  • Latent factors
  • Logistic regression
  • Neural network