Skip to main content

KAFE: Knowledge and Frequency Adapted Embeddings

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13164))

  • 1771 Accesses

Abstract

Word embeddings are widely used in several Natural Language Processing (NLP) applications. The training process typically involves iterative gradient updates of each word vector. This makes word frequency a major factor in the quality of embedding, and in general the embedding of words with few training occurrences end up being of poor quality. This is problematic since rare and frequent words, albeit semantically similar, might end up far from each other in the embedding space.

In this study, we develop KAFE (Knowledge And Frequency adapted Embeddings) which combines adversarial principles and knowledge graph to efficiently represent both frequent and rare words. The goal of adversarial training in KAFE is to minimize the spatial distinguishability (separability) of frequent and rare words in the embedding space. The knowledge graph encourages the embedding to follow the structure of the domain-specific hierarchy, providing an informative prior that is particularly important for words with low amount of training data. We demonstrate the performance of KAFE in representing clinical diagnoses using real-world Electronic Health Records (EHR) data coupled with a knowledge graph. EHRs are notorious for including ever-increasing numbers of rare concepts that are important to consider when defining the state of the patient for various downstream applications. Our experiments demonstrate better intelligibility through visualisation, as well as higher prediction and stability scores of KAFE over state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.who.int/standards/classifications/classification-of-diseases.

References

  1. Eisenstein, J.: Introduction to Natural Language Processing. MIT press, Cambridge (2019)

    Google Scholar 

  2. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  3. Shickel, B., Tighe, P.J., Bihorac, A., Rashidi, P.: Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J. Biomed. Health Inform. 22(5), 1589–1604 (2017)

    Article  Google Scholar 

  4. World Health Organization: International classification of diseases (ICD) information sheet (2018)

    Google Scholar 

  5. Xiao, C., Choi, E., Sun, J.: Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. J. Am. Med. Inform. Assoc. 25(10), 1419–1428 (2018)

    Article  Google Scholar 

  6. Gong, C., He, D., Tan, X., Qin, T., Wang, L., Liu, T.Y.: FRAGE: frequency-agnostic word representation. In: Advances in Neural Information Processing Systems, pp. 1334–1345 (2018)

    Google Scholar 

  7. Mu, J., Bhat, S., Viswanath, P.: All-but-the-top: simple and effective postprocessing for word representations. arXiv preprint arXiv:1702.01417 (2017)

  8. Ashfaq, A.: Predicting clinical outcomes via machine learning on electronic health records. Ph.D. thesis, Halmstad University Press (2019)

    Google Scholar 

  9. Ashfaq, A., Nowaczyk, S.: Machine learning in healthcare-a system’s perspective. arXiv preprint arXiv:1909.07370 (2019)

  10. Luong, M.T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pp. 104–113 (2013)

    Google Scholar 

  11. Creutz, M., Lagus, K.: Unsupervised models for morpheme segmentation and morphology learning. ACM Trans. Speech Lang. Process. (TSLP) 4(1), 1–34 (2007)

    Article  Google Scholar 

  12. Ling, W., et al.: Finding function in form: compositional character models for open vocabulary word representation. arXiv preprint arXiv:1508.02096 (2015)

  13. Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. arXiv preprint arXiv:1508.06615 (2015)

  14. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  15. Xu, C., et al.: RC-NET: a general framework for incorporating knowledge into word representations. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management, pp. 1219–1228 (2014)

    Google Scholar 

  16. Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  17. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  18. Choi, E., Bahadori, M.T., Song, L., Stewart, W.F., Sun, J.: Gram: graph-based attention model for healthcare representation learning. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 787–795 (2017)

    Google Scholar 

  19. Ma, F., You, Q., Xiao, H., Chitta, R., Zhou, J., Gao, J.: KAME: knowledge-based attention model for diagnosis prediction in healthcare. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 743–752 (2018)

    Google Scholar 

  20. Song, L., Cheong, C.W., Yin, K., Cheung, W.K., Fung, B.C., Poon, J.: Medical concept embedding with multiple ontological representations. IJCAI 19, 4613–4619 (2019)

    Google Scholar 

  21. Ashfaq, A., et al.: Data resource profile: regional healthcare information platform in Halland, Sweden. Int. J. Epidemiol. 49(3), 738–739f (2020)

    Article  Google Scholar 

  22. Lazaridou, A., Marelli, M., Baroni, M.: Multimodal word meaning induction from minimal exposure to natural text. Cogn. Sci. 41, 677–705 (2017)

    Article  Google Scholar 

  23. Herbelot, A., Baroni, M.: High-risk learning: acquiring new word vectors from tiny data. arXiv preprint arXiv:1707.06556 (2017)

  24. Schick, T., Schütze, H.: Attentive mimicking: Better word embeddings by attending to informative contexts. arXiv preprint arXiv:1904.01617 (2019)

  25. Schick, T., Schütze, H.: Rare words: a major problem for contextualized embeddings and how to fix it by attentive mimicking. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8766–8774 (2020)

    Google Scholar 

  26. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  27. Quan, H., et al.: Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med. Care 43, 1130–1139 (2005)

    Article  Google Scholar 

  28. Bai, T., Zhang, S., Egleston, B.L., Vucetic, S.: Interpretable representation learning for healthcare via capturing disease progression through time. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 43–51 (2018)

    Google Scholar 

  29. Alshargi, F., Shekarpour, S., Soru, T., Sheth, A.P.: Metrics for evaluating quality of embeddings for ontological concepts (2018)

    Google Scholar 

  30. Miotto, R., Li, L., Kidd, B.A., Dudley, J.T.: Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci. Rep. 6(1), 1–10 (2016)

    Article  Google Scholar 

  31. Choi, E., Xiao, C., Stewart, W.F., Sun, J.: MIME: multilevel medical embedding of electronic health records for predictive healthcare. arXiv preprint arXiv:1810.09593 (2018)

  32. Ashfaq, A., Sant’Anna, A., Lingman, M., Nowaczyk, S.: Readmission prediction using deep learning on electronic health records. J. Biomed. Inform. 97, 103256 (2019)

    Article  Google Scholar 

  33. Wendlandt, L., Kummerfeld, J.K., Mihalcea, R.: Factors influencing the surprising instability of word embeddings. arXiv preprint arXiv:1804.09692 (2018)

  34. McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018)

  35. Obermeyer, Z., Powers, B., Vogeli, C., Mullainathan, S.: Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464), 447–453 (2019)

    Article  Google Scholar 

  36. Goldstein, B.A., Bhavsar, N.A., Phelan, M., Pencina, M.J.: Controlling for informed presence bias due to the number of health encounters in an electronic health record. Am. J. Epidemiol. 184(11), 847–855 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Awais Ashfaq .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ashfaq, A., Lingman, M., Nowaczyk, S. (2022). KAFE: Knowledge and Frequency Adapted Embeddings. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2021. Lecture Notes in Computer Science(), vol 13164. Springer, Cham. https://doi.org/10.1007/978-3-030-95470-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-95470-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-95469-7

  • Online ISBN: 978-3-030-95470-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics