KAFE: Knowledge and Frequency Adapted Embeddings

Ashfaq, Awais; Lingman, Markus; Nowaczyk, Slawomir

doi:10.1007/978-3-030-95470-3_10

Awais Ashfaq^16,17,
Markus Lingman^17,18 &
Slawomir Nowaczyk¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13164))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

1771 Accesses

Abstract

Word embeddings are widely used in several Natural Language Processing (NLP) applications. The training process typically involves iterative gradient updates of each word vector. This makes word frequency a major factor in the quality of embedding, and in general the embedding of words with few training occurrences end up being of poor quality. This is problematic since rare and frequent words, albeit semantically similar, might end up far from each other in the embedding space.

In this study, we develop KAFE (Knowledge And Frequency adapted Embeddings) which combines adversarial principles and knowledge graph to efficiently represent both frequent and rare words. The goal of adversarial training in KAFE is to minimize the spatial distinguishability (separability) of frequent and rare words in the embedding space. The knowledge graph encourages the embedding to follow the structure of the domain-specific hierarchy, providing an informative prior that is particularly important for words with low amount of training data. We demonstrate the performance of KAFE in representing clinical diagnoses using real-world Electronic Health Records (EHR) data coupled with a knowledge graph. EHRs are notorious for including ever-increasing numbers of rare concepts that are important to consider when defining the state of the patient for various downstream applications. Our experiments demonstrate better intelligibility through visualisation, as well as higher prediction and stability scores of KAFE over state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction

Article Open access 20 May 2021

Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings

Article Open access 10 July 2019

Combining Contrastive Learning and Knowledge Graph Embeddings to Develop Medical Word Embeddings for the Italian Language

Notes

1.
https://www.who.int/standards/classifications/classification-of-diseases.

References

Eisenstein, J.: Introduction to Natural Language Processing. MIT press, Cambridge (2019)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Shickel, B., Tighe, P.J., Bihorac, A., Rashidi, P.: Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J. Biomed. Health Inform. 22(5), 1589–1604 (2017)
Article Google Scholar
World Health Organization: International classification of diseases (ICD) information sheet (2018)
Google Scholar
Xiao, C., Choi, E., Sun, J.: Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. J. Am. Med. Inform. Assoc. 25(10), 1419–1428 (2018)
Article Google Scholar
Gong, C., He, D., Tan, X., Qin, T., Wang, L., Liu, T.Y.: FRAGE: frequency-agnostic word representation. In: Advances in Neural Information Processing Systems, pp. 1334–1345 (2018)
Google Scholar
Mu, J., Bhat, S., Viswanath, P.: All-but-the-top: simple and effective postprocessing for word representations. arXiv preprint arXiv:1702.01417 (2017)
Ashfaq, A.: Predicting clinical outcomes via machine learning on electronic health records. Ph.D. thesis, Halmstad University Press (2019)
Google Scholar
Ashfaq, A., Nowaczyk, S.: Machine learning in healthcare-a system’s perspective. arXiv preprint arXiv:1909.07370 (2019)
Luong, M.T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pp. 104–113 (2013)
Google Scholar
Creutz, M., Lagus, K.: Unsupervised models for morpheme segmentation and morphology learning. ACM Trans. Speech Lang. Process. (TSLP) 4(1), 1–34 (2007)
Article Google Scholar
Ling, W., et al.: Finding function in form: compositional character models for open vocabulary word representation. arXiv preprint arXiv:1508.02096 (2015)
Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. arXiv preprint arXiv:1508.06615 (2015)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Article Google Scholar
Xu, C., et al.: RC-NET: a general framework for incorporating knowledge into word representations. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management, pp. 1219–1228 (2014)
Google Scholar
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Choi, E., Bahadori, M.T., Song, L., Stewart, W.F., Sun, J.: Gram: graph-based attention model for healthcare representation learning. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 787–795 (2017)
Google Scholar
Ma, F., You, Q., Xiao, H., Chitta, R., Zhou, J., Gao, J.: KAME: knowledge-based attention model for diagnosis prediction in healthcare. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 743–752 (2018)
Google Scholar
Song, L., Cheong, C.W., Yin, K., Cheung, W.K., Fung, B.C., Poon, J.: Medical concept embedding with multiple ontological representations. IJCAI 19, 4613–4619 (2019)
Google Scholar
Ashfaq, A., et al.: Data resource profile: regional healthcare information platform in Halland, Sweden. Int. J. Epidemiol. 49(3), 738–739f (2020)
Article Google Scholar
Lazaridou, A., Marelli, M., Baroni, M.: Multimodal word meaning induction from minimal exposure to natural text. Cogn. Sci. 41, 677–705 (2017)
Article Google Scholar
Herbelot, A., Baroni, M.: High-risk learning: acquiring new word vectors from tiny data. arXiv preprint arXiv:1707.06556 (2017)
Schick, T., Schütze, H.: Attentive mimicking: Better word embeddings by attending to informative contexts. arXiv preprint arXiv:1904.01617 (2019)
Schick, T., Schütze, H.: Rare words: a major problem for contextualized embeddings and how to fix it by attentive mimicking. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8766–8774 (2020)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Quan, H., et al.: Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med. Care 43, 1130–1139 (2005)
Article Google Scholar
Bai, T., Zhang, S., Egleston, B.L., Vucetic, S.: Interpretable representation learning for healthcare via capturing disease progression through time. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 43–51 (2018)
Google Scholar
Alshargi, F., Shekarpour, S., Soru, T., Sheth, A.P.: Metrics for evaluating quality of embeddings for ontological concepts (2018)
Google Scholar
Miotto, R., Li, L., Kidd, B.A., Dudley, J.T.: Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci. Rep. 6(1), 1–10 (2016)
Article Google Scholar
Choi, E., Xiao, C., Stewart, W.F., Sun, J.: MIME: multilevel medical embedding of electronic health records for predictive healthcare. arXiv preprint arXiv:1810.09593 (2018)
Ashfaq, A., Sant’Anna, A., Lingman, M., Nowaczyk, S.: Readmission prediction using deep learning on electronic health records. J. Biomed. Inform. 97, 103256 (2019)
Article Google Scholar
Wendlandt, L., Kummerfeld, J.K., Mihalcea, R.: Factors influencing the surprising instability of word embeddings. arXiv preprint arXiv:1804.09692 (2018)
McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018)
Obermeyer, Z., Powers, B., Vogeli, C., Mullainathan, S.: Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464), 447–453 (2019)
Article Google Scholar
Goldstein, B.A., Bhavsar, N.A., Phelan, M., Pencina, M.J.: Controlling for informed presence bias due to the number of health encounters in an electronic health record. Am. J. Epidemiol. 184(11), 847–855 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Center for Applied Intelligent Systems Research, Halmstad University, Halmstad, Sweden
Awais Ashfaq & Slawomir Nowaczyk
Halland Hospital, Region Halland, Sweden
Awais Ashfaq & Markus Lingman
Department of Molecular and Clinical Medicine/Cardiology, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
Markus Lingman

Authors

Awais Ashfaq
View author publications
You can also search for this author in PubMed Google Scholar
Markus Lingman
View author publications
You can also search for this author in PubMed Google Scholar
Slawomir Nowaczyk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Awais Ashfaq .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Giuseppe Nicosia
Department of Computer Science, University of Reading, Reading, UK
Varun Ojha
Department of Computer Science, University of Oxford, Oxford, UK
Emanuele La Malfa
Cambridge Judge Business School, University of Cambridge, Cambridge, UK
Gabriele La Malfa
Department of Biochemistry, University of Cambridge, Cambridge, UK
Giorgio Jansen
Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA
Panos M. Pardalos
University of Catania, Catania, Italy
Giovanni Giuffrida
Department of Informatics, Dana-Farber Cancer Institute, Boston, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ashfaq, A., Lingman, M., Nowaczyk, S. (2022). KAFE: Knowledge and Frequency Adapted Embeddings. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2021. Lecture Notes in Computer Science(), vol 13164. Springer, Cham. https://doi.org/10.1007/978-3-030-95470-3_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-95470-3_10
Published: 02 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95469-7
Online ISBN: 978-3-030-95470-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

KAFE: Knowledge and Frequency Adapted Embeddings

Abstract

Access this chapter

Similar content being viewed by others

Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction

Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings

Combining Contrastive Learning and Knowledge Graph Embeddings to Develop Medical Word Embeddings for the Italian Language

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

KAFE: Knowledge and Frequency Adapted Embeddings

Abstract

Access this chapter

Similar content being viewed by others

Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction

Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings

Combining Contrastive Learning and Knowledge Graph Embeddings to Develop Medical Word Embeddings for the Italian Language

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation