Advertisement

A Tri-Partite Neural Document Language Model for Semantic Information Retrieval

  • Gia-Hung Nguyen
  • Lynda Tamine
  • Laure Soulier
  • Nathalie Souf
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10843)

Abstract

Previous work in information retrieval have shown that using evidence, such as concepts and relations, from external knowledge sources could enhance the retrieval performance. Recently, deep neural approaches have emerged as state-of-the art models for capturing word semantics. This paper presents a new tri-partite neural document language framework that leverages explicit knowledge to jointly constrain word, concept, and document learning representations to tackle a number of issues including polysemy and granularity mismatch. We show the effectiveness of the framework in various IR tasks.

Keywords

Semantic information retrieval Knowledge source Deep learning 

References

  1. 1.
    Ai, Q., Yang, L., Guo, J., Croft, W.B.: Analysis of the paragraph vector model for information retrieval. In: ICTIR, pp. 133–142. ACM (2016)Google Scholar
  2. 2.
    Bengio, Y., Schwenk, H., Senécal, J.S., Morin, F., Gauvain, J.L.: Neural probabilistic language models. In: Innovations in Machine Learning (2006)Google Scholar
  3. 3.
    Cheng, J., Wang, Z., Wen, J.R., Yan, J., Chen, Z.: Contextual text understanding in distributional semantic space. In: CIKM, pp. 133–142 (2015)Google Scholar
  4. 4.
    Choi, E., Bahadori, M.T., Searles, E., Coffey, C., Sun, J.: Multi-layer representation learning for medical concepts. In: KDD, pp. 1495–1504 (2016)Google Scholar
  5. 5.
    Corcoglioniti, F., Dragoni, M., Rospocher, M., Aprosio, A.P.: Knowledge extraction for information retrieval. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 317–333. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-34129-3_20CrossRefGoogle Scholar
  6. 6.
    Efthymiou, V., Hassanzadeh, O., Rodriguez-Muro, M., Christophides, V.: Matching web tables with knowledge base entities: from entity lookups to entity embeddings. In: d’Amato, C., Fernandez, M., Tamma, V., Lecue, F., Cudré-Mauroux, P., Sequeda, J., Lange, C., Heflin, J. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 260–277. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-68288-4_16CrossRefGoogle Scholar
  7. 7.
    Faruqui, M., Dodge, J., Jauhar, S.K., Dyer, C., Hovy, E., Smith, N.A.: Retrofitting word vectors to semantic lexicons. In: NAACL (2015)Google Scholar
  8. 8.
    Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by Wikipedia entities). In: CIKM, pp. 1625–1628. ACM (2010)Google Scholar
  9. 9.
    Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)CrossRefGoogle Scholar
  10. 10.
    Iacobacci, I., Pilehvar, M.T., Navigli, R.: SensEmbed: learning sense embeddings for word and relational similarity. In: ACL, pp. 95–105 (2015)Google Scholar
  11. 11.
    Kenter, T., Borisov, A., de Rijke, M.: Siamese CBOW: Optimizing word embeddings for sentence representations. ACL 2016, pp. 941–951 (2016)Google Scholar
  12. 12.
    Kiros, R., Zhu, Y., Salakhutdinov, R.R., Zemel, R., Urtasun, R., Torralba, A., Fidler, S.: Skip-thought vectors. In: NIPS, pp. 3294–3302 (2015)Google Scholar
  13. 13.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)Google Scholar
  14. 14.
    Liu, X., Nie, J.-Y., Sordoni, A.: Constraining word embeddings by prior knowledge – application to medical information retrieval. In: Ma, S., Wen, J.-R., Liu, Y., Dou, Z., Zhang, M., Chang, Y., Zhao, X. (eds.) AIRS 2016. LNCS, vol. 9994, pp. 155–167. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48051-0_12CrossRefGoogle Scholar
  15. 15.
    Mancini, M., Camacho-Collados, J., Iacobacci, I., Navigli, R.: Embedding words and senses together via joint knowledge-enhanced training. In: CoNLL (2017)Google Scholar
  16. 16.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
  17. 17.
    Mitchell, J., Lapata, M.: Vector-based models of semantic composition. In: ACL, pp. 236–244 (2008)Google Scholar
  18. 18.
    Moreno, J.G., Besançon, R., Beaumont, R., D’hondt, E., Ligozat, A.-L., Rosset, S., Tannier, X., Grau, B.: Combining word and entity embeddings for entity linking. In: Blomqvist, E., Maynard, D., Gangemi, A., Hoekstra, R., Hitzler, P., Hartig, O. (eds.) ESWC 2017. LNCS, vol. 10249, pp. 337–352. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-58068-5_21CrossRefGoogle Scholar
  19. 19.
    Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. 41(2), 101–1069 (2009)CrossRefGoogle Scholar
  20. 20.
    Pal, D., Mitra, M., Datta, K.: Improving query expansion using WordNet. J. Assoc. Inf. Sci. Technol. 65(12), 2469–2478 (2014)CrossRefGoogle Scholar
  21. 21.
    Rastogi, P., Poliak, A., Durme, B.V.: Training relation embeddings under logical constraints. In: KG4IR@SIGIR (2017)Google Scholar
  22. 22.
    Rekabsaz, N., Mitra, B., Lupu, M., Hanbury, A.: Toward incorporation of relevant documents in word2vec. In: Neu-IR@SIGIR (2017)Google Scholar
  23. 23.
    Richardson, R., Smeaton, A.F.: Using WordNet in a knowledge-based approach to information retrieval (1995)Google Scholar
  24. 24.
    Trieschnigg, D.: Proof of concept: concept-based biomedical information retrieval. Ph.D. thesis, University of Twente (2010)Google Scholar
  25. 25.
    Vulić, I., Moens, M.F.: Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. In: SIGIR, pp. 363–372. ACM (2015)Google Scholar
  26. 26.
    Xiong, C., Callan, J.: Query expansion with freebase. In: ICTIR. ACM (2015)Google Scholar
  27. 27.
    Yamada, I., Shindo, H., Takeda, H., Takefuji, Y.: Joint learning of the embedding of words and entities for named entity disambiguation, pp. 250–259 (2016)Google Scholar
  28. 28.
    Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: ACL, pp. 545–550 (2014)Google Scholar
  29. 29.
    Zamani, H., Croft, W.B.: Estimating embedding vectors for queries. In: ICTIR, pp. 123–132. ACM (2016)Google Scholar
  30. 30.
    Zhao, R., Grosky, W.I.: Narrowing the semantic gap-improved text-based web document retrieval using visual features. IEEE Trans. Multimed. 4(2), 189–200 (2002)CrossRefGoogle Scholar
  31. 31.
    Zuccon, G., Koopman, B., Bruza, P., Azzopardi, L.: Integrating and evaluating neural word embeddings in information retrieval. In: ADCS, p. 12. ACM (2015)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Université de Toulouse, UPS-IRITToulouseFrance
  2. 2.Sorbonne Université, CNRS - LIP6 UMR 7606ParisFrance

Personalised recommendations