SQE-GAN: A Supervised Query Expansion Scheme via GAN

Fu, Tianle; Tian, Qi; Li, Hui

doi:10.1007/978-3-030-72240-1_25

Tianle Fu¹⁴,
Qi Tian¹⁴ &
Hui Li ORCID: orcid.org/0000-0003-2382-6289¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12657))

Included in the following conference series:

European Conference on Information Retrieval

2321 Accesses

Abstract

Existing Supervised Query Expansion (SQE) spends much time in term feature extraction but generates sub-optimal expanded terms. In this paper, we introduce Generative Adversarial Nets (GANs) and propose a GAN-based SQE method (SQE-GAN) to get helpful query expansion terms. We unify two types of models in query expansion: the generative model and the discriminative one. The generative (resp., discriminative) model focuses on predicting relevant terms (resp., relevancy) given a query (resp., a query-term pair). We iteratively optimize both models with a game between them. Besides, a BiLSTM layer is adopted to encode the utility of a term with respect to the query. As a result, the costly feature calculation in SQE schemes is avoided, such that the efficiency can be significantly improved. Moreover, by introducing GAN into expansion, the expanded terms are possible to be more effective with respect to the eventual needs of the user. Our experimental results demonstrate that SQE-GAN can be 37.3% faster than state-of-the-art SQE solutions while outperforming some recently proposed neural models in the retrieval quality.

This work is supported by National Natural Science Foundation of China (No. 61972309), CCF-Huawei Database System Innovation Research Plan (No. 2020010B), Key Scientific Research Program of Shaanxi Provincial Department of Education (No. 20JY014), and Natural Science Basic Research Program of Shaanxi (No. 2020JM-575).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Any embedding technique can be adopted, e.g., BERT [5], ELMo [17], Word2Vec [16].

References

Amati, G.: Probability models for information retrieval based on divergence from randomness. Univ. Glasgow 20(4), 357–389 (2003)
Google Scholar
Burges, C.J.C., et al.: Learning to rank using gradient descent. In: ICML, pp. 89–96 (2005)
Google Scholar
Cao, G., Nie, J.Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: SIGIR, pp. 243–250 (2008)
Google Scholar
Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44(1), 1–50 (2013)
Article Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL, pp. 4171–4186 (2019)
Google Scholar
Gao, J., Xu, G., Xu, J.: Query expansion using path-constrained random walks. In: SIGIR, pp. 563–572 (2013)
Google Scholar
Imani, A., Vakili, A., Montazer, A., Shakery, A.: Deep neural networks for query expansion using word embeddings. In: ECIR, pp. 203–210 (2019)
Google Scholar
Joachims, T.: Optimizing search engines using clickthrough data. In: KDD, pp. 133–142 (2002)
Google Scholar
Joachims, T.: Training linear SVMs in linear time. In: KDD, pp. 217–226 (2006)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR, pp. 4401–4410 (2019)
Google Scholar
Lee, C.J., Chen, R.C., Kao, S.H., Cheng, P.J.: A term dependency-based approach for query terms ranking. In: CIKM, pp. 1267–1276 (2009)
Google Scholar
Li, J., Luong, M., Jurafsky, D.: A hierarchical neural autoencoder for paragraphs and documents. In: ACL, pp. 1106–1115 (2015)
Google Scholar
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: EMNLP, pp. 1412–1421 (2015)
Google Scholar
Lv, Y., Zhai, C.X., Chen, W.: A boosting approach to improving pseudo-relevance feedback. In: SIGIR, pp. 165–174 (2011)
Google Scholar
Manning, C.D.: Introduction to information retrieval. J. Am. Soc. Inf. Sci. Technol. 61, 852–853 (2009)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR (2013)
Google Scholar
Peters, M.E., et al.: Deep contextualized word representations. In: NAACL, pp. 2227–2237 (2018)
Google Scholar
Victor Lavrenko, W.B.C.: Relevance-based language models. In: SIGIR, pp. 120–127 (2001)
Google Scholar
Wang, J., et al.: IRGAN: a minimax game for unifying generative and discriminative information retrieval models. In: SIGIR, pp. 515–524 (2017)
Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992)
MATH Google Scholar
Zaiem, S., Sadat, F.: Sequence to sequence learning for query expansion. In: AAAI, pp. 10075–10076 (2019)
Google Scholar
Zhang, Z., Wang, Q., Si, L., Gao, J.: Learning for efficient supervised query expansion via two-stage feature selection. In: SIGIR, pp. 265–274 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Cyber Engineering, Xidian University, Xi’an, China
Tianle Fu, Qi Tian & Hui Li

Authors

Tianle Fu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Tian
View author publications
You can also search for this author in PubMed Google Scholar
Hui Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Li .

Editor information

Editors and Affiliations

Radboud University Nijmegen, Nijmegen, The Netherlands
Djoerd Hiemstra
Department of Computer Science, Katholieke Universiteit Leuven, Heverlee, Belgium
Marie-Francine Moens
Toulouse, Toulouse Institute of Computer Science Research, Toulouse, France
Josiane Mothe
Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
Raffaele Perego
Leipzig University, Leipzig, Germany
Martin Potthast
Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
Fabrizio Sebastiani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, T., Tian, Q., Li, H. (2021). SQE-GAN: A Supervised Query Expansion Scheme via GAN. In: Hiemstra, D., Moens, MF., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds) Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science(), vol 12657. Springer, Cham. https://doi.org/10.1007/978-3-030-72240-1_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-72240-1_25
Published: 30 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72239-5
Online ISBN: 978-3-030-72240-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics