trie-nlg: trie context augmentation to improve personalized query auto-completion for short and unseen prefixes

Maurya, Kaushal Kumar; Desarkar, Maunendra Sankar; Gupta, Manish; Agrawal, Puneet

doi:10.1007/s10618-023-00966-0

trie-nlg: trie context augmentation to improve personalized query auto-completion for short and unseen prefixes

Published: 07 August 2023

Volume 37, pages 2306–2329, (2023)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Kaushal Kumar Maurya ORCID: orcid.org/0000-0003-0249-6508¹,
Maunendra Sankar Desarkar¹,
Manish Gupta² &
…
Puneet Agrawal²

252 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Query auto-completion (QAC) aims at suggesting plausible completions for a given query prefix. Traditionally, QAC systems have leveraged tries curated from historical query logs to suggest most popular completions. In this context, there are two specific scenarios that are difficult to handle for any QAC system: short prefixes (which are inherently ambiguous) and unseen prefixes. Recently, personalized Natural Language Generation (NLG) models have been proposed to leverage previous session queries as context for addressing these two challenges. However, such NLG models suffer from two drawbacks: (1) some of the previous session queries could be noisy and irrelevant to the user intent for the current prefix, and (2) NLG models cannot directly incorporate historical query popularity. This motivates us to propose a novel NLG model for QAC, Trie-NLG, which jointly leverages popularity signals from trie and personalization signals from previous session queries. We train the Trie-NLG model by augmenting the prefix with rich context comprising of recent session queries and top trie completions. This simple modeling approach overcomes the limitations of trie-based and NLG-based approaches, and leads to state-of-the-art performance. We evaluate the Trie-NLG model using two large QAC datasets. On average, our model achieves huge \(\sim\)57% and \(\sim\)14% boost in MRR over the popular trie-based lookup and the strong BART-based baseline methods, respectively. We make our code publicly available at https://github.com/kaushal0494/Trie-NLG.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cohort-based personalized query auto-completion

Article 01 September 2019

Query Suggestion

Beyond Click Graph: Topic Modeling for Search Engine Query Log Analysis

Availability of data and materials

We make our code available as supplementary information. It is in line with our claims in the paper and can be trusted to reproduce the results. We use one public dataset (AOL) and a Bing dataset. Since Bing data is proprietary, it cannot be made publicly available.

Code availability

We make our code publicly available and it is attached as supplementary data.

Notes

A small percent of suggestions are not prefix preserving; in this work, we focus on prefix-preserving suggestions only.
https://github.com/jma127/pyltr.
https://github.com/asyml/texar.
https://huggingface.co/.
https://huggingface.co/blog/how-to-generate#appendix.
https://github.com/kaushal0494/Trie-NLG.

References

Bar-Yossef Z, Kraus N (2011) Context-sensitive query auto-completion. In: Proceedings of the 20th international conference on world wide web (pp. 107–116)
Bhatia S,Majumdar D, Mitra P (2011) Query suggestions in the absence of query logs. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval (pp. 795–804)
Bonchi F, Perego R, Silvestri F, Vahabi H, Venturini R (2012) Efficient query recommendations in the long tail via center-piece subgraphs. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval (pp. 345–354)
Burges CJ (2010) From ranknet to lambdarank to lambdamart: an overview. Learning 11:23–581
Google Scholar
Cai F, De Rijke M (2016) Query auto completion in information retrieval. Universiteit van Amsterdam [Host]
Cai F , Liang S , De Rijke, M (2014) Time-sensitive personalized query auto-completion. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management (pp. 1599–1608)
Dehghani M , Rothe S , Alfonseca E , Fleury P (2017) Learning to attend, copy, and generate for session-based query suggestion. Proceedings of the 2017 ACM on conference on information and knowledge management (pp. 1747–1756)
Gehman S , Gururangan S , Sap M , Choi Y , Smith NA (2020) Real Toxicity Prompts: evaluating neural toxic degeneration in language models. Findings of the association for computational linguistics: Emnlp 2020 (pp. 3356–3369). Online Association for Computational Linguistics
Hofmann K, Mitra B, Radlinski F, Shokouhi M (2014) An eye-tracking study of user interactions with query auto completion. Proceedings of the 23rd ACM international conference on conference on information and knowledge management (pp. 549–558)
Hsu B-J, Ottaviano G (2013) Space-efficient data structures for top-k completion. In: Proceedings of the 22nd international conference on world wide web (pp. 583–594)
Huang C-K, Chien L-F, Oyang Y-J (2003) Relevant term suggestion in interactive web search based on contextual information in query session logs. J Am Soc Inf Sci Technol 54(7):638–649
Article Google Scholar
Jiang J-Y , Ke Y-Y , Chien P-Y , Cheng P-J (2014) Learning user reformulation behavior for query auto-completion. In: Proceedings of the 37th international ACM SIGIIR conference on research & development in information retrieval (pp. 445–454)
Lewis M , Liu Y , Goyal N , Ghazvininejad M , Mohamed A , Levy O, Zettlemoyer, L (2020) Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 7871–7880)
Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N et al (2020) Retrieval-augmented generation for knowledge-intensive NLP tasks. Adv Neural Inf Process Syst 33:9459–9474
Google Scholar
Maxwell D , Bailey P , Hawking D (2017) Large-scale generative query autocompletion. In: Proceedings of the 22nd australasian document computing symposium (pp. 1–8)
Mei Q , Zhou D , Church K (2008) Query suggestion using hitting time. In:: Proceedings of the 17th acm conference on information and knowledge management (pp. 469–478)
Mitra B, Craswell N (2015) Query auto-completion for rare prefixes. In: Proceedings of the 24th acm international on conference on information and knowledge management (pp. 1755–1758)
Mitra B , Shokouhi M , Radlinski F, Hofmann K (2014) On user interactions with query auto-completion. In: Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval (pp. 1055–1058)
Mustar A, Lamprier S, Piwowarski B (2020) Using bert and bart for query suggestion. Circle
Park DH, Chiba R(2017) A neural language model for query auto-completion. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval (pp. 1189–1192)
Pass G , Chowdhury A , Torgeson C (2006) A picture of search. In: Proceedings of the 1st international conference on scalable information systems (pp. 1–es)
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67
MathSciNet Google Scholar
Rosset C , Jose D , Ghosh G , Mitra B Tiwary S (2018) Optimizing query evaluations using reinforcement learning for web search. In: The 41st international ACM SIGIR conference on research & development in information retrieval (pp. 1193–1196)
Sadikov E , Madhavan J , Wang L , Halevy A (2010) Clustering query refinements by user intent. In: Proceedings of the 19th international conference on world wide web (pp. 841–850)
Shokouhi M (2013) Learning to personalize query auto-completion. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval (pp. 103–112)
Shokouhi M, Radinsky K (2012) Time-sensitive query auto-completion. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval (pp. 601–610)
Sordoni A , Bengio Y , Vahabi H , Lioma C , Grue Simonsen J , Nie J-Y (2015) A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In: Proceedings of the 24th ACM international on conference on information and knowledge management (pp. 553–562)
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al (2017) Attention is all you need. Adv Neural Inf Process Syst. Vol 30
Wang P-W, Zhang H, Mohan V, Dhillon IS, Kolter JZ (2018) Realtime query completion via deep language models. ecom@ sigir
Wu Q, Burges CJ, Svore KM, Gao J (2010) Adapting boosting for information retrieval measures. Inf Retrieval 13(3):254–270
Article Google Scholar
Yadav N, Sen R, Hill DN, Mazumdar A, Dhillon IS (2021) Session-aware query auto-completion using extreme multi-label ranking. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining (pp. 3835–3844)
Yin D, Tan J, Zhang Z, Deng H, Huang S, Chen J (2020) Learning to generate personalized query auto-completions via a multi-view multi-task attentive approach. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 2998–3007)
Zhou G, Zhu X, Song C, Fan Y, Zhu H , Ma X, et al (2018) Deep interest network for click-through rate prediction. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 1059–1068)

Download references

Funding

This research was partially funded by Microsoft Academic Partnership Grant 2022.

Author information

Authors and Affiliations

IIT Hyderabad, Hyderabad, Telangana, 502284, India
Kaushal Kumar Maurya & Maunendra Sankar Desarkar
Microsoft India, Hyderabad, Telangana, 500032, India
Manish Gupta & Puneet Agrawal

Authors

Kaushal Kumar Maurya
View author publications
You can also search for this author in PubMed Google Scholar
Maunendra Sankar Desarkar
View author publications
You can also search for this author in PubMed Google Scholar
Manish Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Puneet Agrawal
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the design of the project. Specifically, Conceptualization, Methodology, Data preparation: [KKM]; Writing—original draft preparation: [KKM, MG, MSD]; Writing—review and editing: [everyone]; Funding acquisition, Resources, Supervision: [MG, MSD, PA].

Corresponding author

Correspondence to Kaushal Kumar Maurya.

Ethics declarations

Conflicts of interest

The authors have no competing interests to declare relevant to this article’s content.

Consent for publication

All the authors of this work have been informed of the submission, and everyone supports it.

Additional information

Responsible editor: Charalampos Tsourakakis.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Maurya, K.K., Desarkar, M.S., Gupta, M. et al. trie-nlg: trie context augmentation to improve personalized query auto-completion for short and unseen prefixes. Data Min Knowl Disc 37, 2306–2329 (2023). https://doi.org/10.1007/s10618-023-00966-0

Download citation

Received: 10 February 2023
Accepted: 13 July 2023
Published: 07 August 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10618-023-00966-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

trie-nlg: trie context augmentation to improve personalized query auto-completion for short and unseen prefixes

Abstract

Access this article

Similar content being viewed by others

Cohort-based personalized query auto-completion

Query Suggestion

Beyond Click Graph: Topic Modeling for Search Engine Query Log Analysis

Availability of data and materials

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

trie-nlg: trie context augmentation to improve personalized query auto-completion for short and unseen prefixes

Abstract

Access this article

Similar content being viewed by others

Cohort-based personalized query auto-completion

Query Suggestion

Beyond Click Graph: Topic Modeling for Search Engine Query Log Analysis

Availability of data and materials

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation