A Study of Query Term Deletion Using Large-Scale E-commerce Search Logs

  • Bishan Yang
  • Nish Parikh
  • Gyanit Singh
  • Neel Sundaresan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8416)


Query term deletion is one of the commonly used strategies for query rewriting. In this paper, we study the problem of query term deletion using large-scale e-commerce search logs. Specifically, we focus on queries that do not lead to user clicks and aim to predict a reduced and better query that can lead to clicks by term deletion. Accurate prediction of term deletion can potentially help users recover from poor search results and improve shopping experience. To achieve this, we use various term-dependent and query-dependent measures as features and build a classifier to predict which term is the most likely to be deleted from a given query. Our approach is data-driven. We investigate the large-scale query history and the document collection, verify the usefulness of previously proposed features, and also propose to incorporate the query category information into the term deletion predictors. We observe that training within-category classifiers can result in much better performance than training a unified classifier. We validate our approach using a large collection of query sessions logs from a leading e-commerce site and demonstrate that our approach provides promising performance in query term deletion prediction.


Mutual Information Noun Phrase Query Expansion Query Suggestion Search Session 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allan, J., Callan, J., Croft, W., Ballesteros, L., Broglio, J., Xu, J., Shu, H.: Inquery at trec-5. In: Center for Intelligent Information Retrieval, Dept. of Computer Science, University of Massachusetts, Amherst, Mass (1996)Google Scholar
  2. 2.
    Bailey, P., White, R., Liu, H., Kumaran, G.: Mining historic query trails to label long and rare search engine queries. ACM Transactions on the Web (TWEB) 4(4), 15 (2010)Google Scholar
  3. 3.
    Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: Proceedings of SIGIR, pp. 491–498 (2008)Google Scholar
  4. 4.
    Chien, S., Immorlica, N.: Semantic similarity between search engine queries using temporal correlation. In: Proceedings of the 14th International Conference on World Wide Web, pp. 2–11. ACM (2005)Google Scholar
  5. 5.
    Cucerzan, S., Brill, E.: Extracting semantically related queries by exploiting user session information. Technical report, Technical report, Microsoft Research (2005)Google Scholar
  6. 6.
    Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: Liblinear: A library for large linear classification. The Journal of Machine Learning Research 9, 1871–1874 (2008)zbMATHGoogle Scholar
  7. 7.
    Fonseca, B., Golgher, P., Pôssas, B., Ribeiro-Neto, B., Ziviani, N.: Concept-based interactive query expansion. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 696–703. ACM (2005)Google Scholar
  8. 8.
    Hasan, M.A., Parikh, N., Singh, G., Sundaresan, N.: Query suggestion for e-commerce sites. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011, pp. 765–774 (2011)Google Scholar
  9. 9.
    Jones, R., Fain, D.C.: Query word deletion prediction. In: Proceedings of SIGIR, pp. 435–436 (2003)Google Scholar
  10. 10.
    Jones, R., Rey, B., Madani, O., Greiner, W.: Generating query substitutions. In: Proceedings of the 15th International Conference on World Wide Web, WWW 2006, New York, NY, USA, pp. 387–396 (2006)Google Scholar
  11. 11.
    Kumaran, G., Allan, J.: A case for shorter queries, and helping users create them. In: HLT-NAACL, pp. 220–227 (2007)Google Scholar
  12. 12.
    Kumaran, G., Carvalho, V.R.: Reducing long queries using query quality predictors. In: Proceedings of SIGIR, pp. 564–571 (2009)Google Scholar
  13. 13.
    Lease, M., Allan, J., Croft, W.B.: Regression rank: Learning to meet the opportunity of descriptive queries. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 90–101. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  14. 14.
    Parikh, N., Sundaresan, N.: Inferring semantic query relations from collective user behavior. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 349–358. ACM (2008)Google Scholar
  15. 15.
    Shen, D., Ruvini, J.D., Somaiya, M., Sundaresan, N.: Item categorization in the e-commerce domain. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1921–1924. ACM (2011)Google Scholar
  16. 16.
    Singh, G., Parikh, N., Sundaresan, N.: Rewriting null e-commerce queries to recommend products. In: Proceedings of the 21st International Conference Companion on World Wide Web, WWW 2012 Companion, pp. 73–82 (2012)Google Scholar
  17. 17.
    Wu, H., Fang, H.: An exploration of query term deletion. In: Proceedings of the ECIR 2011 Workshop on Information Retrieval Over Query Sessions (2011)Google Scholar
  18. 18.
    Zhao, L., Callan, J.: Term necessity prediction. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM 2010, pp. 259–268 (2010)Google Scholar
  19. 19.
    Zukerman, I., Raskutti, B., Wen, Y.: Query expansion and query reduction in document retrieval. In: Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence, pp. 552–559. IEEE (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Bishan Yang
    • 1
  • Nish Parikh
    • 2
  • Gyanit Singh
    • 2
  • Neel Sundaresan
    • 2
  1. 1.Computer Science DepartmentCornell UniversityUSA
  2. 2.eBay Research LabsUSA

Personalised recommendations