Skip to main content

Semantic Query Suggestion Based on Optimized Random Forests

  • 811 Accesses

Part of the Advances in Intelligent Systems and Computing book series (AISC,volume 764)

Abstract

Query suggestion is an integral part of Web search engines. Data-driven approaches to query suggestion aim to identify more relevant queries to users based on term frequencies and hence cannot fully reveal the underlying semantic intent of queries. Semantic query suggestion seeks to identify relevant queries by taking semantic concepts contained in user queries into account. In this paper, we propose a machine learning approach to semantic query suggestion based on Random Forests. The presented scheme employs an optimized Random Forest algorithm based on multi-objective simulated annealing and weighted voting. In this scheme, multi-objective simulated annealing is utilized to tune the parameters of Random Forests algorithm, i.e. the number of trees forming the ensemble and the number of features to split at each node. In addition, the weighted voting is utilized to combine the predictions of trees based on their predictive performance. The predictive performance of the proposed scheme is compared to conventional classification algorithms (such as Naïve Bayes, logistic regression, support vector machines, Random Forest) and ensemble learning methods (such as AdaBoost, Bagging and Random Subspace). The experimental results on semantic query suggestion prove the superiority of the proposed scheme.

Keywords

  • Query suggestion
  • Random Forests
  • Ensemble learning

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-91189-2_10
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   219.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-91189-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   279.99
Price excludes VAT (USA)

References

  1. Parikh, N., Singh, G., Sundaresan, N.: Handbook of Statistics. Elsevier, New York (2013)

    Google Scholar 

  2. Jansen, B.J., Booth, D.L., Spink, A.: Determining the informational, navigational and transactional intent of web queries. Inf. Process. Manage. 44(3), 1251–1266 (2008)

    CrossRef  Google Scholar 

  3. Wen, J.R., Nie, J.Y., Zhang, H.J.: Clustering user queries of a search engine. In: Proceedings of the 10th International Conference on World Wide Web, pp. 162–168. ACM, New York (2001)

    Google Scholar 

  4. Jansen, B.J., Spink, A., Bateman, J., Saracevic, T.: Real life information retrieval: a study of user queries on the web. ACM SIGIR Forum 32(1), 5–17 (1998)

    CrossRef  Google Scholar 

  5. Cao, H., Jiang, D., Pei, J., He, Q., Liao, Z., Chen, E., Li, H.: Context-aware query suggestion by mining click-through and session data. In: Proceedings of the 14th International Conference on Knowledge Discovery and Data Mining, pp. 875–883. ACM, New York (2008)

    Google Scholar 

  6. Kato, M.P., Sakai, T., Tanaka, K.: When do people use query suggestion? A query suggestion log analysis. Inf. Retrieval 16(6), 725–746 (2013)

    CrossRef  Google Scholar 

  7. Meij, E., Bron, M., Hollink, L., Huurnink, B., de Rijke, M.: Learning semantic query suggestions. In: Bernstein, A., Karger, David R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 424–440. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  8. Onan, A.: Classifier and feature set ensembles for web page classification. J. Inf. Sci. 42(2), 150–165 (2016)

    CrossRef  Google Scholar 

  9. Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: Proceedings of the sixth ACM Conference on Knowledge Discovery and Data Mining, pp. 407–416. ACM, New York (2000)

    Google Scholar 

  10. Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query recommendation using query logs in search engines. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 588–596. Springer, Heidelberg (2004)

    CrossRef  Google Scholar 

  11. Mei, Q., Zhou, D., Church, K.: Query suggestion using hitting time. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 469–478. ACM, New York (2008)

    Google Scholar 

  12. Ma, H., Yang, H., King, I., Lyu, M.: Learning latent semantic relations from clickthrough data for query suggestion. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 709–718. ACM, New York (2008)

    Google Scholar 

  13. Song, Y., He, L.W.: Optimal rare query suggestion with implicit user feedback. In: Proceedings of the 19th International Conference on World Wide Web, pp. 901–910. ACM, New York (2010)

    Google Scholar 

  14. Anagnostopoulos, A., Becchetti, L., Castillo, C., Gionis, A.: An optimization framework for query recommendation. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 161–170. ACM, New York (2010)

    Google Scholar 

  15. Ma, H., Lyu, M.R., King, I.: Diversifying query suggestion results. In: Proceedings of AAAI (2010)

    Google Scholar 

  16. Song, Y., Zhou, D., He, L.W.: Post-ranking query suggestion by diversifying search results. In: Proceedings of the 34th International Conference on Research and Development in Information Retrieval, pp. 815–824. ACM, New York (2011)

    Google Scholar 

  17. Jiang, D., Leung, K.W.T., Yang, L., Ng, W.: Query suggestion with diversification and personalization. Knowl.-Based Syst. 89, 553–568 (2015)

    CrossRef  Google Scholar 

  18. Kraft, R., Zien, J.: Mining anchor text for query refinement. In: Proceedings of the 13th International Conference on World Wide Web, pp. 666–674. ACM, New York (2004)

    Google Scholar 

  19. Jones, R., Rey, B., Madani, O., Greiner, W.: Generating query substitutions. In: Proceedings of the 15th International Conference on World Wide Web, pp. 387–396. ACM, New York (2006)

    Google Scholar 

  20. Wang, X., Zhai, C.: Mining term association patterns from search logs for effective query reformulation. In: Proceedings of the 17th Conference on Information and Knowledge Management, pp. 479–488. ACM, New York (2008)

    Google Scholar 

  21. Jansen, B.J., Booth, D.L., Spink, A.: Patterns of query reformulation during web searching. J. Am. Soc. Inform. Sci. Technol. 60(7), 1358–1371 (2009)

    CrossRef  Google Scholar 

  22. Dang, V., Croft, B.W.: Query reformulation using anchor text. In: Proceedings of the Third International Conference on Web Search and Data Mining, pp. 41–50. ACM, New York (2010)

    Google Scholar 

  23. Ronao, C.A., Cho, S.B.: Anomalous query access detection in RBAC-administered databases with random forest and PCA. Inf. Sci. 369, 238–250 (2016)

    CrossRef  Google Scholar 

  24. Kruschwitz, U., Lungley, D., Albakour, M.D., Song, D.: Deriving query suggestion for site search. J. Am. Soc. Inform. Sci. Technol. 64(10), 1975–1994 (2013)

    CrossRef  Google Scholar 

  25. Kim, Y., Seo, J., Croft, W.B., Smith, D.A.: Automatic suggestion of phrasal-concept queries for literature search. Inf. Process. Manage. 50(4), 568–583 (2014)

    CrossRef  Google Scholar 

  26. Momtazi, S., Lindenberg, F.: Generating query suggestions by exploiting latent semantics in query logs. J. Inf. Sci. 46(2), 437–448 (2016)

    CrossRef  Google Scholar 

  27. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    CrossRef  Google Scholar 

  28. Verikas, A., Gelzinis, A., Bacauskinene, M.: Mining data with random forests: a survey and results of new test. Pattern Recogn. 44(2), 330–349 (2011)

    CrossRef  Google Scholar 

  29. Bader-El-Den, M., Gaber, M.: GARF: towards self-optimised random forests. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012. LNCS, vol. 7664, pp. 506–515. Springer, Heidelberg (2012)

    CrossRef  Google Scholar 

  30. Elyan, E., Gaber, M.M.: A genetic algorithm approach to optimising random forests applied to class engineered data. Inf. Sci. 384, 220–234 (2017)

    CrossRef  Google Scholar 

  31. Milne, D., Witten, I.H.: Learning to link with Wikipedia. In: Proceedings of the 17th Conference on Information and Knowledge Management, pp. 509–518, ACM, New York (2008)

    Google Scholar 

  32. Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)

    MATH  Google Scholar 

  33. Nisbet, R., Miner, G., Elder, J.: Handbook of Statistical Analysis and Data Mining Applications. Academic Press, New York (2009)

    MATH  Google Scholar 

  34. Caramia, M., Dell’Olmo, P.: Multi-objective management in freight logistics. Springer, London (2008)

    CrossRef  MATH  Google Scholar 

  35. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aytuğ Onan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Onan, A. (2019). Semantic Query Suggestion Based on Optimized Random Forests. In: Silhavy, R. (eds) Artificial Intelligence and Algorithms in Intelligent Systems. CSOC2018 2018. Advances in Intelligent Systems and Computing, vol 764. Springer, Cham. https://doi.org/10.1007/978-3-319-91189-2_10

Download citation