Abstract
Verbose or colloquial queries take up a small but non-negligible proportion in the modern searching paradigms, and are commonly used in other platforms such as Community Question Answering (CQA), where answerers often include URLs as part of answers to provide further information. To begin with, we define questions resolved (or largely explained) by the linked web pages (i.e., in the corresponding answers) as navigational question, which are simulated as verbose queries to evaluate the performance of search engines (i.e., by considering the associated linked web pages as relevant documents). Then we experiment with the process of identifying new navigational questions from CQA, from which we demonstrate that navigational intent detection can be effectively automated by using textual features and a set of metadata features. Lastly, to effectively identify relevant navigational questions, we present a hybrid approach which blends several language modelling techniques, namely, the classic (query-likelihood) language model, the state-of-the-art translation-based language model, and our proposed intent-based language model. Our experiments on two real-world datasets show that the proposed mixture language model leads to a significant performance boost compared to that of the state-of-the-art language modelling approach.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bing, L.: User personal evaluation of search engines, http://www.cs.uic.edu/~liub/searchEval/Search-Engine-Evaluation-2011.pdf
Broder, A.: A taxonomy of web search. SIGIR Forum 36, 3–10 (2002)
Carterette, B., Pavlu, V., Kanoulas, E., Aslam, J.A., Allan, J.: Evaluation over thousands of queries. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, pp. 651–658. ACM, New York (2008)
Carterette, B., Smucker, M.D.: Hypothesis testing with incomplete relevance judgments. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007, pp. 643–652. ACM, New York (2007)
Chen, L., Zhang, D., Levene, M.: Understanding user intent in community question answering. In: Proceedings of the 21st International Conference Companion on World Wide Web, WWW 2012 Companion, pp. 823–828. ACM, New York (2012)
Elsayed, T.M.: Identity resolution in email collections. PhD thesis, College Park, MD, USA, AAI3372840 (2009)
Guo, J., Xu, G., Li, H., Cheng, X.: A unified and discriminative model for query refinement. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 379–386 (2008)
Huston, S., Croft, W.B.: Evaluating verbose query processing techniques. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 291–298 (2010)
Jeon, J., Croft, W.B., Lee, J.H.: Finding similar questions in large question and answer archives. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM), Bremen, Germany, pp. 84–90 (2005)
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, ACL 2003, vol. 1, pp. 423–430. Association for Computational Linguistics, Stroudsburg (2003)
Lee, U., Liu, Z., Cho, J.: Automatic identification of user goals in web search. In: Proceedings of the 14th International Conference on World Wide Web, WWW 2005, pp. 391–400. ACM, New York (2005)
Liu, Y., Bian, J., Agichtein, E.: Predicting information seeker satisfaction in community question answering. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, pp. 483–490. ACM, New York (2008)
Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press (1999)
Rafferty, A.N., Manning, C.D.: Parsing three german treebanks: lexicalized and unlexicalized baselines. In: Proceedings of the Workshop on Parsing German, PaGe 2008, pp. 40–46. Association for Computational Linguistics, Stroudsburg (2008)
Sadikov, E., Madhavan, J., Wang, L., Halevy, A.: Clustering query refinements by user intent. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 841–850. ACM, New York (2010)
Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Singapore, pp. 475–482 (2008)
Zhai, C.: Statistical language models for information retrieval a critical review. Found. Trends Inf. Retr. 2(3), 137–213 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, L., Zhang, D., Levene, M. (2013). Understanding and Exploiting User’s Navigational Intent in Community Question Answering. In: Banchs, R.E., Silvestri, F., Liu, TY., Zhang, M., Gao, S., Lang, J. (eds) Information Retrieval Technology. AIRS 2013. Lecture Notes in Computer Science, vol 8281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45068-6_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-45068-6_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45067-9
Online ISBN: 978-3-642-45068-6
eBook Packages: Computer ScienceComputer Science (R0)