ICWE 2005: Web Engineering pp 152-160 | Cite as
An Article Language Model for BBS Search
Abstract
Bulletin Board Systems (BBS), similar to blogs, newsgroups, online forums, etc., are online broadcasting spaces where people can exchange ideas and make announcements. As BBS are becoming valuable repositories of knowledge and information, effective BBS search engines are required to make the information universally accessible and useful. However, the techniques that have been proven successful for web search are not suitable for searching BBS articles due to the nature of BBS. In this paper, we propose a novel article language model (LM) to build an effective BBS search engine. We investigate the differences between BBS articles and web pages, then extend the traditional LM to author LM and category LM. The article LM is powerful in the sense that it can combine the three LMs into a single framework. Experimental results shows that our article LM substantially outperforms both INQUERY algorithm and the traditional LM.
References
- 1.Kou, Z., Zhang, C.: Reply networks on a bulletin board system. Physical Review E 67 (2003)Google Scholar
- 2.Novak, J., Raghavan, P., Tomkins, A.: Anti-aliasing on the web. In: Proceedings of the 13th international conference on World Wide Web, pp. 30–39 (2004)Google Scholar
- 3.Kou, Z., Bao, T., Zhang, C.: Discovery of relationships between intereswts from bulletin board system by dissimilarity reconstruction. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 328–335. Springer, Heidelberg (2003)CrossRefGoogle Scholar
- 4.Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the seventh international conference on World Wide Web, vol. 7, pp. 107–117 (1998)Google Scholar
- 5.Kraft, R., Zien, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms, pp. 668–677 (1998)Google Scholar
- 6.Kraft, R., Zien, J.: Mining anchor text for query refinement. In: Proceedings of the 13th international conference on World Wide Web, pp. 666–674 (2004)Google Scholar
- 7.Xue, G., Zeng, H., Chen, Z., Ma, W., Lu, C.: Log mining to improve the performance of site search. In: Third International Conference on Web Information Systems Engineering (2002)Google Scholar
- 8.Xi, W., Lind, J., Brill, E.: Learning effective ranking functions for newsgroup search. In: Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 394–401 (2004)Google Scholar
- 9.Fagin, R., Kumar, R., McCurley, K.: Searching the workplace web. In: Proceedings of the twelfth international conference on World Wide Web, pp. 366–375 (2003)Google Scholar
- 10.Ponte, J., Croft, W.: A language modeling approach to information retrieval. In: Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275–281 (1998)Google Scholar
- 11.Si, L., Jin, R., Callan, J., Ogilvie, P.: A language modeling framework for resource selection and results merging. In: Proceedings of the eleventh international conference on Information and knowledge management, pp. 391–397 (2002)Google Scholar
- 12.Song, F., Croft, W.: A general language model for information retrieval. In: Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 279–280 (1999)Google Scholar
- 13.Lavrenko, V., Croft, W.: Relevance-based language models. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 120–127 (2001)Google Scholar
- 14.Li, X., Croft, W.: Time-based language models. In: Proceedings of the twelfth international conference on Information and knowledge management, pp. 469–475 (2003)Google Scholar
- 15.Jin, R., Hauptmann, A.G., Title, C.Z.: language model for information retrieval. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 42–48 (2002)Google Scholar
- 16.Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 28–36 (2003)Google Scholar