A Novel Ranking Technique Based on Page Queries
Keyword-based information retrieval finds webpages with queries composed of keywords to provide users with needed information. However, since the keywords are only a part of the necessary information, it may be hard to search intended results from the keyword-based methods. Furthermore, users should make efforts to select proper keywords many times in general because they cannot know which keyword is effective in obtaining meaningful information they really want. In this paper, we propose a novel algorithm, called PQ_Rank, which can find intended webpages more exactly than the existing keyword-based ones. To rank webpages more effectively, it considers not only keywords but also all of the words included in webpages, named page queries. Experimental results show that PQ_Rank outperforms PageRank, a famous algorithm used by Google, in terms of MAP, average recall, and NDCG.
KeywordsInformation retrieval Page query Grouping webpages Ranking technique
Unable to display preview. Download preview PDF.
- 2.Ermelinda, O., Massimo, R.: Towards a Spatial Instance Learning Method for Deep Web Pages. In: Industrial Conference on Data Mining, pp. 270–285 (December 2011)Google Scholar
- 5.Metzler, D.: Generalized Inverse Document Frequency. In: Conference on Information and Knowledge Management, pp. 399–408 (October 2008)Google Scholar
- 6.Pyun, G., Yun, U.: Ranking Techniques for Finding Correlated Webpages. In: International Conference on IT Convergence and Security, pp. 1085–1095 (December 2012)Google Scholar
- 8.CLucene Project web page, http://clucene.sourceforge.net/