Database and Expert Systems Applications

Volume 4080 of the series Lecture Notes in Computer Science pp 842-852

Clustering of Search Engine Keywords Using Access Logs

  • Shingo OtsukaAffiliated withInstitute of Industrial Science, The University of Tokyo
  • , Masaru KitsuregawaAffiliated withInstitute of Industrial Science, The University of Tokyo

* Final gross prices may vary according to local VAT.

Get Access


It the becomes possible that users can get kinds of information by just inputting search keyword(s) representing the topic which users are interested in. But it is not always true that users can hit upon search keyword(s) properly. In this paper, by using Web access logs (called panel logs), which are collected URL histories of Japanese users (called panels) selected without static deviation similar to the survey on TV audience rating, we study the methods of clustering search keywords. Different from the existing systems where the related search keywords are extracted based on the set of URLs viewed by the users after input of their original search keyword(s), we propose two novel methods of clustering the search words. One is based on the Web communities (set of similar web pages); the other is based on the set of nouns obtained by morphological analysis of Web pages. According to evaluation results, our proposed methods can extract more related search keywords than that based on URL.