Trend and Behavior Detection from Web Queries
In this chapter, we demonstrate the type and nature of query characteristics that can be mined from web server logs. Based on a study of over half a million queries (spanning four academic years) to a university’s website, it is shown that the vocabulary (terms) generated from these queries do not have a well-defined Zipf distribution. However, some regularities in term frequency and ranking correlations suggest that piecewise polynomial data fits are reasonable for trend representations.
KeywordsSearch Engine Word Pair Word Association Query Statement Behavior Detection
Unable to display preview. Download preview PDF.
- [BYRN99]R. Baeza-Yates and B. Ribeiro-Neto.Modern Information Retrieval.AddisonWesley, Boston, 1999.Google Scholar
- [Kor77]R.R. Korfhage.Information Storage and Retrieval.Wiley,New York, 1977.Google Scholar
- [SBC97]B. Shneiderman, D. Byrd, and W.B. Croft.Clarifying search: A user-interface framework for text searches.D-Lib Magazine, 1:1–18, 1997.Google Scholar
- [Wo199]D. Wolfram.Term co-occurrence in Internet search engine queries: An analysis of the Excite data set.Canadian Journal of Information and Library Science, 24 (2/3): 12–33, 1999.Google Scholar
- [WP97]P. Wand and L. Pouchard.End-user searching of Web resources: Problems and implications.In Proceedings of the Eighth ASIS SIG/CR Workshop, Washington DC, pages 73–85, 1997.Google Scholar