Encyclopedia of Social Network Analysis and Mining

2014 Edition
| Editors: Reda Alhajj, Jon Rokne

Weblog Analysis

  • Matthias Hagen
  • Benno Stein
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-6170-8_129

Synonyms

Caching; Learning to rank; Partitioned index; Query auto-completion; Query expansion; Query log analysis; Query segmentation; Query sessions; Query spelling correction; Query suggestion; Search engine; Search missions; Sharding; User interaction

Glossary

Weblog

File containing the time-stamped interactions of users with a search engine (i.e., queries and clicks)

Query

Keywords submitted to a search engine

Click

Selecting a search result for further inspection

Index

A data structure to serve fast lookups. For web search typically implemented as an inverted index that matches keywords to documents that contain the keyword

Posting List

One line of an inverted index (i.e., for a given keyword, the documents that contain it)

Definition

Weblog analysis is the task of mining interesting patterns from the user interactions logged by a web search engine. Such log files usually contain time-stamped submitted queries and clicks on results. At a search engine site, analyzing these...

This is a preview of subscription content, log in to check access.

References

  1. Arampatzis A, Kamps J (2008) A study of query length. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2008), Singapore, 20–24 July 2008, pp 811–812Google Scholar
  2. Azzopardi L, Vinay V (2008) Retrievability: an evaluation measure for higher order information access tasks. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), Napa Valley, 26–30 Oct 2008, pp 561–570Google Scholar
  3. Baeza-Yates RA, Saint-Jean F (2003) A three level search engine index based in query log distribution. In: Proceedings of the 10th international symposium on string processing and information retrieval (SPIRE 2003), Manaus, 8–10 Oct 2003. Lecture notes in computer science, vol 2857, pp 56–65Google Scholar
  4. Baeza-Yates RA, Tiberi A (2007) Extracting semantic relations from query logs. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2007), San Jose, 12–15 Aug 2007, pp 76–85Google Scholar
  5. Bar-Yossef Z, Kraus N (2011) Context-sensitive query auto-completion. In: Proceedings of the 20th international conference on world wide web (WWW 2011), Hyderabad, 28 Mar-1 Apr 2011, pp 107–116Google Scholar
  6. Baraglia R, Cacheda F, Carneiro V, Fernandez D, Formoso V, Perego R, Silvestri F (2009) Search shortcuts: a new approach to the recommendation of queries. In: Proceedings of the 3rd ACM conference on recommender systems (RecSys 2009), New York, 23–25 Oct 2009, pp 77–84Google Scholar
  7. Bast H, Weber I (2006) Type less, find more: fast auto-completion search with a succinct index. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2006), Seattle, 6–11 Aug 2006, pp 364–371Google Scholar
  8. Beitzel SM, Jensen EC, Chowdhury A, Frieder O, Grossman DA (2007) Temporal analysis of a very large topically categorized web query log. J Am Soc Inf Sci Technol 58(2):166–178Google Scholar
  9. Benz D, Hotho A, Jäschke R, Krause B, Stumme G (2010) Query Logs as Folksonomies. Datenbank-Spektrum 10(1):15–24Google Scholar
  10. Boldi P, Bonchi F, Castillo C, Donato D, Gionis A, Vigna S (2008) The query-flow graph: model and applications. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), Napa Valley, 26–30 Oct 2008. ACM, pp 609–618Google Scholar
  11. Broder AZ (2002) A taxonomy of web search. SIGIR Forum 36(2):3–10Google Scholar
  12. Cao H, Jiang D, Pei J, He Q, Liao Z, Chen E, Li H (2008) Context-aware query suggestion by mining click-through and session data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2008), Las Vegas, 24–27 Aug 2008, pp 875–883Google Scholar
  13. Cucerzan S, Brill E (2004) Spelling correction as an iterative process that exploits the collective knowledge of web users. In: Proceedings of the 1st conference on empirical methods in natural language processing (EMNLP 2004), Barcelona, 25–26 July 2004, pp 293–300Google Scholar
  14. Cui H, Wen JR, Nie JY, Ma WY (2002) Probabilistic query expansion using query logs. In: Proceedings of the 11th international World Wide Web conference (WWW 2002), Honolulu, 7–11 May 2002, pp 325–332Google Scholar
  15. Duan H, Hsu BJP (2011) Online spelling correction for query completion. In: Proceedings of the 20th international world wide web conference (WWW 2011), Hyderabad, 28 Mar–1 Apr 2011, pp 117–126Google Scholar
  16. Fagni T, Perego R, Silvestri F, Orlando S (2006) Boosting the performance of web search engines: caching and prefetching query results by exploiting historical usage data. ACM Trans Info Syst 24(1):51–78Google Scholar
  17. Francisco AP, Baeza-Yates RA, Oliveira AL (2012) Mining query log graphs towards a query folksonomy. Concurr Comput Pract Exp 24(17): 2179–2192Google Scholar
  18. Hagen M, Potthast M, Stein B, Bräutigam C (2011a) Query segmentation revisited. In: Proceedings of the 20th international world wide web conference (WWW 2011), Hyderabad, 28 Mar–1 Apr 2011, pp 97–106Google Scholar
  19. Hagen M, Stein B, Rüb T (2011b) Query session detection as a cascade. In: Proceedings of the 20th ACM conference on information and knowledge management (CIKM 2011), Glasgow, 24–28 Oct 2011, pp 147–152Google Scholar
  20. Hagen M, Potthast M, Beyer A, Stein B (2012) Towards optimum query segmentation: in doubt without. In: Proceedings of the 21st ACM international conference on information and knowledge management (CIKM 2012), Maui, 29 Oct–02 Nov 2012, pp 1015–1024Google Scholar
  21. Hagen M, Gomoll J, Beyer A, Stein B (2013) From search session detection to search mission detection. In: Proceedings of the 10th international conference open research areas in information retrieval (OAIR 2013), Lisbon, 22–24 May 2013 (to appear)Google Scholar
  22. Joachims T, Granka LA, Pan B, Hembrooke H, Radlinski F, Gay G (2007) Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Trans Info Syst 25(2), article 7Google Scholar
  23. Jones R, Klinkner KL (2008) Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), Napa Valley, 26–30 Oct 2008, pp 699– 708Google Scholar
  24. Li Y, Hsu BJP, Zhai C, Wang K (2011) Unsupervised query segmentation using clickthrough for information retrieval. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval (SIGIR 2011), Beijing, 25–29 July 2011, pp 285–294Google Scholar
  25. Markatos EP (2001) On caching search engine query results. Comput Commun 24(2): 137–143Google Scholar
  26. Mishra N, Roy R, Ganguly N, Laxman S, Choudhury M (2011) Unsupervised query segmentation using only query logs. In: Proceedings of the 20th international conference on world wide web, WWW 2011, Hyderabad, 28 Mar–1 Apr 2011 (Companion Volume), pp 91–92Google Scholar
  27. Mitchell J (2012) How Google search really works. ReadWriteWeb. http://www.readwriteweb.com/archives/interview_changing_engines_mid-flight_qa_with_goog.php. Last accessed 6 May 2013
  28. Pass G, Chowdhury A, Torgeson C (2006) A picture of search. In: Proceedings of the 1st international conference on scalable information systems (Infoscale 2006), Hong Kong, 30 May–1 June 2006, paper 1Google Scholar
  29. Pickens J, Cooper M, Golovchinsky G (2010) Reverted indexing for feedback and expansion. In: Proceedings of the 19th ACM conference on information and knowledge management (CIKM 2010), Toronto, 26– 30 Oct 2010, pp 1049–1058Google Scholar
  30. Puppin D, Silvestri F (2006) The query-vector document model. In: Proceedings of the 15th ACM conference on information and knowledge management (CIKM 2006), Arlington, 6–11 Nov 2006, pp 880–881Google Scholar
  31. Silvestri F (2010) Mining query logs: turning search usage data into knowledge. Found Trends Info Retr 4(1– 2):1–174Google Scholar
  32. White RW, Morris D (2007) Investigating the querying and browsing behavior of advanced search engine users. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2007), Amsterdam, 23–27 July 2007, pp 255–262Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Matthias Hagen
    • 1
  • Benno Stein
    • 1
  1. 1.Bauhaus-Universitat WeimarWeimarGermany