Encyclopedia of Social Network Analysis and Mining

2018 Edition
| Editors: Reda Alhajj, Jon Rokne

Weblog Analysis

  • Matthias Hagen
  • Benno Stein
Reference work entry
DOI: https://doi.org/10.1007/978-1-4939-7131-2_129

Synonyms

Glossary

Web log

File containing the time-stamped interactions of users with a search engine (i.e., queries and clicks)

Query

Keywords submitted to a search engine.

Click

Selecting a search result for further inspection.

Index

A data structure to serve fast lookups. For web search typically implemented as an inverted index that matches keywords to documents that contain the keyword.

Posting list

One line of an inverted index (i.e., for a given keyword, the documents that contain it).

Definition

Web log analysis is the task of mining interesting patterns from the user interactions logged by a web search engine. Such log files usually contain time-stamped submitted queries and clicks on results. At...

This is a preview of subscription content, log in to check access.

References

  1. Arampatzis A, Kamps J (2008) A study of query length. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2008), Singapore, 20–24 July 2008, pp 811–812Google Scholar
  2. Azzopardi L, Vinay V (2008) Retrievability: An evaluation measure for higher order information access tasks. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), Napa Valley, 26–30 Oct 2008, pp 561–570Google Scholar
  3. Baeza-Yates RA, Saint-Jean F (2003) A three level search engine index based in query log distribution. In: Proceedings of the 10th international symposium on string processing and information retrieval (SPIRE 2003), Manaus, 8–10 Oct 2003. Lecture notes in computer science, vol 2857. pp 56–65CrossRefGoogle Scholar
  4. Baeza-Yates RA, Tiberi A (2007) Extracting semantic relations from query logs. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2007), San Jose, 12–15 Aug 2007, pp 76–85Google Scholar
  5. Baraglia R, Cacheda F, Carneiro V, Fernández D, Formoso V, Perego R, Silvestri F (2009) Search shortcuts: A new approach to the recommendation of queries. In: Proceedings of the 3rd ACM conference on recommender systems (RecSys 2009), New York, 23–25 Oct 2009, pp 77–84Google Scholar
  6. Bar-Yossef Z, Kraus N (2011) Context-sensitive query auto-completion. In: Proceedings of the 20th international conference on world wide web (WWW 2011), Hyderabad, 28 Mar–1 Apr 2011, pp 107–116Google Scholar
  7. Baskaya F, Keskustalo H, Järvelin K (2012) Time drives interaction: Simulating sessions in diverse searching environments. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval (SIGIR 2012), Portland, 12–16 Aug 2012, pp 105–114Google Scholar
  8. Bast H, Weber I (2006) Type less, find more: Fast auto-completion search with a succinct index. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2006), Seattle, 6–11 Aug 2006, pp 364–371Google Scholar
  9. Beitzel SM, Jensen EC, Chowdhury A, Frieder O, Grossman DA (2007) Temporal analysis of a very large topically categorized web query log. J Am Soc Inf Sci Technol 58(2):166–178CrossRefGoogle Scholar
  10. Benz D, Hotho A, Jäschke R, Krause B, Stumme G (2010) Query logs as folksonomies. Datenbank-Spektrum 10(1):15–24CrossRefGoogle Scholar
  11. Boldi P, Bonchi F, Castillo C, Donato D, Gionis A, Vigna S (2008) The query-flow graph: Model and applications. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), ACM, Napa Valley, 26–30 Oct 2008, pp 609–618Google Scholar
  12. Broder AZ (2002) A taxonomy of web search. SIGIR Forum 36(2):3–10zbMATHCrossRefGoogle Scholar
  13. Cao H, Jiang D, Pei J, He Q, Liao Z, Chen E, Li H (2008) Context-aware query suggestion by mining click-through and session data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2008), Las Vegas, 24–27 Aug 2008, pp 875–883Google Scholar
  14. Chuklin A, Markov I, de Rijke M (2015) Click models for web search. Morgan & Claypool Publishers, San RafaelGoogle Scholar
  15. Cucerzan S, Brill E (2004) Spelling correction as an iterative process that exploits the collective knowledge of web users. In: Proceedings of the 1st conference on empirical methods in natural language processing (EMNLP 2004), Barcelona, 25–26 July 2004, pp 293–300Google Scholar
  16. Cui H, Wen JR, Nie JY, Ma WY (2002) Probabilistic query expansion using query logs. In: Proceedings of the 11th international world wide web conference (WWW 2002), Honolulu, 7–11 May 2002, pp 325–332Google Scholar
  17. Duan H, Hsu BJP (2011) Online spelling correction for query completion. In: Proceedings of the 20th international world wide web conference (WWW 2011), Hyderabad, 28 Mar–1 Apr 2011, pp 117–126Google Scholar
  18. Fagni T, Perego R, Silvestri F, Orlando S (2006) Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Trans Inf Syst 24(1):51–78CrossRefGoogle Scholar
  19. Francisco AP, Baeza-Yates RA, Oliveira AL (2012) Mining query log graphs towards a query folksonomy. Concurrency Comput: Pract Exp 24(17):2179–2192CrossRefGoogle Scholar
  20. Hagen M, Potthast M, Beyer A, Stein B (2012) Towards optimum query segmentation: In doubt without. In: Proceedings of the 21st ACM international conference on information and knowledge management (CIKM 2012), Maui, 29 Oct–2 Nov 2012, pp 1015–1024Google Scholar
  21. Hagen M, Gomoll J, Beyer A, Stein B (2013) From search session detection to search mission detection. In: Proceedings of the 10th international conference open research areas in information retrieval (OAIR 2013), Lisbon, 22–24 May 2013, pp 85–92Google Scholar
  22. Hagen M, Potthast M, Völske M, Gomoll J, Stein B (2016) How writers search: Analyzing the search and writing logs of non-fictional essays. In: Proceedings of the 2016 ACM conference on human information interaction and retrieval (CHIIR 2016), Carrboro, 13–17 Mar 2016, pp 193–202Google Scholar
  23. Hofmann K, Whiteson S, de Rijke M (2012) Estimating interleaved comparison outcomes from historical click data. In: Proceedings of the 21st ACM international conference on information and knowledge management (CIKM 2012), Maui, 29 Oct–2 Nov 2012, pp 1779–1783Google Scholar
  24. Joachims T, Granka LA, Pan B, Hembrooke H, Radlinski F, Gay G (2007) Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Trans Inf Syst 25(2): article 7CrossRefGoogle Scholar
  25. Jones R, Klinkner KL (2008) Beyond the session timeout: Automatic hierarchical segmentation of search topics in query logs. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), Napa Valley, 26–30 Oct 2008, pp 699–708Google Scholar
  26. Li Y, Hsu BJP, Zhai C, Wang K (2011) Unsupervised query segmentation using clickthrough for information retrieval. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval (SIGIR 2011), Beijing, 25–29 July 2011, pp 285–294Google Scholar
  27. Liu Q, Agichtein E, Dror G, Maarek Y, Szpektor I (2012) When web search fails, searchers become askers: Understanding the transition. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval (SIGIR 2012), Portland, 12–16 Aug 2012, pp 801–810Google Scholar
  28. Markatos EP (2001) On caching search engine query results. Comput Commun 24(2):137–143CrossRefGoogle Scholar
  29. Mishra N, Roy R, Ganguly N, Laxman S, Choudhury M (2011) Unsupervised query segmentation using only query logs. In: Proceedings of the 20th international conference on world wide web, WWW 2011, Hyderabad, 28 Mar–1 Apr 2011 (Companion Volume), pp 91–92Google Scholar
  30. Mitchell J (2012) How Google search really works. ReadWriteWeb. URL http://readwrite.com/2012/02/29/interview_changing_engines_mid-flight_qa_with_goog/. Accessed 6 July 2016
  31. Pass G, Chowdhury A, Torgeson C (2006) A picture of search. In: Proceedings of the 1st international conference on scalable information systems (Infoscale 2006), Hong Kong, 30 May–1 June 2006, paper 1Google Scholar
  32. Pickens J, Cooper M, Golovchinsky G (2010) Reverted indexing for feedback and expansion. In: Proceedings of the 19th ACM conference on information and knowledge management (CIKM 2010), Toronto, 26–30 Oct 2010, pp 1049–1058Google Scholar
  33. Potthast M, Hagen M, Völske M, Stein B (2013) Crowdsourcing interaction logs to understand text reuse from the web. In: Proceedings of the 51st annual meeting of the association for computational linguistics (ACL 2013), Sofia, 4–9 Aug 2013, pp 1212–1221Google Scholar
  34. Puppin D, Silvestri F (2006) The query-vector document model. In: Proceedings of the 15th ACM conference on information and knowledge management (CIKM 2006), Arlington, 6–11 Nov 2006, pp 880–881Google Scholar
  35. Silvestri F (2010) Mining query logs: Turning search usage data into knowledge. Found Trends Inf Retr 4(1–2):1–174zbMATHCrossRefGoogle Scholar
  36. Smucker MD, Clarke CLA (2012) Modeling user variance in time-biased gain. In: Proceedings of human-computer information retrieval symposium (HCIR 2012), Cambridge, MA, 4–5 Oct 2012, paper 3Google Scholar
  37. Vakkari P, Huuskonen S (2012) Search effort degrades search output but improves task outcome. J Am Soc Inf Sci Technol 63(4):657–670CrossRefGoogle Scholar
  38. Verma M, Yilmaz E (2014) Entity oriented task extraction from query logs. In: Proceedings of the 23rd ACM international conference on information and knowledge management (CIKM 2014), Shanghai, 3–7 Nov 2014, pp 1975–1978Google Scholar
  39. Völske M, Braslavski P, Hagen M, Lezina G, Stein B (2015) What users ask a search engine: Analyzing one billion Russian question queries. In: Proceedings of the 24th ACM international conference on information and knowledge management (CIKM 2015), Melbourne, 19–23 Oct 2015, pp 1571–1580Google Scholar
  40. White RW, Morris D (2007) Investigating the querying and browsing behavior of advanced search engine users. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2007), Amsterdam, 23–27 July 2007, pp 255–262Google Scholar

Copyright information

© Springer Science+Business Media LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Bauhaus-Universität WeimarWeimarGermany

Section editors and affiliations

  • Thomas Gottron
    • 1
  • Stefan Schlobach
    • 2
  • Steffen Staab
    • 3
  1. 1.Institute for Web Science and TechnologiesUniversität Koblenz-LandauKoblenzGermany
  2. 2.YUAmsterdamThe Netherlands
  3. 3.Institute for Web Science and TechnologiesUniversität Koblenz-LandauKoblenzGermany