Skip to main content

Weblog Analysis

  • Reference work entry
  • First Online:
  • 138 Accesses

Synonyms

Caching; Learning to rank; Partitioned index; Query auto-completion; Query expansion; Query log analysis; Query segmentation; Query sessions; Query spelling correction; Query suggestion; Search engine; Search missions; Sharding; User interaction

Glossary

Weblog:

File containing the time-stamped interactions of users with a search engine (i.e., queries and clicks)

Query:

Keywords submitted to a search engine

Click:

Selecting a search result for further inspection

Index:

A data structure to serve fast lookups. For web search typically implemented as an inverted index that matches keywords to documents that contain the keyword

Posting List:

One line of an inverted index (i.e., for a given keyword, the documents that contain it)

Definition

Weblog analysis is the task of mining interesting patterns from the user interactions logged by a web search engine. Such log files usually contain time-stamped submitted queries and clicks on results. At a search engine site, analyzing these...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   1,500.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Arampatzis A, Kamps J (2008) A study of query length. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2008), Singapore, 20–24 July 2008, pp 811–812

    Google Scholar 

  • Azzopardi L, Vinay V (2008) Retrievability: an evaluation measure for higher order information access tasks. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), Napa Valley, 26–30 Oct 2008, pp 561–570

    Google Scholar 

  • Baeza-Yates RA, Saint-Jean F (2003) A three level search engine index based in query log distribution. In: Proceedings of the 10th international symposium on string processing and information retrieval (SPIRE 2003), Manaus, 8–10 Oct 2003. Lecture notes in computer science, vol 2857, pp 56–65

    Google Scholar 

  • Baeza-Yates RA, Tiberi A (2007) Extracting semantic relations from query logs. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2007), San Jose, 12–15 Aug 2007, pp 76–85

    Google Scholar 

  • Bar-Yossef Z, Kraus N (2011) Context-sensitive query auto-completion. In: Proceedings of the 20th international conference on world wide web (WWW 2011), Hyderabad, 28 Mar-1 Apr 2011, pp 107–116

    Google Scholar 

  • Baraglia R, Cacheda F, Carneiro V, Fernandez D, Formoso V, Perego R, Silvestri F (2009) Search shortcuts: a new approach to the recommendation of queries. In: Proceedings of the 3rd ACM conference on recommender systems (RecSys 2009), New York, 23–25 Oct 2009, pp 77–84

    Google Scholar 

  • Bast H, Weber I (2006) Type less, find more: fast auto-completion search with a succinct index. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2006), Seattle, 6–11 Aug 2006, pp 364–371

    Google Scholar 

  • Beitzel SM, Jensen EC, Chowdhury A, Frieder O, Grossman DA (2007) Temporal analysis of a very large topically categorized web query log. J Am Soc Inf Sci Technol 58(2):166–178

    Google Scholar 

  • Benz D, Hotho A, Jäschke R, Krause B, Stumme G (2010) Query Logs as Folksonomies. Datenbank-Spektrum 10(1):15–24

    Google Scholar 

  • Boldi P, Bonchi F, Castillo C, Donato D, Gionis A, Vigna S (2008) The query-flow graph: model and applications. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), Napa Valley, 26–30 Oct 2008. ACM, pp 609–618

    Google Scholar 

  • Broder AZ (2002) A taxonomy of web search. SIGIR Forum 36(2):3–10

    Google Scholar 

  • Cao H, Jiang D, Pei J, He Q, Liao Z, Chen E, Li H (2008) Context-aware query suggestion by mining click-through and session data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2008), Las Vegas, 24–27 Aug 2008, pp 875–883

    Google Scholar 

  • Cucerzan S, Brill E (2004) Spelling correction as an iterative process that exploits the collective knowledge of web users. In: Proceedings of the 1st conference on empirical methods in natural language processing (EMNLP 2004), Barcelona, 25–26 July 2004, pp 293–300

    Google Scholar 

  • Cui H, Wen JR, Nie JY, Ma WY (2002) Probabilistic query expansion using query logs. In: Proceedings of the 11th international World Wide Web conference (WWW 2002), Honolulu, 7–11 May 2002, pp 325–332

    Google Scholar 

  • Duan H, Hsu BJP (2011) Online spelling correction for query completion. In: Proceedings of the 20th international world wide web conference (WWW 2011), Hyderabad, 28 Mar–1 Apr 2011, pp 117–126

    Google Scholar 

  • Fagni T, Perego R, Silvestri F, Orlando S (2006) Boosting the performance of web search engines: caching and prefetching query results by exploiting historical usage data. ACM Trans Info Syst 24(1):51–78

    Google Scholar 

  • Francisco AP, Baeza-Yates RA, Oliveira AL (2012) Mining query log graphs towards a query folksonomy. Concurr Comput Pract Exp 24(17): 2179–2192

    Google Scholar 

  • Hagen M, Potthast M, Stein B, Bräutigam C (2011a) Query segmentation revisited. In: Proceedings of the 20th international world wide web conference (WWW 2011), Hyderabad, 28 Mar–1 Apr 2011, pp 97–106

    Google Scholar 

  • Hagen M, Stein B, Rüb T (2011b) Query session detection as a cascade. In: Proceedings of the 20th ACM conference on information and knowledge management (CIKM 2011), Glasgow, 24–28 Oct 2011, pp 147–152

    Google Scholar 

  • Hagen M, Potthast M, Beyer A, Stein B (2012) Towards optimum query segmentation: in doubt without. In: Proceedings of the 21st ACM international conference on information and knowledge management (CIKM 2012), Maui, 29 Oct–02 Nov 2012, pp 1015–1024

    Google Scholar 

  • Hagen M, Gomoll J, Beyer A, Stein B (2013) From search session detection to search mission detection. In: Proceedings of the 10th international conference open research areas in information retrieval (OAIR 2013), Lisbon, 22–24 May 2013 (to appear)

    Google Scholar 

  • Joachims T, Granka LA, Pan B, Hembrooke H, Radlinski F, Gay G (2007) Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Trans Info Syst 25(2), article 7

    Google Scholar 

  • Jones R, Klinkner KL (2008) Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), Napa Valley, 26–30 Oct 2008, pp 699– 708

    Google Scholar 

  • Li Y, Hsu BJP, Zhai C, Wang K (2011) Unsupervised query segmentation using clickthrough for information retrieval. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval (SIGIR 2011), Beijing, 25–29 July 2011, pp 285–294

    Google Scholar 

  • Markatos EP (2001) On caching search engine query results. Comput Commun 24(2): 137–143

    Google Scholar 

  • Mishra N, Roy R, Ganguly N, Laxman S, Choudhury M (2011) Unsupervised query segmentation using only query logs. In: Proceedings of the 20th international conference on world wide web, WWW 2011, Hyderabad, 28 Mar–1 Apr 2011 (Companion Volume), pp 91–92

    Google Scholar 

  • Mitchell J (2012) How Google search really works. ReadWriteWeb. http://www.readwriteweb.com/archives/interview_changing_engines_mid-flight_qa_with_goog.php. Last accessed 6 May 2013

  • Pass G, Chowdhury A, Torgeson C (2006) A picture of search. In: Proceedings of the 1st international conference on scalable information systems (Infoscale 2006), Hong Kong, 30 May–1 June 2006, paper 1

    Google Scholar 

  • Pickens J, Cooper M, Golovchinsky G (2010) Reverted indexing for feedback and expansion. In: Proceedings of the 19th ACM conference on information and knowledge management (CIKM 2010), Toronto, 26– 30 Oct 2010, pp 1049–1058

    Google Scholar 

  • Puppin D, Silvestri F (2006) The query-vector document model. In: Proceedings of the 15th ACM conference on information and knowledge management (CIKM 2006), Arlington, 6–11 Nov 2006, pp 880–881

    Google Scholar 

  • Silvestri F (2010) Mining query logs: turning search usage data into knowledge. Found Trends Info Retr 4(1– 2):1–174

    Google Scholar 

  • White RW, Morris D (2007) Investigating the querying and browsing behavior of advanced search engine users. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2007), Amsterdam, 23–27 July 2007, pp 255–262

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Hagen, M., Stein, B. (2014). Weblog Analysis. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6170-8_129

Download citation

Publish with us

Policies and ethics