Explaining Query Modifications

An Alternative Interpretation of Term Addition and Removal
  • Vera Hollink
  • Jiyin He
  • Arjen de Vries
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7224)


In the course of a search session, searchers often modify their queries several times. In most previous work analyzing search logs, the addition of terms to a query is identified with query specification and the removal of terms with query generalization. By analyzing the result sets that motivated searchers to make modifications, we show that this interpretation is not always correct. In fact, our experiments indicate that in the majority of cases the modifications have the opposite functions. Terms are often removed to get rid of irrelevant results matching only part of the query and thus to make the result set more specific. Similarly, terms are often added to retrieve more diverse results. We propose an alternative interpretation of term additions and removals and show that it explains the deviant modification behavior that was observed.


Query Term Original Query Result List Search Session Coherence Score 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Boldi, P., Bonchi, F., Castillo, C., Vigna, S.: Query reformulation mining: models, patterns, and applications. Information Retrieval 14(3), 257–289 (2010)CrossRefGoogle Scholar
  2. 2.
    Bozzon, A., Chirita, P.A., Firan, C.S., Nejdl, W.: Lexical analysis for modeling web query reformulation. In: SIGIR 2007, pp. 739–740 (2007)Google Scholar
  3. 3.
    Bruza, P., Dennis, S.: Query reformulation on the internet: empirical data and the hyperindex search engine. In: RIAO 1997, pp. 488–499 (1997)Google Scholar
  4. 4.
    Costa, R.P., Seco, N.: Hyponymy Extraction and Web Search Behavior Analysis Based on Query Reformulation. In: Geffner, H., Prada, R., Machado Alexandre, I., David, N. (eds.) IBERAMIA 2008. LNCS (LNAI), vol. 5290, pp. 332–341. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  5. 5.
    Cronen-Townsend, S., Croft, W.B.: Quantifying query ambiguity. In: HLT 2002, pp. 104–109 (2002)Google Scholar
  6. 6.
    Gonzalo, J., Peinado, V., Clough, P., Karlgren, J.: Overview of iCLEF 2009: exploring search behaviour in a multilingual folksonomy environment. In: CLEF 2009, pp. 13–20 (2010)Google Scholar
  7. 7.
    He, D., Göker, A., Harper, D.J.: Combining evidence for automatic web session identification. Information Processing and Management 38(5), 727–742 (2002)zbMATHCrossRefGoogle Scholar
  8. 8.
    He, J., Larson, M., de Rijke, M.: Using Coherence-Based Measures to Predict Query Difficulty. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 689–694. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  9. 9.
    Hiemstra, D.: Term-specific smoothing for the language modeling approach to information retrieval: the importance of a query term. In: SIGIR 2002, pp. 35–41 (2002)Google Scholar
  10. 10.
    Hollink, V., Tsikrika, T., De Vries, A.P.: Semantic search log analysis: a method and a study on professional image search. JASIST 62(4), 691–713 (2011)CrossRefGoogle Scholar
  11. 11.
    Huang, J., Efthimiadis, E.N.: Analyzing and evaluating query reformulation strategies in web search logs. In: CIKM 2009, pp. 77–86 (2009)Google Scholar
  12. 12.
    Jansen, B.J., Booth, D.L., Spink, A.: Patterns of query reformulation during web searching. JASIST 60(7), 1358–1371 (2009)CrossRefGoogle Scholar
  13. 13.
    Jansen, B.J., Spink, A., Pedersen, J.O.: An analysis of multimedia searching on AltaVista. In: MIR 2003, pp. 186–192 (2003)Google Scholar
  14. 14.
    Jones, R., Fain, D.C.: Query word deletion prediction. In: SIGIR 2003, pp. 435–436 (2003)Google Scholar
  15. 15.
    Jörgensen, C., Jörgensen, P.: Image querying by image professionals. JASIST 56(12), 1346–1359 (2005)CrossRefGoogle Scholar
  16. 16.
    Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Özmutlu, H.C.: Markovian analysis for automatic new topic identification in search engine transaction logs. Applied Stochastic Models in Business and Industry 25(6), 737–768 (2009)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Peinado, V., Gonzalo, J., Artiles, J., López-Ostenero, F.: Log Analysis of Multilingual Image Searches in Flickr. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 236–242. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  19. 19.
    Rieh, S.Y., Xie, H.: Analysis of multiple query reformulations on the web: the interactive information retrieval context. Information Processing and Management 42(3), 751–768 (2006)CrossRefGoogle Scholar
  20. 20.
    Rudinac, S., Larson, M., Hanjalic, A.: Exploiting Result Consistency to Select Query Expansions for Spoken Content Retrieval. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 645–648. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  21. 21.
    Whittle, M., Eaglestone, B., Ford, N., Gillet, V.J., Madden, A.: Data mining of search engine logs. JASIST 58(14), 2382–2400 (2007)CrossRefGoogle Scholar
  22. 22.
    Xiang, B., Jiang, D., Pei, J., Sun, X., Chen, E., Li, H.: Context-aware ranking in web search. In: SIGIR 2010, pp. 451–458 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Vera Hollink
    • 1
  • Jiyin He
    • 1
  • Arjen de Vries
    • 1
  1. 1.Centrum Wiskunde en InformaticaAmsterdamThe Netherlands

Personalised recommendations