Interactive Information Retrieval Using Clustering and Spatial Proximity

  • Anton Leuski
  • James Allan

Abstract

A web-based search engine responds to a user’s query with a list of documents. This list can be viewed as the engine’s model of the user’s idea of relevance—the engine ‘believes’ that the first document is the most likely to be relevant, the second is slightly less likely, and so on. We extend this idea to an interactive setting where the system accepts the user’s feedback and adjusts its relevance model. We develop three specific models that are integrated as part of a system we call Lighthouse. The models incorporate document clustering and a spring-embedding visualization of inter-document similarity. We show that if a searcher were to use Lighthouse in ways consistent with the model, the expected effectiveness improves—i.e., the relevant documents are found more quickly in comparison to existing methods.

clustering information organization information retrieval information visualization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aalbersberg, I. J.: 1992, Incremental relevance feedback. In: Proceedings of ACM SIGIR. pp. 11–22.Google Scholar
  2. Allan, J.: 1995, Automatic Hypertext Construction. Ph.D. thesis, Cornell University.Google Scholar
  3. Allan, J., Callan, J., Croft, B., Ballesteros, L., Broglio, J., Xu, J. and Shu, H.: 1997, ‘INQUERY at TREC-5’. In: Fifth Text REtrieval Conference (TREC-5). pp. 119–132.Google Scholar
  4. Allan, J., Callan, J., Croft, W.B., Ballesteros, L., Byrd,D., Swan, R. and Xu, J.: 1998, INQUERY does battle with TREC-6. In: Sixth Text REtrieval Conference (TREC-6). pp. 169–206.Google Scholar
  5. Benyon, D. and Murray, D.: 1993, Developing adaptive systems to fit individual needs. In: Proceedings of the 3rd International Workshop on Intelligent User Interfaces. pp. 115–121.Google Scholar
  6. Berliner, H.: 1979, On the construction of evaluation functions for large domains. In: Proceedings of the Sixth International Joint Conference on Artificial Intelligence.Google Scholar
  7. Bookstein, A.: 1983, Information retrieval: A sequential learning process. Journal of the American Society for Information Science 34(5), 331–342.Google Scholar
  8. Buckley, C. and Salton, G.: 1995, Optimization of relevance feedback weights. In: Proceedings of ACM SIGIR. pp. 351–357.Google Scholar
  9. Chalmers, M. and Chitson, P.: 1992, Bead: Explorations in information visualization. In: Proceedings of ACM SIGIR. pp. 330–337.Google Scholar
  10. Chen, H.: 1995,Machine learning for information retrieval: Neural networks, symbolic learning, and genetic algorithms. Journal of the American Society for Information Science 46(3), 194–216.CrossRefGoogle Scholar
  11. Chen, J. R. and Mathe, N.: 1995, Learning subjective relevance to facilitate information access. In: CIKM. pp. 218–225.Google Scholar
  12. Cypher, A.: 1991, Programming repetitive tasks by example. In: Proceedings of the ACM Conference on Computer Human Interaction. pp. 33–39.Google Scholar
  13. Frey, D., Gupta, R., Khandelwal, V., Lavrenko, V., Leuski, A. and Allan, J.: 2001, Monitoring the news: a TDT demonstarion system. In: Proceedings of the first International HLT Conference.Google Scholar
  14. Frisse, M. E. and Cousing, S. B.: 1989, Information retrieval from hypertext: Update on the dynamic medical handbook project. In: Proceedings of the ACM Conference on Hypertext. pp. 199–212.Google Scholar
  15. Fruchterman, T. M. J. and Reingold, E. M.: 1991, Graph drawing by force-directed placement. Software-Practice and Experience 21(11), 1129–1164.Google Scholar
  16. Harman, D. and Voorhees, E. (eds.): 1997, The Fifth Text REtrieval Conference (TREC-5). NIST.Google Scholar
  17. Harman, D. and Voorhees, E. (eds.): 1998, The Sixth Text REtrieval Conference (TREC-6). NIST.Google Scholar
  18. Harman, D. and Voorhees, E. (eds.): 1999, The Eighth Text REtrieval Conference (TREC-8). NIST.Google Scholar
  19. Harman, D. and Voorhees, E. (eds.): 2000, The Ninth Text REtrieval Conference (TREC-9). NIST.Google Scholar
  20. Harman, D. and Voorhees, E. (eds.): 2001, The Tenth Text REtrieval Conference (TREC-2001). NIST.Google Scholar
  21. Hearst, M. A. and Pedersen, J. O.: 1996, Reexamining the cluster hypothesis: Scatter/Gather on retrieval results. In: Proceedings of ACM SIGIR. pp. 76–84.Google Scholar
  22. Hersh, W. and Over, P.: 2001, The TREC-9 interactive track report. In: The Ninth Text REtrieval Conference (TREC-9). pp. 41–50.Google Scholar
  23. Koenemann, J. and Belkin, N. J.: 1996, A case for interaction: A study of interactive information retrieval behavior and effectivness. In: Proceedings of ACM SIGCHI Conference on Human Factors in Computing Systems. pp. 205–212.Google Scholar
  24. Lance, G. N. and Williams, W. T.: 1967, A general theory of classificatory sorting strategies: 1. Hierarchical Systems. Computer Journal 9, 373–380.Google Scholar
  25. Leuski, A.: 2000, Relevance and reinforcement in interactive browsing. In: Proceedings of Ninth International Conference on Information and KnowledgeManagement (CIKM’00). pp. 119–126.Google Scholar
  26. Leuski, A.: 2001a, Evaluating document clustering for interactive information retrieval. In: Proceedings of Tenth International Conference on Information and Knowledge Management (CIKM’00). pp. 41–48.Google Scholar
  27. Leuski, A.: 2001b, Interactive Information Organization: Techniques and Evaluation. Ph.D. thesis, University of Massachusetts at Amherst.Google Scholar
  28. Leuski, A. and Allan, J.: 2000a, Details of Lighthouse. Technical Report IR-212, Department of Computer Science, University of Massachusetts, Amherst.Google Scholar
  29. Leuski, A. and Allan, J.: 2000b, Lighthouse: showing the way to relevant information. In: Proceedings of InfoVis’2000.Google Scholar
  30. Leuski, A. and Croft, W. B.: 1996, An Evaluation of Techniques for Clustering Search Results. Technical Report IR-76, Department of Computer Science, University of Massachusetts, Amherst.Google Scholar
  31. Lewis, D. D.: 1992, An evaluation of phrasal and clustered representations on a text categorization task. In: Proceedings of ACM SIGIR. pp. 37–50.Google Scholar
  32. Mirkin, B.: 1996, Mathematical Classification and Clustering. K luwer.Google Scholar
  33. Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M. M. and Gatford M.: 1995, Okapi at TREC-3. In: D. Harman and E. Voorhees (eds.): Third Text REtrieval Conference (TREC-3).Google Scholar
  34. Rocchio, Jr., J. J.: 1971, Relevance feedback in information retrieval. In: G. Salton (ed.): The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall, Inc., pp. 313–323.Google Scholar
  35. Salton, G.: 1989, Automatic Text Processing. Addison-Wesley.Google Scholar
  36. Sullivan, J. W. and Tyler, S. W. (eds.): 1991, Intelligent User Interfaces. ACM.Google Scholar
  37. Swan, R. and Allan, J.: 1998, Aspect Windows, 3-D Visualizations, and indirect comparisons of information retrieval systems. In: Proceedings of ACM SIGIR. pp. 173–181.Google Scholar
  38. van Rijsbergen, C. J.: 1979, Information Retrieval. London: Butterworths. Second edition.Google Scholar
  39. Xu, J. and Croft, W. B.: 1996, Querying expansion using local and global document analysis. In: Proceedings of ACM SIGIR. pp. 4–11.Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Anton Leuski
    • 1
  • James Allan
    • 2
  1. 1.Information Sciences InstituteUniversity of Southern CaliforniaUSA
  2. 2.Center for Intelligent Information Retrieval, Department of Computer ScienceUniversity of MassachusettsAmherstUSA

Personalised recommendations