Advertisement

WIDIT: Integrated Approach to HARD Topic Search

  • Kiduk Yang
  • Ning Yu
  • Hui Zhang
  • Shahrier Akram
  • Ivan Record
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4182)

Abstract

Web Information Discovery Tool (WIDIT) Laboratory at the Indiana University School of Library, whose basic approach to combine multiple methods as well as to leverage multiple sources of evidence, participated in 2005 Text Retrieval Conference’s Hard track (HARD-2005) to investigate methods of effectively dealing with HARD topics by exploring a variety of query expansion strategies, the results of which were combined via an automatic fusion optimization process. We hypothesized that the “difficulty” of topics is often due to the lack of appropriate query terms and/or misguided emphasis on non-pivotal query terms by the system. Thus, our first-tier solution was to devise a wide range of query expansion methods that can not only enrich the query with useful term additions but also identify important query terms. Our automatic query expansion included such techniques as noun phrase extraction, synonym identification, definition term extraction, keyword extraction by overlapping sliding window, and Web query expansion. The results of automatic expansion were used in soliciting user feedback, which was utilized in a post-retrieval reranking process. The paper describes our participation in HARD-2005 and is organized as follows. Section 2 gives an overview of HARD track, section 3 describes the WIDIT approach to HARD-2005, and section 4 discusses the results and implications, followed by the concluding remarks in section 5.

Keywords

Noun Phrase Relevance Feedback Query Term Query Expansion Query Length 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Buckley, C., Salton, G., Allan, J., Singhal, A.: Automatic query expansion using SMART: TREC 3. In: Proceeding of the 3rd Text Rerieval Conference (TREC-3), pp. 1–19 (1995)Google Scholar
  2. Buckley, C., Singhal, A., Mitra, M.: Using query zoning and correlation within SMART: TREC 5. In: Proceeding of the 5th Text REtrieval Conference (TREC-5), pp. 105–118 (1997)Google Scholar
  3. Fox, E.A., Shaw, J.A.: Combination of multiple searches. In: Proceeding of the 3rd Text Rerieval Conference (TREC-3), pp. 105–108 (1995)Google Scholar
  4. Frakes, W.B., Baeza-Yates, R. (eds.): Information retrieval: Data structures & algorithms. Prentice Hall, Englewood Cliffs (1992)Google Scholar
  5. Grunfeld, L., Kwok, K.L., Dinstl, N., Deng, P.: TREC 2003 Robust, HARD, and QA track experiments using PIRCS. In: Proceedings of the 12th Text Retrieval Conference, pp. 510–521 (2004)Google Scholar
  6. Harman, D., Buckley, C.: The NRRC Reliable Information Access (RIA) workshop. In: Proceedings of the 27th Annual International ACM SIGIR Conference, pp. 528–529 (2004)Google Scholar
  7. Krovetz, R.: Viewing morphology as an inference process. In: Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 191–203 (1993)Google Scholar
  8. Kwok, K.L., Grunfeld, L., Sun, H.L., Deng, P.: TREC 2004 robust track experiments using PIRCS. In: Proceedings of the 13th Text REtrieval Conference, TREC 2004 (2005)Google Scholar
  9. Robertson, S.E., Walker, S.: Some simple approximations to the 2-Poisson model for probabilistic weighted retrieval. In: Proceedings of the 17th ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 232–241 (1994)Google Scholar
  10. Rocchio Jr., J.J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The Smart System– experments in automatic document processing, pp. 313–323. Prentice-Hall, Englewood Cliffs (1971)Google Scholar
  11. Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science 41, 288–297 (1990)CrossRefGoogle Scholar
  12. Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 21–29 (1996)Google Scholar
  13. Yang, K.: Combining Text-, Link-, and Classification-based Retrieval Methods to Enhance Information Discovery on the Web (Doctoral Dissertation. University of North Carolina) (2002a)Google Scholar
  14. Yang, K.: Combining Text- and Link-based Retrieval Methods for Web IR. In: Proceedings of the 10th Text Rerieval Conference (TREC 2001), pp. 609–618 (2002)Google Scholar
  15. Yang, K., Yu, N.: WIDIT: Fusion-based Approach to Web Search Optimization. Asian Information Retrieval Symposium (2005)Google Scholar
  16. Yang, K., Yu, N., Lee, Y.: Dynamic Tuning for Fusion: Harnessing Human Intelligence to Optimize System Performance. In: Proceedings of the 9th World Multi- Conference on Systemics, Cybernetics and Informatics (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Kiduk Yang
    • 1
  • Ning Yu
    • 1
  • Hui Zhang
    • 1
  • Shahrier Akram
    • 1
  • Ivan Record
    • 1
  1. 1.School of Library and Information ScienceIndiana UniversityBloomingtonU.S.A.

Personalised recommendations