University of Otago at INEX 2010

  • Xiang-Fei Jia
  • David Alexander
  • Vaughn Wood
  • Andrew Trotman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6932)


In this paper, we describe University of Otago’s participation in Ad Hoc, Link-the-Wiki Tracks, Efficiency and Data Centric Tracks of INEX 2010. In the Link-the-Wiki Track, we show that the simpler relevance summation method works better for producing Best Entry Points (BEP). In the Ad Hoc Track, we discusses the effect of various stemming algorithms. In the Efficiency Track, we compare three query pruning algorithms and discusses other efficiency related issues. Finally in the Data Centric Track, we compare the BM25 and Divergence ranking functions.


Term Frequency Jaccard Index Query Evaluation Relevance Score Pruning Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Huang, D., Xu, Y., Trotman, A., Geva, S.: Overview of inex 2007 link the wiki track. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 373–387. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  2. 2.
    Geva, S.: Gpx: Ad-hoc queries and automated link discovery in the wikipedia. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 404–416. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Porter, M.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRefGoogle Scholar
  4. 4.
    Spärck Jones, K.: Automatic Keyword Classification for Information Retrieval. Archon Books (1971)Google Scholar
  5. 5.
    Xu, J., Croft, W.B.: Corpus-based stemming using cooccurrence of word variants. ACM Trans. Inf. Syst. 16(1), 61–81 (1998)CrossRefGoogle Scholar
  6. 6.
    Jia, X.F., Trotman, A., O’Keefe, R.: Efficient accumulator initialisation. In: Proceedings of the 15th Australasian Document Computing Symposium (ADCS 2010), Melbourne, Australia (2010)Google Scholar
  7. 7.
    Trotman, A.: Compressing inverted files. Inf. Retr. 6(1), 5–19 (2003)CrossRefGoogle Scholar
  8. 8.
    Anh, V.N., Moffat, A.: Inverted index compression using word-aligned binary codes. Inf. Retr. 8(1), 151–166 (2005)CrossRefGoogle Scholar
  9. 9.
    Baeza-Yates, R., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: The impact of caching on search engines. In: SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 183–190. ACM, New York (2007)Google Scholar
  10. 10.
    Jia, X.F., Trotman, A., O’Keefe, R., Huang, Z.: Application-specific disk I/O optimisation for a search engine. In: PDCAT 2008: Proceedings of the 2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 399–404. IEEE Computer Society, Washington, DC (2008)CrossRefGoogle Scholar
  11. 11.
    Buckley, C., Lewit, A.F.: Optimization of inverted vector searches, pp. 97–110 (1985)Google Scholar
  12. 12.
    Moffat, A., Zobel, J.: Self-indexing inverted files for fast text retrieval. ACM Trans. Inf. Syst. 14(4), 349–379 (1996)CrossRefGoogle Scholar
  13. 13.
    Tsegay, Y., Turpin, A., Zobel, J.: Dynamic index pruning for effective caching, pp. 987–990 (2007)Google Scholar
  14. 14.
    Persin, M., Zobel, J., Sacks-Davis, R.: Filtered document retrieval with frequency-sorted indexes. J. Am. Soc. Inf. Sci. 47(10), 749–764 (1996)CrossRefGoogle Scholar
  15. 15.
    Anh, V.N., de Kretser, O., Moffat, A.: Vector-space ranking with effective early termination, pp. 35–42 (2001)Google Scholar
  16. 16.
    Trotman, A., Jia, X.F., Geva, S.: Fast and effective focused retrieval. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 229–241. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  17. 17.
    Bentley, J.L., Mcilroy, M.D.: Engineering a sort function (1993)Google Scholar
  18. 18.
    Persin, M.: Document filtering for fast ranking, pp. 339–348 (1994)Google Scholar
  19. 19.
    Moffat, A., Zobel, J., Sacks-Davis, R.: Memory efficient ranking. Inf. Process. Manage. 30(6), 733–744 (1994)CrossRefGoogle Scholar
  20. 20.
    Moffat, A., Zobel, J., Klein, S.T.: Improved inverted file processing for large text databases, pp. 162–171 (1995)Google Scholar
  21. 21.
    Anh, V.N., Moffat, A.: Random access compressed inverted files. In: Australian Computer Science Comm.: Proc. 9th Australasian Database Conf. ADC, vol. 20(2), pp. 1–12 (February 1998)Google Scholar
  22. 22.
    Anh, V.N., Moffat, A.: Compressed inverted files with reduced decoding overheads, pp. 290–297 (1998)Google Scholar
  23. 23.
    Schenkel, R., Suchanek, F., Kasneci, G.: YAWN: A semantically annotated wikipedia xml corpus (March 2007)Google Scholar
  24. 24.
    Amati, G., Van Rijsbergen, C.J.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20(4), 357–389 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Xiang-Fei Jia
    • 1
  • David Alexander
    • 1
  • Vaughn Wood
    • 1
  • Andrew Trotman
    • 1
  1. 1.Computer ScienceUniversity of OtagoDunedinNew Zealand

Personalised recommendations