Article

Journal of Intelligent Information Systems

, Volume 34, Issue 3, pp 249-274

First online:

Passage extraction and result combination for genomics information retrieval

  • Qinmin HuAffiliated withDepartment of Computer Science & Engineering, York University
  • , Jimmy Xiangji HuangAffiliated withSchool of Information Technology, York University Email author 

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

In this paper, we first propose algorithms for passage extraction to build indices for the purpose of generating more accurate passages as query answers. Second, we propose a basic result combination method and an improved result combination method to combine the retrieved results from different indices for the purpose of selecting and merging relevant passages as outputs. For passage extraction, three new algorithms are proposed, namely paragraphParsed, sentenceParsed and wordSentenceParsed. For result combination, a novel method is proposed, in which we use factor analysis to generate a better baseline result for combination by finding some hidden common factors that can be used to estimate the importance of keywords and keyword associations. Finally, we report the experimental results that confirm the effectiveness and superiority of the factor analysis based method for result combination. Our proposed approaches achieve excellent results on the TREC 2006 and 2007 Genomics data sets, which provide a promising avenue for constructing high performance information retrieval systems in biomedicine.

Keywords

Information retrieval Passage extraction Result combination Linear regression Factor analysis Genomics