Abstract
Effective query expansion terms selection methods are really very important for improving the accuracy and efficiency of Pseudo-Relevance Feedback (PRF) based automatic query expansion techniques in information retrieval system. These methods remove irrelevant and redundant terms from the top retrieved feedback documents with respect to a user query. Individual terms selection methods have been widely investigated for improving its performance. However, it is always a challenging task to find an individual expansion terms selection method that would outperform other individual methods in most cases. In this paper, first we explore the possibility of improving the overall performance using individual terms selection methods. Second, we propose a model for combining multiple expansion terms selection methods by using a variety of ranks combining approaches. Third, semantic filtering used to filter out semantically irrelevant terms obtained after combining multiple terms selection methods. Fourth, the Genetic Algorithm used to make an optimal combination of query terms and candidate expansion terms obtained by applying ranks combination and semantic filtering approach. Our experimental results demonstrated that our proposed approaches achieved a significant improvement over each individual terms selection methods and related state-of-the-arts approaches.
Article PDF
Avoid common mistakes on your manuscript.
References
Lesk ME. Word-word associations in document retrieval systems, American Documentation 1969; 20(1): 27–38.
Singh J and Sharan A. Context window based cooccurrence approach for improving feedback based query expansion in information retrieval. International Journal of Information Retrieval Research 2015, 5(4): 31–45.
Xia Z, Zhu Y, Sun X and Chen L. Secure semantic expansion based search over encrypted cloud data supporting similarity ranking. Journal of Cloud Computing 2014; 3(8): 1–11.
Singh J and Sharan A. Co-occurrence and semantic similarity based hybrid approach for improving automatic query expansion in information retrieval. In LNCS 8956, Springer, 2015, pp. 415–418.
Li Y, Luo C and Chung SM. Text clustering with feature selection by using statistical data. IEEE Trans. on Knowledge and Data Engineering 2008; 20(5): 641–652.
Adekpedjou A and Zamba KD. A Chi-Squared Goodness of Fit Test for Recurrent Event Data. Journal of Statistical Theory and Applications 2012;11(2): 97–119.
Carpineto C and Romano G. A survey of Automatic Query Expansion in Information Retrieval. ACM Computing Survey 2012; 44(1): 1–50.
Robertson SE. On term selection for query expansion. Journal of documentation 1990; 46(4): 359–364.
Rogati M and Yang Y. High-performing feature selection for text classification. In: Proceedings of the 11th ACM International Conference on Information and Knowledge Management, 2002, pp. 659–661.
Verelas VE and Raftopoulou P. Semantic Similarity Methods in WordNet and their Application to IR on the Web. In: Web information and data management, 2005, pp. 10–16.
Liu S, Liu F, Yu C and Meng W. An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In: Proceedings of the ACM SIGIR Conference on Research and development in Information Retrieval, 2004, pp. 266–272.
Bhatnagar P and Pareek N. Improving pseudo relevance feedback based query expansion using genetic fuzzy approach and semantic similarity notion. Journal of Information Science 2014; 40(4): 523–537.
Araujo L and PerezAguera JR. Improving query expansion with stemming terms: a new genetic algorithm approach. In: Proceeding of 8th European Conference on evolutionary computation in combinatorial explosion, 2008, pp. 182–193.
Cecchini RL, Lorenzetti CM, Maguitman AG and Brignole NB. Using genetic algorithms to evolve a population of topical queries. Information Processing & Management 2008; 44(6): 1863–1878.
Swets JA. Information retrieval systems. Science 1963; 141(3577): 245–250.
Broookes BC. The measure of information retrieval effectiveness proposed by swets. Journal of Documentation 1968; 24(1): 41–54.
Robertson SE, Walker S, Jones S, Beaulieu MMH and Gatford M. Okapi at TREC-3. In: Proceedings of the third Text REtrieval Conference, 1995, pp. 109–126.
Kelly JS. Social Choice Theory: An Introduction. 1st edn., Springer-Verlag, 1988.
Fox EA and Shaw JA. Combination of Multiple Searches. In: Proceedings of the 2nd Text REtrieval Conference, 1994, pp. 243–252.
Wei Z, Gao W, Ganainy TE, Magdy W and Wong KF. Ranking model selection and fusion for effective micro blog search. In: proceedings of the 1st international workshop on Social media retrieval and analysis, 2014, pp. 21–26.
Leacock C and Chodorow M. Combining Local Context and WordNet Similarity for Word Sense Identification in WordNet. An Electronic Lexical Database, Cambridge, MIT Press, 1998, pp. 265–283.
Resnik P. Using information content to evaluate semantic similarity. In: Proceedings of 14th International Joint Conference on Artificial Intelligence, Montrea, 1995, pp. 448–453.
Wu Z and Palmer M. Verb Semantics and Lexical Selection. In: Annual Meeting of the Associations for Computational Linguistics, 1994, pp. 133–138.
Miao J, Huang X and Ye Z. Proximity-based rocchio’s model for pseudo relevance feedback. In: Proceedings of 35th annual international ACM SIGIR conference on research and development in information retrieval, 2012, pp. 534–544.
Robertson SE, Walker S, Jones S, Beaulieu MMH and Gatford M. Okapi at TREC-3. In: Proceedings of the third Text REtrieval Conference, 1995, pp. 109–126.
Hiemstra D. A linguistically motivated probabilistic model of information retrieval. In Research and advanced technology for digital libraries, Springer, 1998, pp. 569–584.
Aguera JRP and Araujo L. Comparing and Combining Methods for Automatic Query Expansion. In: Advances in Natural Language Processing and Applications, Research in Computing Science 2008; 33: 177–188.
Zhang X, Wang S and Huang G. Query Expansion based on Associated Semantic Space. Journal of Computers 2011; 6(2): pp. 172–177.
Zhu Z, Chen X, Zhu Q and Xie Q. A GA-based query optimization method for web information retrieval. Applied Mathematics and Computation Elsevier 2007; 185(2): 919–930.
Diaz F and Metzler D. Improving the estimation of relevance models using large external corpora. In: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, 2006, pp. 154–161.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
About this article
Cite this article
Singh, J. Ranks Aggregation and Semantic Genetic Approach based Hybrid Model for Query Expansion. Int J Comput Intell Syst 10, 34–55 (2017). https://doi.org/10.2991/ijcis.2017.10.1.4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.2991/ijcis.2017.10.1.4