Abstract
The web contains lots of information that gets updated every second. Searching for a relevant document from the web needs an efficient scrutinization. As the user’s need varies based on location, intention and purpose the retrieval of an efficient response is a challenge. To address this challenge an information retrieval technique has been put forth along with the advent of the machine learning and deep learning models. We have proposed a QeCSO algorithm to perform an efficient retrieval of relevant response. The Attention-based Bi-directional LSTM (ATT-BLSTM) helps to improve the retrieval of the relevant document based on its feature that correlates the semantics between the query and the content. On further expanding the query we can observe a steep improvement in retrieving the response. To perform this, the output from ATT-BLSTM is given as an input to the meta-heuristic algorithm called cuckoo search. It helps us to retrieve the exact term to expand the query and make our search to move closer to an optimal solution. The performance of our approach is compared to those of other models based on the evaluation metrics such as accuracy and F-measure. It is evaluated by applying the model over the SQuAD 1.1 dataset. By analyzing the results it is verified that our proposed algorithm achieves an accuracy of 95.8% for an efficient information retrieval of relevant response with an increase in F-measure.
Similar content being viewed by others
References
Boushaki S I, Kamel N and Bendjeghaba O (2015) Improved Cuckoo search algorithm for document clustering. In: Amine A, Bellatreche L, Elberrichi Z, Neuhold E, Wrembel R (Eds). Computer Science and Its Applications, CIIA 2015, IFIP Advances in Information and Communication Technology. vol. 456. Cham: Springer
Trellian Keyword Discovery 2019 Query size by country. https://www.keyworddiscovery.com/keyword-stats.html
J Wang, B Zhou, S Zhou 2016. An improved cuckoo search optimization algorithm for the problem of chaotic systems parameter estimation. Comput. Intell. Neurosci. 2959370, 8
Wang Z, Li X, Zhang D and Wu F 2006 A PSO-based web document query optimization algorithm. In: Mizoguchi R, Shi Z and Giunchiglia F (Eds.) The Semantic Web, – ASWC 2006, Lecture Notes in Computer Science, vol. 4185. Berlin–Heidelberg: Springer
Carpineto C and Romano G 2012 A survey of automatic query expansion in information retrieval. ACM Comput. Surv. (CSUR) 44(1): 1
Azad H K and Deepak A 2019 Query expansion techniques for information retrieval: a survey. Inf. Process. Manag. 56(5): 1698–1735
Sharma D K, Pamula R and Chauhan D S 2019 A hybrid evolutionary algorithm based automatic query expansion for enhancing document retrieval system. J. Ambient Intell. Hum. Comput.https://doi.org/10.1007/s12652-019-01247-9
Sharma D K, Pamula R and Chauhan D S 2019 Soft computing techniques based automatic query expansion approach for improving document retrieval. In: Proceedings of the 2019 Amity International Conference on Artificial Intelligence (AICAI) February, IEEE, pp. 972–976
Zadeh L A 1994 Fuzzy logic, neural networks, and soft computing. Commun. ACM 37(3): 77–85
Veningston K and Shanmugalakshmi R 2014 Efficient implementation of web search query reformulation using ant colony optimization. In: Proceedings of the International Conference on Big Data Analytics. Cham: Springer, pp. 80–94
Fister I, Yang X S and Fister D 2014. Cuckoo search: a brief literature review. In: Cuckoo Search and Firefly Algorithm. Cham: Springer, pp. 49–62
Khennak I and Drias H 2017 An accelerated PSO for query expansion in web information retrieval: application to medical dataset. Appl. Intell. 47(3): 793–808
Yang X S and Deb S 2014 Cuckoo search: recent advances and applications. Neural Comput. Appl. 24(1): 169–174
Yang X S and Deb S 2009 Cuckoo search via Lévy flights. In: Proceedings of the 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC), December, IEEE, pp. 210–214
Kawam A A and Mansour N 2012 Metaheuristic optimization algorithms for training artificial neural networks. Int. J. Comput. Inf. Technol. 1(2): 156–161
Liddy E D 2001 Natural language processing. In: Encyclopedia of Library and Information Science, 2nd ed. NY: Marcel Decker, Inc.
Hirschberg J and Manning C D 2015 Advances in natural language processing. Science 349(6245): 261–266
Rau L F, Jacobs P S and Zernik U 1989 Information extraction and text summarization using linguistic knowledge acquisition. Inf. Process. Manag. 25(4): 419–428
Winograd T 1971 Procedures as a representation for data in a computer program for understanding natural language (No. MAC-TR-84). Massachusetts Institute of Technology, Cambridge, Project MAC
Johnson M 2009 How the statistical revolution changes (computational) linguistics. In: Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous? Association for Computational Linguistics, pp. 3–11
Reck R P and Reck R A 2007 Generating and rendering readability scores for Project Gutenberg texts. In: Proceedings of the Corpus Linguistics Conference, Brimingham, UK
Francis W N and Kucera H 1964 Brown corpus. Department of Linguistics, Brown University, Providence, Rhode Island
Chelba C, Mikolov T, Schuster M, Ge Q, Brants T, Koehn P and Robinson T 2013 One billion word benchmark for measuring progress in statistical language modeling. Preprint arXiv:1312.3005
Dewdney N, VanEss-Dykema C and MacMillan R 2001 The form is the substance: classification of genres in text. In: Proceedings of the Workshop on Human Language Technology and Knowledge Management, Association for Computational Linguistics, July 7
Oghina A, Breuss M, Tsagkias M and De Rijke M 2012 Predicting IMDB movie ratings using social media. In: Proceedings of the European Conference on Information Retrieval, April. Berlin–Heidelberg: Springer, pp. 503–507
Rajpurkar P, Zhang J, Lopyrev K and Liang P 2016 SQuAD: 100,000+ questions for machine comprehension of text. Preprint arXiv:1606.05250
Rajpurkar P, Jia R and Liang P 2018 Know what you don’t know: unanswerable questions for SQuAD. Preprint arXiv:1806.03822
Reddy S, Chen D and Manning C D 2019 CoQA: a conversational question answering challenge. Trans. Assoc. Comput. Linguistics 7: 249–266
Joshi M, Choi E, Weld D S and Zettlemoyer L 2017 TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. Preprint arXiv:cs.CL/1705.03551
Saha A, Aralikatte R, Khapra M M and Sankaranarayanan K 2018 Duorc: towards complex language understanding with paraphrased reading comprehension. Preprint arXiv:1804.07927
Powles J and Hodson H 2017. Google DeepMind and healthcare in an age of algorithms. Health Technol. 7(4): 351–367
Ma X and Cieri C 2006 Corpus support for machine translation at LDC. In: Proceedings of LREC, May, pp. 859–864
Koehn P 2005 Europarl: A parallel corpus for statistical machine translation. In: Proceedings of MT Summit, September, vol. 5, pp. 79–86
Bojar O, Diatka V, Rychlý P, Stranák P, Suchomel V, Tamchyna A and Zeman D 2014 HindEnCorp-Hindi–English and Hindi-only corpus for machine translation. In: Proceedings of LREC, May, pp. 3550–3555
Usbeck R, Röder M, Hoffmann M, Conrads F, Huthmann J, Ngonga-Ngomo A C and Unger C 2019 Benchmarking question answering systems. Semantic Web, (Preprint), pp. 1–12
Dale R, Moisl H and Somers H (Eds.) 2000 Handbook of natural language processing. CRC Press
Mikolov T, Chen K, Corrado G and Dean J 2013 Efficient estimation of word representations in vector space. Preprint arXiv:1301.3781
Pennington J, Socher R and Manning C 2014 Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543
Peters M E, Neumann M, Iyyer M, Gardner M, Clark C, Lee K and Zettlemoyer L 2018 Deep contextualized word representations. Preprint arXiv:1802.05365
Conneau A, Kiela D, Schwenk H, Barrault L and Bordes A 2017 Supervised learning of universal sentence representations from natural language inference data. Preprint arXiv:1705.02364
Xu D and Li W J 2016 Full-time supervision based bidirectional RNN for factoid question answering. Preprint arXiv:1606.05854
Olah C 2015 Understanding LSTM networks. Blog
Tan M, Santos C D, Xiang B and Zhou B 2015 LSTM-based deep learning models for non-factoid answer selection. Preprint arXiv:1511.04108
Chen S, Wen J and Zhang R 2016 GRU-RNN based question answering over Knowledge Base. In: Proceedings of the China Conference on Knowledge Graph and Semantic Computing, September, Singapore, pp. 80–91
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L and Polosukhin I 2017 Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008
Lilian J F, Sundarakantham K and Shalinie S M 2021 Anti-negation method for handling negation words in question answering system. J. Supercomput. 77(5): 4244–4266
Singh A and Kaur M 2019. Intelligent content-based cybercrime detection in online social networks using cuckoo search metaheuristic approach. J. Supercomput. 76: 5402–5424
Dua R D, Madaan D M, Mukherjee P M and Lall B L 2019 Real time attention based bidirectional long short-term memory networks for air pollution forecasting. In: Proceedings of the 2019 IEEE Fifth International Conference on Big Data Computing Service and Applications (BigDataService), IEEE, April, pp. 151–158
Seo M, Kembhavi A, Farhadi A and Hajishirzi H 2016 Bidirectional attention flow for machine comprehension. Preprint arXiv:1611.01603
Dillon J V and Collins-Thompson K 2010 A unified optimization framework for robust pseudo-relevance feedback algorithms. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, ACM, October, pp. 1069–1078
Imani A, Vakili A, Montazer A and Shakery A 2019 Deep neural networks for query expansion using word embeddings. In: Proceedings of the European Conference on Information Retrieval, April. Cham: Springer, pp. 203–210
Wang X, Macdonald C and Ounis I 2020 Deep reinforced query reformulation for information retrieval. Preprint arXiv:2007.07987
ALMarwi H, Ghurab M and Al-Baltah I 2020 A hybrid semantic query expansion approach for Arabic information retrieval. J. Big Data 7(1): 1–19
Ture F and Jojic O 2016 No need to pay attention: simple recurrent neural networks work! (for answering “simple” questions). Preprint arXiv:1606.05029
Sundermeyer M, Schlüter R and Ney H 2012 LSTM neural networks for language modeling. In: Proceedings of the Thirteenth Annual Conference of the International Speech Communication Association
Xu Y, Mou L, Li G, Chen Y, Peng H and Jin Z 2015 Classifying relations via long short term memory networks along shortest dependency paths. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, September, pp. 1785–1794
Graves A, Fernández S and Schmidhuber J 2005 Bidirectional LSTM networks for improved phoneme classification and recognition. In: Proceedings of the International Conference on Artificial Neural Networks, September. Berlin–Heidelberg: Springer, pp. 799–804
Ma J, Ting T O, Man K L, Zhang N, Guan S U and Wong P W 2013 Parameter estimation of photovoltaic models via cuckoo search. J. Appl. Math. 362619
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lilian, J.F., Sundarakantham, K. & Shalinie, S.M. QeCSO: Design of hybrid Cuckoo Search based Query expansion model for efficient information retrieval. Sādhanā 46, 181 (2021). https://doi.org/10.1007/s12046-021-01706-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12046-021-01706-0