Skip to main content
Log in

QeCSO: Design of hybrid Cuckoo Search based Query expansion model for efficient information retrieval

  • Published:
Sādhanā Aims and scope Submit manuscript

Abstract

The web contains lots of information that gets updated every second. Searching for a relevant document from the web needs an efficient scrutinization. As the user’s need varies based on location, intention and purpose the retrieval of an efficient response is a challenge. To address this challenge an information retrieval technique has been put forth along with the advent of the machine learning and deep learning models. We have proposed a QeCSO algorithm to perform an efficient retrieval of relevant response. The Attention-based Bi-directional LSTM (ATT-BLSTM) helps to improve the retrieval of the relevant document based on its feature that correlates the semantics between the query and the content. On further expanding the query we can observe a steep improvement in retrieving the response. To perform this, the output from ATT-BLSTM is given as an input to the meta-heuristic algorithm called cuckoo search. It helps us to retrieve the exact term to expand the query and make our search to move closer to an optimal solution. The performance of our approach is compared to those of other models based on the evaluation metrics such as accuracy and F-measure. It is evaluated by applying the model over the SQuAD 1.1 dataset. By analyzing the results it is verified that our proposed algorithm achieves an accuracy of 95.8% for an efficient information retrieval of relevant response with an increase in F-measure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11

Similar content being viewed by others

References

  1. Boushaki S I, Kamel N and Bendjeghaba O (2015) Improved Cuckoo search algorithm for document clustering. In: Amine A, Bellatreche L, Elberrichi Z, Neuhold E, Wrembel R (Eds). Computer Science and Its Applications, CIIA 2015, IFIP Advances in Information and Communication Technology. vol. 456. Cham: Springer

    Google Scholar 

  2. Trellian Keyword Discovery 2019 Query size by country. https://www.keyworddiscovery.com/keyword-stats.html

  3. J Wang, B Zhou, S Zhou 2016. An improved cuckoo search optimization algorithm for the problem of chaotic systems parameter estimation. Comput. Intell. Neurosci. 2959370, 8

    Google Scholar 

  4. Wang Z, Li X, Zhang D and Wu F 2006 A PSO-based web document query optimization algorithm. In: Mizoguchi R, Shi Z and Giunchiglia F (Eds.) The Semantic Web, – ASWC 2006, Lecture Notes in Computer Science, vol. 4185. Berlin–Heidelberg: Springer

    Google Scholar 

  5. Carpineto C and Romano G 2012 A survey of automatic query expansion in information retrieval. ACM Comput. Surv. (CSUR) 44(1): 1

    Article  Google Scholar 

  6. Azad H K and Deepak A 2019 Query expansion techniques for information retrieval: a survey. Inf. Process. Manag. 56(5): 1698–1735

    Article  Google Scholar 

  7. Sharma D K, Pamula R and Chauhan D S 2019 A hybrid evolutionary algorithm based automatic query expansion for enhancing document retrieval system. J. Ambient Intell. Hum. Comput.https://doi.org/10.1007/s12652-019-01247-9

    Article  Google Scholar 

  8. Sharma D K, Pamula R and Chauhan D S 2019 Soft computing techniques based automatic query expansion approach for improving document retrieval. In: Proceedings of the 2019 Amity International Conference on Artificial Intelligence (AICAI) February, IEEE, pp. 972–976

  9. Zadeh L A 1994 Fuzzy logic, neural networks, and soft computing. Commun. ACM 37(3): 77–85

    Article  Google Scholar 

  10. Veningston K and Shanmugalakshmi R 2014 Efficient implementation of web search query reformulation using ant colony optimization. In: Proceedings of the International Conference on Big Data Analytics. Cham: Springer, pp. 80–94

  11. Fister I, Yang X S and Fister D 2014. Cuckoo search: a brief literature review. In: Cuckoo Search and Firefly Algorithm. Cham: Springer, pp. 49–62

    Chapter  Google Scholar 

  12. Khennak I and Drias H 2017 An accelerated PSO for query expansion in web information retrieval: application to medical dataset. Appl. Intell. 47(3): 793–808

    Article  Google Scholar 

  13. Yang X S and Deb S 2014 Cuckoo search: recent advances and applications. Neural Comput. Appl. 24(1): 169–174

    Article  Google Scholar 

  14. Yang X S and Deb S 2009 Cuckoo search via Lévy flights. In: Proceedings of the 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC), December, IEEE, pp. 210–214

  15. Kawam A A and Mansour N 2012 Metaheuristic optimization algorithms for training artificial neural networks. Int. J. Comput. Inf. Technol. 1(2): 156–161

    Google Scholar 

  16. Liddy E D 2001 Natural language processing. In: Encyclopedia of Library and Information Science, 2nd ed. NY: Marcel Decker, Inc.

  17. Hirschberg J and Manning C D 2015 Advances in natural language processing. Science 349(6245): 261–266

    Article  MathSciNet  Google Scholar 

  18. Rau L F, Jacobs P S and Zernik U 1989 Information extraction and text summarization using linguistic knowledge acquisition. Inf. Process. Manag. 25(4): 419–428

    Article  Google Scholar 

  19. Winograd T 1971 Procedures as a representation for data in a computer program for understanding natural language (No. MAC-TR-84). Massachusetts Institute of Technology, Cambridge, Project MAC

  20. Johnson M 2009 How the statistical revolution changes (computational) linguistics. In: Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous? Association for Computational Linguistics, pp. 3–11

  21. Reck R P and Reck R A 2007 Generating and rendering readability scores for Project Gutenberg texts. In: Proceedings of the Corpus Linguistics Conference, Brimingham, UK

  22. Francis W N and Kucera H 1964 Brown corpus. Department of Linguistics, Brown University, Providence, Rhode Island

  23. Chelba C, Mikolov T, Schuster M, Ge Q, Brants T, Koehn P and Robinson T 2013 One billion word benchmark for measuring progress in statistical language modeling. Preprint arXiv:1312.3005

  24. Dewdney N, VanEss-Dykema C and MacMillan R 2001 The form is the substance: classification of genres in text. In: Proceedings of the Workshop on Human Language Technology and Knowledge Management, Association for Computational Linguistics, July 7

  25. Oghina A, Breuss M, Tsagkias M and De Rijke M 2012 Predicting IMDB movie ratings using social media. In: Proceedings of the European Conference on Information Retrieval, April. Berlin–Heidelberg: Springer, pp. 503–507

  26. Rajpurkar P, Zhang J, Lopyrev K and Liang P 2016 SQuAD: 100,000+ questions for machine comprehension of text. Preprint arXiv:1606.05250

  27. Rajpurkar P, Jia R and Liang P 2018 Know what you don’t know: unanswerable questions for SQuAD. Preprint arXiv:1806.03822

  28. Reddy S, Chen D and Manning C D 2019 CoQA: a conversational question answering challenge. Trans. Assoc. Comput. Linguistics 7: 249–266

    Article  Google Scholar 

  29. Joshi M, Choi E, Weld D S and Zettlemoyer L 2017 TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. Preprint arXiv:cs.CL/1705.03551

  30. Saha A, Aralikatte R, Khapra M M and Sankaranarayanan K 2018 Duorc: towards complex language understanding with paraphrased reading comprehension. Preprint arXiv:1804.07927

  31. Powles J and Hodson H 2017. Google DeepMind and healthcare in an age of algorithms. Health Technol. 7(4): 351–367

    Article  Google Scholar 

  32. Ma X and Cieri C 2006 Corpus support for machine translation at LDC. In: Proceedings of LREC, May, pp. 859–864

  33. Koehn P 2005 Europarl: A parallel corpus for statistical machine translation. In: Proceedings of MT Summit, September, vol. 5, pp. 79–86

  34. Bojar O, Diatka V, Rychlý P, Stranák P, Suchomel V, Tamchyna A and Zeman D 2014 HindEnCorp-Hindi–English and Hindi-only corpus for machine translation. In: Proceedings of LREC, May, pp. 3550–3555

  35. Usbeck R, Röder M, Hoffmann M, Conrads F, Huthmann J, Ngonga-Ngomo A C and Unger C 2019 Benchmarking question answering systems. Semantic Web, (Preprint), pp. 1–12

  36. Dale R, Moisl H and Somers H (Eds.) 2000 Handbook of natural language processing. CRC Press

  37. Mikolov T, Chen K, Corrado G and Dean J 2013 Efficient estimation of word representations in vector space. Preprint arXiv:1301.3781

  38. Pennington J, Socher R and Manning C 2014 Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543

  39. Peters M E, Neumann M, Iyyer M, Gardner M, Clark C, Lee K and Zettlemoyer L 2018 Deep contextualized word representations. Preprint arXiv:1802.05365

  40. Conneau A, Kiela D, Schwenk H, Barrault L and Bordes A 2017 Supervised learning of universal sentence representations from natural language inference data. Preprint arXiv:1705.02364

  41. Xu D and Li W J 2016 Full-time supervision based bidirectional RNN for factoid question answering. Preprint arXiv:1606.05854

  42. Olah C 2015 Understanding LSTM networks. Blog

  43. Tan M, Santos C D, Xiang B and Zhou B 2015 LSTM-based deep learning models for non-factoid answer selection. Preprint arXiv:1511.04108

  44. Chen S, Wen J and Zhang R 2016 GRU-RNN based question answering over Knowledge Base. In: Proceedings of the China Conference on Knowledge Graph and Semantic Computing, September, Singapore, pp. 80–91

  45. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L and Polosukhin I 2017 Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008

  46. Lilian J F, Sundarakantham K and Shalinie S M 2021 Anti-negation method for handling negation words in question answering system. J. Supercomput. 77(5): 4244–4266

    Article  Google Scholar 

  47. Singh A and Kaur M 2019. Intelligent content-based cybercrime detection in online social networks using cuckoo search metaheuristic approach. J. Supercomput. 76: 5402–5424

    Article  Google Scholar 

  48. Dua R D, Madaan D M, Mukherjee P M and Lall B L 2019 Real time attention based bidirectional long short-term memory networks for air pollution forecasting. In: Proceedings of the 2019 IEEE Fifth International Conference on Big Data Computing Service and Applications (BigDataService), IEEE, April, pp. 151–158

  49. Seo M, Kembhavi A, Farhadi A and Hajishirzi H 2016 Bidirectional attention flow for machine comprehension. Preprint arXiv:1611.01603

  50. Dillon J V and Collins-Thompson K 2010 A unified optimization framework for robust pseudo-relevance feedback algorithms. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, ACM, October, pp. 1069–1078

  51. Imani A, Vakili A, Montazer A and Shakery A 2019 Deep neural networks for query expansion using word embeddings. In: Proceedings of the European Conference on Information Retrieval, April. Cham: Springer, pp. 203–210

  52. Wang X, Macdonald C and Ounis I 2020 Deep reinforced query reformulation for information retrieval. Preprint arXiv:2007.07987

  53. ALMarwi H, Ghurab M and Al-Baltah I 2020 A hybrid semantic query expansion approach for Arabic information retrieval. J. Big Data 7(1): 1–19

  54. Ture F and Jojic O 2016 No need to pay attention: simple recurrent neural networks work! (for answering “simple” questions). Preprint arXiv:1606.05029

  55. Sundermeyer M, Schlüter R and Ney H 2012 LSTM neural networks for language modeling. In: Proceedings of the Thirteenth Annual Conference of the International Speech Communication Association

  56. Xu Y, Mou L, Li G, Chen Y, Peng H and Jin Z 2015 Classifying relations via long short term memory networks along shortest dependency paths. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, September, pp. 1785–1794

  57. Graves A, Fernández S and Schmidhuber J 2005 Bidirectional LSTM networks for improved phoneme classification and recognition. In: Proceedings of the International Conference on Artificial Neural Networks, September. Berlin–Heidelberg: Springer, pp. 799–804

  58. Ma J, Ting T O, Man K L, Zhang N, Guan S U and Wong P W 2013 Parameter estimation of photovoltaic models via cuckoo search. J. Appl. Math. 362619

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J Felicia Lilian.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lilian, J.F., Sundarakantham, K. & Shalinie, S.M. QeCSO: Design of hybrid Cuckoo Search based Query expansion model for efficient information retrieval. Sādhanā 46, 181 (2021). https://doi.org/10.1007/s12046-021-01706-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12046-021-01706-0

Keywords

Navigation