Skip to main content
Log in

A Firefly Algorithm-based Approach for Pseudo-Relevance Feedback: Application to Medical Database

  • Systems-Level Quality Improvement
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

The difficulty of disambiguating the sense of the incomplete and imprecise keywords that are extensively used in the search queries has caused the failure of search systems to retrieve the desired information. One of the most powerful and promising method to overcome this shortcoming and improve the performance of search engines is Query Expansion, whereby the user’s original query is augmented by new keywords that best characterize the user’s information needs and produce more useful query. In this paper, a new Firefly Algorithm-based approach is proposed to enhance the retrieval effectiveness of query expansion while maintaining low computational complexity. In contrast to the existing literature, the proposed approach uses a Firefly Algorithm to find the best expanded query among a set of expanded query candidates. Moreover, this new approach allows the determination of the length of the expanded query empirically. Experimental results on MEDLINE, the on-line medical information database, show that our proposed approach is more effective and efficient compared to the state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Ahmad, F., and Kondrak, G., Learning a spelling error model from search query logs. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 955–962. Association for Computational Linguistics (2005)

  2. Alweshah, M., and Abdullah, S., Hybridizing firefly algorithms with a probabilistic neural network for solving classification problems. Appl. Soft Comput. 35:513–524, 2015.

    Article  Google Scholar 

  3. Attardi, G., Atzori, L., and Simi, M., Index expansion for machine reading and question answering. In: CLEF 2012 Evaluation Labs and Workshop, Online Working Notes (2012)

  4. Baykasoğlu, A., and Ozsoydan, F. B., An improved firefly algorithm for solving dynamic multidimensional knapsack problems. Expert Systems with Applications 41(8):3712–3725, 2014.

  5. Bernardini, A., Carpineto, C., and D’Amico, M., Full-subtopic retrieval with keyphrase-based search results clustering. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, pp. 206–213. IEEE (2009)

  6. Bindal, A. K., and Sanyal, S., Query optimization in context of pseudo relevant documents. In: 3rd Italian Information Retrieval (IIR) workshop (2012)

  7. Blum, C., and Roli, A., Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM Comput. Surv. 35(3):268–308, 2003.

    Article  Google Scholar 

  8. Brajevic, I., and Tuba, M., Cuckoo search and firefly algorithm applied to multilevel image thresholding. In: Cuckoo Search and Firefly Algorithm, pp. 115–139. Springer (2014)

  9. Cao, G., Nie, J. Y., Gao, J., and Robertson, S., Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 243–250. ACM (2008)

  10. Carpineto, C., and Romano, G., Concept Data Analysis: Theory and Applications John Wiley & Sons, 2004.

  11. Carpineto, C., and Romano, G., A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44(1):1–50, 2012.

    Article  Google Scholar 

  12. Chen, Q., Li, M., and Zhou, M., Improving query spelling correction using web search results. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 181–189. Association for Computational Linguistics (2007)

  13. Crestani, F., Application of spreading activation techniques in information retrieval. Artif. Intell. Rev. 11(6): 453–482 , 1997.

    Article  Google Scholar 

  14. Deep, K., and Bansal, J. C., Mean particle swarm optimisation for function optimisation. International Journal of Computational Intelligence Studies 1(1):72–92, 2009.

    Article  Google Scholar 

  15. Dillon, J. V., and Collins-Thompson, K., A unified optimization framework for robust pseudo-relevance feedback algorithms. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1069–1078. ACM (2010)

  16. Eisenstein, J., Xing, E. P., Smith, N. A., and O’Connor, B., Mapping the geographical diffusion of new words. Tech rep (2012)

  17. Gao, K., Zhang, Y., Zhang, D., and Lin, S., Accurate off-line query expansion for large-scale mobile visual search. Signal Process. 93(8):2305–2315, 2013.

    Article  Google Scholar 

  18. Jouglet, A., and Carlier, J., Dominance rules in combinatorial optimization problems. Eur. J. Oper. Res. 212(3):433–444, 2011.

    Article  Google Scholar 

  19. Karthikeyan, S., Asokan, P., and Nickolas, S., A hybrid discrete firefly algorithm for multi-objective flexible job shop scheduling problem with limited resource constraints. Int. J. Adv. Manuf. Technol. 72(9-12):1567–1579, 2014.

    Article  Google Scholar 

  20. Kennedy, J., Particle swarm optimization. In: Encyclopedia of Machine Learning, pp. 760–766. Springer (2011)

  21. Kennedy, J., Kennedy, J. F., Eberhart, R. C., and Shi, Y., Swarm Intelligence Morgan Kaufmann (2001)

  22. Kirkpatrick, S., Optimization by simulated annealing: Quantitative studies. J. Stat. Phys. 34(5-6):975–986, 1984.

    Article  Google Scholar 

  23. Lee, A., and Chau, M., The impact of query suggestion in e-commerce websites. In: E-Life: Web-Enabled Convergence of Commerce, Work, and Social Life 10th Workshop on E-Business, WEB 2011, pp. 248–254 (2011)

  24. Lee, K. S., and Croft, W. B., A deterministic resampling method using overlapping document clusters for pseudo-relevance feedback. Inf. Process. Manag. 49(4):792–806, 2013.

    Article  Google Scholar 

  25. Lei, X., Wang, F., Wu, F. X., Zhang, A., and Pedrycz, W., Protein complex identification through markov clustering with firefly algorithm on dynamic protein–protein interaction networks. Inf. Sci. 329:303–316, 2016.

    Article  Google Scholar 

  26. Leturia, I., Gurrutxaga, A., Areta, N., Alegria, I., and Ezeiza, A., Morphological query expansion and language-filtering words for improving basque web retrieval. Lang. Resour. Eval. 47(2):425–448, 2013.

    Article  Google Scholar 

  27. Long, N. C., and Meesad, P., An optimal design for type–2 fuzzy logic system using hybrid of chaos firefly algorithm and genetic algorithm and its application to sea level prediction. J. Intell. Fuzzy Syst. 27(3):1335–1346, 2014.

    Google Scholar 

  28. Lv, Y., Zhai, C., and Chen, W., A boosting approach to improving pseudo-relevance feedback. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 165–174. ACM (2011)

  29. Martinez, D., Otegi, A., Soroa, A., and Agirre, E., Improving search over electronic health records using umls-based query expansion through random walks. J. Biomed. Inform. 51:100–106, 2014.

    Article  PubMed  Google Scholar 

  30. Melucci, M., A basis for information retrieval in context. ACM Trans. Inf. Syst. 26(3):14:1–14:41, 2008.

    Article  Google Scholar 

  31. Miao, J., Huang, J. X., and Ye, Z., Proximity-based rocchio’s model for pseudo relevance. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 535–544. ACM (2012)

  32. Mitchell, M., An Introduction to Genetic Algorithms MIT press (1998)

  33. Robertson, S., and Zaragoza, H., The Probabilistic Relevance Framework: BM25 and Beyond Now Publishers Inc (2009)

  34. Robertson, S. E., and Jones, K. S., Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27(3):129–146, 1976.

    Article  Google Scholar 

  35. Robertson, S. E., Walker, S., Beaulieu, M., Gatford, M., and Payne, A., Okapi at trec-4. In: Proceedings of the 4th Text Retrieval Conference, pp. 73–97 (1995)

  36. Rocchio, J. J., Relevance feedback in information retrieval. In: Salton, G. (Ed.) The SMART Retrieval System - Experiments in Automatic Document Processing, pp. 313–323 (1971)

  37. Sahlgren, M., An introduction to random indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE (2005)

  38. Subramaniam, L. V., Roy, S., Faruquie, T. A., and Negi, S., A survey of types of text noise and techniques to handle noisy text. In: Proceedings of The 3rd Workshop on Analytics for Noisy Unstructured Text Data, pp. 115–122. ACM (2009)

  39. Sun, H. m., A study of the features of internet english from the linguistic perspective. Tex. Stud. Lit. Lang. 1(7):98, 2010.

    Google Scholar 

  40. Véronis, J., Hyperlex: Lexical cartography for information retrieval. Comput. Speech Lang. 18(3):223–252, 2004.

    Article  Google Scholar 

  41. Williams, H. E., and Zobel, J., Searchable words on the web. Int. J. Digit. Libr. 5(2):99–105.

  42. Wong, S. K., Ziarko, W., Raghavan, V. V., and Wong, P. C., On modeling of information retrieval concepts in vector spaces. ACM Trans. Database Syst. 12(2):299–321, 1987.

    Article  Google Scholar 

  43. Xie, H., Zhang, Y., Tan, J., Guo, L., and Li, J., Contextual query expansion for image retrieval. IEEE Trans. Multimedia 16(4):1104–1114, 2014.

    Article  Google Scholar 

  44. Yang, X. S., Nature-Inspired Metaheuristic Algorithms Luniver Press (2008)

  45. Yang, X. S., Firefly algorithms for multimodal optimization. In: Proceedings of the 5th International Conference on Stochastic Algorithms: Foundations and Applications, pp. 169–178 (2009)

  46. Yang, X. S., Nature-Inspired Metaheuristic Algorithms: 2nd Edn Luniver Press (2010)

  47. Yang, X. S., Nature-inspired optimization algorithms Elsevier (2014)

  48. Ye, Z., and Huang, J. X., A simple term frequency transformation model for effective pseudo relevance feedback. In: Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 323–332. ACM (2014)

  49. Zeng, Y., Zhang, Z., and Kusiak, A., Predictive modeling and optimization of a multi-zone hvac system with data mining and firefly algorithms. Energy 86:393–402, 2015.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilyes Khennak.

Additional information

This article is part of the Topical Collection on Systems-Level Quality Improvement

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khennak, I., Drias, H. A Firefly Algorithm-based Approach for Pseudo-Relevance Feedback: Application to Medical Database. J Med Syst 40, 240 (2016). https://doi.org/10.1007/s10916-016-0603-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-016-0603-5

Keywords

Navigation