Skip to main content

Pseudo-Relevance Feedback for Information Retrieval in Medicine Using Genetic Algorithms

  • Conference paper
  • First Online:
Book cover Intelligent Information and Database Systems (ACIIDS 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10752))

Included in the following conference series:

Abstract

Pseudo-Relevance Feedback is one of the methods for improving search engine results. By automatically extracting information from a previous search result, a new query is posed as an expansion of the original query, and then it is searched again. In this paper, we apply a genetic algorithm to improve the Pseudo-Relevance Feedback method in searching medical texts. First, a set of candidate terms is constructed by extracting keywords from the documents returned from the initial search using the original query. Then, the seed terms are selected from the candidate term set using our proposed genetic algorithm, to be merged with the original query to create a new query. The new query is searched again, returning a final ranked list of documents. Experimental results on the TREC 2014 CDS dataset show that the proposed method outperforms the baseline method that does not use a genetic algorithm for Pseudo-Relevance Feedback.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://eric.ed.gov.

  2. 2.

    http://www.findastronomy.com.

  3. 3.

    http://www.infotopia.info.

  4. 4.

    https://www.ncbi.nlm.nih.gov/gquery/.

  5. 5.

    https://lucene.apache.org/.

  6. 6.

    https://www.nlm.nih.gov/mesh/.

  7. 7.

    https://www.ncbi.nlm.nih.gov/pmc/.

References

  1. Chou, S., Chang, W., Cheng, C.Y., Jehng, J.C., Chang, C.: An information retrieval system for medical records & documents. In: 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 1474–1477 (2008). https://doi.org/10.1109/IEMBS.2008.4649446

  2. Goeuriot, L., Jones, G.J.F., Kelly, L., Müller, H., Zobel, J.: Medical information retrieval: introduction to the special issue. Inf. Retr. J. 19(1–2), 1–5 (2016)

    Google Scholar 

  3. Palotti, J., Hanbury, A., Müller, H., Kahn Jr., C.E.: How users search and what they search for in the medical domain - understanding laypeople and experts through query logs. Inf. Retr. J. 19(1–2), 189–224 (2016)

    Article  Google Scholar 

  4. Cao, G., Nie, J., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, 20–24 July 2008, pp. 243–250 (2008). https://doi.org/10.1145/1390334.1390377

  5. Lv, Y., Zhai, C., Chen, W.: A boosting approach to improving pseudo-relevance feedback. In: Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, 25–29 July 2011, pp. 165–174 (2011). https://doi.org/10.1145/2009916.2009942

  6. Cao, G., Nie, J.-Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, pp. 243–250. ACM, New York (2008)

    Google Scholar 

  7. Vargas, S., Santos, R.L.T., Macdonald, C., Ounis, I.: Selecting effective expansion terms for diversity. In: Open Research Areas in Information Retrieval, OAIR 2013, Lisbon, Portugal, 15–17 May 2013, pp. 69–76 (2013)

    Google Scholar 

  8. Chen, H.: Machine learning for information retrieval: Neural networks, symbolic learning, and genetic algorithms. JASIS 46(3), 194–216 (1995)

    Article  Google Scholar 

  9. Simpson, M.S., Voorhees, E., Hersh, W.: Overview of the TREC 2014 clinical decision support track. In: Proceedings of the 23rd Text Retrieval Conference (TREC), Gaithersburg, MD, USA (2014)

    Google Scholar 

  10. Del Fiol, G., Workman, T.E., Gorman, P.N.: Clinical questions raised by clinicians at the point of care: a systematic review. JAMA Intern. Med. 174(5), 710–718 (2014)

    Article  Google Scholar 

  11. Mourão, A., Martins, F., Magalhães, J.: NovaSearch at TREC 2014 clinical decision support track. In: Proceedings of the Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, 19–21 November 2014

    Google Scholar 

  12. Singh, J.N., Dwivedi, S.K.: Analysis of vector space model in information retrieval. In: Proceedings Published by International Journal of Computer Applications\(^{\textregistered }\) (IJCA), vol. 2, pp. 14–18 (2012)

    Google Scholar 

  13. Trotman, A., Puurula, A., Burgess, B.: Improvements to BM25 and language models examined. In: Proceedings of the 2014 Australasian Document Computing Symposium, ADCS 2014, Melbourne, VIC, Australia, 27–28 November 2014, p. 58 (2014). https://doi.org/10.1145/2682862.2682863

  14. Banerjee, P., Han, H.: Language modeling approaches to information retrieval. JCSE 3(3), 143–164 (2009)

    Google Scholar 

  15. Lv, Y., Zhai, C.: Lower-bounding term frequency normalization. In: Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, United Kingdom, 24–28 October 2011, pp. 7–16 (2011)

    Google Scholar 

  16. Lv, Y., Zhai, C.: When documents are very long, BM25 fails! In: Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, 25–29 July 2011, pp. 1103–1104 (2011). https://doi.org/10.1145/2009916.2010070

  17. Cormack, G.V., Clarke, C.L.A., Büttcher, S.: Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In: Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston, MA, USA, 19–23 July 2009, pp. 758–759 (2009). https://doi.org/10.1145/1571941.1572114

  18. Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44(1), Article ID 1–1150 (2012). https://doi.org/10.1145/2071389.2071390

  19. Gen, M., Liu, B.: A genetic algorithm for optimal capacity expansion. J. Oper. Res. Soc. Jpn. 40, 1–9 (1997)

    Article  MATH  Google Scholar 

  20. Roberts, K., Simpson, M.S., Demner-Fushman, D., Voorhees, E.M., Hersh, W.R.: State-of-the-art in biomedical literature retrieval for clinical cases: a survey of the TREC 2014 CDS track. Inf. Retr. J. 19(1–2), 113–148 (2016)

    Article  Google Scholar 

  21. Zuva, K., Zuva, T.: Evaluation of information retrieval systems. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 4, 35–43 (2012)

    Google Scholar 

  22. Mogotsi, I.C., Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008). 482 p. ISBN: 978-0-521-86571-5. Inf. Retr. 13(2), 192–195 (2010)

    Google Scholar 

Download references

Acknowledgments

This work is funded by Vietnam National University at Ho Chi Minh City under the grant number B2016-42-01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lanh Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, L., Cao, T. (2018). Pseudo-Relevance Feedback for Information Retrieval in Medicine Using Genetic Algorithms. In: Nguyen, N., Hoang, D., Hong, TP., Pham, H., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2018. Lecture Notes in Computer Science(), vol 10752. Springer, Cham. https://doi.org/10.1007/978-3-319-75420-8_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75420-8_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75419-2

  • Online ISBN: 978-3-319-75420-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics