Advertisement

Payoffs and pitfalls in using knowledge-bases for consumer health search

  • Jimmy
  • Guido Zuccon
  • Bevan Koopman
Knowledge Graphs and Semantics in Text Analysis and Retrieval
  • 60 Downloads

Abstract

Consumer health search (CHS) is a challenging domain with vocabulary mismatch and considerable domain expertise hampering peoples’ ability to formulate effective queries. We posit that using knowledge bases for query reformulation may help alleviate this problem. How to exploit knowledge bases for effective CHS is nontrivial, involving a swathe of key choices and design decisions (many of which are not explored in the literature). Here we rigorously empirically evaluate the impact these different choices have on retrieval effectiveness. A state-of-the-art knowledge-base retrieval model—the Entity Query Feature Expansion model—was used to evaluate these choices, which include: which knowledge base to use (specialised vs. general purpose), how to construct the knowledge base, how to extract entities from queries and map them to entities in the knowledge base, what part of the knowledge base to use for query expansion, and if to augment the knowledge base search process with relevance feedback. While knowledge base retrieval has been proposed as a solution for CHS, this paper delves into the finer details of doing this effectively, highlighting both payoffs and pitfalls. It aims to provide some lessons to others in advancing the state-of-the-art in CHS.

Keywords

Knowledge base Knowledge graph Query expansion Consumer health search 

Notes

Acknowledgements

Jimmy is sponsored by the Indonesia Endowment Fund for Education (Lembaga Pengelola Dana Pendidikan/LPDP). Guido Zuccon is the recipient of an Australian Research Council DECRA Research Fellowship (DE180101579) and a Google Faculty Research Award.

References

  1. Aronson, A. R., & Lang, F. M. (2010). An overview of metamap: Historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3), 229–236.CrossRefGoogle Scholar
  2. Balaneshinkordan, S., & Kotov, A. (2016). An empirical comparison of term association and knowledge graphs for query expansion. In European conference on information retrieval (pp 761–767). Berlin: Springer.Google Scholar
  3. Bendersky, M., Metzler, D., & Croft, W, (2012), Effective query formulation with multiple information sources. In Proceedings of the 5th ACM international conference on web search and data mining (pp. 443–452).Google Scholar
  4. Bodenreider, O. (2004). The unified medical language system (UMLS): Integrating biomedical terminology. Nucleic Acids Research, 32(suppl 1), D267–D270.CrossRefGoogle Scholar
  5. Dalton, J., Dietz, L., & Allan, J. (2014). Entity query feature expansion using knowledge base links. In Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval (pp. 365–374).Google Scholar
  6. Díaz-Galiano, M., Martín-Valdivia, M., & Ureña-López, L. (2009). Query expansion with a medical ontology to improve a multimodal information retrieval system. Journal of Computers in Biology and Medicine, 39(4), 396–403.CrossRefGoogle Scholar
  7. Egozi, O., Markovitch, S., & Gabrilovich, E. (2011). Concept-based information retrieval using explicit semantic analysis. ACM Transactions on Information Systems (TOIS), 29(2), 8.CrossRefGoogle Scholar
  8. Fox, S., & Duggan, M. (2013). Health online 2013. Technical report. http://www.pewinternet.org/2013/01/15/health-online-2013/. Accessed 30 Oct 2018.
  9. Jimmy, Zuccon, G., & Koopman, B. (2016). Boosting titles does not generally improve retrieval effectiveness. In Proceedings of the 21st Australasian document computing symposium (pp. 25–32).Google Scholar
  10. Jimmy, Zuccon, G., & Koopman, B. (2017). Qut ielab at clef 2017 e-health IR task: Knowledge base retrieval for consumer health search. In CLEF.Google Scholar
  11. Jimmy, Zuccon, G., & Koopman, B. (2018). Choices in knowledge-base retrieval for consumer health search. In Proceedings of the 40th European conference on information retrieval. Berlin: Springer.Google Scholar
  12. Keselman, A., Smith, C. A., Divita, G., Kim, H., Browne, A. C., Leroy, G., et al. (2008). Consumer health concepts that do not map to the UMLS: Where do they fit? Journal of the American Medical Informatics Association, 15(4), 496–505.CrossRefGoogle Scholar
  13. Keselman, A., Tse, T., Crowell, J., Browne, A., Ngo, L., & Zeng, Q. (2006). Relating consumer knowledge of health terms and health concepts. In Proceedings of American medical informatics association.Google Scholar
  14. Koopman, B., Zuccon, G., Bruza, P., Sitbon, L., & Lawley, M. (2012). Graph-based concept weighting for medical information retrieval. In Proceedings of the 17th Australasian document computing symposium (pp. 80–87).Google Scholar
  15. Kotov, A., & Zhai, C. (2012). Tapping into knowledge base for concept feedback: Leveraging concept net to improve search results for difficult queries. In Proceedings of the 5th ACM international conference on web search and data mining, ACM (pp. 403–412).Google Scholar
  16. Limsopatham, N., Macdonald, C., & Ounis, I. (2013). Inferring conceptual relationships to improve medical records search. In Proceedings of the 10th conference on open research areas in information retrieval (pp. 1–8).Google Scholar
  17. Liu, X., & Fang, H. (2015). Latent entity space: A novel retrieval approach for entity-bearing queries. Information Retrieval Journal, 18(6), 473–503.MathSciNetCrossRefGoogle Scholar
  18. Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2), 203–208.CrossRefGoogle Scholar
  19. McDaid, D., & Park, A. L. (2011). Online health: Untangling the web. Technical report. https://www.bupa.com.au/staticfiles/Bupa/HealthAndWellness/MediaFiles/PDF/LSE_Report_Online_Health.pdf. Accessed 30 Oct 2018.
  20. Palotti, J., Goeuriot, L., Zuccon, G., & Hanbury, A. (2016). Ranking health web pages with relevance and understandability. In Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval (pp. 965–968).Google Scholar
  21. Palotti, J., Zuccon, G., Jimmy, Pecina, P., Lupu, M., Goeuriot, L., Kelly, L., & Hanbury, A. (2017). Clef 2017 task overview: The IR task at the ehealth evaluation lab. In Working notes of conference and labs of the evaluation (CLEF) forum. CEUR workshop proceedings.Google Scholar
  22. Plovnick, R., & Zeng, Q. (2004). Reformulation of consumer health queries with professional terminology: A pilot study. Journal of Medical Internet Research, 6(3), e27.CrossRefGoogle Scholar
  23. Sakai, T. (2007). Alternatives to bpref. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’07 (pp. 71–78). New York: ACM.Google Scholar
  24. Silva, R., & Lopes, C. (2016). The effectiveness of query expansion when searching for health related content: Infolab at clef ehealth 2016. In CLEF (working notes).Google Scholar
  25. Soldaini, L., Cohan, A., Yates, A., Goharian, N., & Frieder, O. (2015). Retrieving medical literature for clinical decision support. In European conference on information retrieval (pp 538–549). Berlin: Springer.Google Scholar
  26. Soldaini, L., & Goharian, N. (2016). QuickUMLS: A fast, unsupervised approach for medical concept extraction. In SIGIR MedIR workshop, Pisa, Italy.Google Scholar
  27. Soldaini, L., & Goharian, N. (2017). Learning to rank for consumer health search: A semantic approach. In European conference on information retrieval (pp 640–646). Berlin: Springer.Google Scholar
  28. Soldaini, L., Yates, A., Yom-Tov, E., Frieder, O., & Goharian, N. (2016). Enhancing web search in the medical domain via query clarification. Information Retrieval Journal, 19(1–2), 149–173.CrossRefGoogle Scholar
  29. Stanton, I., Ieong, S., & Mishra, N. (2014). Circumlocution in diagnostic medical queries. In Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval, ACM (pp. 133–142).Google Scholar
  30. Toms, E., & Latter, C. (2007). How consumers search for health information. Health Informatics Journal, 13(3), 223–235.CrossRefGoogle Scholar
  31. Xiong, C., & Callan, J. (2015). Query expansion with freebase. In Proceedings of the 2015 international conference on the theory of information retrieval, ACM (pp. 111–120).Google Scholar
  32. Zeng, Q., Kogan, S., Ash, N., Greenes, R., & Boxwala, A. (2002). Characteristics of consumer terminology for health information retrieval. Methods of Information in Medicine-Methodik der Information in der Medizin, 41(4), 289–298.CrossRefGoogle Scholar
  33. Zeng, Q. T., Crowell, J., Plovnick, R. M., Kim, E., Ngo, L., & Dibble, E. (2006). Assisting consumer health information retrieval with query recommendations. Journal of the American Medical Informatics Association, 13(1), 80–90.CrossRefGoogle Scholar
  34. Zeng, Q. T., & Tse, T. (2006). Exploring and developing consumer health vocabularies. Journal of the American Medical Informatics Association, 13(1), 24–29.CrossRefGoogle Scholar
  35. Zhang, Y. (2014). Searching for specific health-related information in MedlinePlus: Behavioral patterns and user experience. Journal of the Association for Information Science and Technology, 65(1), 53–68.CrossRefGoogle Scholar
  36. Zuccon, G., Koopman, B., Nguyen, A., Vickers, D., & Butt, L. (2012). Exploiting medical hierarchies for concept-based information retrieval. In Proceedings of the 17th Australasian document computing symposium (pp. 111–114).Google Scholar
  37. Zuccon, G., Koopman, B., & Palotti, J. (2015). Diagnose this if you can: On the effectiveness of search engines in finding medical self-diagnosis information. In European conference on information retrieval MedIR’15 (pp. 562–567).Google Scholar
  38. Zuccon, G., Palotti, J., Goeuriot, L., Kelly, L., Lupu, M., Pecina, P., Mueller, H., Budaher, J., & Deacon, A. (2016). The IR task at the CLEF eHealth evaluation lab 2016: User-centred health information retrieval. In CLEF 2016-conference and labs of the evaluation forum.Google Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.Queensland University of Technology (QUT)BrisbaneAustralia
  2. 2.University of Surabaya (UBAYA)SurabayaIndonesia
  3. 3.Australian e-Health Research CentreCSIROCanberraAustralia

Personalised recommendations