Skip to main content

Overview of BioASQ 2022: The Tenth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering

  • Conference paper
  • First Online:
Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2022)

Abstract

This paper presents an overview of the tenth edition of the BioASQ challenge in the context of the Conference and Labs of the Evaluation Forum (CLEF) 2022. BioASQ is an ongoing series of challenges that promotes advances in the domain of large-scale biomedical semantic indexing and question answering. In this edition, the challenge was composed of the three established tasks a, b and Synergy, and a new task named DisTEMIST for automatic semantic annotation and grounding of diseases from clinical content in Spanish, a key concept for semantic indexing and search engines of literature and clinical records. This year, BioASQ received more than 170 distinct systems from 38 teams in total for the four different tasks of the challenge. As in previous years, the majority of the competing systems outperformed the strong baselines, indicating the continuous advancement of the state-of-the-art in this domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://pubmed.ncbi.nlm.nih.gov/.

  2. 2.

    https://www.nlm.nih.gov/pubs/techbull/nd21/nd21_medline_2022.html.

  3. 3.

    https://plantl.mineco.gob.es.

  4. 4.

    https://doi.org/10.5281/zenodo.6458078.

  5. 5.

    https://scielo.org/.

  6. 6.

    https://doi.org/10.5281/zenodo.6408476.

  7. 7.

    https://doi.org/10.5281/zenodo.6458114.

  8. 8.

    https://huggingface.co/Wellcome/WellcomeBertMesh.

  9. 9.

    http://participants-area.bioasq.org/results/10a/.

  10. 10.

    http://participants-area.bioasq.org/Tasks/b/eval_meas_2021/.

  11. 11.

    http://participants-area.bioasq.org/results/10b/phaseA/.

  12. 12.

    http://participants-area.bioasq.org/results/10b/phaseB/.

  13. 13.

    http://participants-area.bioasq.org/results/synergy_v2022/.

References

  1. Almeida, T., Matos, S.: BioASQ synergy: a strong and simple baseline rooted in relevance feedback. CLEF (Working Notes) (2021)

    Google Scholar 

  2. Almeida, T., Matos, S.: Universal passage weighting mechanism (UPWM) in BioASQ 9b. CLEF (Working Notes) (2021)

    Google Scholar 

  3. Alrowili, S., Shanker, V.: BioM-transformers: building large biomedical language models with BERT, ALBERT and ELECTRA. In: Proceedings of the 20th Workshop on Biomedical Language Processing, pp. 221–227. Association for Computational Linguistics, June 2021. https://www.aclweb.org/anthology/2021.bionlp-1.24

  4. Amano, T., et al.: Tapping into non-English-language science for the conservation of global biodiversity. PLoS Biol. 19(10), e3001296 (2021)

    Article  Google Scholar 

  5. Baldwin, B., Carpenter, B.: Lingpipe (2003). World Wide Web: http://alias-i.com/lingpipe

  6. Balikas, G., et al.: Evaluation framework specifications. Project deliverable D4.1, UPMC, May 2013

    Google Scholar 

  7. Benson, T.: Principles of Health Interoperability HL7 and SNOMED. Springer, Heidelberg (2012). https://doi.org/10.1007/978-1-4471-2801-4

    Book  Google Scholar 

  8. Bernik, M., Tovornik, R., Fabjan, B., Marco-Ruiz, L.: Diagñoza: a natural language processing tool for automatic annotation of clinical free text with SNOMED-CT (2022)

    Google Scholar 

  9. Borchert, F., Schapranow, M.P.: Hpi-dhc @ bioasq distemist: Spanish biomedical entity linking with cross-lingual candidate retrieval and rule-based reranking (2022)

    Google Scholar 

  10. Castano, J., Gambarte, M.L., Otero, C., Luna, D.: A simple terminology-based approach to clinical entity recognition (2022)

    Google Scholar 

  11. Chizhikova, M., Collado-Montañez, J., López-Úbeda, P., Díaz-Galiano, M.C., Ureña-López, L.A., Martín-Valdivia, M.T.: SINAI at CLEF 2022: Leveraging biomedical transformers to detect and normalize disease mentions (2022)

    Google Scholar 

  12. Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)

  13. Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  14. Gonzalez-Agirre, A., Marimon, M., Intxaurrondo, A., Rabal, O., Villegas, M., Krallinger, M.: Pharmaconer: pharmacological substances, compounds and proteins named entity recognition track. In: Proceedings of the 5th Workshop on BioNLP Open Shared Tasks, pp. 1–10 (2019)

    Google Scholar 

  15. Islamaj Dogan, R., Murray, G.C., Névéol, A., Lu, Z.: Understanding pubmed® user search behavior through log analysis. Database 2009 (2009)

    Google Scholar 

  16. Islamaj Doğan, R., Leaman, R., Lu, Z.: NCBI disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Informa. 47, 1–10 (2014). https://doi.org/10.1016/j.jbi.2013.12.006. https://www.sciencedirect.com/science/article/pii/S1532046413001974

  17. Kosmopoulos, A., Partalas, I., Gaussier, E., Paliouras, G., Androutsopoulos, I.: Evaluation measures for hierarchical classification: a unified view and novel approaches. Data Min. Knowl. Disc. 29(3), 820–865 (2014). https://doi.org/10.1007/s10618-014-0382-x

    Article  MathSciNet  MATH  Google Scholar 

  18. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)

  19. Li, C., Yates, A., MacAvaney, S., He, B., Sun, Y.: Parade: passage representation aggregation for document reranking. arXiv preprint arXiv:2008.09093 (2020)

  20. Miranda-Escalada, A., Farré, E., Krallinger, M.: Named entity recognition, concept normalization and clinical coding: overview of the cantemist track for cancer text mining in Spanish, corpus, guidelines, methods and results. In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020). CEUR Workshop Proceedings (2020)

    Google Scholar 

  21. Miranda-Escalada, A., et al.: Overview of DISTEMIST at BioASQ: automatic detection and normalization of diseases from clinical texts: results, methods, evaluation and multilingual resources (2022)

    Google Scholar 

  22. Miranda-Escalada, A., Gonzalez-Agirre, A., Armengol-Estapé, J., Krallinger, M.: Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of CLEF ehealth 2020. In: Working Notes of Conference and Labs of the Evaluation (CLEF) Forum. CEUR Workshop Proceedings (2020)

    Google Scholar 

  23. Mork, J.G., Demner-Fushman, D., Schmidt, S.C., Aronson, A.R.: Recent enhancements to the NLM medical text indexer. In: Proceedings of Question Answering Lab at CLEF (2014)

    Google Scholar 

  24. Moscato, V., Postiglione, M., Sperl[í], G.: Biomedical Spanish language models for entity recognition and linking at BioASQ DisTEMIST (2022)

    Google Scholar 

  25. Nentidis, A., et al.: Overview of BioASQ 2021: the ninth BioASQ challenge on large-scale biomedical semantic indexing and question answering. In: Candan, K.S., et al. (eds.) CLEF 2021. LNCS, vol. 12880, pp. 239–263. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85251-1_18

    Chapter  Google Scholar 

  26. Nentidis, A., et al.: Overview of BioASQ 2020: the eighth BioASQ challenge on large-scale biomedical semantic indexing and question answering. In: Arampatzis, A., et al. (eds.) CLEF 2020. LNCS, vol. 12260, pp. 194–214. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58219-7_16

    Chapter  Google Scholar 

  27. Neves, A.: Unicage at distemist - named entity recognition system using only bash and unicage tools (2022)

    Google Scholar 

  28. Ozyurt, I.B.: End-to-end biomedical question answering via bio-answerfinder and discriminative language representation models. CLEF (Working Notes) (2021)

    Google Scholar 

  29. Rae, A.R., Mork, J.G., Demner-Fushman, D.: A neural text ranking approach for automatic mesh indexing. In: CLEF (Working Notes), pp. 302–312 (2021)

    Google Scholar 

  30. Reyes-Aguillón, J., del Moral, R., Ramos-Flores, O., Gómez-Adorno, H., Bel-Enguix, G.: Clinical named entity recognition and linking using BERT in combination with Spanish medical embeddings (2022)

    Google Scholar 

  31. Tamayo, A., Burgos, D.A., Gelbukh, A.: mBERT and simple post-processing: a baseline for disease mention detection in Spanish (2022)

    Google Scholar 

  32. Tsatsaronis, G., et al.: An overview of the BioASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinform. 16, 138 (2015). https://doi.org/10.1186/s12859-015-0564-6

    Article  Google Scholar 

  33. Tsoumakas, G., Laliotis, M., Markontanatos, N., Vlahavas, I.: Large-scale semantic indexing of biomedical publications. In: 1st BioASQ Workshop: A Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering (2013)

    Google Scholar 

  34. Uzuner, O., South, B.R., Shen, S., DuVall, S.L.: 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18(5), 552–556 (2011). https://doi.org/10.1136/amiajnl-2011-000203

  35. Wang, L.L., et al.: CORD-19: the COVID-19 open research dataset. ArXiv (2020)

    Google Scholar 

  36. Wei, C.H., Leaman, R., Lu, Z.: Beyond accuracy: creating interoperable and scalable text-mining web services. Bioinform. (Oxford, Engl.) 32(12), 1907–10 (2016). https://doi.org/10.1093/bioinformatics/btv760

    Article  Google Scholar 

  37. Yang, Z., Zhou, Y., Eric, N.: Learning to answer biomedical questions: Oaqa at bioasq 4b. ACL 2016, 23 (2016)

    Google Scholar 

  38. Zavorin, I., Mork, J.G., Demner-Fushman, D.: Using learning-to-rank to enhance NLM medical text indexer results. ACL 2016, 8 (2016)

    Google Scholar 

  39. Zhang, Y., Han, J.C., Tsai, R.T.H.: NCU-IISR/AS-GIS: results of various pre-trained biomedical language models and linear regression model in BioASQ task 9b phase b. In: CEUR Workshop Proceedings (2021)

    Google Scholar 

Download references

Acknowledgments

Google was a proud sponsor of the BioASQ Challenge in 2021. The tenth edition of BioASQ is also sponsored by the Atypon Systems inc. BioASQ is grateful to NLM for providing the baselines for task 10a and to the CMU team for providing the baselines for task 10b. The DisTEMIST track was supported by the Spanish Plan for advancement of Language Technologies (Plan TL) and the Secretaría de Estado de Digitalización e Inteligencia Artificial (SEDIA), the European Union’s Horizon Europe Coordination & Support Action under Grant Agreement No 101058779 and by the AI4PROFHEALTH (PID2020-119266RA-I00).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anastasios Nentidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nentidis, A. et al. (2022). Overview of BioASQ 2022: The Tenth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering. In: Barrón-Cedeño, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2022. Lecture Notes in Computer Science, vol 13390. Springer, Cham. https://doi.org/10.1007/978-3-031-13643-6_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-13643-6_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-13642-9

  • Online ISBN: 978-3-031-13643-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics