Abstract
This paper presents an overview of the tenth edition of the BioASQ challenge in the context of the Conference and Labs of the Evaluation Forum (CLEF) 2022. BioASQ is an ongoing series of challenges that promotes advances in the domain of large-scale biomedical semantic indexing and question answering. In this edition, the challenge was composed of the three established tasks a, b and Synergy, and a new task named DisTEMIST for automatic semantic annotation and grounding of diseases from clinical content in Spanish, a key concept for semantic indexing and search engines of literature and clinical records. This year, BioASQ received more than 170 distinct systems from 38 teams in total for the four different tasks of the challenge. As in previous years, the majority of the competing systems outperformed the strong baselines, indicating the continuous advancement of the state-of-the-art in this domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
References
Almeida, T., Matos, S.: BioASQ synergy: a strong and simple baseline rooted in relevance feedback. CLEF (Working Notes) (2021)
Almeida, T., Matos, S.: Universal passage weighting mechanism (UPWM) in BioASQ 9b. CLEF (Working Notes) (2021)
Alrowili, S., Shanker, V.: BioM-transformers: building large biomedical language models with BERT, ALBERT and ELECTRA. In: Proceedings of the 20th Workshop on Biomedical Language Processing, pp. 221–227. Association for Computational Linguistics, June 2021. https://www.aclweb.org/anthology/2021.bionlp-1.24
Amano, T., et al.: Tapping into non-English-language science for the conservation of global biodiversity. PLoS Biol. 19(10), e3001296 (2021)
Baldwin, B., Carpenter, B.: Lingpipe (2003). World Wide Web: http://alias-i.com/lingpipe
Balikas, G., et al.: Evaluation framework specifications. Project deliverable D4.1, UPMC, May 2013
Benson, T.: Principles of Health Interoperability HL7 and SNOMED. Springer, Heidelberg (2012). https://doi.org/10.1007/978-1-4471-2801-4
Bernik, M., Tovornik, R., Fabjan, B., Marco-Ruiz, L.: Diagñoza: a natural language processing tool for automatic annotation of clinical free text with SNOMED-CT (2022)
Borchert, F., Schapranow, M.P.: Hpi-dhc @ bioasq distemist: Spanish biomedical entity linking with cross-lingual candidate retrieval and rule-based reranking (2022)
Castano, J., Gambarte, M.L., Otero, C., Luna, D.: A simple terminology-based approach to clinical entity recognition (2022)
Chizhikova, M., Collado-Montañez, J., López-Úbeda, P., Díaz-Galiano, M.C., Ureña-López, L.A., Martín-Valdivia, M.T.: SINAI at CLEF 2022: Leveraging biomedical transformers to detect and normalize disease mentions (2022)
Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Gonzalez-Agirre, A., Marimon, M., Intxaurrondo, A., Rabal, O., Villegas, M., Krallinger, M.: Pharmaconer: pharmacological substances, compounds and proteins named entity recognition track. In: Proceedings of the 5th Workshop on BioNLP Open Shared Tasks, pp. 1–10 (2019)
Islamaj Dogan, R., Murray, G.C., Névéol, A., Lu, Z.: Understanding pubmed® user search behavior through log analysis. Database 2009 (2009)
Islamaj Doğan, R., Leaman, R., Lu, Z.: NCBI disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Informa. 47, 1–10 (2014). https://doi.org/10.1016/j.jbi.2013.12.006. https://www.sciencedirect.com/science/article/pii/S1532046413001974
Kosmopoulos, A., Partalas, I., Gaussier, E., Paliouras, G., Androutsopoulos, I.: Evaluation measures for hierarchical classification: a unified view and novel approaches. Data Min. Knowl. Disc. 29(3), 820–865 (2014). https://doi.org/10.1007/s10618-014-0382-x
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)
Li, C., Yates, A., MacAvaney, S., He, B., Sun, Y.: Parade: passage representation aggregation for document reranking. arXiv preprint arXiv:2008.09093 (2020)
Miranda-Escalada, A., Farré, E., Krallinger, M.: Named entity recognition, concept normalization and clinical coding: overview of the cantemist track for cancer text mining in Spanish, corpus, guidelines, methods and results. In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020). CEUR Workshop Proceedings (2020)
Miranda-Escalada, A., et al.: Overview of DISTEMIST at BioASQ: automatic detection and normalization of diseases from clinical texts: results, methods, evaluation and multilingual resources (2022)
Miranda-Escalada, A., Gonzalez-Agirre, A., Armengol-Estapé, J., Krallinger, M.: Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of CLEF ehealth 2020. In: Working Notes of Conference and Labs of the Evaluation (CLEF) Forum. CEUR Workshop Proceedings (2020)
Mork, J.G., Demner-Fushman, D., Schmidt, S.C., Aronson, A.R.: Recent enhancements to the NLM medical text indexer. In: Proceedings of Question Answering Lab at CLEF (2014)
Moscato, V., Postiglione, M., Sperl[í], G.: Biomedical Spanish language models for entity recognition and linking at BioASQ DisTEMIST (2022)
Nentidis, A., et al.: Overview of BioASQ 2021: the ninth BioASQ challenge on large-scale biomedical semantic indexing and question answering. In: Candan, K.S., et al. (eds.) CLEF 2021. LNCS, vol. 12880, pp. 239–263. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85251-1_18
Nentidis, A., et al.: Overview of BioASQ 2020: the eighth BioASQ challenge on large-scale biomedical semantic indexing and question answering. In: Arampatzis, A., et al. (eds.) CLEF 2020. LNCS, vol. 12260, pp. 194–214. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58219-7_16
Neves, A.: Unicage at distemist - named entity recognition system using only bash and unicage tools (2022)
Ozyurt, I.B.: End-to-end biomedical question answering via bio-answerfinder and discriminative language representation models. CLEF (Working Notes) (2021)
Rae, A.R., Mork, J.G., Demner-Fushman, D.: A neural text ranking approach for automatic mesh indexing. In: CLEF (Working Notes), pp. 302–312 (2021)
Reyes-Aguillón, J., del Moral, R., Ramos-Flores, O., Gómez-Adorno, H., Bel-Enguix, G.: Clinical named entity recognition and linking using BERT in combination with Spanish medical embeddings (2022)
Tamayo, A., Burgos, D.A., Gelbukh, A.: mBERT and simple post-processing: a baseline for disease mention detection in Spanish (2022)
Tsatsaronis, G., et al.: An overview of the BioASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinform. 16, 138 (2015). https://doi.org/10.1186/s12859-015-0564-6
Tsoumakas, G., Laliotis, M., Markontanatos, N., Vlahavas, I.: Large-scale semantic indexing of biomedical publications. In: 1st BioASQ Workshop: A Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering (2013)
Uzuner, O., South, B.R., Shen, S., DuVall, S.L.: 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18(5), 552–556 (2011). https://doi.org/10.1136/amiajnl-2011-000203
Wang, L.L., et al.: CORD-19: the COVID-19 open research dataset. ArXiv (2020)
Wei, C.H., Leaman, R., Lu, Z.: Beyond accuracy: creating interoperable and scalable text-mining web services. Bioinform. (Oxford, Engl.) 32(12), 1907–10 (2016). https://doi.org/10.1093/bioinformatics/btv760
Yang, Z., Zhou, Y., Eric, N.: Learning to answer biomedical questions: Oaqa at bioasq 4b. ACL 2016, 23 (2016)
Zavorin, I., Mork, J.G., Demner-Fushman, D.: Using learning-to-rank to enhance NLM medical text indexer results. ACL 2016, 8 (2016)
Zhang, Y., Han, J.C., Tsai, R.T.H.: NCU-IISR/AS-GIS: results of various pre-trained biomedical language models and linear regression model in BioASQ task 9b phase b. In: CEUR Workshop Proceedings (2021)
Acknowledgments
Google was a proud sponsor of the BioASQ Challenge in 2021. The tenth edition of BioASQ is also sponsored by the Atypon Systems inc. BioASQ is grateful to NLM for providing the baselines for task 10a and to the CMU team for providing the baselines for task 10b. The DisTEMIST track was supported by the Spanish Plan for advancement of Language Technologies (Plan TL) and the Secretaría de Estado de Digitalización e Inteligencia Artificial (SEDIA), the European Union’s Horizon Europe Coordination & Support Action under Grant Agreement No 101058779 and by the AI4PROFHEALTH (PID2020-119266RA-I00).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nentidis, A. et al. (2022). Overview of BioASQ 2022: The Tenth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering. In: Barrón-Cedeño, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2022. Lecture Notes in Computer Science, vol 13390. Springer, Cham. https://doi.org/10.1007/978-3-031-13643-6_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-13643-6_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13642-9
Online ISBN: 978-3-031-13643-6
eBook Packages: Computer ScienceComputer Science (R0)