MeSHLabeler and DeepMeSH: Recent Progress in Large-Scale MeSH Indexing
The US National Library of Medicine (NLM) uses the Medical Subject Headings (MeSH) (seeNote 1 ) to index almost all 24 million citations in MEDLINE, which greatly facilitates the application of biomedical information retrieval and text mining. Large-scale automatic MeSH indexing has two challenging aspects: the MeSH side and citation side. For the MeSH side, each citation is annotated by only 12 (on average) out of all 28,000 MeSH terms. For the citation side, all existing methods, including Medical Text Indexer (MTI) by NLM, deal with text by bag-of-words, which cannot capture semantic and context-dependent information well. To solve these two challenges, we developed the MeSHLabeler and DeepMeSH. By utilizing “learning to rank” (LTR) framework, MeSHLabeler integrates multiple types of information to solve the challenge in the MeSH side, while DeepMeSH integrates deep semantic representation to solve the challenge in the citation side. MeSHLabeler achieved the first place in both BioASQ2 and BioASQ3, and DeepMeSH achieved the first place in both BioASQ4 and BioASQ5 challenges. DeepMeSH is available at http://datamining-iip.fudan.edu.cn/deepmesh.
Key wordsMeSH indexing Text categorization Multi-label classification Medical subject headings MEDLINE Machine learning
This work has been partially supported by National Natural Science Foundation of China (Grant Nos: 61572139), MEXT KAKENHI #16H02868 and FiDiPro by Tekes.
- 9.Mork JG, Jimeno-Yepes A, Aronson AR (2013) The NLM medical text indexer system for indexing biomedical literature. BioASQ@ CLEFGoogle Scholar
- 10.Demner-Fushman D, Mork JG (2016) A report to the board of Scientific Counselors, April 2016Google Scholar
- 11.Mork JG, Demner-Fushman D, Schmidt S, Aronson AR (2014) Recent Enhancements to the NLM Medical Text Indexer. CLEF (Working Notes), pp 1328–1336Google Scholar
- 12.Nelson SJ, Schopen M, Savage AG, Schulman JL, Arluk N (2004) The MeSH translation maintenance system: structure, interface design, and implementation. Medinfo 11:67–69Google Scholar
- 15.Partalas I, Gaussier É, Ngomo ACN et al. (2013) Results of the first BioASQ Workshop. BioASQ@ CLEFGoogle Scholar
- 17.Balikas G, Partalas I, Ngomo AN, Krithara A, Paliouras G (2014) Results of the BioASQ track of the question answering lab at CLEF 2014. CLEF (Working Notes), pp 1181–1193Google Scholar
- 18.Tsoumakas G, Laliotis M, Markantonatos N, Vlahavas IP (2013) Large-scale semantic indexing of biomedical publications. BioASQ@ CLEFGoogle Scholar
- 19.Mao Y, Lu Z (2013) NCBI at the 2013 BioASQ challenge task: learning to rank for automatic MeSH indexing. BioASQ@ CLEFGoogle Scholar
- 22.Peng S, You R, Xie Z, Wang B, Zhang Y, Zhu S (2015) The Fudan participation in the 2015 BioASQ challenge: large-scale biomedical semantic indexing and question answering. CLEF (Working Notes)Google Scholar