Abstract
Ontologies have proven to be useful in the area of Information Retrieval and the biomedical informatics community has acknowledged, in recent years, their utility. However, building and updating manually ontologies is a long and tedious task. This paper proposes a system that allows any search engine to develop its semantic layer by applying ontology learning techniques on Web snippets and applies it to a well-known medical digital library, PubMed. The new system (SemPubMed) automatically builds new ontology fragments related to the user’s query and then it reformulates queries using the new concepts in order to improve information retrieval. Our system has endured a twofold evaluations. On the one hand, we have evaluated the quality of the modular ontologies built by the system. On the other hand, we have studied how the semantic reformulation of the queries has led to an improvement of the quality of the results given by PubMed, both in terms of precision and recall. Obtained results show that adding semantic layer to PubMed enables an improvement of query reformulation and predicted ranking score.
Similar content being viewed by others
References
Baruzzo A, Casoto P, Challapalli P, Dattolo A, Pudota N, Tasso C (2009) Toward semantic digital libraries: exploiting Web 2.0 and semantic services in cultural heritage. J Digit Inf 10(6)
Ben Mustapha N, Aufaure M, Baazaoui Zghal H, Ben Ghezala H (2012) Modular ontological warehouse for adaptative information search. In: MEDI 2012, pp 79–90
Ben Mustapha N, Aufaure M-A, Baazaoui-Zghal H, Ben-Ghzala H (2011) Contextual ontology module learning from Web snippets and past user queries. In: Procs. of the 15th int. conf. on knowledge-based and intelligent information and engineering systems, KES’11, pp 538–547
Berland M, Charniak E (1999) Finding parts in verylarge corpora. In: Proceedings of the 37th annual meeting of the association for computational linguistics, ACL ’99, pp 57–64
Bettembourg C, Diot C, Burgun A, Dameron O (2012) GO2PUB: querying PubMed with semantic expansion of gene ontology terms. J Bio Semant 3:7
Boldi P, Bonchi F, Castillo C, Vigna S (2011) Query reformulation mining: models, patterns and applications. J Inf Ret Arch 14(3):257–289
Christopher D, Schtze H (1999) Foundations of statistical natural language processing. MIT Press
Corby O, Dieng-Kuntz R, Faron-Zucker C (2004) Querying the semantic web withcorese search engine. In: de Mntaras RL, Saitta L (eds) Proceeding of European conference on artificial intelligence (ECAI 2004). IOS Press, pp 705–709
Elloumi-Chaabene M, Ben Mustapha N, Baazaoui Zghal H, Moreno A, Snchez D (2011) Semantic-based composition of modular ontologies applied to web query reformulation. In: Proceedings of the 6th international conference on software and data technologies ICSOFT, pp 305–308
Kafsi S, Ben Mustapha N, Baazaoui Zghal H, Moreno A (2012) Sem-PubMed: a semantic medical digital library that integrates ontology learning and query reformulation. KES, pp 1932–1941
Kiefer S, Rauch J, Albertoni R, Attene M, Giannini F, Marini S, Schneider L, Mesquita C, Xing X (2011) An ontology-driven search module for accessing chronic pathology literature. OTM Workshops, pp 382–391
Mastora A, Monopoli M, Kapidakis S (2008) Exploring query formulation and reformulation: a preliminary study to map users’ search behaviour. In: Christensen-Dalsgaard B et al (eds) ECDL 2008, LNCS 5173, pp 427–430
Mayr P, Mutschke P, Petras V (2007) Reducing semantic complexity indistributed digital librariesTreatment of term vagueness and document reranking. GESIS-IZ Social Science Information Centre, pp 213–234
Ferran N, Mor E, Minguillon J (2005) Towards personalization in digital libraries through ontologies. Libr Manag 26(4–5):206–217
Perez-Carballo J, Xie I (2011) Design principles of help systems for digital libraries. University of Wisconsin-Milwaukee
Price C, Summers R (2010) Decision support in large-scale healthcare information systems: the challenge of integrating ontologies. Int J Biomed Eng Technol 3(3–4):375–392
Sanchez D, Moreno A (2007) Bringing taxonomic structure to large digital libraries. Int J Metadata Semant Ontol 2(2):112–122
Sanchez D, Moreno A (2008) Pattern-based automatictaxonomy learning from the Web. AI Commun 21(1):27–48
Sanchez D, Moreno A (2008) Learning non-taxonomicrelationshipsfrom web documents for domainontologyconstruction. Data Knowl Eng J 64:600–623
Sanchez D, Moreno A, Del Vasto-Terrientes L (2012) Learning relation axioms from text: an automatic Web-based approach. Expert Syst Appl 39:5792–5805
Suomela S, Kekalainen J (2005) Ontology as a search-tool: a study of real user’s query formulationwithand without conceptual support. In: Proceedings of ECIR 2005. LNCS 3408. Springer, pp 315–329
Tan P-N, Steinbach M, Kumar V (2005) Introduction to data mining. Addison Wesley
Thinn Mya Mya Swe (2011) Intelligent Information Retrieval Within Digital Library Using Domain Ontology. Computer Science & Information Technology (CS & IT), vol 1–2
Turney P (2001) Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. In: Proceedings of the 12th Eu-ropean conference on machine learning, pp 491–510
Vallet D, Fernandez M, Castells P (2005) An ontology-based information retrieval model. In: Gomez-Perez A, Euzenat J (eds) Proceedings of ESWC 2005. LNCS 3532. Springer, pp 455–470
Yu H, Kim T, Oh J, Ko I, Kim S (2009) RefMed: relevance feedback retrieval system fo PubMed. CIKM, pp 2099–2100
Zhao P, Zhang M, Yang D, Tang S (2005) Finding hidden semantics behind reference linkages: an ontological approach for scientific digital libraries. DASFAA, pp 699–710
Acknowledgements
This research work has been supported by the Spanish-Tunisian AECID project number A/030058/10, “A framework for the integration of Ontology Learning and Semantic Search”.
The authors acknowledge the work and contributions of Nesrine Ben Mustapha and Safa Kafsi in the previous stages of this work [10].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Baazaoui Zghal, H., Moreno, A. A system for information retrieval in a medical digital library based on modular ontologies and query reformulation. Multimed Tools Appl 72, 2393–2412 (2014). https://doi.org/10.1007/s11042-013-1527-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1527-4