Semantic Facets for Scientific Information Retrieval

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 475)


We present an Information Retrieval System for scientific publications that provides the possibility to filter results according to semantic facets. We use sentence-level semantic annotations that identify specific semantic relations in texts, such as methods, definitions, hypotheses, that correspond to common information needs related to scientific literature. The semantic annotations are obtained using a rule-based method that identifies linguistic clues organized into a linguistic ontology. The system is implemented using Solr Search Server and offers efficient search and navigation in scientific papers.


Semantic annotation Information retrieval Faceted search Semantic facets Solr 



We thank Benoît Macaluso of the Observatoire des Sciences et des Technologies (OST), Montreal, Canada, for harvesting and providing the PLOS dataset.


  1. 1.
    Bertin, M., Atanassova, I.: Semantic enrichment of scientific publications and metadata : citation analysis through contextual and cognitive analysis. In: Proceedings of the 1st International Workshop on Mining Scientific Publications, in Conjunction with Joint Conference on Digital Libraries JCDL-2012. ACM/IEEE (2012)Google Scholar
  2. 2.
    Bertin, M., Atanassova, I., Desclés, J.P.: Automatic analysis of author judgment in scientific articles based on semantic annotation. In: Proceedings of the 22nd International Florida Artificial Intelligence, Research Society Conference, Sanibel Island, Florida. pp. 19–21 (2009)Google Scholar
  3. 3.
    Bertin, M., Atanassova, I., Lariviere, V., Gingras, Y.: The distribution of references in scientific papers: an analysis of the IMRaD structure. In: 14th International Society of Scientometrics and Informetrics Conference, pp. 591–603. International Society for Informetrics and Sciento (2013)Google Scholar
  4. 4.
    Buscaldi, D., Zargayouna, H.: Yasemir: yet another semantic information retrieval system. In: Proceedings of the Sixth International Workshop on Exploiting Semantic Annotations in Information Retrieval, pp. 13–16. ACM (2013)Google Scholar
  5. 5.
    Desclés, J.P.: Contextual exploration processing for discourse and automatic annotations of texts. In: FLAIRS Conference, pp. 281–284 (2006)Google Scholar
  6. 6.
    Liakata, M., Thompson, P., de Waard, A., Nawaz, R., Maat, H.P., Ananiadou, S.: A three-way perspective on scientific discourse annotation for knowledge extraction. In: Proceedings of the Workshop on Detecting Structure in Scholarly Discourse, pp. 37–46. Association for Computational Linguistics (2012)Google Scholar
  7. 7.
    Mourad, G.: La segmentation de textes par exploration contextuelle automatique, présentation du module segatex. ISLsp, Inscription Spatiale du Langage : structure et processus IRIT, Université Paul Sabatier, Toulouse (2002)Google Scholar
  8. 8.
    Novacek, V., Groza, T., Handschuh, S., Decker, S.: Coraal - dive into publications, bathe in the knowledge. Web Semant. Sci. Serv. Agents World Wide Web 8(2–3), 1–10 (2010)Google Scholar
  9. 9.
    Shotton, D., Peroni, S.: DoCO, the document components ontology (2011)Google Scholar
  10. 10.
    Teufel, S., Siddharthan, A., Tidhar, D.: Automatic classification of citation function. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06, pp. 103–110. Association for Computational Linguistics, Stroudsburg, PA, USA (2006)Google Scholar
  11. 11.
    You, W., Fontaine, D., Barthès, J.P.: An automatic keyphrase extraction system for scientific documents. Knowl. Inf. Syst. 34(3), 691–724 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.CIRSTUniversité du Québec à MontréalMontrealCanada

Personalised recommendations