Abstract
Based on state of the art machine learning techniques, GROBID (GeneRation Of BIbliographic Data) performs reliable bibliographic data extractions from scholar articles combined with multi-level term extractions. These two types of extraction present synergies and correspond to complementary descriptions of an article. This tool is viewed as a component for enhancing the existing and the future large repositories of technical and scientific publications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Peng, F., McCallum, A.: Accurate Information Extraction from Research Papers using Conditional Random Fields. In: Proceedings of HLT-NAACL (2004)
McCallum, A., Kachites, A.: MALLET: A Machine Learning for Language Toolkit (2002)
Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: Proceedings of ACL Workshop on Multiword Expressions (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lopez, P. (2009). GROBID: Combining Automatic Bibliographic Data Recognition and Term Extraction for Scholarship Publications. In: Agosti, M., Borbinha, J., Kapidakis, S., Papatheodorou, C., Tsakonas, G. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2009. Lecture Notes in Computer Science, vol 5714. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04346-8_62
Download citation
DOI: https://doi.org/10.1007/978-3-642-04346-8_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04345-1
Online ISBN: 978-3-642-04346-8
eBook Packages: Computer ScienceComputer Science (R0)