Neural ParsCit: a deep learning-based reference string parser

Prasad, Animesh; Kaur, Manpreet; Kan, Min-Yen

doi:10.1007/s00799-018-0242-1

Neural ParsCit: a deep learning-based reference string parser

Published: 19 May 2018

Volume 19, pages 323–337, (2018)
Cite this article

International Journal on Digital Libraries Aims and scope Submit manuscript

997 Accesses
28 Citations
4 Altmetric
Explore all metrics

Abstract

We present a deep learning approach for the core digital libraries task of parsing bibliographic reference strings. We deploy the state-of-the-art long short-term memory (LSTM) neural network architecture, a variant of a recurrent neural network to capture long-range dependencies in reference strings. We explore word embeddings and character-based word embeddings as an alternative to handcrafted features. We incrementally experiment with features, architectural configurations, and the diversity of the dataset. Our final model is an LSTM-based architecture, which layers a linear chain conditional random field (CRF) over the LSTM output. In extensive experiments in both English in-domain (computer science) and out-of-domain (humanities) test cases, as well as multilingual data, our results show a significant gain (\(p<0.01\)) over the reported state-of-the-art CRF-only-based parser.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast and scalable neural embedding models for biomedical sentence classification

Article Open access 22 December 2018

Detecting Target Text Related to Algorithmic Efficiency in Scholarly Big Data Using Recurrent Convolutional Neural Network Model

An Efficient Framework for Algorithmic Metadata Extraction over Scholarly Documents Using Deep Neural Networks

Article 20 April 2023

Notes

http://www.mendeley.com/.
Code and data available at https://github.com/WING-NUS/Neural-ParsCit.
https://code.google.com/archive/p/word2vec/.
https://www.doi.org/.

References

Bengio, Y.: Learning deep architectures for AI. Found. trends\({\textregistered }\) Mach. Learn.2(1), 1–127 (2009)
Article Google Scholar
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Article Google Scholar
Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., Bengio, Y.: Theano: A CPU and GPU math compiler in python. In: Proceedings of 9th Python in Science Conference, pp. 1–7 (2010)
Chen, C.C., Yang, K.H., Chen, C.L., Ho, J.M.: Bibpro: a citation parser based on sequence alignment. IEEE Trans. Knowl. Data Eng. 24(2), 236–250 (2012)
Article Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)
MATH Google Scholar
Councill, I.G., Giles, C.L., Kan, M.Y.: Parscit: an open-source CRF reference string parsing package. LREC 8, 661–667 (2008)
Google Scholar
Cuong, N.V., Chandrasekaran, M.K., Kan, M.Y., Lee, W.S.: Scholarly document information extraction using extensible features for efficient higher order semi-CRFs. In: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, ACM, pp. 61–64 (2015)
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)
Article Google Scholar
Giles, C.L., Bollacker, K.D., Lawrence, S.: Citeseer: an automatic citation indexing system. In: Proceedings of the Third ACM Conference on Digital Libraries, ACM, pp. 89–98 (1998)
Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. arXiv preprint arXiv:1503.04069 (2015)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kern, R., Klampfl, S.: Extraction of references using layout and formatting information from scientific articles. D-Lib Mag. 19(9/10), (2013)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML, vol. 1, pp. 282–289 (2001)
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp. 3111–3119 (2013)
Mohamed, A.R., Dahl, G.E., Hinton, G.: Acoustic modeling using deep belief networks. IEEE Trans. Audio Speech Lang. Process. 20(1), 14–22 (2012)
Article Google Scholar
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. ICML 3(28), 1310–1318 (2013)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. EMNLP 14, 1532–43 (2014)
Google Scholar
Rabiner, L., Juang, B.: An introduction to hidden Markov models. IEEE ASSP Mag. 3(1), 4–16 (1986)
Article Google Scholar
Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 833–840 (2011)
Romanello, M., Boschetti, F., Crane, G.: Citations in the digital library of classics: extracting canonical references by using conditional random fields. In: Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, Association for Computational Linguistics, pp. 80–87 (2009)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. Technical report, DTIC Document (1985)
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Article Google Scholar
Tkaczyk, D., Szostek, P., Fedoryszak, M., Dendek, P.J., Bolikowski, Ł.: Cermine: automatic extraction of structured metadata from scientific literature. IJDAR 18(4), 317–335 (2015)
Article Google Scholar

Download references

Acknowledgements

We would like to acknowledge the support of the NExT research grant funds, supported by the National Research Foundation, Prime Minister’s Office, Singapore, under its IRC @ SG Funding Initiative. We would also like to gratefully acknowledge the support of NVIDIA Corporation with the donation of the GeForce GTX Titan X GPU used for this research. We also acknowledge Muthu Kumar Chandrasekaran for insightful feedback on data handling and editing, along with Kishaloy Halder and Wenqiang Lei for their help in the ongoing integration with the current ParsCit pipeline.

Author information

Authors and Affiliations

School of Computing, National University of Singapore, Computing 1, 13 Computing Drive, Singapore, 11741, Singapore
Animesh Prasad, Manpreet Kaur & Min-Yen Kan

Authors

Animesh Prasad
View author publications
You can also search for this author in PubMed Google Scholar
Manpreet Kaur
View author publications
You can also search for this author in PubMed Google Scholar
Min-Yen Kan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Animesh Prasad.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prasad, A., Kaur, M. & Kan, MY. Neural ParsCit: a deep learning-based reference string parser. Int J Digit Libr 19, 323–337 (2018). https://doi.org/10.1007/s00799-018-0242-1

Download citation

Received: 18 October 2016
Revised: 04 April 2018
Accepted: 11 April 2018
Published: 19 May 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s00799-018-0242-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Neural ParsCit: a deep learning-based reference string parser

Abstract

Access this article

Similar content being viewed by others

Fast and scalable neural embedding models for biomedical sentence classification

Detecting Target Text Related to Algorithmic Efficiency in Scholarly Big Data Using Recurrent Convolutional Neural Network Model

An Efficient Framework for Algorithmic Metadata Extraction over Scholarly Documents Using Deep Neural Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Neural ParsCit: a deep learning-based reference string parser

Abstract

Access this article

Similar content being viewed by others

Fast and scalable neural embedding models for biomedical sentence classification

Detecting Target Text Related to Algorithmic Efficiency in Scholarly Big Data Using Recurrent Convolutional Neural Network Model

An Efficient Framework for Algorithmic Metadata Extraction over Scholarly Documents Using Deep Neural Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation