Abstract
In this paper, a method for the automatic identification of research data references in publications is proposed for automatically generating research data repositories. The International Conference on Language Resources and Evaluation (LREC) requires authors to list research data references separately from other publication references. The goal of our research is to automate the discrimination process. We investigated the reference lists in LREC papers and the citation contexts to find characteristic features that are useful for identifying research data references. We confirmed that key phrases appeared in the citation contexts and the bibliographical elements in the reference lists. Our proposed method uses the presence or absence of key phrases to identify research data references. Experiments on LREC proceedings papers proved the effectiveness of using key phrases in the citation context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
LREC Author’s kit (2016). https://www.lrec2016.lrec-conf.org/en/submission/authors-kit/
Ahtaridis, E., Cieri, C., DiPersio, D.: LDC language resource database: Building a bibliographic database. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), pp. 1723–1728. European Language Resources Association (ELRA), Istanbul, May 2012
Calzolari, N. et al. (eds.): Proceedings of LREC 2016, 2018, and 2020. http://www.lrec-conf.org/proceedings/
Choukri, K., Arranz, V., Hamon, O., Park, J.: Using the international standard language resource number: practical and technical aspects. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), pp. 50–54. European Language Resources Association (ELRA), Istanbul, May 2012
Kozawa, S., Tohyama, H., Uchimoto, K., Matsubara, S.: Collection of usage information for language resources from academic articles. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010). European Language Resources Association (ELRA), Valletta, May 2010
Mapelli, V., Popescu, V., Liu, L., Choukri, K.: Language resource citation: the ISLRN dissemination and further developments. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pp. 1610–1613. European Language Resources Association (ELRA), Portorož, May 2016
Namba, H.: Construction of an academic resource repository. In: Proceedings of Toward Effective Support for Academic Information Search Workshop, pp. 8–14 (2018)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Tohyama, H., Kozawa, S., Uchimoto, K., Matsubara, S., Isahara, H.: Construction of an infrastructure for providing users with suitable language resources. In: Coling 2008: Companion volume: Posters, pp. 119–122. Coling 2008 Organizing Committee, Manchester, August 2008
Zinn, C.: Squib: The language resource switchboard. Comput. Linguist. 44(4), 631–639 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ikoma, T., Matsubara, S. (2020). Identification of Research Data References Based on Citation Contexts. In: Ishita, E., Pang, N.L.S., Zhou, L. (eds) Digital Libraries at Times of Massive Societal Transition. ICADL 2020. Lecture Notes in Computer Science(), vol 12504. Springer, Cham. https://doi.org/10.1007/978-3-030-64452-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-64452-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64451-2
Online ISBN: 978-3-030-64452-9
eBook Packages: Computer ScienceComputer Science (R0)