Abstract
Natural language processing is used for solving a wide variety of problems. Some scholars and interest groups working with language resources are not well versed in programming, so there is a need for a good graphical framework that allows users to quickly design and test natural language processing pipelines without the need for programming. The existing frameworks do not satisfy all the requirements for such a tool. We, therefore, propose a new framework that provides a simple way for its users to build language processing pipelines. It also allows a simple programming language agnostic way for adding new modules, which will help the adoption by natural language processing developers and researchers. The main parts of the proposed framework consist of (a) a pluggable Docker-based architecture, (b) a general data model, and (c) APIs description along with the graphical user interface. The proposed design is being used for implementation of a new natural language processing framework, called ANGLEr.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The number of downloads was recorded by https://sourceforge.net.
References
Apache: Opennlp (2010). http://opennlp.apache.org
Bird, S., Loper, E.: NLTK: the natural language toolkit. Association for Computational Linguistics (2004)
Cunningham, H.: Gate, a general architecture for text engineering. Comput. Humanit. 36(2), 223–254 (2002)
Demšar, J., et al.: Orange: data mining toolbox in Python. J. Mach. Learn. Res. 14(1), 2349–2353 (2013)
Ferrucci, D., Lally, A.: UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3–4), 327–348 (2004)
Nance, J., Baumgartner, P.: gobbli: a uniform interface to deep learning for text in Python. J. Open Source Softw. 6(62), 2395 (2021)
Qi, P., Zhang, Y., Zhang, Y., Bolton, J., Manning, C.D.: Stanza: a Python natural language processing toolkit for many human languages. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 101–108 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Knez, T., Bajec, M., Žitnik, S. (2022). ANGLEr: A Next-Generation Natural Language Exploratory Framework. In: Guizzardi, R., Ralyté, J., Franch, X. (eds) Research Challenges in Information Science. RCIS 2022. Lecture Notes in Business Information Processing, vol 446. Springer, Cham. https://doi.org/10.1007/978-3-031-05760-1_53
Download citation
DOI: https://doi.org/10.1007/978-3-031-05760-1_53
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05759-5
Online ISBN: 978-3-031-05760-1
eBook Packages: Computer ScienceComputer Science (R0)