LG-Starship: A Framework for Text Analysis

Maisto, Alessandro

doi:10.1007/978-3-030-44038-1_38

Alessandro Maisto¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1150))

Included in the following conference series:

Workshops of the International Conference on Advanced Information Networking and Applications

2346 Accesses

Abstract

In this work we present a new framework for the analysis of Italian texts that could help linguists to perform rapid text analysis. The framework, that performs both statistical and rule-based analysis, is called LG-Starship. The idea is to built a modular software that includes the basic algorithms to perform different kinds of analysis. The framework will include a Preprocessing Module a POS Tagging and Lemmatization module, a Statistic Module, a Semantic Module based on Distributional Analysis algorithms, and a Syntactic Module, which analyze syntax structures of a selected sentence and tag the verbs and its arguments with semantic labels. The objective of the Framework is to build an “all-in-one” platform for NLP which allows any kind of users to perform basic and advanced text analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amato, F., Mazzeo, A., Moscato, V., Picariello, A.: Semantic management of multimedia documents for e-government activity, pp. 1193–1198 (2009). https://doi.org/10.1109/CISIS.2009.195
Amato, F., Moscato, V., Picariello, A., Sperli, G.: Multimedia social network modeling: a proposal, pp. 448–453. Institute of Electrical and Electronics Engineers Inc. (2016). https://doi.org/10.1109/ICSC.2016.20
Amato, F., Castiglione, A., De Santo, A., Moscato, V., Picariello, A., Persia, F., Sperlí, G.: Recognizing human behaviours in online social networks. Comput. Secur. 74, 355–370 (2018). https://doi.org/10.1016/j.cose.2017.06.002
Article Google Scholar
Attardi, G., Fuschetto, A., Tamberi, F., Simi, M., Vecchi, E.M.: Experiments in tagger combination: arbitrating, guessing, correcting, suggesting. In: Proceedings of Workshop Evalita, p. 10 (2009)
Google Scholar
Audet, C., Burgess, C., et al.: Using a high-dimensional memory model to evaluate the properties of abstract and concrete words. In: Proceedings of the Cognitive Science Society, pp. 37–42. Citeseer (1999)
Google Scholar
Bird, S.: NLTK: the natural language toolkit. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 69–72. Association for Computational Linguistics (2006)
Google Scholar
Burgess, C.: From simple associations to the building blocks of language: modeling meaning in memory with the HAL model. Behav. Res. Methods Instrum. Comput. 30(2), 188–198 (1998)
Article Google Scholar
Burgess, C.: Representing and resolving semantic ambiguity: a contribution from high-dimensional memory modeling (2001)
Google Scholar
Choi, J.D.: Dynamic feature induction: the last gist to the state-of-the-art. In: Proceedings of NAACL-HLT, pp. 271–281 (2016)
Google Scholar
Chomsky, N.: Aspects of the Theory of Syntax, vol. 11. MIT Press, Cambridge (1965)
Google Scholar
Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 1–8. Association for Computational Linguistics (2002)
Google Scholar
Cunningham, H.: Gate, a general architecture for text engineering. Comput. Humanit. 36(2), 223–254 (2002)
Article Google Scholar
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: an architecture for development of robust HLT applications. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 168–175. Association for Computational Linguistics (2002)
Google Scholar
Elia, A.: Lessico e sintassi tra tempo e massa parlante. In: Marchese, M.P., Nocentini, A. (eds.) Il lessico nella teoria e nella storia linguistica, pp. 15–47. Edizioni il Calamo, Calamo (2014)
Google Scholar
Elia, A., Martinelli, M., D’Agostino, E.: Lessico e Strutture sintattiche. Introduzione alla sintassi del verbo italiano. Liguori, Napoli (1981)
Google Scholar
Graffi, G., Scalise, S.: Le lingue e il linguaggio. Il Mulino, Bologna (2002)
Google Scholar
Gross, M.: Transformational Analysis of French Verbal Constructions. University of Pennsylvania (1971)
Google Scholar
Gross, M.: Méthodes en syntaxe. Hermann, Paris (1975)
Google Scholar
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Article Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:150801991 (2015)
Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition induction and representation of knowledge. Psychol. Rev. 104(2), 211 (1997)
Article Google Scholar
Loria, S.: TextBlob: simplified text processing. Secondary TextBlob: Simplified Text Processing (2014)
Google Scholar
Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Methods Instrum. Comput. 28(2), 203–208 (1996)
Article Google Scholar
Lyding, V., Stemle, E., Borghetti, C., Brunello, M., Castagnoli, S., Dell’Orletta, F., Dittmann, H., Lenci, A., Pirrelli, V.: The PAISA corpus of Italian web texts. In: Proceedings of the 9th Web as Corpus Workshop (WaC-9), pp. 36–43 (2014)
Google Scholar
Manning, C.D.: Part-of-speech tagging from 97% to 100%: is it time for some linguistics? In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 171–189. Springer (2011)
Google Scholar
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: ACL (System Demonstrations), pp. 55–60 (2014)
Google Scholar
Morton, T., Kottmann, J., Baldridge, J., Bierner, G.: OpenNLP: a Java-based NLP toolkit (2005)
Google Scholar
OpenNLP: A machine learning based toolkit for the processing of natural language text (2018). http://opennlp.apache.org. Accessed 18 June 2013
Pantel, P.: Inducing ontological co-occurrence vectors. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 125–132. Association for Computational Linguistics (2005)
Google Scholar
Pianta, E., Zanoli, R.: TagPro: a system for Italian PoS tagging based on SVM. Intelligenza Artificiale 4(2), 8–9 (2007)
Google Scholar
Pianta, E., Girardi, C., Zanoli, R.: The TextPro tool suite. In: LREC. Citeseer (2008)
Google Scholar
Schmid, H.: Treetagger - a language independent part-of-speech tagger, vol. 43, p. 28. Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart (1995)
Google Scholar
Silberztein, M.: NooJ: a linguistic annotation system for corpus processing. In: Proceedings of HLT/EMNLP on Interactive Demonstrations, pp. 10–11. Association for Computational Linguistics (2005)
Google Scholar
Silberztein, M.: NooJ manual [electronic resource]. Mode of access (2014)
Google Scholar
Smedt, T.D., Daelemans, W.: Pattern for Python. J. Mach. Learn. Res. 13, 2063–2067 (2012)
Google Scholar
Toutanova, K., Manning, C.D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora: Held in Conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, vol. 13, pp. 63–70. Association for Computational Linguistics (2000)
Google Scholar
Vietri, S.: The Italian module for NooJ. In: Proceedings of the First Italian Conference on Computational Linguistics, CLiC-it 2014 (2014)
Google Scholar
Wilcock, G.: Text annotation with OpenNLP and UIMA (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Salerno, via Giovanni Paolo II, 132, 84084, Fisciano, SA, Italy
Alessandro Maisto

Authors

Alessandro Maisto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alessandro Maisto .

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, Fukuoka Institute of Technology, Fukuoka, Japan
Leonard Barolli
Department of Electrical Engineering and Information Technology, University of Naples “Frederico II”, Naples, Italy
Flora Amato
Department of Political Science, University of Campania Luigi Vanvitelli, Caserta, Italy
Francesco Moscato
Faculty of Business Administration, Rissho University, Tokyo, Japan
Tomoya Enokido
Department of Advanced Sciences, Hosei University, Tokyo, Japan
Makoto Takizawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maisto, A. (2020). LG-Starship: A Framework for Text Analysis. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds) Web, Artificial Intelligence and Network Applications. WAINA 2020. Advances in Intelligent Systems and Computing, vol 1150. Springer, Cham. https://doi.org/10.1007/978-3-030-44038-1_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-44038-1_38
Published: 31 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44037-4
Online ISBN: 978-3-030-44038-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics