Abstract
To teach the nominal expressions of the phraseology to foreign learners, we must build a corpus from which we can extract desirable sequences. Modeling and disambiguation are at the heart of extraction. In this article, we discuss how the two procedures are established and also show how a data implementation is processed in NooJ. At the end of the article, our quantitative and qualitative analyses prove that the result of this extraction is positive.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In NooJ, the single quotation marks (<>) allow us to find all the occurrences of this term and its variants.
References
Anthony, L.: AntConc: design and development of a freeware corpus analysis toolkit for the technical writing classroom. In: Communication présentée à la Conference IPC 2005, pp. 729–737 (2005)
Cavalla, C.: La phraséologie en classe de FLE. Les Langues Modernes 1/2009 (2009). http://www.aplv-languesmodernes.org/spip.php?article2292
Cavalla, C., Loiseau, M.: Scientext comme corpus pour l’enseignement. In: Tutin, A., Grossmann, F. (Eds.) L’écrit scientifique: du lexique au discours. Autour de Scientext, Rennes: PUR, pp. 163–182 (2013)
Church, K., Gale, W., Hanks, P., Hindle, D.: Using statistics in lexical analysis, pp. 115–164 (1991)
Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. In: Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, 26–29 June 1989, Vancouver, Canada, pp. 76–83 (1989)
Chomsky, N.: Syntactic Structures. Mouton, The Hague (1957). Livre traduit en 1969: Structures syntaxiques. Le Seuil, Paris
Choueka, Y., Klein, S.T., Neuwitz, E.: Automatic retrieval of frequent idiomatic and collocational expressions in a large corpus. J. Assoc. Lit. Linguist. Comput. 4, 34–38 (1983)
Cowie, A.P.: The place of illustrative material and collocations in the design of a learner’s dictionary. In: Strevens, P. (ed.) In Honour of A.S. Hornby. Oxford University Press, Oxford, pp. 127–139 (1978)
Goldman, J.P., Nerima, L., Wehrli, E.: Collocation extraction using a syntactic parser. In: Proceedings of the ACL 2001 Workshop on Collocation, Toulouse, pp. 61–66 (2001)
González-Rey, I.: La phraséologie du français. Presses Universitaires du Mirail, Toulouse (2002)
González-Rey, I.: La didactique du français idiomatique. E.M.E., Fernelmont (2008)
Grefenstette, G., Teufel, S.: Corpus-based method for automatic identification of support verbs for nominalizations. In: Proceedings of the Seventh Conference of the European Chapter of the Association for Computational Linguistics, 27–31 March 1995, Dublin, Ireland, pp. 98–103 (1995)
Gross, M.: Une classification des phrases «figées» du français. Revue québécoise de linguistique 11(2), 151–185 (1982)
Grossmann, F., Tutin, A.: Les collocations: analyse et traitement. De Werelt, Amsterdam (2003)
Kilgarriff, A., Tugwell, D.: Word sketch: extraction, combination and display of significant collocations for lexicography. In: Proceedings of the Workshop on Collocations: Computational Extraction, Analysis and Exploitation, ACL-EACL 2001, Toulouse, pp. 32–38 (2001)
Lamiroy, B., Klein, J.R.: Le problème central du figement est le semi-figement. Linx 53, 135–154 (2005)
Lewis, M.: Teaching Collocation, Further Developments in the Lexical Approach. Language Teaching Publications LTP, Hove (2000)
Lin, D.: Extracting collocations from text corpora. In: First Workshop on Computational Terminology, Montréal, pp. 57–63 (1998)
Luka, N., Seretan, V., Wehrli, E.: Le problème de collocation en TAL. In: Nouveaux cahiers de linguistiques Française, pp. 95–115 (2006)
Mejri, S.: Figement, néologie et renouvellement du lexique. Linx. Revue des linguistes de l’université Paris X Nanterre 52, 163–174 (2005). https://doi.org/10.4000/linx.231
Mel’čuk, I.: La Phraséologie et son rôle dans l’enseignement-apprentissage d’une langue étrangère. Études de linguistique appliquée 92, 82–113 (1993)
Polguère, A.: Towards a theoretically-motivated general public dictionary of semantic derivations and collocations for French. In: Proceedings of EURALEX 2000, Stuttgart, pp. 517–527 (2000)
Polguère, A.: Lexicologie et sémantique lexicale: notions fondamentales. Troisième édition (première édition en 2003), les presses de l’Université de Montréal, Montréal (2016)
Sag, I., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multiword expressions: a pain in the neck for NLP. In: Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics (CICLING 2002), Mexico City, pp. 1–15 (2002)
Salem, A., équipe SYLED: Statistique textuelle. Dunod, Paris (2001)
Silberztein, M., Tutin, A.: NooJ, un outil TAL pour l’enseignement des langues: application pour l’étude de la morphologie lexicale en FLEM. Alsic, 8(2), 123–134 (2005)
Smadja, F.: Retrieving collocations form text: Xtract. Comput. Linguist. 19(1), 143–177 (1993)
Tutin, A.: Pour une modélisation dynamique des collocations dans les textes. Actes d’Euralex, Lorient (2004)
Yang, T.: Cuisitext: un corpus écrit et oral pour l’enseignement, colloque LOSP (Langues sur objectifs spécifiques: perspective croisées entre linguistique et didactique), Grenoble, 24–25 November 2016 (2016). http://losp2016.u-grenoble3.fr
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Yang, T. (2018). Automatic Extraction of the Phraseology Through NooJ. In: Mbarki, S., Mourchid, M., Silberztein, M. (eds) Formalizing Natural Languages with NooJ and Its Natural Language Processing Applications. NooJ 2017. Communications in Computer and Information Science, vol 811. Springer, Cham. https://doi.org/10.1007/978-3-319-73420-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-73420-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73419-4
Online ISBN: 978-3-319-73420-0
eBook Packages: Computer ScienceComputer Science (R0)