Abstract
Current term recognition algorithms havecentred mostly on the notion of term based onthe assumption that terms are monoreferentialand as such independent of context. Thecharacteristics and behaviour of terms in realtexts are however far removed from this idealbecause factors such as text type orcommunicative situation greatly influence thelinguistic realisation of a concept. Context,therefore, is important for the correctidentification of terms (Dubuc and Lauriston,1997). Based on this assumption, we haveshifted our emphasis from terms towardssurrounding linguistic context, namely verbs,as verbs are considered the central elements inthe sentence. More specifically, we have setout to examine whether verbs and verbal syntaxin particular, could help us towards the taskof automatic term recognition. Our findingssuggest that term occurrence variessignificantly in different argument structuresand different syntactic positions. Additionally, deviant grammatical structureshave proved rich environments for terms. Theanalysis was carried out in three differentspecialised subcorpora in order to explore howthe effectiveness of verbal syntax as apotential indicator of term occurrence can beconstrained by factors such as subject matterand text type.
Similar content being viewed by others
References
Ananiadou S. (1994) A Methodology for Automatic Term Recognition. Proceedings of COLING-94, pp. 1034–1038.
Bonzi S. (1990) Syntactic Patterns in Scientific Sublanguages: A Study of Four Disciplines. Journal of the American Society for Information Science, 41, pp. 121–131.
Bourigault D. (1992) Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases. Proceedings of COLING-92, pp. 977–981.
Bross I.D.J., Shapiro P.A., Anderson B.B. (1972) How Information is Carried in Scientific Sublanguages. Science, 176, pp. 1303–1307.
Butler B., Isaacs A. (eds.) (1993) A Dictionary of Finance. Oxford University Press, Oxford.
Cabré M. T., Estopa R., Vivaldi J. (2001) Automatic Term Detection: A Review of Current Systems. In Bourigault, D., Jacquemin, C., and L'Homme, M. (eds.), Recent Advances in Computational Terminology, John Benjamins Publishing Company, Amsterdam/Philadelphia, pp. 53–89.
Chafe W. (1970) Meaning and the Structure of Language. University of Chicago Press, Chicago.
Cohen D.J. (1995) Highlights: Language-and Domain-independent Automatic Indexing for Abstracting. Journal of the American Society for Information Science, 46(3), pp. 162–174.
Curzon L.B. (1982) A Dictionary of Law. McDonald and Evans Ltd., Plymouth.
Dagan I., Church K. (1995) Termight: Identifying and Translating Technical Terminology. Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics, EACL'95, pp. 34–39.
Daille B., Gaussier E., Langé Jean-Marc (1994) Towards Automatic Extraction of Monolingual and Bilingual Terminology. Proceedings of the 15th International Conference on Computational Linguistics, COLING'94, pp. 515–521.
Damerau, F.J. (1993) Generating and Evaluating Domain-oriented Multi-word Terms from Texts. Information Processing and Management, 29(4), pp. 433–447.
Dubuc R., Lauriston A. (1997) Terms and Contexts. In Wright, S.E. and Budin, G. (eds.), Handbook of Terminology Management, Volume 1. John Benjamins Publishing Company, Amsterdam/ Philadelphia, pp. 80–87.
Enguehard C., Pantera L. (1994) Automatic Natural Acquisition of a Terminology. Journal of Quantitative Linguistics, 2(1), pp. 27–32.
Fillmore C. (1968) The Case for Case. In Bach, E. and Harms, R.T. (eds.), Universals in Linguistic Theory, North Holland, New York, pp. 1–88.
Franzi K., Ananiadou S. (1996) Extracting Nested Collocations. Proceedings of the 16th International Conference on Computational Linguistics, COLING'96, pp. 41–46.
Graham G. (1995) The 3-D Visual Dictionary of Computing. Foster City CA, IDG Books from MaranGraphics.
Haas S.W., He S. (1993) Toward the Automatic Identification of Sublanguage Vocabulary. Information Processing and Management, 29(6), pp. 721–731.
Harris Z.S. (1968) Mathematical Structures of Language. Wiley, New York.
Herbst R. (1966) Dictionary of Commercial, Financial and Legal Terms. Translegal Ltd., Switzerland.
Hornby A.S. (1974) Oxford Advanced Learner's Dictionary of Current English. Oxford University Press, London.
Jacquemin C., Royaute J. (1994) Retrieving Terms and their Variants in a Lexicalised Unification-Based Framework. Proceedings 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'94), Springer Verlag, Dublin, Berlin, pp. 132–141.
Jackson K.G., Feinberg R. (1981) Dictionary of Electrical Engineering. Butterworth and Co., London.
Johansson S. (1989) Frequency Analysis of English Vocabulary and Grammar: Based on the LOB Corpus. Volume 1: Tag Frequencies and Word Frequencies. Clarendon, Oxford.
Kageura K., Yioshioka M., Nozue T. (1998) Towards a Common Testbed for Subcorpus-based Computational Terminology. Proceedings of Computerm'98, pp. 81–85.
Lauriston A. (1996) Automatic Term Recognition: Performance of Linguistic and Statistical Techniques. PhD thesis, University of Manchester Institute of Science and Technology, Manchester.
Leech G. (1993) 100 Million Words of English. English Today, 9, pp. 9–15.
Lehrberger J. (1986) Sublanguage Analysis. In Ralph Grishman and Richard Kittredge (eds.), Analysing Language in Restricted Domains: Sublanguage Description and Processing, Laurence Erlbaum Associates, New Jersey, Hillsdale, pp. 19–59.
Maynard D., Ananiadou S. (1999) Identifying Contextual Information for Multi-word Term Extraction. In Proceedings of TKE'99: Terminology and Knowledge Engineering, TermNet, Vienna, pp. 212–221.
Nkwenti-Azeh B. (1994) Positional and Combinatorial Characteristics of Terms: Consequences for Subcorpus-based Terminography. Terminology, 1(1), pp. 61–97.
Oueslati R., Frath P., Rousselot F. (1996) Term Identification and Knowledge Extraction. Proceedings of NLP + IA 96, Moncton, N.B., Canada, pp. 191–196.
Pearson J. (1998) Terms in Context. John Benjamins Publishing Company, Amsterdam/Philadelphia.
Smadja F.A. (1991) Retrieving Collocations from Text: Xtract. Computational Linguistics, 19(1), pp. 144–177.
Vollnhals O. (1984) Elsevier's Dictionary of Personal and Office Computing. Elsevier Science Publishers B.V., Netherlands.
Wüster E. (1978) Einfuhrung in die Allgemeine Terminologielehre und Terminologische Lexicographie, 2 volumes. Springer, Wien.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Eumeridou, E., Nkwenti-Azeh, B. & McNaught, J. An Analysis of Verb Subcategorization Frames in Three Special Language Corpora with a View towards Automatic Term Recognition. Computers and the Humanities 38, 37–60 (2004). https://doi.org/10.1023/B:CHUM.0000009278.73498.f4
Issue Date:
DOI: https://doi.org/10.1023/B:CHUM.0000009278.73498.f4