Skip to main content
Log in

An Analysis of Verb Subcategorization Frames in Three Special Language Corpora with a View towards Automatic Term Recognition

  • Published:
Computers and the Humanities Aims and scope Submit manuscript

Abstract

Current term recognition algorithms havecentred mostly on the notion of term based onthe assumption that terms are monoreferentialand as such independent of context. Thecharacteristics and behaviour of terms in realtexts are however far removed from this idealbecause factors such as text type orcommunicative situation greatly influence thelinguistic realisation of a concept. Context,therefore, is important for the correctidentification of terms (Dubuc and Lauriston,1997). Based on this assumption, we haveshifted our emphasis from terms towardssurrounding linguistic context, namely verbs,as verbs are considered the central elements inthe sentence. More specifically, we have setout to examine whether verbs and verbal syntaxin particular, could help us towards the taskof automatic term recognition. Our findingssuggest that term occurrence variessignificantly in different argument structuresand different syntactic positions. Additionally, deviant grammatical structureshave proved rich environments for terms. Theanalysis was carried out in three differentspecialised subcorpora in order to explore howthe effectiveness of verbal syntax as apotential indicator of term occurrence can beconstrained by factors such as subject matterand text type.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ananiadou S. (1994) A Methodology for Automatic Term Recognition. Proceedings of COLING-94, pp. 1034–1038.

  • Bonzi S. (1990) Syntactic Patterns in Scientific Sublanguages: A Study of Four Disciplines. Journal of the American Society for Information Science, 41, pp. 121–131.

  • Bourigault D. (1992) Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases. Proceedings of COLING-92, pp. 977–981.

  • Bross I.D.J., Shapiro P.A., Anderson B.B. (1972) How Information is Carried in Scientific Sublanguages. Science, 176, pp. 1303–1307.

    Google Scholar 

  • Butler B., Isaacs A. (eds.) (1993) A Dictionary of Finance. Oxford University Press, Oxford.

    Google Scholar 

  • Cabré M. T., Estopa R., Vivaldi J. (2001) Automatic Term Detection: A Review of Current Systems. In Bourigault, D., Jacquemin, C., and L'Homme, M. (eds.), Recent Advances in Computational Terminology, John Benjamins Publishing Company, Amsterdam/Philadelphia, pp. 53–89.

    Google Scholar 

  • Chafe W. (1970) Meaning and the Structure of Language. University of Chicago Press, Chicago.

    Google Scholar 

  • Cohen D.J. (1995) Highlights: Language-and Domain-independent Automatic Indexing for Abstracting. Journal of the American Society for Information Science, 46(3), pp. 162–174.

    Google Scholar 

  • Curzon L.B. (1982) A Dictionary of Law. McDonald and Evans Ltd., Plymouth.

    Google Scholar 

  • Dagan I., Church K. (1995) Termight: Identifying and Translating Technical Terminology. Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics, EACL'95, pp. 34–39.

  • Daille B., Gaussier E., Langé Jean-Marc (1994) Towards Automatic Extraction of Monolingual and Bilingual Terminology. Proceedings of the 15th International Conference on Computational Linguistics, COLING'94, pp. 515–521.

  • Damerau, F.J. (1993) Generating and Evaluating Domain-oriented Multi-word Terms from Texts. Information Processing and Management, 29(4), pp. 433–447.

    Google Scholar 

  • Dubuc R., Lauriston A. (1997) Terms and Contexts. In Wright, S.E. and Budin, G. (eds.), Handbook of Terminology Management, Volume 1. John Benjamins Publishing Company, Amsterdam/ Philadelphia, pp. 80–87.

    Google Scholar 

  • Enguehard C., Pantera L. (1994) Automatic Natural Acquisition of a Terminology. Journal of Quantitative Linguistics, 2(1), pp. 27–32.

    Google Scholar 

  • Fillmore C. (1968) The Case for Case. In Bach, E. and Harms, R.T. (eds.), Universals in Linguistic Theory, North Holland, New York, pp. 1–88.

    Google Scholar 

  • Franzi K., Ananiadou S. (1996) Extracting Nested Collocations. Proceedings of the 16th International Conference on Computational Linguistics, COLING'96, pp. 41–46.

  • Graham G. (1995) The 3-D Visual Dictionary of Computing. Foster City CA, IDG Books from MaranGraphics.

    Google Scholar 

  • Haas S.W., He S. (1993) Toward the Automatic Identification of Sublanguage Vocabulary. Information Processing and Management, 29(6), pp. 721–731.

    Google Scholar 

  • Harris Z.S. (1968) Mathematical Structures of Language. Wiley, New York.

    Google Scholar 

  • Herbst R. (1966) Dictionary of Commercial, Financial and Legal Terms. Translegal Ltd., Switzerland.

    Google Scholar 

  • Hornby A.S. (1974) Oxford Advanced Learner's Dictionary of Current English. Oxford University Press, London.

    Google Scholar 

  • Jacquemin C., Royaute J. (1994) Retrieving Terms and their Variants in a Lexicalised Unification-Based Framework. Proceedings 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'94), Springer Verlag, Dublin, Berlin, pp. 132–141.

    Google Scholar 

  • Jackson K.G., Feinberg R. (1981) Dictionary of Electrical Engineering. Butterworth and Co., London.

    Google Scholar 

  • Johansson S. (1989) Frequency Analysis of English Vocabulary and Grammar: Based on the LOB Corpus. Volume 1: Tag Frequencies and Word Frequencies. Clarendon, Oxford.

    Google Scholar 

  • Kageura K., Yioshioka M., Nozue T. (1998) Towards a Common Testbed for Subcorpus-based Computational Terminology. Proceedings of Computerm'98, pp. 81–85.

  • Lauriston A. (1996) Automatic Term Recognition: Performance of Linguistic and Statistical Techniques. PhD thesis, University of Manchester Institute of Science and Technology, Manchester.

    Google Scholar 

  • Leech G. (1993) 100 Million Words of English. English Today, 9, pp. 9–15.

    Google Scholar 

  • Lehrberger J. (1986) Sublanguage Analysis. In Ralph Grishman and Richard Kittredge (eds.), Analysing Language in Restricted Domains: Sublanguage Description and Processing, Laurence Erlbaum Associates, New Jersey, Hillsdale, pp. 19–59.

    Google Scholar 

  • Maynard D., Ananiadou S. (1999) Identifying Contextual Information for Multi-word Term Extraction. In Proceedings of TKE'99: Terminology and Knowledge Engineering, TermNet, Vienna, pp. 212–221.

    Google Scholar 

  • Nkwenti-Azeh B. (1994) Positional and Combinatorial Characteristics of Terms: Consequences for Subcorpus-based Terminography. Terminology, 1(1), pp. 61–97.

    Google Scholar 

  • Oueslati R., Frath P., Rousselot F. (1996) Term Identification and Knowledge Extraction. Proceedings of NLP + IA 96, Moncton, N.B., Canada, pp. 191–196.

    Google Scholar 

  • Pearson J. (1998) Terms in Context. John Benjamins Publishing Company, Amsterdam/Philadelphia.

    Google Scholar 

  • Smadja F.A. (1991) Retrieving Collocations from Text: Xtract. Computational Linguistics, 19(1), pp. 144–177.

    Google Scholar 

  • Vollnhals O. (1984) Elsevier's Dictionary of Personal and Office Computing. Elsevier Science Publishers B.V., Netherlands.

    Google Scholar 

  • Wüster E. (1978) Einfuhrung in die Allgemeine Terminologielehre und Terminologische Lexicographie, 2 volumes. Springer, Wien.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Eumeridou, E., Nkwenti-Azeh, B. & McNaught, J. An Analysis of Verb Subcategorization Frames in Three Special Language Corpora with a View towards Automatic Term Recognition. Computers and the Humanities 38, 37–60 (2004). https://doi.org/10.1023/B:CHUM.0000009278.73498.f4

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:CHUM.0000009278.73498.f4

Navigation