A Natural Language Toolkit: Reconciling Theory with Practice

  • Branimir Boguraev
Part of the Studies in Linguistics and Philosophy book series (SLAP, volume 35)


Generalized Phrase Structure Grammar (GPSG) is a highly restrictive theory of natural language syntax, characterised by complex interaction between its various rule types and constraints. Motivated by desire for declarative semantics, the theory defines these as applying simultaneously in the process of licensing local trees. As a result, as far as practical implementations of GPSG are concerned, the theory loses its apparent efficient parsability and becomes computationally intractable. This paper describes one aspect of an UK collaborative effort to produce a general purpose morphological and syntactic analyser for English within the theoretical framework of Generalized Phrase Structure Grammar, namely the development of a tractable grammatical formalism with clear semantics, capable of supporting the task of writing a substantial grammar. The paper outlines the intellectual and pragmatic background of the development effort and traces the incremental evolution of this formalism, following discussions concerning the fundamental issues of rules interpretation, feature system, grammar organisation, parser strategy, environment for grammar writing and support, and the construction of a lexicon linked to the grammar. Particular emphasis is placed on the quesiton of how theoretical standpoints have been reconciled with practical constraints, and how the commitment to deliver a functional morphological and syntactic analyser of wide scope and coverage of English has influenced the current state of the grammatical formalism.


Lexical Entry Computational Linguistics Default Rule Grammar Formalism Phrase Structure Grammar 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Alshawi, Hiyan; Moore, Robert; Sparck Jones, Karen; and Pulman, Steven (1986). Final report: Feasibility study for a research programme in natural-language processing. SRI Technical Note, SRI International, Cambridge, UK.Google Scholar
  2. [2]
    Barstow, David; Shrobe, Eric; and Sandewall, Erik (1984). Interactive programming environments. New York: McGraw Hill.Google Scholar
  3. [3]
    Barton, G. Edward (1985). On the complexity of ID/LP parsing. Computational Linguistics, 11(4), 205–218.Google Scholar
  4. [4]
    Boguraev, Bran and Briscoe, Ted (1987, Forthcoming). Large lexicons for natural language processing: exploiting the grammar coding system of LDOCE. Computational Linguistics.Google Scholar
  5. [5]
    Boguraev, Bran; Briscoe, Ted; Carroll, John; Carter, David and Grover, Claire (1987). The derivation of a grammatically indexed lexicon from the Longman Dictionary of Contemporary English. In Proceedings of the 25th Annual Meeting of the Association for Computational Linguistics, Stanford, California (to appear).Google Scholar
  6. [6]
    Briscoe, Ted (1986). Tools projects grammar formalism and related issues. Alvey Grammar Project Working Paper: Department of Linguistics, University of Lancaster England.Google Scholar
  7. [7]
    Briscoe, Ted; Craig, Ian; and Grover, Claire (1986). The use of the LOB Corpus in the development of a phrase-structure grammar of English. In Proceedings of the Seventh International Conference on English Language Research on Computational Corpora (ICAME), Amsterdam, The Netherlands.Google Scholar
  8. [8]
    Carroll, John (1986). Grammar development environment — V 1.09. User Documentation: University of Cambridge Computer Laboratory: Cambridge, England.Google Scholar
  9. [9]
    Chung, P.W.H. (Ed); Bowen, D.; Byrd, L.; Pereira, F.; Pereira, L.; Rae, R.; and Warre, D. (1985). Edinburgh Prolog: the new implementation (NIP) user’s manual. University of Edinburgh AI Applications Institute, Edinburgh, Scotland.Google Scholar
  10. [10]
    Evans, Roger (1985). ProGram — a development tool for GPSG grammars. Linguistics 23 (2), 313–243.CrossRefGoogle Scholar
  11. [11]
    Garside, Roger and Leech, Geoffrey (1982). Grammatical tagging of the LOB corpus: general survey. In Johansson, S., Ed. Computer Corpora in English Language Research. Norwegian Computing Centre for the Humanities, Bergen, Norway.Google Scholar
  12. [12]
    Gawron, Mark; King, Jonathan; Lamping, John; Loebner, Egon; Paulson, Anne; Pullum, Geoffrey; Sag, Ivan; and Wasow, Thomas (1982). The GPSG linguistics system. In Proceedings of the 20th Annual Meeting of the Association for Computational Linguistics, Toronto, Ontario.Google Scholar
  13. [13]
    Gazdar, Gerald (1982). Phrase Structure Grammar. In Jacobson, P. and Pullum, G. Eds., The nature of syntactic representation, Dordrecht: D. Reidel.Google Scholar
  14. [14]
    Gazdar, Gerald and Pullum, Geoffrey (1982). Generalized Phrase Structure Grammar: a theoretical synopsis. University Linguistics Club, Bloomington, Indiana.Google Scholar
  15. [15]
    Gazdar, Gerald (1984). Recent computer implementations of phrase structure grammars. Computational Linguistics 10 (3–4), 212–214.Google Scholar
  16. [16]
    Gazdar, Gerald; Klein, Ewan; Pullum, Geoffrey K.; and Sag, Ivan A. (1985). Generalized phrase structure grammar. Oxford: Blackwell and Cambridge: Harvard University Press.Google Scholar
  17. [17]
    Grover, Claire (1986). The analysis of NP specifiers in English. Paper presented at the Autumn Meeting of the Linguistic Association of Great Britain. Edinburgh University, Edinburgh, Scotland.Google Scholar
  18. [18]
    Kaplan, Ronald (1986). Three seductions of computational psycholinguistics. In Whitelock et al.Google Scholar
  19. [19]
    Karttunen, Lauri and Kay, Martin (1985). Structure sharing with binary trees. In Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics, Chicago, Illinois.Google Scholar
  20. [20]
    Kilbury, James (1984). A modification of the Earley-Shieber algorithm for direct parsing of ID/LP grammars. Project KIT Technical Report, Technische UniversitHat, Berlin.Google Scholar
  21. [21]
    Kilbury, James (1986). Category cooccurence restrictions and the elimination of metarules. In this volume. Also in Proceedings of the 11th International Conference on Computational Linguistics, Bonn, Germany.Google Scholar
  22. [22]
    Koskenniemi, Kimmo (1983). Two-level model for morphological analysis. In Proceedings of the Eighth International Joint Conference on Artificial intelligence. Karlsruhe, Germany.Google Scholar
  23. [23]
    Koskenniemi, Kimmo (1983). Two-level morphology: a general computational model for word recognition and production. Publication No. 11: University of Helsinki: Finland.Google Scholar
  24. [24]
    Pereira, Fernando and Warren, David (1980). Definite clause grammars for language analysis — a survey of the formalism and a comparison with augmented transition networks. In Artificial Intelligence, vol.13: 231–278.CrossRefGoogle Scholar
  25. [25]
    Pereira, Fernando (1985). A structure-sharing representation for unification-based grammar fornalisms. In Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics, Chicago, Illinois.Google Scholar
  26. [26]
    Phillips, John and Thompson, Henry (1985). GPSGP — a parser for generalised phrase structure grammars. Linguistics 23 (2), 245–261.CrossRefGoogle Scholar
  27. [27]
    Phillips, John and Thompson, Henry (1986a) A parser for generalised phrase structure grammars. To apper in Klein, E. and Haddock, N., Eds. Edinburgh Working Papers in Cognitive Science, University of Edinburgh: Edinburgh, Scotland.Google Scholar
  28. [28]
    Phillips, John and Thompson, Henry (1986b). A parser and an appropriate computational representation for GPSG. To be presented at the Autumn Meeting of the Linguistics Association of Great Britain, University of Edinburgh, Edinburgh, Scotland.Google Scholar
  29. [29]
    Procter, Paul (1978). Longman dictionary of contemporary English. Longman Group Limited, Harlow and London, England.Google Scholar
  30. [30]
    Russell, Graham; Pulman, Steve; Ritchie, Graeme; and Black, Alan A Dictionary and Morphological Analyser for English. In Proceedings of the Eleventh International Congress on Computational Linguistics, Bonn, Germany.Google Scholar
  31. [31]
    Shieber, Stuart (1984a). Direct parsing of ID/LP grammars. Linguistics and Philosophy, 7, 135–154.CrossRefGoogle Scholar
  32. [32]
    Shieber, Stuart (1984b). The design of a computer language for linguistic information. In Proceedings of the Tenth International Conference on Computational Linguistics. Stanford University, Stanford, California.Google Scholar
  33. [33]
    Shieber, Stuart (1985). Criteria for designing computer facilities for linguistic analysis. Linguistics 23 (2), 189–211.CrossRefGoogle Scholar
  34. [34]
    Shieber, Stuart (1986a). Separating linguistic analyses from linguistic theories. In Whitelock et al.; revised version in this volume.Google Scholar
  35. [35]
    Shieber, Stuart (1986b). A simple reconstruction of GPSG. In Proceedings of the 11th International Conference on Computational Linguistics, Bonn, Germany.Google Scholar
  36. [36]
    Thompson, Henry (1981). Chart parsing and rule schemata in PSG. In Proceedings of the 19th Annual meeting of the Association for Computational Linguistics, Stanford, CA.Google Scholar
  37. [37]
    Thompson, Henry (1983). A flexible, modular chart parsing system. In Proceedings of the National Conference on Artificial Intelligence, Washington, D.C.Google Scholar
  38. [38]
    Thompson, Henry (1982). Handling metarules in a parser for GPSG. In Barlow, M; Flickinger, D.; and Sag, I., Eds. Developments in Generalised Phrase Structure Grammar: Stanford Working Papers in Grammatical Theory, Volume 2, Indiana University Linguistics Club: Bloomington, Indiana.Google Scholar
  39. [39]
    Thompson, Henry (1985). Implementation: distraction or contribution. In Whitelock et al.Google Scholar
  40. [40]
    Whitelock, P.; Somers, H.; Bennett, P.; Johnson, R.; and Wood, MM.., Eds. (1986). Proceedings of Alvey/ICL Workshop on Linguistic Theory and Computer Applications. Transcripts available as a CCL/UMIST report No. 86/2: Centre for Computational Linguistics, University of Manchester Institute of Science and Technology, Manchester, England.Google Scholar

Copyright information

© D. Reidel Publishing Company 1988

Authors and Affiliations

  • Branimir Boguraev
    • 1
  1. 1.Computer LaboratoryUniversity of CambridgeCambridgeEngland

Personalised recommendations