Skip to main content
  • 2540 Accesses

Abstract

Part I has explained the mechanism of natural communication, based on the [2+1] level structure of the Slim theory of language and different kinds of language signs. Part II turns to the combinatorial building up of complex signs within the grammar component of syntax. The methods are those of formal language theory, a wide field reaching far into the foundations of mathematics and logic. The purpose here is to introduce the linguistically relevant concepts and formalisms as simply as possible, explaining their historical origin and motivation as well as their different strengths and weaknesses. Formal proofs will be kept to a minimum.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In formal language theory, the lexicon of an artificial language is sometimes called the alphabet, a word a letter, and a sentence a word. From a linguistic point of view this practice is unnecessarily misleading. Therefore a basic expression of an artificial or a natural language is called here uniformly a word (even if the word consists of only a single letter, e.g., a) and a complete well-formed expression is called here uniformly a sentence (even if it consists only of a sequence of one-letter-words, e.g., aaabbb).

  2. 2.

    In other words: the free monoid over LX equals \(\mathrm{LX}^{\mathsf{+}} \cup \{\varepsilon\}\) (Harrison 1978, p. 3).

  3. 3.

    The subsets of infinite sets may themselves be infinite. For example, the even numbers, e.g., 2,4,6, … form an infinite subset of the natural numbers 1,2,3,4,5, …. The latter are formed from the finite lexicon of the digits 1,2,3,4,5,6,7,8,9, and 0 by means of concatenation, e.g., 12 or 21.

  4. 4.

    This is because an explicit list of the well-formed sentences is finite by nature. Therefore it would be impossible to make a list of, for example, all the natural numbers. Instead the infinitely many surfaces of possible natural numbers are produced from the digits via the structural principle of concatenation.

  5. 5.

    A detailed introduction to PS Grammar is given in Chap. 8.

  6. 6.

    This latter approach is taken by Left-Associative grammar (LA Grammar, Chap. 10).

  7. 7.

    An algebraic definition of C Grammar is provided in 7.4.2, of PS Grammar in 8.1.1, and of LA Grammar in 10.2.1, respectively.

  8. 8.

    The relation between C and PS Grammar is described in Sect. 9.2. The different language hierarchies of PS and LA Grammar are compared in Chap. 12.

  9. 9.

    For example, Chomsky originally thought that the recoverability condition

    of deletions would keep transformational grammar decidable (see Sect. 8.5). However, Peters and Ritchie proved in 1972 that TG is undecidable despite this condition.

    When Gazdar (1981) proposed the additional formalism of metarules for context-free PS Grammar, he formulated the finite closure condition to ensure that metarules would not increase complexity beyond that of context-free. However, the condition was widely rejected as linguistically unmotivated, leading Uszkoreit and Peters (1986) to the conclusion that GPSG is in fact undecidable.

  10. 10.

    In linguistics, examples of ungrammatical structures are marked with an asterisk *, a convention which dates back at least to Bloomfield (1933).

  11. 11.

    The mathematical properties of informal descriptions, on the other hand, cannot be investigated because their structures are not sufficiently clearly specified.

  12. 12.

    Programs which are not based on a declarative specification may still run. However, as long as it is not clear which of their properties are theoretically necessary and which are an accidental result of the programming environment and the programmer’s idiosyncrasies, such programs – called hacks – are of little theoretical interest. From a practical point of view, they are difficult to scale up and hard to debug. The relation between grammar systems and their implementation is further discussed in Sect. 15.1.

  13. 13.

    Quechua is a language of South-American Indians.

  14. 14.

    A good intuitive summary may be found in Geach (1972). See also Lambek (1958) and Bar-Hillel (1964), Chap. 14, pp. 185–189.

  15. 15.

    In contrast, square_root is not a function, but called a relation because it may assign more than one value to an argument in the domain. The root of 4, for example, has two values, namely 2 and −2.

  16. 16.

    An alternative algebraic definition of C Grammar may be found in Bar-Hillel (1964), p. 188.

  17. 17.

    The names and the number of elementary categories (here, u and v) are in principle unrestricted. For example, Ajdukiewicz used only one elementary category, Geach and Montague used two, others three.

  18. 18.

    The term fragment is used to refer to that subset of a natural language which a given formal grammar is designed to handle.

  19. 19.

    Sect. 19.4 and CoL, pp. 292–295.

  20. 20.

    For simplicity and consistency, our notation differs from Montague’s in that the distinction between syntactic categories and semantic types is omitted, with arguments positioned before the slash.

  21. 21.

    For an attempt see SCG.

References

  • Ajdukiewicz, K. (1935) “Die syntaktische Konnexität,” Studia Philosophica 1:1–27

    Google Scholar 

  • Bar-Hillel, Y. (1953) “Some Linguistic Problems Connected with Machine Translation,” Philosophy of Science 20:217–225

    Article  Google Scholar 

  • Bar-Hillel, Y. (1964) Language and Information. Selected Essays on Their Theory and Application, Reading: Addison-Wesley

    MATH  Google Scholar 

  • Bloomfield, L. (1933) Language, New York: Holt, Rinehart, and Winston

    Google Scholar 

  • Gazdar, G. (1981) “Unbounded Dependencies and Coordinate Structure,” Linguistic Inquiry 12.2:155–184

    Google Scholar 

  • Geach, P. (1972) “A Program for Syntax,” in D. Davidson and G. Harman (eds.), 483–497

    Google Scholar 

  • Halliday, M.A.K. (1985) An Introduction to Functional Grammar, London: Edward Arnold

    Google Scholar 

  • Harrison, M. (1978) Introduction to Formal Language Theory, Reading: Addison-Wesley

    MATH  Google Scholar 

  • Kleene, S.C. (1952) Introduction to Metamathematics, Amsterdam

    MATH  Google Scholar 

  • Lamb, S. (1996) Outline of Stratificational Grammar, Washington: Georgetown University Press

    Google Scholar 

  • Lambek, J. (1958) “The Mathematics of Sentence Structure,” The American Mathematical Monthly 65:154–170

    Article  MATH  MathSciNet  Google Scholar 

  • Leśniewski, S. (1929) “Grundzüge eines neuen Systems der Grundlagen der Mathematik,” Fundamenta Mathematicae 14:1–81

    MATH  Google Scholar 

  • Tesnière, L. (1959) Éléments de syntaxe structurale, Paris: Editions Klincksieck

    Google Scholar 

  • Uszkoreit, H., and S. Peters (1986) “On Some Formal Properties of Metarules,” Report CSLI-85-43, Stanford University: Center for the Study of Language and Information

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Exercises

Exercises

Section 7.1

  1. 1.

    How is the notion of a language defined in formal grammar?

  2. 2.

    Explain the notion of a free monoid as it relates to formal grammar.

  3. 3.

    What is the difference between positive closure and Kleene closure?

  4. 4.

    In what sense can a generative grammar be viewed as a filter?

  5. 5.

    Explain the role of recursion in the derivation of aaaabbbb using definition 7.1.3.

  6. 6.

    Why is PS Grammar called a generative grammar?

  7. 7.

    What is an algebraic definition and what is its purpose?

  8. 8.

    What is the difference between elementary, derived, and semi-formal formalisms?

  9. 9.

    What is the reason for the development of derived formalisms?

Section 7.2

  1. 1.

    Explain the difference in the well-formedness for artificial and natural languages.

  2. 2.

    Why is a formal characterization of grammatical well-formedness a descriptive goal of theoretical linguistics?

  3. 3.

    Name three reasons for using formal grammar in modern linguistics.

  4. 4.

    Why is the use of formal grammars a necessary, but not a sufficient, condition for a successful language analysis?

Section 7.3

  1. 1.

    Under what circumstances is a formal grammar descriptively adequate?

  2. 2.

    What is meant by the mathematical complexity of a grammar formalism and why is it relevant for practical work?

  3. 3.

    What is the difference between functional and nonfunctional grammar theories?

  4. 4.

    Which three aspects should be jointly taken into account in the development of a generative grammar and why?

Section 7.4

  1. 1.

    Who invented C Grammar, when, and for what purpose?

  2. 2.

    When was C Grammar first applied to natural language and by whom?

  3. 3.

    What is the structure of a logical function?

  4. 4.

    Give an algebraic definition of C Grammar.

  5. 5.

    Explain the interpretation of complex C Grammar categories as functors.

  6. 6.

    Why is the set of categories in C Grammar infinite and the lexicon finite?

  7. 7.

    Name the formal principle allowing the C Grammar 7.4.4 to generate infinitely many expressions even though its lexicon and its rule set are finite.

  8. 8.

    Why is the grammar formalism defined in 7.4.4 called bidirectional C Grammar?

  9. 9.

    Would it be advisable to use C Grammar as the syntactic component of the Slim theory of language?

Section 7.5

  1. 1.

    Why is C Grammar prototypical of a lexical approach?

  2. 2.

    What is meant by a fragment of a natural language in formal grammar?

  3. 3.

    Explain the relation between a functional interpretation of complex categories in C Grammar and the model-theoretic interpretation of natural language.

  4. 4.

    Explain the recursive structure in the C Grammar 7.5.4.

  5. 5.

    Explain how the semantic interpretation of C Grammar works in principle.

  6. 6.

    Extend the C Grammar 7.5.4 to generate the sentences The man sent the girl a letter, The girl received a letter from the man, The girl was sent a letter by the man. Explain the semantic motivation of your categories.

  7. 7.

    Why are there no large-scale descriptions of natural language in C Grammar?

  8. 8.

    Why are there no efficient implementations of C Grammar?

  9. 9.

    Why is the absence of efficient implementations a serious methodological problem for C Grammar?

  10. 10.

    Does C Grammar provide a mechanism of natural communication? Would it be suitable as a component of such a mechanism?

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hausser, R. (2014). Formal Grammar. In: Foundations of Computational Linguistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41431-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41431-2_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41430-5

  • Online ISBN: 978-3-642-41431-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics