Skip to main content

From MT to Computational Linguistics and Natural Language Processing

  • Chapter
  • First Online:
Automating Linguistics

Part of the book series: History of Computing ((HC))

  • 395 Accesses

Abstract

Automatic syntactic analysis constituted the theoretical basis of “the new linguistics”, as recommended by ALPAC, and ensured the legitimacy of computational linguistics. It also conditioned the emergence of the Chomskyan program, which actually was closely associated with the horizon of retrospection created by MT, so that it can be identified as one of its horizons of projection.

The development of syntactic analysis in MT centres was the result of three main syntactic approaches, which interacted with each other at one point or another:

  1. (i)

    The distributionalists’ approach, especially that of Hockett and Harris (see Chap. 4)

  2. (ii)

    Bar-Hillel’s approach, directly inspired by Carnap

  3. (iii)

    Chomsky’s approach

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 44.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The first mathematisation of language corresponds to the rise of formalisation promoted by the School of Vienna and is characterised by the setting in interaction of algorithms and formal languages resulting from mathematical logic (see Introduction Chap. 1).

  2. 2.

    The linguists commonly referred to as the Neo-Bloomfieldians were followers of Boas, Sapir and Bloomfield. They were linguistic anthropologists focused on the description of American Indian languages that had in common an inductive approach and the use of distributional analysis inaugurated by Bloomfield, whose interest in the mathematisation of language they also shared (see Chap. 4).

  3. 3.

    The notion of constituent structure was first made explicit by Bloomfield (1933), Harris (1946) and Wells (1947). A constituent is any word or group of words which enter into some larger constructions (i.e. group of words). Immediate constituents are the constituents of which any given construction is directly formed. For example, there are two immediate constituents “the old man who lives there” and “has gone to his son’s house” in the utterance “The old man who lives there has gone to his son’s house”. “Old man” is an IC of “old man who lives there”, but not of the whole utterance. However, “there has” or “man who” is not a constituent. The ICs of a given construction are its constituents on the next lower level. Those on any still lower level are constituents but not immediate constituents (from Gleason 1969, pp.128–148).

    IC-analysis is a syntactic method; as such its goal is to find the best possible organisation of any given utterance. The end result of IC-analysis is often presented in a visual diagrammatic form that reveals the hierarchical immediate constituent structure of the sentence at hand, for example, (Fig. 6.1).

  4. 4.

    For a formal definition of syntactic connexity, see Ajdukiewicz (1935).

  5. 5.

    Lemmatisation is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word’s lemma, or dictionary form. For example, the verb “to buy” may appear as “buy”, “buys”, “bought” or “buying”. The lemma “buy” corresponds to the dictionary form.

  6. 6.

    “Parsing” comes from Latin pars orationis (parts of speech). To parse means to break something down into its parts. A syntactic parser is a program that scans a sequence (generally a sentence) and analyses it into its syntactic components according to the rules of a grammar.

  7. 7.

    Lucien Tesnière (1893–1954), a French linguist and specialist in Slavic languages, is the author of two books: Esquisse d’une syntaxe structurale (1953) and Elements de syntaxe structurale (1959). Certain concepts advanced in these works, like valence and dependency structure, inspired several models of formal linguistics and of MT in France, in the USA, and in the countries of the Soviet Bloc.

  8. 8.

    Garvin contrasted the heuristic method with the algorithmic method. The algorithmic method is deterministic and complete. It takes into account all the instructions necessary to get from one point to another and leads necessarily to the result. On the contrary, Garvin recommended a heuristic method, which is more of an assistance to discover the result than a direct way to reach it, for example, by resorting to arbitrary choices or learning based on trial and error.

  9. 9.

    Maurice Gross (1934–2001) was a pioneer in MT in France. He also worked with Chomsky at MIT and Harris at the University of Pennsylvania during the early 1960s. He played an important role in disseminating information on formal grammars in France and largely contributed to spreading Harris’s works and integrating them into French linguistics (see Chaps. 8 and 9).

  10. 10.

    Peters and Ritchie argue that transformational grammars are too powerful, insofar as they generate all kinds of languages, including nonrecursive languages. The problem originates in the fact that these grammars apply their derivation rules an unlimited number of times, in particular for short sentences. One result of this study is to show that the linguists’ intuition, according to which natural languages are recursive, is empirically founded.

  11. 11.

    Post’s canonical systems (1943) belong to research from the 1930s and 1940s, which attempted to characterise the concept of algorithm, as applied to mathematics, in a formal way. Thus Turing machines and Post’s production systems led to the same calculable functions. Devised as systems for manipulating strings of character, they comprised a triplet:

    • A finite alphabet, strings set up from this alphabet, or words

    • A set of initial words

    • A set of rules to manipulate character strings (or production rules)

    Chomsky’s rewriting rules are directly inspired by Post’s systems although, as Pullum and Scholz (2007) remarked, Chomsky never cites Post’s key technical papers. See Partee (1978, p.167–168) referring to Chomsky and Miller (1963 section 4) for a presentation of the formalisation of these rewriting systems.

  12. 12.

    See Marcus (1988) on this point.

  13. 13.

    See Cori and Marandin (2001) on reciprocal borrowing between data processing and formal grammars, in particular generative grammar.

  14. 14.

    In 1973, AMTCL was renamed Association for Computational Linguistics (ACL) thus dropping the reference to MT. A bulletin, The Finite String, was created in 1964, and the association started to organise conferences that took place every 2 years. The first one took place in Denver, Colorado, in August 1963. The journal Mechanical Translation, from 1965 until 1974 (when Yngve gave up its direction), was called Mechanical Translation and Computational Linguistics. From 1974 to 1983, it was called American Journal of Computational Linguistics, and finally from 1984 onwards Computational Linguistics.

  15. 15.

    In that period of Cold War, American works on MT were well known by Russian researchers who started their first experiments in the wake of the first demonstration of MT on an IBM computer in New York in 1954. Some Russian works were based on American phrase-structure grammars and syntactic parsers, while other methods, more original and anchored in the Russian tradition, were founded on semantic grounds (see Chap. 7 §7.1.3). Conversely, the Americans started translating Soviet works from 1957 within the Joint Publications Research Service. The Slavist Kenneth E. Harper from UCLA established a state-of-the-art survey of Soviet works, conferences and journals in 1961 (Harper 1961). However, American MT researchers did not draw inspiration from Soviet methods contrary to the French who used both American and Russian methods (see Chap. 8 §8.6). The other European countries of the Soviet bloc only started work on MT in the mid-1960s and did not develop original methods.

  16. 16.

    According to Murray (1993), the Congress organising committee, made up of Morris Halle, William Locke, Horace Hunt and Edward Klima reserved a plenary session for Chomsky, although he was a generation younger than the four other invited speakers. Besides he was allotted four times more space in the proceedings. As for the Neo-Bloomfieldians, they boycotted the congress, acknowledging their defeat. According to Barsky (2011), Chomsky’s and Harris’ biographer, Harris gave up his plenary session to Chomsky.

  17. 17.

    The Macy Conferences were organised by the pioneers of cybernetics, among them McCulloch and Rosenblueth, to support interdisciplinary meetings under the aegis of the Josiah Macy Jr. Foundation, which was created in 1930 and was specialised in medical research. Only the last five conferences have been published.

  18. 18.

    This term refers to the traditional opposition between “strong AI”, which holds that the machine is capable of reproducing a cognitive behaviour or simulating an organism and its relationships of adaptation to an environment, and “weak AI” which holds that the machine can simulate a fragment of “synthetic” intelligence whose composition is completely different from human intelligence, but whose result, i.e. the production of representations, is identical to what human intelligence would produce.

  19. 19.

    It should be noted that, in 1951, Kleene showed this part of McCullough and Pitts does not hold up mathematically. Thanks to Maarten Bullynck for this remark.

  20. 20.

    See Chap. 7 below for a full presentation of this group of MT.

Bibliography

  • Ajdukiewicz, K. 1935. Die syntaktische Konnexität. Studia philosophica 1: 1–27.

    MATH  Google Scholar 

  • Akhmanova, O.S., I.A. Mel’čuk, R.M. Frumkina, and E.V. Paducheva. 1963. Exact Methods in Linguistic Research. Berkeley: University California Press.

    Google Scholar 

  • ALPAC. Language and Machines. Computers in translation and linguistics (1966) A report by the Automatic Language Processing Advisory Committee (ALPAC), National Academy of Sciences, National Research Council.

    Google Scholar 

  • Bar-Hillel, Y. 1953b. A Quasi-Arithmetic Notation for Syntactic Description. Language 29: 47–58.

    Article  MATH  Google Scholar 

  • ———. 1955. Idioms. In Machine Translation of Languages, 14 Essays, ed. W.N. Locke and A.D. Booth, 183–193. Cambridge, MA/New York: Wiley/MIT.

    Google Scholar 

  • ———. 1960. The Present Status of Automatic Translation of Languages. In Advances in Computers 1, ed. F.C. Alt, 91–141. London: Academic.

    Google Scholar 

  • Barsky, R. 2011. Zellig Harris. From American Linguistics to Socialist Zionism. Cambridge, MA: MIT Press.

    Book  Google Scholar 

  • Bloomfield, L. 1933. Language. New York: H. Holt and Company.

    Google Scholar 

  • Chomsky, N. 1956. Three Models for the Description of Language. IRE (Institute of Radio Engineers) Transactions on Information Theory IT-3: 113–124.

    Article  MATH  Google Scholar 

  • ———. 1957. Syntactic Structures. London: Mouton.

    Book  MATH  Google Scholar 

  • Chomsky, N., and G.A. Miller. 1963. Introduction to the Formal Analysis of Natural Languages. In Handbook of Mathematical Psychology 2, ed. D. Luce, R. Bush, and E. Galanter, 269–321. New York: Addison-Wiley.

    Google Scholar 

  • Chomsky, N., and M-P. Schützenberger 1963. The Algebraic Theory of Context-Free Languages. In Computer Programming and Formal Systems, Studies in logic and the Foundations of mathematics, vol. 14, ed. P. Braffort, and D. Hirschberg, 118–161. Amsterdam: North-Holland Publ. Co..

    Google Scholar 

  • Cori, M., and J. Léon. 2002. La constitution du TAL. Etude historique des dénominations et des concepts. Traitement Automatique des Langues 43 (3): 21–55.

    Google Scholar 

  • Cori, M., and J.-M. Marandin. 2001. La linguistique au contact de l’informatique: de la construction de grammaire aux grammaires de construction. Histoire Épistémologie Langage 23 (1): 49–79.

    Article  Google Scholar 

  • Dupuy, J.-P. 1994. Aux origines des sciences cognitives. Paris: La Découverte.

    Google Scholar 

  • Garvin, P. 1968. Machine translation Today: The Fulcrum Approach and Heuristics. Lingua 21: 162–182.

    Article  Google Scholar 

  • Gleason, H.A. 1969. An Introduction to descriptive linguistics. Revised Edition. London/New York/Sydney/Toronto: Holt Rinehart Winston.

    Google Scholar 

  • Gross, M. 1964. The equivalences of models of language used in the fields of mechanical translation and information retrieval. Information Storage and Retrieval 2 (1): 43–57.

    Article  MathSciNet  Google Scholar 

  • Harper, K.E. 1961. Soviet research in machine translation. In Proceedings of the National Symposium on Machine translation, ed. H.P. Edmundson, 2–12. Los Angeles: University of California.

    Google Scholar 

  • Harris, Z.S. 1946. From Morpheme to Utterance. Language 22 (3): 161–183.

    Article  Google Scholar 

  • ———. 1951a. Methods in Structural Linguistics. Chicago: The University of Chicago Press.

    Google Scholar 

  • Hays, D.G. 1964. Dependency theory: A formalism and some observations. Language 40 (4): 511–525.

    Article  Google Scholar 

  • Heims, S.J. 1993. Constructing a Social Science for Postwar America: The cybernetics Group, 1946–1953. Cambridge, MA: The MIT Press.

    Google Scholar 

  • Hockett, C.F., and R. Ascher. 1964. The Human Revolution. Current Anthropology 5: 135.

    Article  Google Scholar 

  • Hutchins, W.J. 1986a. Machine translation, past, present, future. Chichester ltd: Ellis Horwood.

    Google Scholar 

  • Lamb, S.M. 1962. On the mechanisation of syntactic analysis. Proceedings of the International Conference on Machine translation and Applied Language Analysis, Teddington 1961. HMSO, London, pp 673–686

    Google Scholar 

  • Lecerf, Y. 1960. Programme des conflits, modèle des conflits. La Traduction automatique 4–5: 17–36.

    Google Scholar 

  • Locke, W.N., and A.D. Booth, eds. 1955. Machine Translation of Languages, 14 Essays. Cambridge MA, New York: MIT/Wiley.

    Google Scholar 

  • Marcus, S. 1988. Mathématiques et linguistique. Mathématiques et sciences humaines 103: 7–21.

    MATH  Google Scholar 

  • Murray, S.O. 1993. Theory Groups and the Study of Language in North America. Amsterdam: Benjamins (SiHoLS 69).

    Google Scholar 

  • Partee Hall, B. 1978. Fundamentals of Mathematics for Linguistics. Dordrecht: D.Reidel Publishing Company.

    MATH  Google Scholar 

  • Pélissier, A., and A. Tête. 1995. Sciences cognitives. Textes fondateurs (1943–1950). Paris: PUF.

    Google Scholar 

  • Peters, S., and Robert Ritchie. 1973. On the Generative Power of Transformational Grammars Information Science 6: 49–83.

    Google Scholar 

  • Post, E.L. 1943. Formal Reductions of the General Combinatorial Decision Problem. American Journal of Mathematics 65 (2): 197–215.

    Article  MathSciNet  MATH  Google Scholar 

  • Pratt, V. 1987. Thinking Machines. The Evolution of Artificial Intelligence. Oxford: Basil Blackwell.

    Google Scholar 

  • Pullum, G.K., and B. Scholz. 2007. Review article: Tracking the origins of transformational grammar Marcus Tomalin, 2006, Linguistics and the formal sciences: The origins of generative grammar (Cambridge Studies in Linguistics 110). Cambridge: Cambridge University Press. Journal of Linguistics 43 (3): 701–723.

    Article  Google Scholar 

  • Rumelhart, D.E., J.L. McClelland, and The PDP Research Group. 1986. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, MA: 1 & 2 MIT Press.

    Book  Google Scholar 

  • Savitch, W.J., E. Bach, W. Marsh, and N.G. Safran, eds. 1987. The Formal Complexity of Natural Language. Dordrecht: D. Reidel.

    Google Scholar 

  • Segal, J. 2003. Le Zéro et le Un. Histoire de la notion scientifique d’information au XXe siècle. Paris: Éditions Syllepse.

    Google Scholar 

  • Shannon, C.E. 1950. A Chess-Playing Machine. Scientific American 182 (2): 48–51.

    Article  Google Scholar 

  • Shannon, C.E., and W. Weaver. 1949. The Mathematical Theory of Communication, Urbana-Champaign. Urbana: University of Illinois Press.

    MATH  Google Scholar 

  • Tesnière, L. 1953. Esquisse d’une syntaxe structurale. Paris: Klincksieck.

    Google Scholar 

  • ———. 1959. Éléments de syntaxe structurale. Paris: Klincksieck.

    Google Scholar 

  • Turing, A.M. 1950. Computing Machinery and Intelligence. Mind 59: 433–460.

    Article  MathSciNet  Google Scholar 

  • Varela, F. 1989. Invitation aux sciences cognitives. Paris: Éditions du Seuil.

    Google Scholar 

  • Waltz, D.L., and J.B. Pollack. 1985. Massively Parallel Parsing: A Strongly Interactive Model of Natural Language Interpretation. Cognitive Science 9 (1): 51–74.

    Article  Google Scholar 

  • Weaver, W. 1955. Translation. In Machine Translation of Languages, 14 Essays, ed. W.N. Locke and A.D. Booth, 15–23. Cambridge MA/New York: MIT/Wiley.

    Google Scholar 

  • Wells, R.S. 1947. Immediate Constituents. Language 23: 81–117.

    Article  Google Scholar 

  • Winograd, T. 1972. Understanding Natural Language. New York: Academic.

    Book  Google Scholar 

  • Yngve, V.H. 1955. Syntax and the Problem of Multiple Meaning. In Machine Translation of Languages, 14 Essays, ed. W.N. Locke and A.D. Booth, 208–226. Cambridge, MA/New York: MIT/Wiley.

    Google Scholar 

  • ———. 1959. The COMIT System for Mechanical translation, 183–187. IFIP Congress 1959, Paris France.

    Google Scholar 

  • ———. 1960. A Model and an Hypothesis for Language Structure. Proceedings of the American Philosophical Society 104 (5): 444–466.

    MathSciNet  Google Scholar 

  • ———. 1964. Implications of Mechanical Translation Research. Proceedings of the American Philosophical Society 108 (5): 275–281.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Léon, J. (2021). From MT to Computational Linguistics and Natural Language Processing. In: Automating Linguistics. History of Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-70642-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-70642-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-70641-8

  • Online ISBN: 978-3-030-70642-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics