Abstract
Having described possibilities of a purely technological approach to natural language processing in the previous chapter, we turn next to its limitations and ways of overcoming them by using linguistic knowledge. Section 2.1 explains the structures underlying the use of textual databases. Section 2.2 shows how linguistic methods can improve the retrieval from textual databases. Section 2.3 shows how different applications require linguistic knowledge to different degrees in order to be practically useful. Section 2.4 explains the notion of language pairs in machine translation and describes the direct and the transfer approach. A third approach to machine translation, the interlingua approach, as well as computer-based systems for aiding translation are described in Section 2.5.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A widely used notation for specifying patterns of letter sequences are regular expressions as implemented in Unix.
See G. Salton 1989, S. 248.
STAIRS is an acronym for Storage and Information Retrieval System a software product developed and distributed by IBM.
Some database systems already use thesauri,though with mixed results. Commercially available lexica are in fact likely to lower precision without improving recall. For example, in Websters New Collegiate Dictionary,the word car is related to vehicle, carriage, cart, chariot, railroad car, streetcar, automobile, cage of an elevator, and part of an airship or balloon. With the exception of automobile, all of these would only lower precision without improving recall.
Operations which are performed while the user is interacting with the system are called on the fly operations,in contrast to batch mode operations,e.g. building up an index, which are run when the system is closed to public use.
As mentioned in the Introduction III, Eliza is based on the primitive mechanics of predefined sentence patterns, yet may startle the naive user by giving the appearance of understanding, both on the level of language and of human empathy.
See 2.5.5, 3.
The alternative between smart and solid solutions will be illustrated with statistically-based (Sections 13.4, 13.5) and rule-based (Chapter 14) systems of word form recognition.
Presuming a given pair of languages.
Another example is the United Nations, which generate a volume of similar magnitude.
For this reason a translation from French into Danish will require a French-+Danish dictionary, but hardly a Danish-*French dictionary. Because of language specific lexical gaps, idioms, etc., the vocabulary of the two languages is not strictly one to one, for which reason the two dictionaries are not really symmetric.
P. Wheeler & V. Lawson 1982.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hausser, R. (1999). Technology and grammar. In: Foundations of Computational Linguistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-03920-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-662-03920-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-03922-9
Online ISBN: 978-3-662-03920-5
eBook Packages: Springer Book Archive