Introduction to String Analysis

Harris, Zellig S.

doi:10.1007/978-94-017-6059-1_17

Zellig S. Harris³

Part of the book series: Formal Linguistics Series ((FLIS))

Abstract

String analysis has developed out of an attempt to carry out syntactic analysis on a computer, just as, some ten years earlier, transformational analysis developed out of the attempt to normalize texts for discourse analysis. The arrangement of syntax for computability, following in part the method presented in ‘From Morpheme to Utterance’ (Language 22 (1946), 161–83; Paper VI of this volume) was based on an effective procedure for finding in each sentence a sequence (in general, broken) of words which was itself a sentence, belonging to a certain set of minimal sentence structures. This minimal sentence was called the center of the given sentence, and its meaning had an important and central relation to the meaning of the given sentence; this relation can be specified independently of the given sentence. The remainder of the sentence consisted of adjunctions to the center or to the adjunctions; an effective procedure was presented for an ordered determining of these adjunctions, and the ordered adjunctions had an interpretation independent of the given sentence. The original version of this analysis, made for the Univac sentence-decomposing program of 1959, is given in Computable Syntactic Analysis (TDAP 15), 1959 (Paper XVI of this volume).¹

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 99.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Notes

A formal treatment of string analysis is given in String Analysis of Sentence Structure, Mouton & Co., The Hague, 1962.
Google Scholar
Square brackets enclose material which is necessary to the center of the given sentence but which appears as an adjunct in other, related, centers: if the noun here were not a singular count noun, the bracketed word would be an adjunct (e. g.... decomposes each theory... would have as center... decomposes theory...). The center includes the object required by its verb. After a certain subset of verbs which includes decompose, subdivide, the object is N into M, where N indicates nouns and M indicates mass or aggregate nouns (e. g. nothingness, list) or plural nouns (e. g. sentences) or count-noun and (or plus or with) N. (Count-noun indicates nouns that require the, a, each, etc.)
Google Scholar
If the word is where, when, etc. what follows is a full sentence structure.
Google Scholar
For present purposes it will suffice to say that a grammatical class is a class of words or affixes, selected so that its members will have the same occurrence in respect to other grammatical classes in the formulation of centers and adjuncts. A structure of a sequence of words is a mapping of it onto its grammatical classes. E. g. of the center is an adjunct, and PN is its structure.
Google Scholar
Many sentence-adjunct structures may also occur at specified interior points of the structure they adjoin. Conceivably, there may be adjuncts which occur at a distance from their adjoined element. In English this happens almost only when two or more adjuncts occur in succession, e. g. V PN PN.
Google Scholar
E. g. in the case of count-nouns, the article (the, a, each, one, etc.) is required (hence not an adjunct) before them in most positions, but it is an adjunct (since it is not required) before them in some cases when they open a verb object.
Google Scholar
Henceforth, ‘adjunct’ and ‘center’ will often be used instead of ‘adjunct-structure’ and ‘center-structure’.
Google Scholar
The interest in selecting the center, therefore, depends entirely on the possibility of finding a reasonably simple set of adjunct-structures whose permitted combinations, when extracted from the sentences of the language, leave a reasonably simple set of center structures. The considerations here are comparable to those that determine how classes and constructions are set up in the usual structural (descriptive) linguistics.
Google Scholar
The center-structure may be called the center string (of grammatical classes) and the adjunct structures may be called substrings (each consisting of a sequence of one or more grammatical class symbols). In many languages it will be seen that all strings are connected except for the insertion of (connected) other strings. Thus, if string X consists of initial and final parts X ₁ and X ₂, we may find Y ₁ X ₁ X ₂ Y ₂ but not Y ₁ X ₁ Y ₂ X ₂. If the latter form occurs in a language, it may be that the words of X ₁ and X ₂ contain markers (affixes, sub-class membership, etc.) indicating that they go together, so that we can collect these related parts by permutation and obtain an artificial form Y ₁ X ₁ X ₂ Y ₂. On the basis of comments by Henry Hiz, it may be more correct to say that restricted intercalation may occur in various conditions, perhaps always provided that there are some grammatical features or sub-class relations (such as a word in one section being a classifier of a word in another) which would make it possible (for the hearer) to collect the sections that belong to one string.
Google Scholar
Recognizing and characterizing a family of related and similar substrings is aided by a tentative use of transformational criteria. For instance, adjoined to the right of N we find various adjuncts which begin with wh- (or P wh-): who met him, whom she met, which surprised him, which he doubted, on whom he relied, whom he relied on, near which he lived, where he lived, etc. We can describe these as one structure (or family of structures) if we say that after wh- there always follows a sentence one of whose nouns has been replaced by the pronominal morpheme after wh:-o replacing a subject noun,-om replacing an object or adjunct noun,-ich replacing any noun,-ere replacing in plus noun, etc. We then have to say that the adjunct structure is wh- plus the pronominal morpheme plus a sentence minus the pronouned noun; and if the pronouned noun had a P before it, that P may appear before the wh-
Google Scholar
More generally, we may note certain unusual structures which occur only in the neighborhood of particular other symbols or structures. For example, N t (t: tense or auxiliary) occurs after conjunctions or in a sequence of matched or related sentences: He won’t go but I may. We also find it after certain (matched) sentence structures which have conjunctions before them: Since he won’t go, I may. We can say that the N t is a variant form of N t V (plus object of the verb), the V being that of the matched sentence: i. e. it is morphophonemically N t V with zero variant of the same V that occurs in the corresponding positions of the matched sentence: Since he won’t go, I may (go).
Google Scholar
It should be recognized, as pointed out by Henry Hiz, that the properties of adjuncts presented here reflect to some extent the particular situation in English. In languages in which word-order is more free, and in which inflection is more important, other properties may appear more characteristic.
Google Scholar
Note that we do not have to add ‘or sequences’, for the strings are not defined in terms of sequences, except insofar as a certain sequence of morphemes or morpheme classes may be defined as a member of a word class which is used in defining a string. Beyond this point, we can say about any sequence of grammatical classes either that it is part of the composition of some string, or else that part of it belongs to one string and the next part of it is the beginning or end of another string which is insertable at the given point.
Google Scholar
It may be useful to distinguish center-and-adjunct analysis from constituent analysis of sentences, as generally used in structural linguistics. In constituent analysis every sentence is decomposed into parts which are not themselves sentences. Each of these parts is further decomposed into one or more parts which are either the same as itself (with possibly other material in addition), or else which are different and smaller (i. e. are at a deeper level) than itself. Thus: Sentence = noun-phrase҄ verb-phrase (the concatenation mark ҄ indicates succession); or, if we use (M) to indicate the possible presence of sentence-modifiers and conjunctioned clauses: Sentence = (M)҄ noun-phrasen (M)҄verb-phrase҄(M). Of the verb phrase we may say (in English): verb-phrase = tense ҄verb҄object. Further: noun-phrase = noun, or else article ҄noun, or else the ҄adjective. But in addition: noun-phrase = noun-phrase ҄ adjective or wh-clause; here we have an entity which is decomposed into a structurally identical entity plus something else. In constituent analysis, this latter is only one of the various types of decomposition, and is specifically not the first decomposition of a sentence; whereas in center-and-adjunct analysis we precisely use this decomposition, in a strong form, to separate out the center.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Pennsylvania, USA
Zellig S. Harris

Authors

Zellig S. Harris
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Harris, Z.S. (1970). Introduction to String Analysis. In: Papers in Structural and Transformational Linguistics. Formal Linguistics Series. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-6059-1_17

Download citation

DOI: https://doi.org/10.1007/978-94-017-6059-1_17
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-017-5716-4
Online ISBN: 978-94-017-6059-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics