Agentive (para)synthetic compounds in Russian: a quantitative study of rival constructions


The paper compares two rival word-formation constructions giving rise to compound agent nouns in Russian, i.e., (para)synthetic compounds formed with the agentive suffixes -ec and -tel’, such as basnopisec ‘fable writer’ and bytopisatel’ ‘everyday-life writer’. To understand what makes these constructions different from one another, compounds in -ec and -tel’ are analyzed based on a number of formal and semantic criteria, i.e., the part of speech and semantic role of the non-verbal element of the compound, the transitivity and formal aspect of the verbal base of the compound, the animacy of the compound’s referent, and the semantics of the compound. The study is supported by statistical analyses, i.e., conditional inference trees and random forests, which help discriminate the behavior of rival constructions and determine which parameters are more relevant for the comparison. To understand whether diachronic and/or stylistic factors also affect the survival of rival constructions, the data are checked in the Russian National Corpus, which allows retrieving information about the texts in which compounds occur, such as their creation date and textual genre. Finally, the productivity of rival word-formation constructions in modern Russian is discussed both in terms of diachronic changes and in terms of restrictions that the two constructions are subject to. The analyses carried out demonstrate that the two constructions show significant differences regarding their semantics, but also their diachronic and stylistic distribution, as well as their productivity, which prevents one construction from completely ousting the other in modern Russian.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

    See “Abbreviations” at the end of the paper for the full list of the abbreviations employed.

  2. 2.

    The linking vowel is usually either -o- or -e-. The linking vowel -e- appears after palatalized and unpaired consonants (Švedova 1980, § 585). More rarely, other linking elements are used, i.e., -u-, -uch-, -ech-, -i-, -ja- (Švedova 1980, § 585).

  3. 3.

    The term “synthetic” is replaced by “verbal”, “deverbal”, or “secondary” within formal accounts of such compounds (cf. Roeper and Siegel 1978; Selkirk 1982; Di Sciullo 1992, 2005; Scalise 1994).

  4. 4.

    I would like to express my gratitude to Olga Lyashevskaya (School of Linguistics, NRU HSE, Moscow), who has provided access to the RNC word-formation database, which is currently not open for the public.

  5. 5.

    I have excluded from the analysis noun-based compounds in -ec because denominal word-formation does not constitute an area of functional overlapping between the two constructions, as the suffix -tel’ is not employed in denominal word-formation (see Sect. 3.1).

  6. 6.

    Compounds in -ec also include seven compounds formed with the suffix -l-ec, which is the result of metanalysis (cf. Luschützky 2011:90) and is strictly related to -ec.

  7. 7.

    Adjectival and adverbial bases are kept together because it is not always possible to establish with certainty the categorial status of such bases in compounds (cf. Bogdanov 2011:167).

  8. 8.

    Intransitive bases are all unergative, with the only exception of the verb žit’ ‘live’, which is unaccusative and is found in three compounds with the suffix -tel’.

  9. 9.

    The strong correlation of the suffix -tel’ with transitive verbal bases is also pointed out by Andrews (1996:99, 101).

  10. 10.

    The event indicated by the verbal base is something that takes place habitually: a fable writer (basnopisec) is one who writes fables habitually, as a professional; a fire extinguisher (ognetušitel’) is an instrument that is habitually employed to extinguish fire, and so on.

  11. 11.

    In Švedova (1980, § 216), it is claimed that the suffix -ec can be attached to both imperfective and perfective bases. However, I have not found cases of perfective bases in my data.

  12. 12.

    Prefixed verbal bases are rarely found in -ec compounds (the only exception in my data is constituted by compounds ending in -prochodec: pro- ‘forth, through’ + chodit’ ‘go’), while they are much more common in -tel’ compounds (cf. also Švedova 1980, §§ 211, 216, 217).

  13. 13.

    The compound bronenosec also has the meaning of ‘battleship’.

  14. 14.

    In Fig. 1, Prototypical Agents are abbreviated as “ag”, Carriers of State as “cos”, and Instruments as “instr”.

  15. 15.

    Exact values are obtained by applying the function round().

  16. 16.

    The RNC main subcorpus contains texts from the 18th century to the present day belonging to different genres (fiction, drama, memoirs and biographies, journalism and literary criticism, scientific and popular scientific texts, instructional texts, religious and philosophical texts, technical texts, business and jurisprudence texts, letters and diaries), for a total of over 200 million words.

  17. 17.

    The correlation between diachrony and textual genres cannot be directly verified in the RNC interface, but it is possible to create a subcorpus including a certain time span and check the texts contained in that time span.

  18. 18.

    See Bauer (2005) for an overview of different productivity theories.

  19. 19.

    Hapax legomena are words occurring only once in a given text.

  20. 20.

    Note that some of the compounds in the database might be older than the 18th century. However, the search in the main subcorpus of the RNC does not allow access to older texts.

  21. 21.

    The number of texts and, consequently, the number of tokens included in the main subcorpus of the RNC is different for different time spans: 4,726,499 tokens for the 18th century, 53,090,226 tokens for the 19th century, and 141,267,193 tokens for the 20th century.

  22. 22.

    For the dictionary search, I have used the site, which includes several dictionaries of modern Russian, among which are Ušakov (1935–1940), Ožegov and Švedova (1996), Kuznecov (1998), and Efremova (2000).

  23. 23.

    Cf. also Švedova (1980, § 559), where it is suggested that the productivity of the construction in -ec in compounding is limited to certain endings, i.e., -tvorec, -boec, -pisec, -borec, and especially -nosec and -ljubec.


