Skip to main content

On the Syntax and Translation of Finnish Discourse Clitics

  • Chapter
Shall We Play the Festschrift Game?

Abstract

Finnish has a set of morphemes called discourse clitics, which attach to words and express things like contrasting and reminding. This paper builds a formal grammar to specify the syntax and morphology of these clitics. The grammar is written in GF, Grammatical Framework, which has a distinction between abstract syntax (tree structures) and concrete syntax (surface structures such as strings). The abstract syntax of clitics defines their contribution to the discourse semantics of sentences, in particular the topic-focus structure. The concrete syntax defines the realization in Finnish. We also show another concrete syntax, for English, which makes it possible to translate between Finnish discourse clitics and corresponding devices in English. The paper shows a complete GF code of a small grammar demonstrating the main ideas and also gives a link to a web demo for translation. Theoretically, the work can be seen as a synthesis of a Montague semantics for clitics as proposed by Karttunen and Karttunen in 1976 and their explanation in terms of dialogue games following Lauri Carlson’s model of 1993.

One criterion is to think of the description as material for machine translation—that is the level of specificity I’d like to achieve. The description of the clitics should support translation between correct uses of clitics and corresponding devices in other languages. (Carlson 1993 : 5)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The full list is ko, pa, han, and s. Also the combinations kos, kohan, pas, and pahan exist. The ones we study here are chosen because of their high frequency and clearly distinguishable meanings.

  2. 2.

    http://www.grammaticalframework.org/demos/finnish-clitics/.

  3. 3.

    http://translate.google.com.

  4. 4.

    http://www.sunda.fi/eng/translator.html.

  5. 5.

    Also the Finnish reference grammar Hakulinen et al. (2005) calls kin a focus particle, whereas the others are called “tonal particles” (“sävypartikkeli”).

  6. 6.

    The verb does not easily get the focus clitic when topicalized: juokin Jussi maitoa (‘Jussi actually does even drink milk’) is strange. On the other hand, taidankin tästä lähteä (‘I think I leave now’) is correct, maybe because the subject is omitted. Tulikin talvi (‘the winter came, after all’) is also correct, maybe because there is an omitted formal subject different from talvi (‘the winter’). We will leave room for overgeneration here to keep the rules simple.

  7. 7.

    Many other permutations are possible, since Finnish has “free word order”. Notice, however, that this does not mean free variation, since each word order has its own meaning and may, consequently, have its own translation.

  8. 8.

    Adding the other clitic of this class, kaan, will not change this, since it is in complementary distribution with kin depending on the polarity of the sentence; the positive kin is ‘also’, and the negative kaan is ‘either’. Negative polarity is usually expressed by sentence negation, but can also appear in unnegated questions.

  9. 9.

    There are two other ways of dealing with the vowel harmony of clitics in GF. One is to introduce the clitics as forms in inflection tables directly. This, however, leads to prohibitively large tables—for instance, every noun then has almost 3,744 forms (26 case-number combinations, 6 possessive suffixes (including none), 3 focus clitics (kin, kaan, none), and 8 topic clitics (all combinations including none)); the number of distinct forms is a little lower, since some of the combinations of case and possessive suffix produce the same string. The other way is to leave the decision to a separate lexical synthesis procedure (unlexing) after grammar-based linearization. This helps keep the grammar simple, but makes the over-all system more complex. One complication is that the vowel harmony of compound nouns, which are very common in Finnish, is impossible to decide from a string alone, without knowing the compound boundary. The parameter-based all-GF solution used here gives good quality with a reasonable table size. The classic implementation of Finnish morphology by Koskenniemi (1983) treats clitics as lexical forms to preserve accuracy, but avoids the explosion of the lexicon because its run-time representation is a finite-state automaton rather than an explicit table. Our solution similarly results in an automaton at run time, if we add a lexical analysis phase needed for restoring the binding tokens following the ideas of Huet (2005).

  10. 10.

    A full Finnish grammar has many more dependencies, in particular for verbs; even nouns have 30 forms in the GF resource grammar.

  11. 11.

    The Finnish resource grammar uses regular-expression pattern matching to define a set of much more powerful lexical paradigms, which infer the complete inflection from just the dictionary form for 87 % of nouns and 96 % of verbs (Détrez and Ranta 2012).

  12. 12.

    In a wide perspective, our approach can be seen in relation to the “quantifying in” idea of Montague (1974), which was developed for the clitic kin in Karttunen and Karttunen (1976). The common idea is that the clitic does not primarily attach to a word, but to an entire clause, from which a selected word is picked for the final, concrete attachment. Rather than bound variables, we use the idea of “slash categories” of GPSG (Gazdar et al. 1985): categories that have “gaps” in which syntactic constructions can insert new material.

  13. 13.

    http://kaino.kotus.fi/sanat/nykysuomi/.

  14. 14.

    Other “non-standard” languages represented in the resource grammar library are Amharic, Arabic, Hindi/Urdu, Maltese, Nepali, Persian, Punjabi, Swahili, and Thai.

  15. 15.

    Carlson (1993) presents this as a consequence of the general rule that “-kin/-kAAn cannot modify the polarity alone”. Interestingly, this rule seems to be getting less strict, at least for two-syllabic plural forms: Google search finds e.g. the natural-sounding Maapallo kyllä selviää, vaikka me emmekään selviäisi (‘The globe will certainly survive, even if we did not survive ourselves’; Web version of the newspaper Keskisuomalainen, May 2008).

  16. 16.

    See http://www.grammaticalframework.org/demos/finnish-clitics/.

References

  • Carlson, Lauri. 1983. Dialogue games: An approach to discourse analysis. Dordrecht: Reidel.

    Google Scholar 

  • Carlson, Lauri. 1993. Dialogue games with Finnish clitics. In Yearbook of the Linguistic Society of Finland, eds. Maria Vilkuna and Susanna Shore. Helsinki: SKY.

    Google Scholar 

  • Détrez, Grégoire, and Aarne Ranta. 2012. Smart paradigms and the predictability and complexity of inflectional morphology. In EACL 2012.

    Google Scholar 

  • Diderichsen, Paul. 1962. Elementær dansk grammatik. København: Gyldendal.

    Google Scholar 

  • Gazdar, Gerald, Ewan Klein, Geoffrey K. Pullum, and Ivan A. Sag. 1985. Generalized phrase structure grammar. Oxford: Basil Blackwell.

    Google Scholar 

  • Hakulinen, Auli, Maria Vilkuna, Riitta Korhonen, Vesa Koivisto, Tarja Riitta Heinonen, and Irja Alho. 2005. Iso suomen kielioppi. Helsinki: Suomalaisen Kirjallisuuden Seura.

    Google Scholar 

  • Huet, Gerard. 2005. A functional toolkit for morphological and phonological processing, application to a Sanskrit tagger. Journal of Functional Programming 15: 573–614.

    Article  MATH  Google Scholar 

  • Karttunen, Frances, and Lauri Karttunen. 1976. The clitic -kin/-kaan in Finnish. Texas Linguistic Forum 5: 89–118.

    Google Scholar 

  • Koskenniemi, Kimmo. 1983. Two-level morphology: A general computational model for word-form recognition and production. Doctoral diss., University of Helsinki.

    Google Scholar 

  • Montague, Richard. 1974. Formal philosophy. New Haven: Yale University Press. Collected papers edited by Richmond H. Thomason.

    Google Scholar 

  • Nevis, Joel A.. 1986. Finnish particle clitics and general clitic theory. Doctoral diss., Department of Linguistics, Ohio State University, Columbus.

    Google Scholar 

  • Ranta, Aarne. 2004. Grammatical Framework: A type-theoretical grammar formalism. Journal of Functional Programming 14: 145–189. http://www.cse.chalmers.se/~aarne/articles/gf-jfp.pdf

    Article  MathSciNet  MATH  Google Scholar 

  • Ranta, Aarne. 2009. The GF resource grammar library. Linguistics in Language Technology 2. http://elanguage.net/journals/index.php/lilt/article/viewFile/214/158.

  • Ranta, Aarne. 2011. Grammatical framework: Programming with multilingual grammars. Stanford: CSLI.

    Google Scholar 

  • Zwicky, Arnold. 1977. On clitics. Indiana University Linguistic Club 5: 89–118.

    Google Scholar 

Download references

Acknowledgements

I am grateful to Janet Pierrehumbert and Atro Voutilainen for useful and encouraging comments on the first version of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aarne Ranta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Ranta, A. (2012). On the Syntax and Translation of Finnish Discourse Clitics. In: Santos, D., Lindén, K., Ng’ang’a, W. (eds) Shall We Play the Festschrift Game?. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30773-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30773-7_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30772-0

  • Online ISBN: 978-3-642-30773-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics