Skip to main content

Going Beyond “Aboutness”: A Quantitative Analysis of Sputnik Czech Republic

  • Chapter
  • First Online:
Taming the Corpus

Abstract

This paper is an attempt to unpack the “alternativeness” of Sputnik Czech Republic, an online news-opinion portal that targets the Czech-speaking audience. The overarching principle used in the analysis is prominence, a concept used in the corpus linguistic method of keyword analysis. The use of Multi-level Discourse Prominence Analysis (MLDPA), which combines quantitative data and concepts from critical discourse analysis and cognitive linguistics, expands the applicability of prominence beyond the lexicon to multiple levels of language and informs of the overarching rhetoric and ideology in a text. The centerpiece of MLDPA is “keymorph analysis,” which applies the cognitive linguistic notion of morphemes as meaning-bearing units (Janda 1993; Janda and Clancy, The case book for Czech. Slavica, Bloomington, IN, 2006) to the existing corpus linguistic method of keyword analysis. MLDPA helps identify and objectivize the ideological content of news in media that creates the impression of objective and well-balanced news.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://sputniknews.com/docs/about/index.html, accessed September 22, 2018.

  2. 2.

    The use of podle parallels the use of “neutral structuring verbs” (Caldas-Coulthard, 1994) that “introduce a saying without evaluating it explicitly” (Machin and Mayer, 2012, p. 59).

  3. 3.

    The show is cited incorrectly as “Le Lene” instead of the actual “Le Iene.”

  4. 4.

    Emphasis in bold style by the authors.

  5. 5.

    All the examples from SPUCz used in this article were last checked and were present on the web on June 22, 2018.

  6. 6.

    The phrasing připravovat + the infinitive is not natural but not totally erroneous in Czech.

  7. 7.

    Filip is the current chairman of the Czech Communist Party.

  8. 8.

    “[a] word form which recurs within the text in question will be more likely to be key in it.” (Scott & Tribble, 2006).

  9. 9.

    Extraction of KWs is the first statistical step (“keywords are pointers, that is all” (Scott, 2010)). KWs are often further analyzed with other methods of corpus linguistics (e.g., collocation profiles and “semantic prosody” (Stewart, 2010)).

  10. 10.

    Several statistical tests are used for comparison of relative frequencies, such as log-likelihood, chi2, or Fisher exact tests (cf. Bertels & Speelman, 2013) to determine the statistical significance of the difference. However, the statistical significance expressed by p-value is a necessary but not sufficient condition of prominence. Given that these tests are typically asymptotically true, p-values (esp. when computed on large data sets) do not inform us of whether the difference between the frequencies carries any descriptive value (cf. Wilson, 2013). As a result, tests are often accompanied by the effect size estimation, such as the Difference Index (DIN), a ratio (multiplied by 100) of the difference between relative frequencies of an item in the target text, and the reference corpus and the mean of those relative frequencies (cf. Fidler & Cvrček, 2015).

  11. 11.

    The term KWs therefore differs from query terms in search engines or cultural keywords (Williams, 1976). The identification of KWs has a clear quantitative basis; “…it is less subject to the vagaries of subjective judgments of cultural importance … [and] it does not rely on researchers selecting items that might be important… but can reveal items that researchers did not know to be important in the first place.” (Culpeper & Demmen, 2015, p. 90)

  12. 12.

    More discussion on the influence of a reference corpus on the results of KWA can be found in Scott, 2013.

  13. 13.

    While the target corpus may be biased towards the presence of words formed from these stems, it allows us to focus on the image of these countries specifically (especially Russia and Ukraine).

  14. 14.

    Both corpora are available upon request at www.korpus.cz.

  15. 15.

    The significance level used in this study was set to 0.001 and the minimum effect size was set to DIN = 75.

  16. 16.

    This procedure involves the level of prominence (DIN), the number of prominent units, and the number of all content words in a sentence. It investigates sentence types that are likely to attract reader attention by measuring the density of KLs.

  17. 17.

    For example, the lemma hrad ‘castle’ can appear in multiple word forms in Standard Czech: hrad (nom/acc sg), hradu (gen/dat sg), hradě (loc sg), hradem (instr sg), hrady (nom/acc/voc/instr pl), hradů (gen pl), hradům (dat pl), and hradech (loc pl).

  18. 18.

    Here, we only discuss common nouns, as they are most likely to be associated with the representation of entities, individuals, and events.

  19. 19.

    Proper nouns and adjectives directly derived from them are not discussed here.

  20. 20.

    Cf. “collocations create connotations” (Stubbs, 2005, p. 14). The contextual properties of keywords are thus examined by their links (Scott & Tribble, 2006) to other keywords (i.e., co-occurrence of KWs within a textual span).

  21. 21.

    The collocates were searched within a span of three words on either side of the KWIC and were ranked first by LogDice and secondly by frequency.

  22. 22.

    Collocates here are lemmas that are not necessarily keyed.

  23. 23.

    The appearance of KWs referring to presidents among the collocations is expected, as the major seed words include names of presidents (e.g., Putin and Poroshenko).

  24. 24.

    We excluded the remaining adverbs: zahraničně as part of the descriptive phrases zahraničně-politický/-ekonomický/-obchodní ‘internationally-politically /-economically /-commercially,‘ and the adverb odkladně (used in neodkladně ‘urgently’).

  25. 25.

    Subjects were manually checked and categorized.

  26. 26.

    The subjects were manually identified and include instances where the subject is implicit and/or is mentioned in the surrounding discourse.

  27. 27.

    DIN here (marked with the asterisk) is calculated differently than for KLs. The prominence of each case is calculated relative to all occurrences of a given lemma in SPUCz and SYN2015, respectively (i.e., not relative to the number of tokens in the corpus) as in Table <InternalRef="IDRef="IDTab17”>10.17</InternalRef>.

  28. 28.

    The instrumental case is highly collocated with the preposition s ‘with’ in Czech.

  29. 29.

    The sentences were examined by each co-author independently first. The co-authors then discussed their differences and reached a mutually acceptable categorization.

References

  • Altshuler, D. (2010). Aspect in English and Russian flashback discourses. Oslo Studies in Language, 2, 75–107.

    Google Scholar 

  • Baker, P., & McEnery, T. (2005). A corpus-based approach to discourse of refugees and asylum seekers in UN and newspaper texts. Journal of Language and Politics, 4(2), 197–226.

    Article  Google Scholar 

  • Baker, P. (2005). The public discourse of gay men. London: Routledge.

    Google Scholar 

  • Baker, P. (2009). The question is, how cruel is it? Keywords in debates on fox hunting in the British House of Commons. In D. Archer (Ed.), What’s in a word-list? (pp. 125–136). London: Ashgate.

    Google Scholar 

  • Bertels, A., & Speelman, D. (2013). ‘Keywords method’ versus ‘Calcul des Spécificités’. International Journal of Corpus Lingustics, 18(4), 536–560.

    Article  Google Scholar 

  • Biber, D. (1993). Using register-diversified corpora for general language studies. Computational Linguistics, 19(2), 219–241.

    Google Scholar 

  • Biber, D. (2006). University language: A corpus-based study of spoken and written registers. Amsterdam: John Benjamins.

    Book  Google Scholar 

  • Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and written English. Harlow, UK: Longman.

    Google Scholar 

  • Caldas-Coulthard, C. (1994). “On reporting reporting: The representation of speech in factual and factional narratives”, ed. In Malcolm Coulthard, advances in written texts analysis, 295–308. London: Routledge.

    Google Scholar 

  • Chvany, C. (1990). Verbal aspect, discourse saliency, and the so-called perfect of result in Modern Russian. In N. B. Thelin (Ed.), Verbal aspect in discourse (pp. 213–236). Amsterdam: John Benjamins.

    Chapter  Google Scholar 

  • Culpeper, J. (2002). Computers, language and characterisation. An analysis of six characters in Romeo and Juliet. In U. Melander-Marttala, C. Ostman, & M. Kyto (Eds.), Conversation in life and in literature: Papers from the ASLA symposium (Vol. 15, pp. 11–30). Uppsala, Sweden: Association Suedoise de Linguistique Appliquee.

    Google Scholar 

  • Culpeper, J. (2009). Keyness: Words, parts-of-speech and semantic categories in the character-talk of Shakespeare’s Romeo and Juliet. International Journal of Corpus Linguistics, 14(1), 29–59.

    Article  Google Scholar 

  • Culpeper, J., & Demmen, J. (2015). Keywords. In D. Biber & R. Reppen (Eds.), The Cambridge handbook of English corpus linguistics (pp. 90–105). Cambridge, UK: Cambridge University Press.

    Chapter  Google Scholar 

  • Cvrček, V., & Fidler, M. (forthcoming). More than keywords: Discourse prominence analysis of the Russian web portal Sputnik Czech Republic. In A. Salamurovič & M. Berrocal (Eds.), Language in politics in Slavic-speaking countries.

    Google Scholar 

  • Desclés, J.-P., & Guentschéva, Z. (1990). Discourse analysis of aorist and imperfect in Bulgarian and French. In N. B. Thelin (Ed.), Verbal aspect in discourse (pp. 237–261). Amsterdam: John Benjamins.

    Chapter  Google Scholar 

  • Fidler, M., & Cvrček, V. (2015). A data-driven analysis of reader viewpoints: Reconstructing the historical reader using keyword analysis. Journal of Slavic Linguistics, 23(2), 197–239.

    Article  Google Scholar 

  • Fairclough, N. (1995). Media discourse. London: Hodder Education.

    Google Scholar 

  • Fidler, M., & Cvrček, V. (2017). Keymorph analysis, or how morphosyntax informs discourse. Corpus Linguistics and Linguistic Theory. https://doi.org/10.1515/cllt-2016-0073. Accessed 29 Sept 2018.

  • Fielder, G. (1990). Narrative context and Russian aspect. In N. B. Thelin (Ed.), Verbal aspect in discourse (pp. 263–284). Amsterdam: John Benjamins.

    Chapter  Google Scholar 

  • Fisher-Starcke, B. (2009). Keywords and frequent phrases of Jane Austin’s Pride and Prejudice. A corpus-stylistic analysis. International Journal of Corpus Linguistics, 14(4), 492–523.

    Article  Google Scholar 

  • Groll, E. Elias. (2014). Kremlin’s ‘Sputnik’ newswire is the buzzfeed of propaganda. Foreign Policy. https://foreignpolicy.com/2014/11/10/kremlins-sputnik-newswire-is-the-buzzfeed-of-propaganda/. Accessed 3 July 2017.

  • Heritage, Timothy. (2013, December 9). Putin dissolves state news agency, tightens grip on Russia media. Reuters World News. http://www.reuters.com/article/us-russia-media-idUSBRE9B80I120131209. Accessed 17 July 2017.

  • Hopper, P., & Thompson, S. (1980). Transitivity. Language, 56(2), 251–299.

    Article  Google Scholar 

  • Jäger, S., & Maier, F. (2016). Analysing discourses and dispositives: A foucauldian approach to theory and methodology. In R. Wodak & M. Meyer (Eds.), Methods of critical discourse studies (3rd ed., pp. 109–136). London: Sage.

    Google Scholar 

  • Jakobson, R. (1984). Contribution to the general theory of case: General meanings of the Russian cases. In L. R. Waugh & M. Halle (Eds.), Roman Jakobson. Russian and Slavic grammar. Studies 1931–1981 (pp. 59–103). Berlin: Mouton.

    Chapter  Google Scholar 

  • Janda, L. A. (1993). The shape of the indirect object in Central and Eastern Europe. The Slavic and East European Journal, 37(4), 533–563.

    Article  Google Scholar 

  • Janda, L. A., & Clancy, S. (2006). The case book for Czech. Bloomington, IN: Slavica.

    Google Scholar 

  • Kresin, S. (1998). Deixis and thematic hierarchies in Russian narrative discourse. Journal of Pragmatics, 30(4), 421–435.

    Article  Google Scholar 

  • Křen, M., Cvrček, V., Čapka, T., Čermáková, A., Hnátková, M., Chlumská, L., et al. (2016). SYN2015: Representative Corpus of contemporary written Czech. In N. Calzolari, K. Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, et al. (Eds.), Proceedings of the tenth international conference on language resources and evaluation (LREC’16) (pp. 2522–2528). Portorož, Slovenia: ELRA http://www.lrec-conf.org/proceedings/lrec2016/index.html. Accessed 29 Sept 2018.

  • MacFarquhar, Neil. (2016, August 28). A powerful Russian weapon: The spread of false stories. The New York Times. https://www.nytimes.com/2016/08/29/world/europe/russia-sweden-disinformation.html. Accessed 17 July 2017.

  • Machin, D., & Mayer, A. (2012). How to do critical discourse analysis: A multimodal introduction. Los Angeles: Sage.

    Google Scholar 

  • Mahlberg, M. (2007). Clusters, key clusters and local textual functions in Dickens. Corpora, 2(1), 1–31.

    Article  Google Scholar 

  • Scott, M. (2010). Problems in investigating keyness, or cleansing the undergrowth and marking out tails…. In M. Bondi & M. Scott (Eds.), Keyness in texts (pp. 43–57). Amsterdam: John Benjamins.

    Chapter  Google Scholar 

  • Scott, M. (2013). WordSmith tools manual. Version 7.0. Liverpool, UK: Lexical Analysis Software http://www.lexically.net/downloads. Accessed 29 Sept 2018.

  • Scott, M., & Tribble, C. (2006). Textual patterns: Keyword and corpus analysis in language education. Amsterdam: John Benjamins.

    Book  Google Scholar 

  • Smoleňová, Ivana. (2015, June). The pro-Russian disinformation campaign in the Czech Republic and Slovakia. Types of media spreading pro-Russian propaganda, their characteristics and frequently used narratives. Prague Security Studies Institute (PSSI). http://www.pssi.cz/download/docs/253_is-pro-russian-campaign.pdf. Accessed 17 July 2017.

  • Sonnenhauser, B. (2008). Aspect interpretation in Russian—A pragmatic account. Journal of Pragmatics, 40(12), 2077–2099.

    Article  Google Scholar 

  • Stewart, D. (2010). Semantic prosody. A critical evaluation. New York: Routledge.

    Book  Google Scholar 

  • Straková, J., Straka, M., & Hajič, J. (2014). Open-source tools for morphology, lemmatization, pos tagging and named entity recognition. In Proceedings of 52nd annual meeting of the Association for Computational Linguistics: System demonstrations, Baltimore, Maryland, June 2014 (pp. 13–18). Stroudsburg, PA: Association for Computational Linguistics.

    Chapter  Google Scholar 

  • Stubbs, M. (2005). Conrad in the computer: Examples of quantitative stylistic methods. Language and Literature, 14(1), 5–24.

    Article  Google Scholar 

  • Tabbert, U. (2015). Crime and corpus. The linguistic representation of crime in the press. John Benjamins: Philadephia.

    Google Scholar 

  • Ueda, M. (1992). The interaction between clause-level parameters and context in Russian morphosyntax: Genitive of negation and predicate adjectives. Munich, Germany: Otto Sagner.

    Book  Google Scholar 

  • Walker, B. (2010). Wmatrix, key concepts and the narrator in Julian Barnes’s Talking It Over. In D. McIntyre & B. Busse (Eds.), Language and style (pp. 364–387). Basingstoke, UK: Palgrave Education.

    Chapter  Google Scholar 

  • Williams, R. (1976). Keywords: A vocabulary of culture and society. New York: Oxford University Press.

    Google Scholar 

  • Wilson, A. (2013). Embracing Bayes factors for key item analysis in corpus linguistics. In M. Bieswanger & A. Koll-Stobbe (Eds.), New approaches to the study of linguistic variability. Language competence and language awareness in Europe (pp. 3–11). Frankfurt, Germany: Peter Lang.

    Google Scholar 

Download references

Acknowledgments

This paper was supported in part by program Progres Q08 Czech National Corpus implemented at the Faculty of Arts, Charles University and the Brown University Humanities Research Funds. The authors would also like to thank Katie Krafft for data collection.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Václav Cvrček .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Fidler, M., Cvrček, V. (2018). Going Beyond “Aboutness”: A Quantitative Analysis of Sputnik Czech Republic. In: Fidler, M., Cvrček, V. (eds) Taming the Corpus. Quantitative Methods in the Humanities and Social Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-98017-1_10

Download citation

Publish with us

Policies and ethics