Abstract
The paper describes the results of experiments on the development of a statistical model of the Russian text corpus on musicology. We construct a topic model based on Latent Dirichlet Allocation and process corpus data with the help of the GenSim statistical toolkit. Results achieved in course of experiments allow us to distinguish general and special topics which describe conceptual structure of the corpus in question and to analyze paradigmatic and syntagmatic relations between lemmata within topics.
The research discussed in the paper is supported by the grant of St.-Petersburg State University № 30.38.305.2014 «Quantitative linguistic parameters for defining stylistic characteristics and subject area of texts».
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bod, R.: A unified model of structural organization in language and music. J. Artif. Intell. Res. 17, 289–308 (2002)
Scha, R., Bod, R.: Computational aesthetics. J. Informatie en Informatiebeleid 11(1), 54–63 (1993)
Koryhalova, N.P.: Muzykal’no-ispolnitel’skije Terminy (Terminology of Musical Performance) (in Russian). St.-Petersburg (2006)
Mitrofanova, O.A.: Regulyarnoje i Irregulyarnoje v Terminologii Muzyki: o Jazykovyh Sposobah Zadanija Risunka Muzykal’nogo Proizvedenija (Regular and irregular items in terminology of music: on linguistic means of defining the contour of the musical composition) (in Russian). In: Materialy XXXI Nauchno-Prakticheskoj Konferencii Filologicheskogo Fakul’teta SPbGU. Vyp. 4. Sekcija Prikladnoj i Matematicheskoj Lingvistiki. (Proceedings of the XXXI Research Conference of the Philological Faculty, St.-Petersburg State University, issue 4. Section of Applied and Mathematical Linguistics), St.-Petersburg (2002)
Mitrofanova, O.A.: Jazykovyje Sposoby Zadanija Risunka Muzykal’nogo Proizvedenija: Shtrihi k Lingvisticheskomu Portretu A.N. Skryabina (Language means of defining the contour of the musical composition: the features of A.N. Skryabin’s linguistic portrait) (in Russian). In: Avtor. Tekst. Auditorija (Author. Text. Audience). Saratov (2002)
Gries, S.T.: What is corpus linguistics. J. Lang. Linguist. Compass 3, 1–17 (2009)
Zakharov, V.P.: Korpusnaja Lingvistika (Corpus Linguistics) (in Russian). St.-Petersburg (2005)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)
Daud, A., Li, J., Zhou, L., Muhammad, F.: Knowledge discovery through directed probabilistic topic models: a survey. Front. Comput. Sci. China 4(2), 280–301 (2010)
TMB – Topic Modelling Bibliography. http://www.cs.princeton.edu/~mimno/topics.html
Rhody, L.M.: Topic modeling and figurative language. J. Digit. Humanit. 2(1). Winter 2012. http://journalofdigitalhumanities.org/2-1/topic-modeling-and-figurative-language-by-lisa-m-rhody/. (2012)
Bodrunova, S., Koltsov, S., Koltsova, O., Nikolenko, S., Shimorina, A.: Interval semi-supervised LDA: classifying needles in a haystack. In: Castro, F., Gelbukh, A., González, M. (eds.) MICAI 2013, Part I. LNCS, vol. 8265, pp. 265–274. Springer, Heidelberg (2013)
Vorontsov, K.V., Potapenko, A. Additive regularization of topic models. In: Analysis of Images, Social Networks and Texts. Communications in Computer and Information Science, vol. 436, pp. 29–46 (2014)
Mitrofanova, O.A.: Verojatnostnoje Modelirovanije Tematiki Russkojazychnyh Korpusov Tekstov s Ispol’zovanijem Kompjuternogo Instrumenta GenSim (Probabilistic topic modelling of the Russian text corpora by means of GenSim toolkit) (in Russian). In: Trudy Mezhdunarodnoj Konferencii «Korpusnaja Lingvistika – 2015» (Proceedings of the International Conference «Corpus Linguistics – 2015»), St.-Petersburg (2015)
Martynenko, G.J.: Semantika Korpusa Russkogo Romansa (Semantics of the Russian romance corpus) (in Russian). In: Trudy Mezhdunarodnoj Konferencii «Korpusnaja Lingvistika – 2006» (Proceedings of the International Conference «Corpus Linguistics – 2006»), St.-Petersburg, pp. 255–262 (2006)
Martynenko, G.J.: Korpus Russkogo Romansa kak Osnova Issledovanija Verbal’no-muzykal’nyh Tekstov (The corpus of Russian romances for studying poetry and music) (in Russian). In: Trudy Mezhdunarodnoj Konferencii «Korpusnaja Lingvistika – 2013» (Proceedings of the International Conference «Corpus Linguistics – 2013»), St.-Petersburg (2013)
Samin, D.: 100 Velikih Muzykantov (100 Great Musicians) (in Russian). Moscow (2002)
Samin, D. 100 velikih kompozitorov (100 Great Composers) (in Russian). Moscow (2006)
Karaulov, J.N., et al.: RAS Russkij Assiciativnyj Slovar’ (The Russian Associative Dictionary) (in Russian), vol. 1–2. Moscow (2002)
Melchuk, I.A.: Opyt Teorii Lingvisticheskih Modelej «Smysl <=> Tekst» (Experience of the Theory of Linguistic Models «Sense <=> Text») (in Russian). Moscow (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mitrofanova, O. (2015). Probabilistic Topic Modeling of the Russian Text Corpus on Musicology. In: Eismont, P., Konstantinova, N. (eds) Language, Music, and Computing. LMAC 2015. Communications in Computer and Information Science, vol 561. Springer, Cham. https://doi.org/10.1007/978-3-319-27498-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-27498-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27497-3
Online ISBN: 978-3-319-27498-0
eBook Packages: Computer ScienceComputer Science (R0)