Skip to main content

On Capturing Functional Style of Texts with Part-of-speech Trigrams

  • Conference paper
  • First Online:
Creativity in Intelligent Technologies and Data Science (CIT&DS 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1909))

Included in the following conference series:

  • 189 Accesses

Abstract

This article is dedicated to automatic detection of natural language texts functional style. Part-of-speech N-grams are selected as text features for capturing word order, which depends on functional style in Russian. The introduced approach was approbated within the task of texts classification by functional style and within a content-oriented book recommender system, which uses basic and modified probabilistic topic modeling and selects writings basing on their styles similarity to the input. Successful style-based books selection showed that an expectation gap for recommender systems can be filled, since it became possible to match books by style, successfully crossing the genre boundaries. The results are applicable for texts classification by functional style in libraries and publishers software, for personalized writings selection in recommender systems for news, scientific articles and fiction, as well as for further style modelling in dialogue systems and conversational communicative robots in order to select appropriate style depending on the interlocutor’s one and on the polilogue pragmatics and participants’ roles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. on Knowl. and Data Eng. 17, 734–749 (2005)

    Google Scholar 

  2. Babakov, N., Dale, D., Gusev, I., Krotova, I., Panchenko, A.: Don’t lose the message while paraphrasing: a study on content preserving style transfer. In: Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds.) Natural Language Processing and Information Systems. NLDB 2023. LNCS, vol. 13913. Springer, Cham (2023)

    Google Scholar 

  3. Bally, C.: Traité de stylistique française (in French), vol. 1. Winter, Heidelberg (1951)

    Google Scholar 

  4. Barakhnin, V.B., Kozhemyakina, O.Y., Pastushkov, I.S.: Automated determination of the type of genre and stylistic coloring of Russian texts. ITM Web of Conferences, Vol. 10. Art. 02001 (2017). https://doi.org/10.1051/itmconf/20171002001

  5. Bektoshev, O., Nishonova, S., Maxsudova, U., Hoshimova, D., Mahmudjonova, H.: Formation of religious style in linguistics. Journal of Positive School Psychology 6(12), 118–124 (2022)

    Google Scholar 

  6. Bell, A.: Language style as audience design. Lang. Soc. 13(2), 145–204 (1984). https://doi.org/10.1017/S004740450001037X

    Article  Google Scholar 

  7. Bell, A.: Language and the Media. Annu. Rev. Appl. Linguist. 15, 23–41 (1995)

    Article  Google Scholar 

  8. Bell, A.: A review on In: Biber, D., Finegan, E. (eds.) Sociolinguistic perspectives on register. Oxford University press, Oxford & New York, 1994. pp. xi, 385. Language in Society 24(2), 265270 (1995)

    Google Scholar 

  9. Berthoud, E., Elderkin, S.: The novel cure: an A–Z of literary remedies. Cannongate, Edinburgh (2013)

    Google Scholar 

  10. Biber, D.: Dimensions of register variation: a cross-linguistic comparison. Cambridge University Press, Cambridge (1995)

    Book  Google Scholar 

  11. Blakar, R.M.: Language as a means of social power. Pragmalinguistics, pp. 131169. Mouton, The Hague (1979)

    Google Scholar 

  12. Bolshakova, E., Vorontsov, K., Efremova, N., Klyshinsky, E., Lukashevich, N., Sapin, A.: Automatic natural language texts processing and data analysis (in Russian), pp. 198–205. HSE, Moscow (2017)

    Google Scholar 

  13. Braslavsky, P.: A study in automatic classification of texts by styles (based on the material of documents from Internet) (in Russian). In: Russian language in the Internet, compilation of articles, pp. 6–15. Otechestvo, Kazan (2003)

    Google Scholar 

  14. Bridge, D., Goker, M., McGinty, L., Smyth, B.: Case–based recommender systems. Knowl. Eng. Rev. 20(3), 315–320 (2006)

    Article  Google Scholar 

  15. Candillier, L., Jack, K., Fessant, F., Meyer, F.: State-of-the-art recommender systems. In: Collaborative and Social Information Retrieval and Access — Techniques for Improved User Modeling, pp. 1–22. IGI Global, Hershey (2009)

    Google Scholar 

  16. Chen, J., Dong, H., Wang, X., Feng, F., Wang, M., He, X.: Bias and debias in recommender system: a survey and future directions. ACM Trans. Info. Sys. 41(3), 1–39 (2023)

    Google Scholar 

  17. Chukovsky, K.: The art of translation. In: Leighton, L. (ed.). University of Tennessee Press, Knoxville (1984)

    Google Scholar 

  18. Delitsyn, L.: Hudlomer. Automatic classification of text style (in Russian), archived, https://web.archive.org/web/20180402152210/http://teneta.rinet.ru:80/hudlomer/article.html, last accessed 15 June 2023

  19. Ferraro, A., Ferreira, G., Diaz, F., Born, G.: Measuring commonality in recommendation in cultural content: recommender systems to enhance cultural citizenship. In: RecSys'22: Proceedings of the 16th ACM Conference on Recommender Systems, pp. 567–572. ACM, New York (2022). https://doi.org/10.1145/3523227.3551476

  20. Fomenko, V.P., Fomenko, T.G.: Author’s invariant of Russian literary texts (in Russian). In: Fomenko, A.T. (ed.) New Chronology of Greece: Antiquity in the Middle Ages, vol. 2, pp. 768–820. MSU, Moscow (1996)

    Google Scholar 

  21. Fucks, W.: Mathematical theory of word-formation. In: Cherry, C. (ed.) Information theory, pp. 154–170. Butterworths Scientific Publications, London (1955)

    Google Scholar 

  22. Gladkiy, A.V.: Syntactic structures of natural language, 2nd edn. LKI, Moscow (2007). (in Russian)

    Google Scholar 

  23. Grashchenko, L.A., Romanishin, G.V.: An essay at automated analysis of self-repetition in scientific texts (in Russian). In: New information technologies in automated systems: Proceedings of eighteenth scientific and practical seminar, pp. 582–590. Keldysh Institute of Applied Mathematics, Moscow (2015)

    Google Scholar 

  24. Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, pp. 289– 296. Morgan Kaufmann Publishers Inc., Waltham (1999)

    Google Scholar 

  25. Horta Ribeiro, M., Veselovsky, V., West, R.: The amplification paradox in recommender systems. In: Lin, Y.-R., Cha, M., Quercia, D. (eds.) Proceedings of the International AAAI Conference on Web and Social Media, 17(1), pp. 1138–1142. AAAI Press, Palo Alto (2023). https://doi.org/10.1609/icwsm.v17i1.22223

  26. How book synopses set reader expectations and why that matters/Pages Unbound Book Reviews & Discussions, https://pagesunbound.wordpress.com/2021/07/20/how-book-synopses-set-reader-expectations-and-why-that-matters/, last accessed 15 June 2023

  27. Jin, D., et al.: A survey on fairness-aware recommender systems (preprint), https://ssrn.com/abstract=4469569 (2023). https://doi.org/10.2139/ssrn.4469569, last accessed 15 June 2023

  28. Karlgren, J., Cutting, D.: Recognizing text genres with simple metrics using discriminant analysis. In: Proceedings of the 15th International Conference on Computational Linguistics (COLING ‘94), pp. 1.071–1.075 (1994)

    Google Scholar 

  29. Kessler, B., Nunberg, G., Schütze, H.: Automatic detection of text genre. In: Proceedings of 35th Annual Meeting. Association for Computational Linguistics, pp. 32–38. ACL, Stroudsburg (1997)

    Google Scholar 

  30. Kotov, A.: Application of D-Script Model to Emotional Dialogue Simulation. In: André, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds.) ADS 2004. LNCS (LNAI), vol. 3068, pp. 193–196. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24842-2_19

    Chapter  Google Scholar 

  31. Kotov, A.A.: Accounting for irony and emotional oscillation in computer architectures. In: Proc. of International Conference on Affective Computing and Intelligent Interaction ACII 2009, pp. 506–511. IEEE, Amsterdam (2009)

    Google Scholar 

  32. Kotov, A.A.: Mechanisms of the speech influence. RSUH, Moscow (2021). (in Russian)

    Google Scholar 

  33. Kotov, A., Budyanskaya, E.: The Russian emotional corpus: communication in natural emotional situations. In: Computational Linguistics and Intellectual Technologies, vol. 11(18), pp. 296–306. RSUH, Moscow (2012)

    Google Scholar 

  34. Kozerenko, E.B.: The problem of language structures equivalence within translation and semantic alignment of parallel texts. In: Iomdin, L.L., Laufer, N.I., Narinyani, A.S., Selegey, V.P. (eds.) Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference «Dialogue 2006» (Bekasovo, 31 May – 4 June 2006), pp. 252–258. RSUH, Moscow (2006)

    Google Scholar 

  35. Kozhina, M.N., Bazhenova, E.A., Kotyurova, M.P., Skovorodnikov, A.P. (eds.): Stylistic encyclopedic dictionary of Russian (in Russian). 2nd edn. Flinta, Nauka, Moscow (2006)

    Google Scholar 

  36. Levshina, N., et al.: Why we need a gradient approach to word order. Linguistics (2023). https://doi.org/10.1515/ling-2021-0098

    Article  Google Scholar 

  37. Li, Y., Liu, K., Satapathy, R., Wang, S., Cambria, E.: Recent developments in recommender systems: a survey (preprint) (2023). https://doi.org/10.48550/arXiv.2306.12680

  38. Malkina, M., Zinina, A., Arinkin, N., Kotov, A.: Multimodal hedges for companion robots: a politeness strategy or an emotional expression? In: Selegey, V.P., et al. (eds.) Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, issue 22, pp. 319–326. RSUH, Moscow (2023)

    Google Scholar 

  39. Markov, A.A.: On one application of the statistical method (in Russian). Transactions of the Imperial Academy of Sciences., series. 6, vol. X(4). Publisher House of the Imperial Academy of Sciences, Petrograd (1916)

    Google Scholar 

  40. Mathesius, V.: Functional linguistics. In: Vachek, J., Dušková, L. (eds.) Praguiana: Some basic and less known aspects of the Prague Linguistic School, pp. 121–142. John Benjamins, Amsterdam and New York (1983)

    Google Scholar 

  41. Meier, H.: Deutsche Sprachstatistik (in German). Georg Olms Verlagsbuchhandlung, Hildesheim (1964)

    Google Scholar 

  42. Mizernov, I.Y., Grashchenko, L.A.: Analysis of methods for text complexity estimation (in Russian). In: New information technologies in automated systems: Proceedings of eighteenth scientific and practical seminar, pp. 572–581. Keldysh Institute of Applied Mathematics, Moscow (2015)

    Google Scholar 

  43. Morozov, N.A.: Linguistic spectra: A means of distinguishing plagiarism from the true works of a known author: A stylometric etude (in Russian). Transactions of the Department of Russian language and Philology of the Imperial Academy of Sciences XX(4), pp. 93–134 . Publisher House of the Imperial Academy of Sciences, Petrograd (1915)

    Google Scholar 

  44. Nivre, J., Boguslavsky, I.M., Iomdin, L.L.: Parsing the SynTagRus treebank of Russian. In: Proceedings of the 22nd International Conference on Computational Linguistics. COLING 2008, 18–22 August 2008, Manchester, pp. 641–648. ACL, Stroudsburg (2008)

    Google Scholar 

  45. Notation for grammemes (for Russian) — Morphological analyzer pymorphy2 (in Russian), https://pymorphy2.readthedocs.io/en/stable/user/grammemes.html, last accessed 15 June 2023

  46. Perfiliev, A.A., Murzin, F.A., Shmanina, T.V.: Methods of syntactic analysis and comparison of constructions of a natural language oriented to use in search systems. Bull. Nov. Comp. Center, Comp. Science 31, 91–109 (2010)

    Google Scholar 

  47. Petukhova, K., Smilga, V., Zharikova, D.: Abstract user goals in open-domain dialog systems. In: Selegey, V.P., et al. (eds.) Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, issue 22, supplementary volume, pp. 1097–1107. RSUH, Moscow (2023)

    Google Scholar 

  48. Poirier, D., Fessant, F., Tellier, I.: Reducing the cold-start problem in content recommender through opinion classification. In: Proceedings of IEEE/WIC/ACM International Conference WI-IAT, pp. 204–207. IEEE Computer Society, Washington (2010)

    Google Scholar 

  49. Resnick, P., Varian, H.R.: Recommender systems. Commun. ACM 40(3), 56–58 (1997)

    Article  Google Scholar 

  50. Ricci, F., Rokach, L., Shapira, B.: Recommender systems: techniques, applications, and challenges. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook. 3rd edn, pp. 1–35. Springer, New York (2022). https://doi.org/10.1007/978-1-0716-2197-4_1

  51. Rombouts, E., Fieremans, M., Zenner, E.: Talking very properly creates such a distance’: Exploring style-shifting in speech-language therapists. Int. J. Lang. Commun. Disord. (2023). https://doi.org/10.1111/1460-6984.12896

    Article  Google Scholar 

  52. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)

    Article  Google Scholar 

  53. Savchenko, E., Lazebnik, T.: Computer aided functional style identification and correction in modern russian texts. J. Data Info. Manage. 4(3), 1–8 (2022). https://doi.org/10.1007/s42488-021-00062-2

    Article  Google Scholar 

  54. Sirotinina, O.B.: Modern colloquial speech and its peculiarities. Znanie, Moscow (1974). (in Russian)

    Google Scholar 

  55. Sirotinina, O.B. (ed.): Colloquial speech in the system of functional styles of the modern Russian language. Saratov University Publishing House, Saratov (1983). (in Russian)

    Google Scholar 

  56. Smyth, B.: Case-based recommender. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) The Adaptive Web, LNCS 4321, pp. 342–376. Springer-Verlag, Heidelberg (2007)

    Google Scholar 

  57. Stamatatos, E.: A survey of modern authorship attribution methods. J. Am. Soc. Inform. Sci. Technol. 60(3), 538–556 (2009)

    Article  Google Scholar 

  58. Vachek, J., Dušková, L. (eds.) Praguiana: Some basic and less known aspects of the Prague Linguistic School, pp. 33–58. John Benjamins, Amsterdam and New York (1983)

    Google Scholar 

  59. Trosborg, A.: Text typology: register, genre and text type. In: Text typology and translation, pp. 3–23, John Benjamins Publishing Company, Amsterdam (1997)

    Google Scholar 

  60. Trubetzkoy, N.S.: Principles of phonology. University of California Press, Berkeley (1969)

    Google Scholar 

  61. Velichkovsky, B.M., Kotov, A., Arinkin, N., Zaidelman, L., Zinina, A., Kivva, K.: From social gaze to indirect speech constructions: how to induce the impression that your companion robot is a conscious creature. Appl. Sci. 11(21), 10255 (2021)

    Article  Google Scholar 

  62. Veselova, E., Vorontsov, K.: Topic balancing with additive regularization of topic models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pp. 59–65. ACL, Stroudsburg (2020)

    Google Scholar 

  63. Vinogradov, V.V.: Results of stylistics issues discussion (in Russian). Linguistic Issues 1, 60–87 (1955)

    Google Scholar 

  64. Volkova, L.L.: Towards the problem of detecting functional style of a natural language document (in Russian). In: New information technologies in automated systems: Proceedings of eighteenth scientific and practical seminar, pp. 615–626. Keldysh Institute of Applied Mathematics, Moscow (2015)

    Google Scholar 

  65. Volkova, L.L., Lanko, A.A.: A method for selecting features of natural language texts for classification by functional style (in Russian). In: Tikhonov, A.N., Uvaysov, S.U., Ivanov, I.A. (eds.) Innovations on the base of information and communicative technologies: Proceedings of international scientific and practical conference, pp. 287–289. NRU HSE, Moscow (2015)

    Google Scholar 

  66. Wang, L., Zhang, K.: Space efficient algorithms for ordered tree comparison. Algorithmica 51(3), 283–297 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  67. Xu, S., Ji, J., Li, Y., Ge, Y., Tan, J., Zhang, Y.: Causal inference for recommendation: foundations, methods and applications (preprint) (2023). https://doi.org/10.48550/arXiv.2301.04016

  68. Yartseva, V.N. (ed.): Linguistic encyclopedic dictionary. Sovetskaya entsiklopediya, Moscow (1990). (in Russian)

    Google Scholar 

  69. Zasorina, L.N. (ed.): Frequency dictionary of Russian. Russkiy Yazyk, Moscow (1977). (in Russian)

    Google Scholar 

  70. Zhao, D., Chen, Q.: Translation style: a systemic functional perspective. Int. J. Eng. Litera. 14, 27–32 (2023). https://doi.org/10.5897/IJEL2023.1569

    Article  Google Scholar 

Download references

Acknowledgements

This research is partly supported by the grant of the Russian Science Foundation № 19-18-00547, https://rscf.ru/project/19-18-00547/.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liliya Volkova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Volkova, L., Lanko, A., Romanov, V. (2023). On Capturing Functional Style of Texts with Part-of-speech Trigrams. In: Kravets, A.G., Shcherbakov, M.V., Groumpos, P.P. (eds) Creativity in Intelligent Technologies and Data Science. CIT&DS 2023. Communications in Computer and Information Science, vol 1909. Springer, Cham. https://doi.org/10.1007/978-3-031-44615-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44615-3_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44614-6

  • Online ISBN: 978-3-031-44615-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics