Prosodic Boundaries Prediction in Russian Using Morphological and Syntactic Features

Menshikova, Alla; Kocharov, Daniil

doi:10.1007/978-3-030-34518-1_9

Alla Menshikova⁹ &
Daniil Kocharov⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1119))

Included in the following conference series:

Conference on Artificial Intelligence and Natural Language

496 Accesses

Abstract

The paper presents a comparison between three approaches towards prosodic boundary prediction in Russian text, namely a rule-governed method and methods involving statistical classifier and deep learning technique. The methods aim to predict all possible prosodic boundaries in text applying morphological and syntactic information. All used features were described in terms of Universal Dependencies framework by means of SyntaxNet parser. The rule-governed method runs in a bottom-up fashion, using the information about syntax group edges and applying data-driven and hand-written linguistic rules. For machine learning methods, conditional random fields classifier and bidirectional LSTM model were built, with such features as part-of-speech tag, syntactic dependency type, syntactic relation embedding and presence of syntactic link between the current and adjacent words. As experimental material, we used the data of CORPRES corpus, containing over 30 hours of professionally read speech. Used separately, morphological features are slightly superior to syntactic ones, and their combination improves the results. BiLSTM yields the highest F₁ measure value of 90.4, as compared to 88.8 for CRF and 83.1 for rule-based method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Combining Syntactic and Acoustic Features for Prosodic Boundary Detection in Russian

Investigating Effect of Rich Syntactic Features on Mandarin Prosodic Boundaries Prediction

Article 04 June 2015

Automatic Detection of Prosodic Boundaries in Brazilian Portuguese Spontaneous Speech

References

Alberti, C., et al.: Syntaxnet models for the CoNLL 2017 shared task. arXiv e-prints arXiv:1703.04929 (2017)
Andor, D., et al.: Globally normalized transition-based neural networks. arXiv e-prints arXiv:1603.06042 (2016)
Che, H., Tao, J., Li, Y.: Improving mandarin prosodic boundary prediction with rich syntactic features. In: Proceedings of Interspeech 2014 (2014)
Google Scholar
Chen, Q., Ling, Z.H., Yang, C.Y., Dai, L.R.: Automatic phrase boundary labeling of speech synthesis database using context-dependent HMMs and n-gram prior distributions. In: Proceedings of Interspeech 2015 (2015)
Google Scholar
Chistikov, Pavel, Khomitsevich, Olga: Improving prosodic break detection in a Russian TTS system. In: Železný, Miloš, Habernal, Ivan, Ronzhin, Andrey (eds.) SPECOM 2013. LNCS (LNAI), vol. 8113, pp. 181–188. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-01931-4_24
Chapter Google Scholar
Chollet, F.: Keras (2015). https://github.com/fchollet/keras
Khomitsevich, O., Chistikov, P., Zakharov, D.: Using random forests for prosodic break prediction based on automatic speech labeling. In: International Conference on Speech and Computer, pp. 467–474 (2014)
Google Scholar
Klimkov, V., et al.: Phrase break prediction for long-form reading TTS: exploiting text structure information. In: Proceedings of Interspeech 2017, pp. 1064–1068 (2017)
Google Scholar
Kocharov, D., Kachkovskaia, T., Mirzagitova, A., Skrelin, P.: Prosodic boundary detection using syntactic and acoustic information. In: International Conference on Statistical Language and Speech Processing, pp. 68–79 (2016)
Google Scholar
Kudo, T.: CRF++: yet another CRF toolkit (2005). http://crfpp.googlecode.com/svn/trunk/doc/index.html
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning, pp. 282–289 (2001)
Google Scholar
Liu, R., Bao, F., Gao, G., Zhang, H., Wang, Y.: Improving Mongolian phrase break prediction by using syllable and morphological embeddings with BiLSTM model. In: Proceedings of Interspeech 2018, pp. 57–61 (2018)
Google Scholar
Lobanov, B.: An algorithm of the text segmentation on syntactic syntagrams for TTS synthesis. In: Proceedings of Dialogue 2008 (2008)
Google Scholar
Louw, J.A., Moodley, A.: Speaker specific phrase break modeling with conditional random fields for text-to-speech. In: Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech), pp. 1–6 (2016)
Google Scholar
de Marneffe, M.C., et al.: Universal stanford dependencies: a cross-linguistic typology. In: Proceedings of the 9th Conference on International Language Resources and Evaluation, LREC 2014, pp. 4585–4592 (2014)
Google Scholar
Mishra, T., Kim, Y.j., Bangalore, S.: Intonational phrase break prediction for text-to-speech synthesis using dependency relations. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4919–4923 (2015)
Google Scholar
Mosteller, F., Tukey, J.W.: Data analysis, including statistics. In: Handbook of Social Psychology, vol. 2, pp. 80–203 (1968)
Google Scholar
Nivre, J., et al.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of the 10th Conference on International Language Resources and Evaluation, LREC 2016, pp. 1659–1666 (2016)
Google Scholar
Pascual, Santiago, Bonafonte, Antonio: Prosodic break prediction with RNNs. In: Abad, A., et al. (eds.) IberSPEECH 2016. LNCS (LNAI), vol. 10077, pp. 64–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49169-1_7
Chapter Google Scholar
Read, I., Cox, S.: Stochastic and syntactic techniques for predicting phrase breaks. Comput. Speech Lang. 21(3), 519–542 (2007)
Article Google Scholar
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50 (2010)
Google Scholar
Schmid, H., Atterer, M.: New statistical methods for phrase break prediction. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 659 (2004)
Google Scholar
Skrelin, P.A., Volskaya, N.B., Kocharov, D., Evgrafova, K., Glotova, O., Evdokimova, V.: A fully annotated corpus of Russian speech. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010, pp. 109–112 (2010)
Google Scholar
Sun, X., Applebaum, T.H.: Intonational phrase break prediction using decision tree and n-gram model. In: 7th European Conference on Speech Communication and Technology (2001)
Google Scholar
Zhang, Z., Wu, F., Yang, C., Dong, M., Zhou, F.: Mandarin prosodic phrase prediction based on syntactic trees. In: 9th ISCA Speech Synthesis Workshop, pp. 160–165 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Saint Petersburg State University, St. Petersburg, 199034, Russia
Alla Menshikova & Daniil Kocharov

Authors

Alla Menshikova
View author publications
You can also search for this author in PubMed Google Scholar
Daniil Kocharov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alla Menshikova .

Editor information

Editors and Affiliations

Krasovskii Institute of Mathematics and Mechanics, Yekaterinburg, Russia
Dmitry Ustalov
ITMO University, St. Petersburg, Russia
Andrey Filchenkov
Computer Science, University of Helsinki, Helsinki, Finland
Lidia Pivovarova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Menshikova, A., Kocharov, D. (2019). Prosodic Boundaries Prediction in Russian Using Morphological and Syntactic Features. In: Ustalov, D., Filchenkov, A., Pivovarova, L. (eds) Artificial Intelligence and Natural Language. AINL 2019. Communications in Computer and Information Science, vol 1119. Springer, Cham. https://doi.org/10.1007/978-3-030-34518-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-34518-1_9
Published: 13 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34517-4
Online ISBN: 978-3-030-34518-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Prosodic Boundaries Prediction in Russian Using Morphological and Syntactic Features

Abstract

Access this chapter

Similar content being viewed by others

Combining Syntactic and Acoustic Features for Prosodic Boundary Detection in Russian

Investigating Effect of Rich Syntactic Features on Mandarin Prosodic Boundaries Prediction

Automatic Detection of Prosodic Boundaries in Brazilian Portuguese Spontaneous Speech

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Prosodic Boundaries Prediction in Russian Using Morphological and Syntactic Features

Abstract

Access this chapter

Similar content being viewed by others

Combining Syntactic and Acoustic Features for Prosodic Boundary Detection in Russian

Investigating Effect of Rich Syntactic Features on Mandarin Prosodic Boundaries Prediction

Automatic Detection of Prosodic Boundaries in Brazilian Portuguese Spontaneous Speech

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation