Skip to main content

Prosodic Boundaries Prediction in Russian Using Morphological and Syntactic Features

  • Conference paper
  • First Online:
Artificial Intelligence and Natural Language (AINL 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1119))

Included in the following conference series:

  • 496 Accesses

Abstract

The paper presents a comparison between three approaches towards prosodic boundary prediction in Russian text, namely a rule-governed method and methods involving statistical classifier and deep learning technique. The methods aim to predict all possible prosodic boundaries in text applying morphological and syntactic information. All used features were described in terms of Universal Dependencies framework by means of SyntaxNet parser. The rule-governed method runs in a bottom-up fashion, using the information about syntax group edges and applying data-driven and hand-written linguistic rules. For machine learning methods, conditional random fields classifier and bidirectional LSTM model were built, with such features as part-of-speech tag, syntactic dependency type, syntactic relation embedding and presence of syntactic link between the current and adjacent words. As experimental material, we used the data of CORPRES corpus, containing over 30 hours of professionally read speech. Used separately, morphological features are slightly superior to syntactic ones, and their combination improves the results. BiLSTM yields the highest F1 measure value of 90.4, as compared to 88.8 for CRF and 83.1 for rule-based method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alberti, C., et al.: Syntaxnet models for the CoNLL 2017 shared task. arXiv e-prints arXiv:1703.04929 (2017)

  2. Andor, D., et al.: Globally normalized transition-based neural networks. arXiv e-prints arXiv:1603.06042 (2016)

  3. Che, H., Tao, J., Li, Y.: Improving mandarin prosodic boundary prediction with rich syntactic features. In: Proceedings of Interspeech 2014 (2014)

    Google Scholar 

  4. Chen, Q., Ling, Z.H., Yang, C.Y., Dai, L.R.: Automatic phrase boundary labeling of speech synthesis database using context-dependent HMMs and n-gram prior distributions. In: Proceedings of Interspeech 2015 (2015)

    Google Scholar 

  5. Chistikov, Pavel, Khomitsevich, Olga: Improving prosodic break detection in a Russian TTS system. In: Železný, Miloš, Habernal, Ivan, Ronzhin, Andrey (eds.) SPECOM 2013. LNCS (LNAI), vol. 8113, pp. 181–188. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-01931-4_24

    Chapter  Google Scholar 

  6. Chollet, F.: Keras (2015). https://github.com/fchollet/keras

  7. Khomitsevich, O., Chistikov, P., Zakharov, D.: Using random forests for prosodic break prediction based on automatic speech labeling. In: International Conference on Speech and Computer, pp. 467–474 (2014)

    Google Scholar 

  8. Klimkov, V., et al.: Phrase break prediction for long-form reading TTS: exploiting text structure information. In: Proceedings of Interspeech 2017, pp. 1064–1068 (2017)

    Google Scholar 

  9. Kocharov, D., Kachkovskaia, T., Mirzagitova, A., Skrelin, P.: Prosodic boundary detection using syntactic and acoustic information. In: International Conference on Statistical Language and Speech Processing, pp. 68–79 (2016)

    Google Scholar 

  10. Kudo, T.: CRF++: yet another CRF toolkit (2005). http://crfpp.googlecode.com/svn/trunk/doc/index.html

  11. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning, pp. 282–289 (2001)

    Google Scholar 

  12. Liu, R., Bao, F., Gao, G., Zhang, H., Wang, Y.: Improving Mongolian phrase break prediction by using syllable and morphological embeddings with BiLSTM model. In: Proceedings of Interspeech 2018, pp. 57–61 (2018)

    Google Scholar 

  13. Lobanov, B.: An algorithm of the text segmentation on syntactic syntagrams for TTS synthesis. In: Proceedings of Dialogue 2008 (2008)

    Google Scholar 

  14. Louw, J.A., Moodley, A.: Speaker specific phrase break modeling with conditional random fields for text-to-speech. In: Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech), pp. 1–6 (2016)

    Google Scholar 

  15. de Marneffe, M.C., et al.: Universal stanford dependencies: a cross-linguistic typology. In: Proceedings of the 9th Conference on International Language Resources and Evaluation, LREC 2014, pp. 4585–4592 (2014)

    Google Scholar 

  16. Mishra, T., Kim, Y.j., Bangalore, S.: Intonational phrase break prediction for text-to-speech synthesis using dependency relations. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4919–4923 (2015)

    Google Scholar 

  17. Mosteller, F., Tukey, J.W.: Data analysis, including statistics. In: Handbook of Social Psychology, vol. 2, pp. 80–203 (1968)

    Google Scholar 

  18. Nivre, J., et al.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of the 10th Conference on International Language Resources and Evaluation, LREC 2016, pp. 1659–1666 (2016)

    Google Scholar 

  19. Pascual, Santiago, Bonafonte, Antonio: Prosodic break prediction with RNNs. In: Abad, A., et al. (eds.) IberSPEECH 2016. LNCS (LNAI), vol. 10077, pp. 64–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49169-1_7

    Chapter  Google Scholar 

  20. Read, I., Cox, S.: Stochastic and syntactic techniques for predicting phrase breaks. Comput. Speech Lang. 21(3), 519–542 (2007)

    Article  Google Scholar 

  21. Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50 (2010)

    Google Scholar 

  22. Schmid, H., Atterer, M.: New statistical methods for phrase break prediction. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 659 (2004)

    Google Scholar 

  23. Skrelin, P.A., Volskaya, N.B., Kocharov, D., Evgrafova, K., Glotova, O., Evdokimova, V.: A fully annotated corpus of Russian speech. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010, pp. 109–112 (2010)

    Google Scholar 

  24. Sun, X., Applebaum, T.H.: Intonational phrase break prediction using decision tree and n-gram model. In: 7th European Conference on Speech Communication and Technology (2001)

    Google Scholar 

  25. Zhang, Z., Wu, F., Yang, C., Dong, M., Zhou, F.: Mandarin prosodic phrase prediction based on syntactic trees. In: 9th ISCA Speech Synthesis Workshop, pp. 160–165 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alla Menshikova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Menshikova, A., Kocharov, D. (2019). Prosodic Boundaries Prediction in Russian Using Morphological and Syntactic Features. In: Ustalov, D., Filchenkov, A., Pivovarova, L. (eds) Artificial Intelligence and Natural Language. AINL 2019. Communications in Computer and Information Science, vol 1119. Springer, Cham. https://doi.org/10.1007/978-3-030-34518-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34518-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34517-4

  • Online ISBN: 978-3-030-34518-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics