Abstract
The paper presents a comparison between three approaches towards prosodic boundary prediction in Russian text, namely a rule-governed method and methods involving statistical classifier and deep learning technique. The methods aim to predict all possible prosodic boundaries in text applying morphological and syntactic information. All used features were described in terms of Universal Dependencies framework by means of SyntaxNet parser. The rule-governed method runs in a bottom-up fashion, using the information about syntax group edges and applying data-driven and hand-written linguistic rules. For machine learning methods, conditional random fields classifier and bidirectional LSTM model were built, with such features as part-of-speech tag, syntactic dependency type, syntactic relation embedding and presence of syntactic link between the current and adjacent words. As experimental material, we used the data of CORPRES corpus, containing over 30 hours of professionally read speech. Used separately, morphological features are slightly superior to syntactic ones, and their combination improves the results. BiLSTM yields the highest F1 measure value of 90.4, as compared to 88.8 for CRF and 83.1 for rule-based method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alberti, C., et al.: Syntaxnet models for the CoNLL 2017 shared task. arXiv e-prints arXiv:1703.04929 (2017)
Andor, D., et al.: Globally normalized transition-based neural networks. arXiv e-prints arXiv:1603.06042 (2016)
Che, H., Tao, J., Li, Y.: Improving mandarin prosodic boundary prediction with rich syntactic features. In: Proceedings of Interspeech 2014 (2014)
Chen, Q., Ling, Z.H., Yang, C.Y., Dai, L.R.: Automatic phrase boundary labeling of speech synthesis database using context-dependent HMMs and n-gram prior distributions. In: Proceedings of Interspeech 2015 (2015)
Chistikov, Pavel, Khomitsevich, Olga: Improving prosodic break detection in a Russian TTS system. In: Železný, Miloš, Habernal, Ivan, Ronzhin, Andrey (eds.) SPECOM 2013. LNCS (LNAI), vol. 8113, pp. 181–188. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-01931-4_24
Chollet, F.: Keras (2015). https://github.com/fchollet/keras
Khomitsevich, O., Chistikov, P., Zakharov, D.: Using random forests for prosodic break prediction based on automatic speech labeling. In: International Conference on Speech and Computer, pp. 467–474 (2014)
Klimkov, V., et al.: Phrase break prediction for long-form reading TTS: exploiting text structure information. In: Proceedings of Interspeech 2017, pp. 1064–1068 (2017)
Kocharov, D., Kachkovskaia, T., Mirzagitova, A., Skrelin, P.: Prosodic boundary detection using syntactic and acoustic information. In: International Conference on Statistical Language and Speech Processing, pp. 68–79 (2016)
Kudo, T.: CRF++: yet another CRF toolkit (2005). http://crfpp.googlecode.com/svn/trunk/doc/index.html
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning, pp. 282–289 (2001)
Liu, R., Bao, F., Gao, G., Zhang, H., Wang, Y.: Improving Mongolian phrase break prediction by using syllable and morphological embeddings with BiLSTM model. In: Proceedings of Interspeech 2018, pp. 57–61 (2018)
Lobanov, B.: An algorithm of the text segmentation on syntactic syntagrams for TTS synthesis. In: Proceedings of Dialogue 2008 (2008)
Louw, J.A., Moodley, A.: Speaker specific phrase break modeling with conditional random fields for text-to-speech. In: Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech), pp. 1–6 (2016)
de Marneffe, M.C., et al.: Universal stanford dependencies: a cross-linguistic typology. In: Proceedings of the 9th Conference on International Language Resources and Evaluation, LREC 2014, pp. 4585–4592 (2014)
Mishra, T., Kim, Y.j., Bangalore, S.: Intonational phrase break prediction for text-to-speech synthesis using dependency relations. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4919–4923 (2015)
Mosteller, F., Tukey, J.W.: Data analysis, including statistics. In: Handbook of Social Psychology, vol. 2, pp. 80–203 (1968)
Nivre, J., et al.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of the 10th Conference on International Language Resources and Evaluation, LREC 2016, pp. 1659–1666 (2016)
Pascual, Santiago, Bonafonte, Antonio: Prosodic break prediction with RNNs. In: Abad, A., et al. (eds.) IberSPEECH 2016. LNCS (LNAI), vol. 10077, pp. 64–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49169-1_7
Read, I., Cox, S.: Stochastic and syntactic techniques for predicting phrase breaks. Comput. Speech Lang. 21(3), 519–542 (2007)
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50 (2010)
Schmid, H., Atterer, M.: New statistical methods for phrase break prediction. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 659 (2004)
Skrelin, P.A., Volskaya, N.B., Kocharov, D., Evgrafova, K., Glotova, O., Evdokimova, V.: A fully annotated corpus of Russian speech. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010, pp. 109–112 (2010)
Sun, X., Applebaum, T.H.: Intonational phrase break prediction using decision tree and n-gram model. In: 7th European Conference on Speech Communication and Technology (2001)
Zhang, Z., Wu, F., Yang, C., Dong, M., Zhou, F.: Mandarin prosodic phrase prediction based on syntactic trees. In: 9th ISCA Speech Synthesis Workshop, pp. 160–165 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Menshikova, A., Kocharov, D. (2019). Prosodic Boundaries Prediction in Russian Using Morphological and Syntactic Features. In: Ustalov, D., Filchenkov, A., Pivovarova, L. (eds) Artificial Intelligence and Natural Language. AINL 2019. Communications in Computer and Information Science, vol 1119. Springer, Cham. https://doi.org/10.1007/978-3-030-34518-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-34518-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34517-4
Online ISBN: 978-3-030-34518-1
eBook Packages: Computer ScienceComputer Science (R0)