Central Audio-Library of the University of Novi Sad
- 417 Downloads
This paper presents the project Central Audio Library of the University of Novi Sad (CABUNS), aimed at automated creation of audio editions of textbooks, presentations and other course material using the new technology of text-to-speech synthesis in the Serbian language. The paper describes the architecture and the features of the developed system, from the points of view of both teachers and assistants who upload course material to the CABUNS server, as well as students who can download audio editions and listen to them (and view them) using their computers and mobile phones. The examples of the first audio editions of textbooks and PowerPoint presentations related to the course Acoustics and Audio Engineering are presented. The paper also analyzes the advantages and drawbacks of this new learning technology, which has a potential to greatly contribute to the quality of higher education, but also to education at other levels. The paper also presents the most recent results in the development of text-to-speech, enabling voice conversion, which means that very soon it will be possible to produce an audio edition in the voice of the author of the textbook or the person who delivers the lecture.
KeywordsAudio library Audio editions of textbooks New digital learning technologies Text-to-speech synthesis Deep neural networks Voice conversion
The work described in this paper was supported in part by the Ministry of Education, Science and Technological Development of the Republic of Serbia, within the project “Development of Dialogue Systems for Serbian and Other South Slavic Languages”, and the Provincial Secretariat for Higher Education and Scientific Research, within the project “Central Audio-Library of the University of Novi Sad”, No. 114-451-2570/2016-02.
- 1.Aasbrenn, M., Bingen, H.: Maximizing flexibility and learning; using learning technology to improve course programs in higher education. In: ICDE 23rd World Conference, Maastricht MECC, The Netherlands (2009)Google Scholar
- 3.Beer, K.: Listen while you read. School Libr. J. 44(4), 30–35 (1998)Google Scholar
- 4.Delić, T., Suzić, S., Sec̆ujski, M., Ostojić, V.: Deep neural network speech synthesis based on adaptation to amateur speech data. In: 5th International Conference on Electrical, Electronic and Computing Engineering (IcETRAN), Subotica, Serbia, pp. 1249–1252 (2018)Google Scholar
- 5.Delić, T., Suzić, S., Sec̆ujski, M., Pekar, D.: Rapid development of new TTS voices by neural network adaptation. In: 17th International Symposium INFOTEH-JAHORINA, Jahorina, Bosnia and Herzegovina, pp. 1–6 (2018)Google Scholar
- 8.Mišković, D., Gnjatović, M., Jakovljević, N., Delić, V.: Development of the audio library of the University of Novi Sad. In: 11th DOGS, Digital Speech and Image Processing, Novi Sad, Serbia, pp. 53–56 (2017)Google Scholar
- 11.Ozgur, A.Z., Kiray, H.S.: Evaluating audio books as supported course materials in distance education: the experiences of the blind learners. Turkish Online J. Educ. Technol. TOJET 6(4), 30–35 (2007)Google Scholar
- 12.Suzić, S., Delić, T., Jovanović, V., Sec̆ujski, M., Pekar, D., Delić, V.: A comparison of multi-style DNN-based TTS approaches using small datasets. In: 13th International Conference on Electromechanics and Robotics “Zavalishin’s Readings”, ER(ZR)-2018, St. Petersburg, Russia, pp. 1–6 (2018)Google Scholar
- 14.Székely, É., Cabral, J.P., Abou-Zleikha, M., Cahill, P., Carson-Berndsen, J.: Evaluating expressive speech synthesis from audiobooks in conversational phrases. In: International Conference on Language Resources and Evaluation, Istanbul, Turkey, pp. 3335–3339 (2012)Google Scholar
- 15.Wu, Z., Swietojanski, P., Veaux, C., Renals, S., King, S.: A study of speaker adaptation for DNN-based speech synthesis. In: 16th Annual Conference of the International Speech Communication Association, INTERSPEECH, Dresden, Germany (2015)Google Scholar
- 16.Zen, H., Agiomyrgiannakis, Y., Egberts, N., Henderson, F., Szczepaniak, P.: Fast, compact, and high quality LSTM-RNN based statistical parametric speech synthesizers for mobile devices. In: 17th Annual Conference of the International Speech Communication Association, INTERSPEECH, San Francisco, CA, USA, pp. 2273–2277 (2016)Google Scholar