Skip to main content

Evaluation on Noise Reduction in Subtitle Generator for Videos

  • Conference paper
  • First Online:
Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS 2022)

Abstract

Currently, watching movies and videos on the internet serves all needs such as learning, entertainment, and research. The application of artificial intelligence in translation and speech recognition is also discussed. They are also developing in many research directions, such as Speech-to-Text recognition applications based on specific audio files. However, studies often focus on improving the processing speed and the accuracy of words converted into text inside the audio file but have not focused on clarifying the voice inside the audio file to facilitate easy and accurate identification. Like no tool can automatically create subtitles for videos for free, but only manually create subtitles based on timestamps and adding subtitles, which is quite time-consuming for long movies or videos. Therefore, this study proposes a new approach by combining audio processing for noise reduction, noise removal, and audio-to-text recognition to create a tool to generate subtitles automatically with high accuracy. The study results are only experimental to create a research direction that can be developed and implemented into viable applications for creating subtitles for videos without having to do it manually and with an accuracy of about 80%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/alexkay/spek, accessed on 10 March 2022.

  2. 2.

    https://pypi.org/project/PyQt5/.

References

  1. Aswin, V.B., et al.: NLP-driven ensemble-based automatic subtitle generation and semantic video summarization technique. In: Chiplunkar, N.N., Fukao, T. (eds.) Advances in Artificial Intelligence and Data Engineering. AISC, vol. 1133, pp. 3–13. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-3514-7_1

    Chapter  Google Scholar 

  2. Chootong, C., Shih, T.K., Ochirbat, A., Sommool, W., Zhuang, Y.Y.: An attention enhanced sentence feature network for subtitle extraction and summarization. Expert Syst. Appl. 178, 114946 (2021). https://doi.org/10.1016/j.eswa.2021.114946

  3. Degadwala, S., Vyas, D., Biswas, H., Chakraborty, U., Saha, S.: Image captioning using inception v3 transfer learning model. In: 2021 6th International Conference on Communication and Electronics Systems (ICCES). IEEE (2021). https://doi.org/10.1109/icces51350.2021.9489111

  4. Domingo, I.V.R., Mamanta, M.N.G., Regpala, J.T.S.: FILENG: an automatic English subtitle generator from Filipino video clips using hidden Markov model. In: The 2021 9th International Conference on Computer and Communications Management. ACM (2021). https://doi.org/10.1145/3479162.3479172

  5. Elshahaby, H., Rashwan, M.: An end to end system for subtitle text extraction from movie videos. J. Ambient Intell. Human. Comput. (2021). https://doi.org/10.1007/s12652-021-02951-1

  6. Halpern, Y., et al.: Contextual prediction models for speech recognition. In: Proceedings of Interspeech 2016 (2016). http://www.isca-speech.org/archive/Interspeech_2016/pdfs/1358.PDF

  7. Hunter, J.D.: Matplotlib: a 2D graphics environment. Computi. Sci. Eng. 9(3), 90–95 (2007). https://doi.org/10.1109/MCSE.2007.55

    Article  Google Scholar 

  8. Orero, P., Brescia-Zapata, M., Hughes, C.: Evaluating subtitle readability in media immersive environments. In: 9th International Conference on Software Development and Technologies for Enhancing Accessibility and Fighting Info-exclusion. ACM, December 2020. https://doi.org/10.1145/3439231.3440602

  9. Linhares Pontes, E., González-Gallardo, C.-E., Torres-Moreno, J.-M., Huet, S.: Cross-lingual speech-to-text summarization. In: Choroś, K., Kopel, M., Kukla, E., Siemiński, A. (eds.) MISSI 2018. AISC, vol. 833, pp. 385–395. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98678-4_39

    Chapter  Google Scholar 

  10. Roy, A., Phadikar, S.: Automatic segmentation of spoken word signals into letters based on amplitude variation for speech to text transcription. In: Mandal, J.K., Satapathy, S.C., Sanyal, M.K., Sarkar, P.P., Mukhopadhyay, A. (eds.) Information Systems Design and Intelligent Applications. AISC, vol. 340, pp. 621–628. Springer, New Delhi (2015). https://doi.org/10.1007/978-81-322-2247-7_63

    Chapter  Google Scholar 

  11. Sainburg, T., Thielk, M., Gentner, T.Q.: Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires. PLoS Comput. Biol. 16(10), e1008228 (2020)

    Article  Google Scholar 

  12. Seo, D., Gil, J.-M.: Speech-to-text-based life log system for smartphones. In: Park, D.S., Chao, H.C., Jeong, Y.S., Park, J. (eds.) Advances in Computer Science and Ubiquitous Computing. LNEE, vol. 373, pp. 637–642. Springer, Singapore (2015). https://doi.org/10.1007/978-981-10-0281-6_90

    Chapter  Google Scholar 

  13. Verboom, M., Crombie, D., Dijk, E., Theunisz, M.: Spoken subtitles: making subtitled TV programmes accessible. In: Miesenberger, K., Klaus, J., Zagler, W. (eds.) ICCHP 2002. LNCS, vol. 2398, pp. 295–302. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45491-8_62

    Chapter  Google Scholar 

  14. Victor, D.M., Eduardo, F.F., Biswas, R., Alegre, E., Fernández-Robles, L.: Application of extractive text summarization algorithms to speech-to-text media. In: Pérez García, H., Sánchez González, L., Castejón Limas, M., Quintián Pardo, H., Corchado Rodríguez, E. (eds.) HAIS 2019. LNCS (LNAI), vol. 11734, pp. 540–550. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29859-3_46

    Chapter  Google Scholar 

  15. Yim, J.: Design of a subtitle generator. In: Advanced Science and Technology Letters. Science and Engineering Research Support soCiety, November 2015. https://doi.org/10.14257/astl.2015.117.17

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dien Thanh Tran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, H.T., Thanh, T.N.L., Ngoc, T.L., Le, A.D., Tran, D.T. (2022). Evaluation on Noise Reduction in Subtitle Generator for Videos. In: Barolli, L. (eds) Innovative Mobile and Internet Services in Ubiquitous Computing. IMIS 2022. Lecture Notes in Networks and Systems, vol 496. Springer, Cham. https://doi.org/10.1007/978-3-031-08819-3_14

Download citation

Publish with us

Policies and ethics