Tech-Talk-Sum: fine-tuning extractive summarization and enhancing BERT text contextualization for technological talk videos

Chootong, Chalothon; Shih, Timothy K.

doi:10.1007/s11042-022-12812-4

Tech-Talk-Sum: fine-tuning extractive summarization and enhancing BERT text contextualization for technological talk videos

Published: 08 April 2022

Volume 81, pages 31295–31312, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

292 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Automatic summarization is a task to condense the data to a shorter version while preserving key informational components and the meaning of content. In this paper, we introduce Tech-Talk-Sum, which is the combination of BERT (Bidirectional Encoder Representations from Transformers) and the attention mechanism to summarize the technological talk videos. We first introduce the technology talk datasets that were constructed from YouTube including short- and long-talk videos. Second, we explored various sentence representations from BERT’s output. Using the top hidden layer to represent sentences is the best choice for our datasets. The outputs from BERT were fed forward to the Bi-LSTM network to build local context vectors. Besides, we built the document encoder layer that leverages BERT and the self-attention mechanism to express the semantics of a video caption and to form the global context vector. Third, the undirected LSTM was added to bridge the local and global sentence’s contexts to predict the sentence’s salience score. Finally, the video summaries were generated based on the scores. We trained a single unified model on long-talk video datasets. ROUGE was utilized to evaluate our proposed methods. The experimental results demonstrate that our model has generalization ability, and achieves the baselines and state-of-the-art results for both long and short videos.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TL;DW? Summarizing Instructional Videos with Task Relevance and Cross-Modal Saliency

Description generation of open-domain videos incorporating multimodal features and bidirectional encoder

Article 30 August 2018

Video to Text Generation Using Sentence Vector and Skip Connections

Notes

https://tools.ietf.org/html/rfc6749

References

Aishwarya J, Vaibhav R (2018) Extractive summarization with SWAP-NET: Sentences and words from alternating pointer networks. In: Proc of the 56th annual meeting ofthe association for computational linguistics, melbourne, Australia. https://doi.org/10.18653/v1/P18-1014, pp 142–151
Albi D, Silvester H (2017) Pagerank Algorithm. IOSR Journal of Computer Engineering 19(1):01–07. https://doi.org/10.9790/0661-1901030107
Google Scholar
Alessandro F (2014) Innovative document summarization techniques: revolutionizing knowledge understanding. IGI Global. https://doi.org/10.4018/978-1-4666-5019-0
Alexander MR, Sumit C, Jason W (2015) A neural attention model for abstractive sentence summarization. In: Proc. of the 2015 conference on empirical methods in natural language processing, . https://doi.org/10.18653/v1/D15-1044, pp 379–389
Ashish V, Noam S, Niki P, Jakob U, Llion J, Aidan NG, Łukasz K (2017) Attention Is All You Need. In: Proc. of 2017 NIPS Long Beach, CA, USA, pp 6000–6010
Ashutosh A, Achyudh R, Raphael T, Jimmy L (2019) DocBERT: BERT for document classification. arXiv:1904.08398
Chi S, Luyao H, Xipeng Q (2019) utilizing BERT for Aspect-Based sentiment analysis via constructing auxiliary sentence. In: Proc. of NAACL-HLT 2019 minneapolis, minnesota, pp 380–385
Chin YL (2004) ROUGE: A package for automatic evaluation of summaries. In: Association for computational linguistics, Barcelona, Spain, pp 74–81
Chun IT, Hsiao TH, Kuan YC, Berlin C (2016) Extractive speech summarization leveraging convolutional neural network techniques. In: Proc 2016 IEEE spoken language technology workshop (SLT 2016), pp 158–164
Fu Z, Bing Q, Jing Y, Jing C, Yubo Z, Xu W (2019) Document summarization using word and part-of-speech based on attention mechanism. Journal of Physics: Conference Series, 1168(3). https://doi.org/10.1088/1742-6596/1168/3/032008
Jacob D, Ming WC, Kenton L, Kristina T (2016) BERT: Pre-training of deep bidirectional transformers for language understanding. Computer Science. https://doi.org/10.18653/v1/n19-1423
Jeffrey P, Richard S, Christopher M (2014) Glove: Global vectors for word representation. In: Proc. of the 2014 conference on empirical methods in natural language processing (EMNLP), Association for Computational Linguistics, Doha, Qatar. https://doi.org/10.3115/v1/D14-1162, pp 1532–1543
Kamal AS, Zhang Z, Mohammed N (2018) A hierarchical structured self-attentive model for extractive document summarization (HSSAS). IEEE Access 6:24205–24212. https://doi.org/10.1109/ACCESS.2018.2829199
Article Google Scholar
Kuan YC, Shih HL, Berlin C, Hsin MW, Ea EJ (2015) Extractive broadcast news summarization leveraging recurrent neural network language modeling techniques. IEEE/ACM Transactions on Audio Speech, and Language Processing 23 (8):1322–1334. https://doi.org/10.1109/TASLP.2015.2432578
Article Google Scholar
Lijun W, Fei T, Li Z, Jianhuang L, Tie YL (2018) Word attention for sequence to sequence text understanding. In: Proc. of the thirty-second AAAI conference on artificial intelligence, Louisiana, USA, pp 5578–5585
Qicai W, Peiyu L, Zhenfang Z, Hongxia Y, Qiuyue Z, Lindong Z (2019) A text abstraction summary model based on BERT word embedding and reinforcement learning. Appl Sci, 9(21). https://doi.org/10.3390/app9214701
Ramesh N, Feifei Z, Bowen Z (2016) SummaruNNer: A recurrent neural network based sequence model for extractive summarization of documents. In: Proc. of the thirty-first AAAI conference on artificial intelligence (AAAI-17), San Francisco, California USA, pp 3075–3081
Shashi N, Shay BC, Mirella L (2018) Ranking sentences for extractive summarization with reinforcement learning. In: Proc. of NAACL-HLT 2018 New Orleans, Louisiana, pp 1747–1759
Victor S, Lysandre D, Julien C, Thomas W (2020) DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv:1910.01108
Xingxing Z, Mirella L, Furu W, Ming Z (2018) Neural latent extractive document summarization. In: Proc. of the 2018 conference on empirical methods in natural language processing, brussels, Belgium. https://doi.org/10.18653/v1/D18-1088. November, pp 779–784
Yang L (2019) Fine-tune BERT for Extractive Summarization. arXiv:1903.10318
Yang L, Mirella L (2019) Text summarization with pretrained encoders. In: Proc 2019 the conference on empirical methods in natural language processing, Hong Kong, China, pp 3730–3740
Yau SW, Hung YL (2018) Learning to encode text as human-readable summaries using generative adversarial networks. In: Proc. of the 2018 conference on empirical methods in natural language processing, association for computational linguistics, Brussels, Belgium. https://doi.org/10.18653/v1/D18-1451, pp 4187–4195
Yinhan L, Myle O, Naman G, Jingfei D, Mandar J, Danqi C, Omer L, Mike L, Luke Z, Veselin S (2019) RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692
Yogan JK, Ong SG, Halizah B, Ngo HC, Puspalata CS (2016) A review on automatic text summarization approaches. J Comput Sci 12 (4):178–190. https://doi.org/10.3844/jcssp.2016.178.190
Article Google Scholar
Yong Z, Meng JE, Mahardhika P (2016) Extractive document summarization based on convolutional neural networks. In: IECON 2016 - 42nd Annu. Conf. IEEE Ind. Electron. Soc., Florence, Italy, pp 918–922
Yuxiang W, Baotian H (2018) Learning to extract coherent summary via deep reinforcement learning. In: association for the advancement of artificial intelligence (AAAI 2018), Louisiana, USA, pp 5602–5609
Zhenzhong L, Mingda C, Sebastian G, Kevin G, Piyush S, Radu S (2020) ALBERT: A lite bert for Self-Supervised learning of language representations. In: Proc. ICLR 2020, Ababa, Ethiopia. arXiv:1909.11942
Zhenzhong L, Mingda C, Sebastian G, Kevin G, Piyush S, Radu S (2020) ALBERT: A lite bert for Self-Supervised learning of language representations. In: Proc. ICLR 2020, Ababa, Ethiopia. arXiv:1909.11942

Download references

Acknowledgements

The authors would like to thank the Ministry of Science and Technology of Taiwan for supporting our research under grant No. MOST105-2923-S-008-001-MY3.

Author information

Authors and Affiliations

Faculty of Science at Sriracha, Kasetsart University, Chonburi, Thailand
Chalothon Chootong
Department of Computer Science and Information Engineering National Central University, Taoyuan, Taiwan
Timothy K. Shih

Authors

Chalothon Chootong
View author publications
You can also search for this author in PubMed Google Scholar
Timothy K. Shih
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chalothon Chootong.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chootong, C., Shih, T.K. Tech-Talk-Sum: fine-tuning extractive summarization and enhancing BERT text contextualization for technological talk videos. Multimed Tools Appl 81, 31295–31312 (2022). https://doi.org/10.1007/s11042-022-12812-4

Download citation

Received: 21 June 2020
Revised: 23 February 2022
Accepted: 09 March 2022
Published: 08 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s11042-022-12812-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tech-Talk-Sum: fine-tuning extractive summarization and enhancing BERT text contextualization for technological talk videos

Abstract

Access this article

Similar content being viewed by others

TL;DW? Summarizing Instructional Videos with Task Relevance and Cross-Modal Saliency

Description generation of open-domain videos incorporating multimodal features and bidirectional encoder

Video to Text Generation Using Sentence Vector and Skip Connections

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Tech-Talk-Sum: fine-tuning extractive summarization and enhancing BERT text contextualization for technological talk videos

Abstract

Access this article

Similar content being viewed by others

TL;DW? Summarizing Instructional Videos with Task Relevance and Cross-Modal Saliency

Description generation of open-domain videos incorporating multimodal features and bidirectional encoder

Video to Text Generation Using Sentence Vector and Skip Connections

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation