Abstract
Automatic summarization is a task to condense the data to a shorter version while preserving key informational components and the meaning of content. In this paper, we introduce Tech-Talk-Sum, which is the combination of BERT (Bidirectional Encoder Representations from Transformers) and the attention mechanism to summarize the technological talk videos. We first introduce the technology talk datasets that were constructed from YouTube including short- and long-talk videos. Second, we explored various sentence representations from BERT’s output. Using the top hidden layer to represent sentences is the best choice for our datasets. The outputs from BERT were fed forward to the Bi-LSTM network to build local context vectors. Besides, we built the document encoder layer that leverages BERT and the self-attention mechanism to express the semantics of a video caption and to form the global context vector. Third, the undirected LSTM was added to bridge the local and global sentence’s contexts to predict the sentence’s salience score. Finally, the video summaries were generated based on the scores. We trained a single unified model on long-talk video datasets. ROUGE was utilized to evaluate our proposed methods. The experimental results demonstrate that our model has generalization ability, and achieves the baselines and state-of-the-art results for both long and short videos.
Similar content being viewed by others
References
Aishwarya J, Vaibhav R (2018) Extractive summarization with SWAP-NET: Sentences and words from alternating pointer networks. In: Proc of the 56th annual meeting ofthe association for computational linguistics, melbourne, Australia. https://doi.org/10.18653/v1/P18-1014, pp 142–151
Albi D, Silvester H (2017) Pagerank Algorithm. IOSR Journal of Computer Engineering 19(1):01–07. https://doi.org/10.9790/0661-1901030107
Alessandro F (2014) Innovative document summarization techniques: revolutionizing knowledge understanding. IGI Global. https://doi.org/10.4018/978-1-4666-5019-0
Alexander MR, Sumit C, Jason W (2015) A neural attention model for abstractive sentence summarization. In: Proc. of the 2015 conference on empirical methods in natural language processing, . https://doi.org/10.18653/v1/D15-1044, pp 379–389
Ashish V, Noam S, Niki P, Jakob U, Llion J, Aidan NG, Łukasz K (2017) Attention Is All You Need. In: Proc. of 2017 NIPS Long Beach, CA, USA, pp 6000–6010
Ashutosh A, Achyudh R, Raphael T, Jimmy L (2019) DocBERT: BERT for document classification. arXiv:1904.08398
Chi S, Luyao H, Xipeng Q (2019) utilizing BERT for Aspect-Based sentiment analysis via constructing auxiliary sentence. In: Proc. of NAACL-HLT 2019 minneapolis, minnesota, pp 380–385
Chin YL (2004) ROUGE: A package for automatic evaluation of summaries. In: Association for computational linguistics, Barcelona, Spain, pp 74–81
Chun IT, Hsiao TH, Kuan YC, Berlin C (2016) Extractive speech summarization leveraging convolutional neural network techniques. In: Proc 2016 IEEE spoken language technology workshop (SLT 2016), pp 158–164
Fu Z, Bing Q, Jing Y, Jing C, Yubo Z, Xu W (2019) Document summarization using word and part-of-speech based on attention mechanism. Journal of Physics: Conference Series, 1168(3). https://doi.org/10.1088/1742-6596/1168/3/032008
Jacob D, Ming WC, Kenton L, Kristina T (2016) BERT: Pre-training of deep bidirectional transformers for language understanding. Computer Science. https://doi.org/10.18653/v1/n19-1423
Jeffrey P, Richard S, Christopher M (2014) Glove: Global vectors for word representation. In: Proc. of the 2014 conference on empirical methods in natural language processing (EMNLP), Association for Computational Linguistics, Doha, Qatar. https://doi.org/10.3115/v1/D14-1162, pp 1532–1543
Kamal AS, Zhang Z, Mohammed N (2018) A hierarchical structured self-attentive model for extractive document summarization (HSSAS). IEEE Access 6:24205–24212. https://doi.org/10.1109/ACCESS.2018.2829199
Kuan YC, Shih HL, Berlin C, Hsin MW, Ea EJ (2015) Extractive broadcast news summarization leveraging recurrent neural network language modeling techniques. IEEE/ACM Transactions on Audio Speech, and Language Processing 23 (8):1322–1334. https://doi.org/10.1109/TASLP.2015.2432578
Lijun W, Fei T, Li Z, Jianhuang L, Tie YL (2018) Word attention for sequence to sequence text understanding. In: Proc. of the thirty-second AAAI conference on artificial intelligence, Louisiana, USA, pp 5578–5585
Qicai W, Peiyu L, Zhenfang Z, Hongxia Y, Qiuyue Z, Lindong Z (2019) A text abstraction summary model based on BERT word embedding and reinforcement learning. Appl Sci, 9(21). https://doi.org/10.3390/app9214701
Ramesh N, Feifei Z, Bowen Z (2016) SummaruNNer: A recurrent neural network based sequence model for extractive summarization of documents. In: Proc. of the thirty-first AAAI conference on artificial intelligence (AAAI-17), San Francisco, California USA, pp 3075–3081
Shashi N, Shay BC, Mirella L (2018) Ranking sentences for extractive summarization with reinforcement learning. In: Proc. of NAACL-HLT 2018 New Orleans, Louisiana, pp 1747–1759
Victor S, Lysandre D, Julien C, Thomas W (2020) DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv:1910.01108
Xingxing Z, Mirella L, Furu W, Ming Z (2018) Neural latent extractive document summarization. In: Proc. of the 2018 conference on empirical methods in natural language processing, brussels, Belgium. https://doi.org/10.18653/v1/D18-1088. November, pp 779–784
Yang L (2019) Fine-tune BERT for Extractive Summarization. arXiv:1903.10318
Yang L, Mirella L (2019) Text summarization with pretrained encoders. In: Proc 2019 the conference on empirical methods in natural language processing, Hong Kong, China, pp 3730–3740
Yau SW, Hung YL (2018) Learning to encode text as human-readable summaries using generative adversarial networks. In: Proc. of the 2018 conference on empirical methods in natural language processing, association for computational linguistics, Brussels, Belgium. https://doi.org/10.18653/v1/D18-1451, pp 4187–4195
Yinhan L, Myle O, Naman G, Jingfei D, Mandar J, Danqi C, Omer L, Mike L, Luke Z, Veselin S (2019) RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692
Yogan JK, Ong SG, Halizah B, Ngo HC, Puspalata CS (2016) A review on automatic text summarization approaches. J Comput Sci 12 (4):178–190. https://doi.org/10.3844/jcssp.2016.178.190
Yong Z, Meng JE, Mahardhika P (2016) Extractive document summarization based on convolutional neural networks. In: IECON 2016 - 42nd Annu. Conf. IEEE Ind. Electron. Soc., Florence, Italy, pp 918–922
Yuxiang W, Baotian H (2018) Learning to extract coherent summary via deep reinforcement learning. In: association for the advancement of artificial intelligence (AAAI 2018), Louisiana, USA, pp 5602–5609
Zhenzhong L, Mingda C, Sebastian G, Kevin G, Piyush S, Radu S (2020) ALBERT: A lite bert for Self-Supervised learning of language representations. In: Proc. ICLR 2020, Ababa, Ethiopia. arXiv:1909.11942
Zhenzhong L, Mingda C, Sebastian G, Kevin G, Piyush S, Radu S (2020) ALBERT: A lite bert for Self-Supervised learning of language representations. In: Proc. ICLR 2020, Ababa, Ethiopia. arXiv:1909.11942
Acknowledgements
The authors would like to thank the Ministry of Science and Technology of Taiwan for supporting our research under grant No. MOST105-2923-S-008-001-MY3.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chootong, C., Shih, T.K. Tech-Talk-Sum: fine-tuning extractive summarization and enhancing BERT text contextualization for technological talk videos. Multimed Tools Appl 81, 31295–31312 (2022). https://doi.org/10.1007/s11042-022-12812-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12812-4