A hybrid neural network hidden Markov model approach for automatic story segmentation

Yu, Jia; Xie, Lei; Xiao, Xiong; Chng, Eng Siong

doi:10.1007/s12652-017-0501-9

A hybrid neural network hidden Markov model approach for automatic story segmentation

Original Research
Published: 11 May 2017

Volume 8, pages 925–936, (2017)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Jia Yu^1,2,
Lei Xie¹,
Xiong Xiao³ &
…
Eng Siong Chng³

466 Accesses
12 Citations
Explore all metrics

Abstract

We propose a hybrid neural network hidden Markov model (NN-HMM) approach for automatic story segmentation. A story is treated as an instance of an underlying topic (a hidden state) and words are generated from the distribution of the topic. The transition from one topic to another indicates a story boundary. Different from the traditional HMM approach, in which the emission probability of each state is calculated from a topic-dependent language model, we use deep neural network (DNN) to directly map the word distribution into topic posterior probabilities. DNN is known to be able to learn meaningful continuous features for words and hence has better discriminative and generalization capability than n-gram models. Specifically, we investigate three neural network structures: a feed-forward neural network, a recurrent neural network with long short-term memory cells (LSTM-RNN) and a modified LSTM-RNN with multi-task learning ability. Experimental results on the TDT2 corpus show that the proposed NN-HMM approach outperforms the traditional HMM approach significantly and achieves state-of-the-art performance in story segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Broadcast news story segmentation using sticky hierarchical dirichlet process

Article 14 February 2022

LSTM-based Siamese neural network for Urdu news story segmentation

Article 03 June 2023

Recurrent Neural Word Segmentation with Tag Inference

References

Abdel-Hamid O, Deng L, Yu D (2013) Exploring convolutional neural network structures and optimization techniques for speech recognition. In: Proceedings of INTERSPEECH, pp 3366–3370
Banerjee S, Rudnicky AI (2006) A texttiling based approach to topic boundary detection in meetings. In: Proceedings of INTERSPEECH
Beeferman D, Berger A, Lafferty J (1999) Statistical models for text segmentation. Mach Learn 34(1–3):177–210
Article MATH Google Scholar
Blei DM, Moreno PJ (2001) Topic segmentation with an aspect hidden Markov model. In: Proceedings of SIGIR, pp 343–348
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Bouchekif A, Damnati G, Charlet D (2014) Intra-content term weighting for topic segmentation. In: Proceedings of ICASSP, pp 7113–7117
Bourlard HA, Morgan N (2012) Connectionist speech recognition: a hybrid approach, vol 247. Springer, Berlin
Google Scholar
Chaisorn L, Chua TS, Lee CH (2003) A multi-modal approach to story segmentation for news video. World Wide Web Internet Web Inf Syst 6(2):187–208
Article Google Scholar
Charlet D, Damnati G, Bouchekif A, Douib A (2015) Fusion of speaker and lexical information for topic segmentation: a co-segmentation approach. In: Proceedings of ICASSP, pp 5261–5265
Chen H, Guo B, Yu Z, Han Q (2016) Toward real-time and cooperative mobile visual sensing and sharing. In: Proceedings of INFOCOM, pp 1–9
Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. In: EMNLP, pp 740–750
Chunwijitra V, Chotimongkol A, Wutiwiwatchai C (2016) A hybrid input-type recurrent neural network for LVCSR language modeling. Eurasip J Audio Speech Music Process 1:15
Article Google Scholar
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of ICML, pp 160–167
Cullar MP, Delgado M, Pegalajar MC (2005) An application of non-linear programming to train recurrent neural networks in time series prediction problems. In: Proceedings of ICEIS, pp 35–42
Damavandi B, Kumar S, Shazeer N, Bruguier A (2016) Nn-grams: Unifying neural network and n-gram language models for speech recognition. In: Proceedings of INTERSPEECH, pp 3499–3503
Eisenstein J, Barzilay R (2008) Bayesian unsupervised topic segmentation. In: Proceedings of EMNLP, pp 334–343
Fiscus J, Doddington G, Garofolo J, Martin A (1999) NISTs 1998 topic detection and tracking evaluation (TDT2). In: Proceedings of the 1999 DARPA Broadcast News Workshop, pp 19–24
Fragkou P, Petridis V, Kehagias A (2004) A dynamic programming algorithm for linear text segmentation. J Intell Inf Syst 23(2):179–197
Article MATH Google Scholar
Ghosh S, Vinyals O, Strope B, Roy S, Dean T, Heck L (2016) Contextual LSTM (CLSTM) models for large scale NLP tasks. In: Proceedings of DL-KDD
Graves A, Mohamed AR, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: Proceedings of ICASSP, pp 6645–6649
Grezl F, Karafiat M, Vesely K (2014) Adaptation of multilingual stacked bottle-neck neural network structure for new language. In: Proceedings of ICASSP, pp 7654–7658
Haidar M, Kurimo M (2016) Recurrent neural network language model with incremental updated context information generated using bag-of-words representation. In: Proceedings of INTERSPEECH
Hearst MA (1997) Texttiling: segmenting text into multi-paragraph subtopic passages. Comput Linguist 23(1):33–64
Google Scholar
Heinonen O (1998) Optimal multi-paragraph text segmentation by dynamic programming. In: Proceedings of ACL, pp 1484–1486
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of SIGIR, pp 50–57
Huang Z, Li J, Siniscalchi SM, Chen IF, Wu J, Lee CH (2015) Rapid adaptation for deep neural networks through multi-task learning. In: Proceedings of INTERSPEECH, pp 3625–3629
James A (2002) Introduction to topic detection and tracking. Topic detection and tracking, pp 1–16
Karypis G (2002) Cluto—a clustering toolkit. Tech. Rep, DTIC Document
Kim Y (2014) Convolutional neural networks for sentence classification. In: EMNLP, pp 1746–1751
Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of EMNLP, pp 388–395
Kumar G, FD’Haro L (2015) Deep autoencoder topic model for short texts. In: Proceedings of IWES
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of national conference on artificial intelligence, pp 2267–2273
Larochelle H, Lauly S (2012) A neural autoregressive topic model. In: Proceedings of NIPS, pp 2717–2725
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of ICML, pp 1188–1196
Lee L, Chen B (2005) Spoken document understanding and organization. Signal Process Mag IEEE 22(5):42–60
Article Google Scholar
Li J, Cheng JH, Shi JY, Huang F (2012) Advances in computer science and information engineering. Springer, Berlin
Google Scholar
Liu X, Gao J, He X, Deng L, Duh K, Wang YY (2015) Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In: Proceedings of HLT, pp 912–921
Li C, Wang H, Zhang Z, Sun A, Ma Z (2016) Topic modeling for short texts with auxiliary word embeddings. In: Proceedings of SIGIR, pp 165–174
Lu M, Leung CC, Xie L, Ma B, Li H (2011a) Probabilistic latent semantic analysis for broadcast news story segmentation. In: Proceedings of INTERSPEECH, pp 1109–1112
Lu M, Zheng L, Leung CC, Xie L, Ma B, Li H (2011b) Broadcast news story segmentation using probabilistic latent semantic analysis and Laplacian eigenmaps. In: Proceedings of APSIPA, pp 356–360
Malioutov I, Barzilay R (2006) Minimum cut model for spoken lecture segmentation. In: Proceedings of ACL, pp 25–32
Malioutov I, Parkand A, Barzilay R, Glass J (2007) Making sense of sound: unsupervised topic segmentation over acoustic input. In: Proceedings of ACL, p 504
Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv:1301.3781 (preprint)
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013b) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26:3111–3119
Google Scholar
Rabiner LR, Juang BH (1986) An introduction to hidden Markov models. ASSP Mag IEEE 3(1):4–16
Article Google Scholar
Rau LF, Jacobs PS, Zernik U (1989) Information extraction and text summarization using linguistic knowledge acquisition. Inf Process Manag 25(4):419–428
Article Google Scholar
Reynar JC (1994) An automatic method of finding topic boundaries. In: Proceedings of ACL, pp 331–333
Rosenberg A, Hirschberg J (2006) Story segmentation of broadcast news in English, Mandarin and Arabic. In: Proceedings of HLT, pp 125–128
Schultz T, Waibel A (2001) Language-independent and language-adaptive acoustic modeling for speech recognition. Speech Commun 35(1):31–51
Article MATH Google Scholar
Seltzer M, Droppo J (2013) Multi-task learning in deep neural networks for improved phoneme recognition. In: Proceedings of ICASSP, pp 6965–6969
Sherman M, Liu Y (2008) Using hidden Markov models for topic segmentation of meeting transcripts. In: Proceedings of SLT, pp 185–188
Shriberg E, Stolcke A, Hakkani-Tur D, Tür G (2000) Prosody-based automatic segmentation of speech into sentences and topics. Speech Commun 32(1–2):127–154
Article Google Scholar
Soderland S (1999) Learning information extraction rules for semi-structured and free text. Mach Learn 34(1–3):233–272
Article MATH Google Scholar
Sundermeyer M, Schlter R, Ney H (2012) LSTM neural networks for language modeling. In: Proceedings of INTERSPEECH, pp 194–197
Tan T, Qian Y, Yu D, Kundu S, Lu L, Sim KC, Xiao X, Zhang Y (2016) Speaker-aware training of LSTM-RNNS for acoustic modelling. In: Proceedings of ICASSP, pp 5280–5284
Tian F, Gao B, He D, Liu TY (2016) Sentence level recurrent topic model: letting topics speak for themselves. arXiv:160402038 (preprint)
Van Mulbregt P, Carp I, Gillick L, Lowe S, Yamron J (1998) Text segmentation and topic tracking on broadcast news via a hidden Markov model approach. In: Proceedings of ICSLP
Wan L, Zhu L, Fergus R (2012) A hybrid neural network-latent topic model. In: Proceedings of AISTATS, vol 12, pp 1287–1294
Werbos PJ (1988) Generalization of backpropagation with application to a recurrent gas market model. Neural Netw 1(4):339–356
Article Google Scholar
Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2):270–280
Article Google Scholar
Wu Z, Valentinibotinhao C, Watts O, King S (2015) Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis. In: Proceedings of ICASSP, pp 4460–4464
Xie L, Yang YL, Liu ZQ (2011) On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news. Inf Sci 181(13):2873–2891
Article Google Scholar
Xie L, Zheng L, Liu Z, Zhang Y (2012) Laplacian eigenmaps for automatic story segmentation of broadcast news. Audio Speech Lang Proces IEEE Trans 20(1):276–289
Article Google Scholar
Xu K, Xie L, Yao K (2016) Investigating LSTM for punctuation prediction. In: Proceedings of ISCSLP, pp 5280–5284
Yamron JP, Carp I, Gillick L, Lowe S, van Mulbregt P (1998) A hidden Markov model approach to text segmentation and event tracking. In: Proceedings of ICASSP, pp 333–336
Yang C, Xie L, Zhou X (2014) Unsupervised broadcast news story segmentation using distance dependent Chinese restaurant processes. In: Proceedings of ICASSP, pp 4062–4066
Yu D, Deng L (2015) Automatic speech recognition—a deep learning approach. Springer, Berlin
MATH Google Scholar
Yu D, Seltzer ML, Li J, Huang JT, Seide F (2013) Feature learning in deep neural networks-studies on speech recognition tasks. arXiv:13013605 (preprint)
Zhang Z, Wang L, Kai A, Yamada T, Li W, Iwahashi M (2015) Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification. EURASIP J Audio Speech Music Process 1:12
Article Google Scholar
Zhang Y, Chuangsuwanich E, Glass JR (2014) Extracting deep neural network bottleneck features using low-rank matrix factorization. In: Proceedings of ICASSP, pp 185–189

Download references

Acknowledgements

This paper is supported by the National Natural Science Foundation of China (61571363), Aeronautical Science Foundation of China (20155553038 and 20155553040), Science and Technology on Avionics Integration Laboratory.

Author information

Authors and Affiliations

Shaanxi Provincial Key Laboratory of Speech and Image Information Processing, School of Computer Science, Northwestern Polytechnical University, Xi’an, China
Jia Yu & Lei Xie
School of Computer and Information Engineering, Luoyang Institute of Science and Technology, Luoyang, China
Jia Yu
Temasek Laboratories@NTU, Nanyang Technological University, Singapore, Singapore
Xiong Xiao & Eng Siong Chng

Authors

Jia Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Xie
View author publications
You can also search for this author in PubMed Google Scholar
Xiong Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Eng Siong Chng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jia Yu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, J., Xie, L., Xiao, X. et al. A hybrid neural network hidden Markov model approach for automatic story segmentation. J Ambient Intell Human Comput 8, 925–936 (2017). https://doi.org/10.1007/s12652-017-0501-9

Download citation

Received: 06 February 2017
Accepted: 29 April 2017
Published: 11 May 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s12652-017-0501-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid neural network hidden Markov model approach for automatic story segmentation

Abstract

Access this article

Similar content being viewed by others

Broadcast news story segmentation using sticky hierarchical dirichlet process

LSTM-based Siamese neural network for Urdu news story segmentation

Recurrent Neural Word Segmentation with Tag Inference

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hybrid neural network hidden Markov model approach for automatic story segmentation

Abstract

Access this article

Similar content being viewed by others

Broadcast news story segmentation using sticky hierarchical dirichlet process

LSTM-based Siamese neural network for Urdu news story segmentation

Recurrent Neural Word Segmentation with Tag Inference

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation