Abstract
Hidden Markov model (HMM) is a popular technique for story segmentation, where hidden Markov states represent the topics. The number of hidden states has to set manually, however, this number is often unknown. This paper proposed a nonparametric approach, called SHDP-HMM, to address this problem. By defining an HDP prior distribution on transition matrices over countably infinite state spaces, SHDP-HMM can infer the number of hidden states from the data automatically. Besides, to better model the duration of topics, we utilize a parameter for self-transition bias that reduces the transition probabilities among redundant hidden states. Given a stream of text, a Gibbs sampler labels the hidden states with topic classes. The position where the topic shifts indicates a story boundary. Experiments show that the proposed SHDP-HMM approach outperforms the traditional HMM-based approaches, and the number of hidden states can be automatically inferred from data.
Similar content being viewed by others
References
James A (2002) Introduction to topic detection and tracking, Topic detection and tracking, pp 1–16
Fiscus J, Doddington G, Garofolo J, Martin A (1999) Nist’s 1998 topic detection and tracking evaluation (tdt2), In: Proceedings of the 1999 DARPA Broadcast News Workshop, pp 19–24
Soderland S (1999) Learning information extraction rules for semi-structured and free text. Machine learning 34(1–3):233–272
Lee L-S, Chen B (2005) Spoken document understanding and organization. Signal Processing Magazine, IEEE 22(5):42–60
Beeferman D, Berger A, Lafferty J (1999) Statistical models for text segmentation. Machine learning 34(1–3):177–210
Reynar JC (1994) An automatic method of finding topic boundaries, In: Proceedings of the ACL, pp 331–333
Hearst MA (1997) Texttiling: Segmenting text into multi-paragraph subtopic passages. Computational linguistics 23(1):33–64
Stokes N, Carthy J, Smeaton AF (2004) Select: a lexical cohesion based news story segmentation system. AI communications 17(1):3–12
Banerjee S, Rudnicky AI (2006) A texttiling based approach to topic boundary detection in meetings, In: Proceedings of the ICSLP
Chen H, Xie L, Feng W, Zheng L, Zhang Y (2015) Topic segmentation on spoken documents using self-validated acoustic cuts. Soft Computing 19(1):47–59
Chen H, Xie L, Leung C-C, Lu X, Ma B, Li H (2016) Modeling latent topics and temporal distance for story segmentation of broadcast news. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(1):112–123
Malioutov I, Barzilay R (1998) Minimum cut model for spoken lecture segmentation, In: Proceedings of the ACL, pp 25–32
Malioutov I, Parkand A, Barzilay R, Glass J (2007) Making sense of sound: Unsupervised topic segmentation over acoustic input, In: Proceedings of the ACL, p 504
Theodorou T, Mporas I, Fakotakis N (2014) An overview of automatic audio segmentation. International Journal of Information Technology and Computer Science (IJITCS) 6(11):1
Shriberg E, Stolcke A, Hakkani-Tur D, Tür G (2000) Prosody-based automatic segmentation of speech into sentences and topics. Speech Communication 32(1–2):127–154
Lu L, Zhang H-J, Jiang H (2002) Content analysis for audio classification and segmentation. IEEE Transactions on speech and audio processing 10(7):504–516
Chaisorn L, Chua TS, Lee CH (2003) A multi-modal approach to story segmentation for news video. World Wide Web-internet and Web Information Systems 6(2):187–208
Yamron JP, Carp I, Gillick L, Lowe S, van Mulbregt P (1998) A hidden markov model approach to text segmentation and event tracking, In: Proceedings of the ICASSP, pp 333–336
Pevzner L, Hearst MA (2002) A critique and improvement of an evaluation metric for text segmentation. Computational Linguistics 28(1):19–36
Van Mulbregt P, Carp I, Gillick L, Lowe S, Yamron J (1998) Text segmentation and topic tracking on broadcast news via a hidden markov model approach. In: Proceedings of the ICSLP
Yu D, Deng L (2015) Automatic Speech Recognition - A Deep Learning Approach. Springer, Berlin
Abdel-Hamid O, Deng L, Yu D (2013) Exploring convolutional neural network structures and optimization techniques for speech recognition. In: Proceedings if the INTERSPEECH, pp 3366–3370
Abdel-Hamid O, Mohamed A-R, Jiang H, Deng L, Penn G, Yu D (2014) Convolutional neural networks for speech recognition. IEEE/ACM Transactions on audio, speech, and language processing 22(10):1533–1545
Schultz T, Waibel A (2001) Language-independent and language-adaptive acoustic modeling for speech recognition. Speech Communication 35(1):31–51
Karpagavalli S, Chandra E (2016) A review on automatic speech recognition architecture and approaches. International Journal of Signal Processing, Image Processing and Pattern Recognition 9(4):393–404
Bourlard HA, Morgan N (2012) Connectionist speech recognition: a hybrid approach. Springer Science & Business Media, vol 247
Qin C-X, Zhang W-L, Qu D (2019) A new joint ctc-attention-based speech recognition model with multi-level multi-head attention. EURASIP Journal on Audio, Speech, and Music Processing 2019(1):1–12
Huang X, Baker J, Reddy R (2014) A historical perspective of speech recognition. Communications of the ACM 57(1):94–103
Xie L, Yang Y-L, Liu Z-Q (2011) On the effectiveness of subwords for lexical cohesion based story segmentation of chinese broadcast news. Information Sciences 181(13):2873–2891
Bouchekif A, Damnati G, Charlet D (2014) Intra-content term weighting for topic segmentation, In: Proceedings of the ICASSP, pp 7113–7117
Yang Y, Xie L (2008) Subword latent semantic analysis for texttiling-based automatic story segmentation of chinese broadcast news, In: Proceedings of the ISCLP, pp 1–4
Lu X, Leung C-C, Xie L, Ma B, Li H (2013) Broadcast news story segmentation using latent topics on data manifold, In: Proceedings of the ICASSP, pp 8465–8469
Xie L, Zheng L, Liu Z, Zhang Y (2012) Laplacian eigenmaps for automatic story segmentation of broadcast news. Audio, Speech, and Language Processing, IEEE Transactions on 20(1):276–289
Eisenstein J, Barzilay R (2008) Bayesian unsupervised topic segmentation, In: Proceedings of the EMNLP, pp 334–343
Lee C, Glass J (2012) A nonparametric bayesian approach to acoustic model discovery, In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, vol 1, pp 40–49
Hofmann, Thomas (1999) Probabilistic latent semantic indexing, In: Proceedings of the SIGIR, pp 50–57
Jelodar H, Wang Y, Yuan C, Feng X, Jiang X, Li Y, Zhao L (2019) Latent dirichlet allocation (lda) and topic modeling: models, applications, a survey. Multimedia Tools and Applications 78(11):15169–15211
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. The Journal of machine Learning research 3:993–1022
Zhang Y, Jin R, Zhou Z-H (2010) Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics 1(1–4):43–52
Fragkou P, Petridis V, Kehagias A (2004) A dynamic programming algorithm for linear text segmentation. Journal of Intelligent Information Systems 23(2):179–197
Heinonen O (1998) Optimal multi-paragraph text segmentation by dynamic programming, In: Proceedings of the ACL, pp 1484–1486
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Machine learning 42(1):177–196
Ibrahim R, Elbagoury A, Kamel MS, Karray F (2018) Tools and approaches for topic detection from twitter streams: survey. Knowledge and Information Systems 54(3):511–539
Wan L, Zhu L, Fergus R (2012) A hybrid neural network-latent topic model. Proceedings of the AISTATS 12:1287–1294
Li C, Wang H, Zhang Z, Sun A, Ma Z (2016) Topic modeling for short texts with auxiliary word embeddings, In: Proceedings of the SIGIR, pp 165–174
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification, In: Proceedings of the National Conferece on Artificial Intelligence, pp 2267–2273
Larochelle H, Lauly S (2012) A neural autoregressive topic model, Advances in Neural Information Processing Systems, pp 2708–2716
Rabiner LR, Juang B-H (1986) An introduction to hidden markov models. ASSP Magazine, IEEE 3(1):4–16
Sherman M, Liu Y (2008) Using hidden markov models for topic segmentation of meeting transcripts, In: Proceedings of the SLT, pp 185–188
Yu J, Xiao X, Xie L, Chng ES, Li H (2016) A dnn-hmm approach to story segmentation, In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH). San Francisco, CA, USA: ISCA, pp 1527–1531
Yu J, Xie L, Xiao X, Chng ES (2017) A hybrid neural network hidden markov model approach for automatic story segmentation. Journal of Ambient Intelligence and Humanized Computing 8(6):925–936
Teh YW (2010) Dirichlet process, Encyclopedia of Machine Learning, pp 280–287
Hoffman MD, Cook PR, Blei DM (2008) Data-driven recomposition using the hierarchical dirichlet process hidden markov model, In: Proceedings of InternationalComputer Music Conference (ICMC). Belfast, Ireland: Michigan Publishing, pp 639–650
Blei D (2005) Variational inference for dirichlet process mixtures. Journal of Bayesian Analysis 1(1):121–143
Antoniak CE (1974) Mixtures of dirichlet processes with applications to bayesian nonparametric problems. Annals of Statistics 2(6):1152–1174
Forney GD Jr (1973) The viterbi algorithm. Proceedings of the IEEE 61(3):268–278
Fox EB, Sudderth EB, Jordan MI (2011) Willsky AS A sticky hdp-hmm with application to speaker diarization, The Annals of Applied Statistics, pp 1020–1056
Nasrabadi NM (2007) Pattern recognition and machine learning. Journal of electronic imaging 16(4):049901
Bolstad WM (2010) Understanding Computational Bayesian Statistics. New York City: John Wiley & Sons, vol 644
Neubig G (2014) Simple, correct parallelization for blocked gibbs sampling, Nara Institute of Science and Technology. Tech, Rep
Cohn T, Blunsom P (2010) Blocked inference in bayesian tree substitution grammars, In: Proceedings of the ACL, pp 225–230
Yildirim I (2012) Bayesian inference: Gibbs sampling. University of Rochester, Technical Note
Chopin N, Singh SS et al (2015) On particle gibbs sampling. Bernoulli 21(3):1855–1883
Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the ICML, pp 1188–1196
Lu M, Zheng L, Leung C-C, Xie L, Ma B, Li H (2011) Broadcast news story segmentation using probabilistic latent semantic analysis and laplacian eigenmaps, In: Proceedings of the APSIPA ASC, pp 356–360
Wei C, Luo S, Ma X, Ren H, Zhang J, Pan L (2016) Locally embedding autoencoders: a semi-supervised manifold learning approach of document representation. PloS one 11(1):e0146672
Yang C, Xie L, Zhou X (2014) Unsupervised broadcast news story segmentation using distance dependent chinese restaurant processes, In: Proceedings of the ICASSP, pp 4062–4066
Yu J, Xie L, Xiao X, Chng ES (2018) Learning distributed sentence representations for story segmentation. Signal Processing 142:403–411
Cai D, Mei Q, Han J, Zhai C (2008) Modeling hidden topics on document manifold, In: Proceedings of the 17th ACM conference on Information and knowledge management, pp 911–920
Rydén T et al (2008) Em versus markov chain monte carlo for estimation of hidden markov models: A computational perspective. Bayesian Analysis 3(4):659–688
Funding
This work was supported by the the Key Projects of Colleges and Universities in Henan Province(Grant No. 20A520028), China Scholarship Council (Grant No. 202008410014) and the Science and Technology Breakthrough Project of Henan Science and Technology Department (192102210249).
Author information
Authors and Affiliations
Contributions
Conception of the study: Jia Yu; Analysis and preprocessing of data: Hongxiang Shao; Algorithm design and experiment performing: Jia Yu; Drafting the manuscript: Jia Yu; Revising the manuscript: Hongxiang Shao.
Corresponding author
Ethics declarations
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed Consent
Informed consent was obtained from all individual participants included in the study.
Conflict of Interest
The authors declare that there is no conflict of interest regarding the publication of this article.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yu, J., Shao, H. Broadcast news story segmentation using sticky hierarchical dirichlet process. Appl Intell 52, 12788–12800 (2022). https://doi.org/10.1007/s10489-021-03098-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-03098-4