Skip to main content
Log in

Broadcast news story segmentation using sticky hierarchical dirichlet process

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Hidden Markov model (HMM) is a popular technique for story segmentation, where hidden Markov states represent the topics. The number of hidden states has to set manually, however, this number is often unknown. This paper proposed a nonparametric approach, called SHDP-HMM, to address this problem. By defining an HDP prior distribution on transition matrices over countably infinite state spaces, SHDP-HMM can infer the number of hidden states from the data automatically. Besides, to better model the duration of topics, we utilize a parameter for self-transition bias that reduces the transition probabilities among redundant hidden states. Given a stream of text, a Gibbs sampler labels the hidden states with topic classes. The position where the topic shifts indicates a story boundary. Experiments show that the proposed SHDP-HMM approach outperforms the traditional HMM-based approaches, and the number of hidden states can be automatically inferred from data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2001S93

  2. http://julius.sourceforge.jp/.

References

  1. James A (2002) Introduction to topic detection and tracking, Topic detection and tracking, pp 1–16

  2. Fiscus J, Doddington G, Garofolo J, Martin A (1999) Nist’s 1998 topic detection and tracking evaluation (tdt2), In: Proceedings of the 1999 DARPA Broadcast News Workshop, pp 19–24

  3. Soderland S (1999) Learning information extraction rules for semi-structured and free text. Machine learning 34(1–3):233–272

    Article  MATH  Google Scholar 

  4. Lee L-S, Chen B (2005) Spoken document understanding and organization. Signal Processing Magazine, IEEE 22(5):42–60

    Article  Google Scholar 

  5. Beeferman D, Berger A, Lafferty J (1999) Statistical models for text segmentation. Machine learning 34(1–3):177–210

    Article  MATH  Google Scholar 

  6. Reynar JC (1994) An automatic method of finding topic boundaries, In: Proceedings of the ACL, pp 331–333

  7. Hearst MA (1997) Texttiling: Segmenting text into multi-paragraph subtopic passages. Computational linguistics 23(1):33–64

    Google Scholar 

  8. Stokes N, Carthy J, Smeaton AF (2004) Select: a lexical cohesion based news story segmentation system. AI communications 17(1):3–12

    MathSciNet  MATH  Google Scholar 

  9. Banerjee S, Rudnicky AI (2006) A texttiling based approach to topic boundary detection in meetings, In: Proceedings of the ICSLP

  10. Chen H, Xie L, Feng W, Zheng L, Zhang Y (2015) Topic segmentation on spoken documents using self-validated acoustic cuts. Soft Computing 19(1):47–59

    Article  Google Scholar 

  11. Chen H, Xie L, Leung C-C, Lu X, Ma B, Li H (2016) Modeling latent topics and temporal distance for story segmentation of broadcast news. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(1):112–123

    Article  Google Scholar 

  12. Malioutov I, Barzilay R (1998) Minimum cut model for spoken lecture segmentation, In: Proceedings of the ACL, pp 25–32

  13. Malioutov I, Parkand A, Barzilay R, Glass J (2007) Making sense of sound: Unsupervised topic segmentation over acoustic input, In: Proceedings of the ACL, p 504

  14. Theodorou T, Mporas I, Fakotakis N (2014) An overview of automatic audio segmentation. International Journal of Information Technology and Computer Science (IJITCS) 6(11):1

    Article  Google Scholar 

  15. Shriberg E, Stolcke A, Hakkani-Tur D, Tür G (2000) Prosody-based automatic segmentation of speech into sentences and topics. Speech Communication 32(1–2):127–154

    Article  Google Scholar 

  16. Lu L, Zhang H-J, Jiang H (2002) Content analysis for audio classification and segmentation. IEEE Transactions on speech and audio processing 10(7):504–516

    Article  Google Scholar 

  17. Chaisorn L, Chua TS, Lee CH (2003) A multi-modal approach to story segmentation for news video. World Wide Web-internet and Web Information Systems 6(2):187–208

    Google Scholar 

  18. Yamron JP, Carp I, Gillick L, Lowe S, van Mulbregt P (1998) A hidden markov model approach to text segmentation and event tracking, In: Proceedings of the ICASSP, pp 333–336

  19. Pevzner L, Hearst MA (2002) A critique and improvement of an evaluation metric for text segmentation. Computational Linguistics 28(1):19–36

    Article  Google Scholar 

  20. Van Mulbregt P, Carp I, Gillick L, Lowe S, Yamron J (1998) Text segmentation and topic tracking on broadcast news via a hidden markov model approach. In: Proceedings of the ICSLP

  21. Yu D, Deng L (2015) Automatic Speech Recognition - A Deep Learning Approach. Springer, Berlin

    MATH  Google Scholar 

  22. Abdel-Hamid O, Deng L, Yu D (2013) Exploring convolutional neural network structures and optimization techniques for speech recognition. In: Proceedings if the INTERSPEECH, pp 3366–3370

  23. Abdel-Hamid O, Mohamed A-R, Jiang H, Deng L, Penn G, Yu D (2014) Convolutional neural networks for speech recognition. IEEE/ACM Transactions on audio, speech, and language processing 22(10):1533–1545

    Article  Google Scholar 

  24. Schultz T, Waibel A (2001) Language-independent and language-adaptive acoustic modeling for speech recognition. Speech Communication 35(1):31–51

    Article  MATH  Google Scholar 

  25. Karpagavalli S, Chandra E (2016) A review on automatic speech recognition architecture and approaches. International Journal of Signal Processing, Image Processing and Pattern Recognition 9(4):393–404

    Article  Google Scholar 

  26. Bourlard HA, Morgan N (2012) Connectionist speech recognition: a hybrid approach. Springer Science & Business Media, vol 247

  27. Qin C-X, Zhang W-L, Qu D (2019) A new joint ctc-attention-based speech recognition model with multi-level multi-head attention. EURASIP Journal on Audio, Speech, and Music Processing 2019(1):1–12

    Article  Google Scholar 

  28. Huang X, Baker J, Reddy R (2014) A historical perspective of speech recognition. Communications of the ACM 57(1):94–103

    Article  Google Scholar 

  29. Xie L, Yang Y-L, Liu Z-Q (2011) On the effectiveness of subwords for lexical cohesion based story segmentation of chinese broadcast news. Information Sciences 181(13):2873–2891

    Article  Google Scholar 

  30. Bouchekif A, Damnati G, Charlet D (2014) Intra-content term weighting for topic segmentation, In: Proceedings of the ICASSP, pp 7113–7117

  31. Yang Y, Xie L (2008) Subword latent semantic analysis for texttiling-based automatic story segmentation of chinese broadcast news, In: Proceedings of the ISCLP, pp 1–4

  32. Lu X, Leung C-C, Xie L, Ma B, Li H (2013) Broadcast news story segmentation using latent topics on data manifold, In: Proceedings of the ICASSP, pp 8465–8469

  33. Xie L, Zheng L, Liu Z, Zhang Y (2012) Laplacian eigenmaps for automatic story segmentation of broadcast news. Audio, Speech, and Language Processing, IEEE Transactions on 20(1):276–289

    Article  Google Scholar 

  34. Eisenstein J, Barzilay R (2008) Bayesian unsupervised topic segmentation, In: Proceedings of the EMNLP, pp 334–343

  35. Lee C, Glass J (2012) A nonparametric bayesian approach to acoustic model discovery, In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, vol 1, pp 40–49

  36. Hofmann, Thomas (1999) Probabilistic latent semantic indexing, In: Proceedings of the SIGIR, pp 50–57

  37. Jelodar H, Wang Y, Yuan C, Feng X, Jiang X, Li Y, Zhao L (2019) Latent dirichlet allocation (lda) and topic modeling: models, applications, a survey. Multimedia Tools and Applications 78(11):15169–15211

    Article  Google Scholar 

  38. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. The Journal of machine Learning research 3:993–1022

    MATH  Google Scholar 

  39. Zhang Y, Jin R, Zhou Z-H (2010) Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics 1(1–4):43–52

    Article  Google Scholar 

  40. Fragkou P, Petridis V, Kehagias A (2004) A dynamic programming algorithm for linear text segmentation. Journal of Intelligent Information Systems 23(2):179–197

    Article  MATH  Google Scholar 

  41. Heinonen O (1998) Optimal multi-paragraph text segmentation by dynamic programming, In: Proceedings of the ACL, pp 1484–1486

  42. Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Machine learning 42(1):177–196

    Article  MATH  Google Scholar 

  43. Ibrahim R, Elbagoury A, Kamel MS, Karray F (2018) Tools and approaches for topic detection from twitter streams: survey. Knowledge and Information Systems 54(3):511–539

    Article  Google Scholar 

  44. Wan L, Zhu L, Fergus R (2012) A hybrid neural network-latent topic model. Proceedings of the AISTATS 12:1287–1294

    Google Scholar 

  45. Li C, Wang H, Zhang Z, Sun A, Ma Z (2016) Topic modeling for short texts with auxiliary word embeddings, In: Proceedings of the SIGIR, pp 165–174

  46. Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification, In: Proceedings of the National Conferece on Artificial Intelligence, pp 2267–2273

  47. Larochelle H, Lauly S (2012) A neural autoregressive topic model, Advances in Neural Information Processing Systems, pp 2708–2716

  48. Rabiner LR, Juang B-H (1986) An introduction to hidden markov models. ASSP Magazine, IEEE 3(1):4–16

    Article  Google Scholar 

  49. Sherman M, Liu Y (2008) Using hidden markov models for topic segmentation of meeting transcripts, In: Proceedings of the SLT, pp 185–188

  50. Yu J, Xiao X, Xie L, Chng ES, Li H (2016) A dnn-hmm approach to story segmentation, In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH). San Francisco, CA, USA: ISCA, pp 1527–1531

  51. Yu J, Xie L, Xiao X, Chng ES (2017) A hybrid neural network hidden markov model approach for automatic story segmentation. Journal of Ambient Intelligence and Humanized Computing 8(6):925–936

    Article  Google Scholar 

  52. Teh YW (2010) Dirichlet process, Encyclopedia of Machine Learning, pp 280–287

  53. Hoffman MD, Cook PR, Blei DM (2008) Data-driven recomposition using the hierarchical dirichlet process hidden markov model, In: Proceedings of InternationalComputer Music Conference (ICMC). Belfast, Ireland: Michigan Publishing, pp 639–650

  54. Blei D (2005) Variational inference for dirichlet process mixtures. Journal of Bayesian Analysis 1(1):121–143

    MathSciNet  MATH  Google Scholar 

  55. Antoniak CE (1974) Mixtures of dirichlet processes with applications to bayesian nonparametric problems. Annals of Statistics 2(6):1152–1174

    Article  MathSciNet  MATH  Google Scholar 

  56. Forney GD Jr (1973) The viterbi algorithm. Proceedings of the IEEE 61(3):268–278

    Article  MathSciNet  Google Scholar 

  57. Fox EB, Sudderth EB, Jordan MI (2011) Willsky AS A sticky hdp-hmm with application to speaker diarization, The Annals of Applied Statistics, pp 1020–1056

  58. Nasrabadi NM (2007) Pattern recognition and machine learning. Journal of electronic imaging 16(4):049901

    Article  MathSciNet  Google Scholar 

  59. Bolstad WM (2010) Understanding Computational Bayesian Statistics. New York City: John Wiley & Sons, vol 644

  60. Neubig G (2014) Simple, correct parallelization for blocked gibbs sampling, Nara Institute of Science and Technology. Tech, Rep

    Google Scholar 

  61. Cohn T, Blunsom P (2010) Blocked inference in bayesian tree substitution grammars, In: Proceedings of the ACL, pp 225–230

  62. Yildirim I (2012) Bayesian inference: Gibbs sampling. University of Rochester, Technical Note

    Google Scholar 

  63. Chopin N, Singh SS et al (2015) On particle gibbs sampling. Bernoulli 21(3):1855–1883

    Article  MathSciNet  MATH  Google Scholar 

  64. Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the ICML, pp 1188–1196

  65. Lu M, Zheng L, Leung C-C, Xie L, Ma B, Li H (2011) Broadcast news story segmentation using probabilistic latent semantic analysis and laplacian eigenmaps, In: Proceedings of the APSIPA ASC, pp 356–360

  66. Wei C, Luo S, Ma X, Ren H, Zhang J, Pan L (2016) Locally embedding autoencoders: a semi-supervised manifold learning approach of document representation. PloS one 11(1):e0146672

    Article  Google Scholar 

  67. Yang C, Xie L, Zhou X (2014) Unsupervised broadcast news story segmentation using distance dependent chinese restaurant processes, In: Proceedings of the ICASSP, pp 4062–4066

  68. Yu J, Xie L, Xiao X, Chng ES (2018) Learning distributed sentence representations for story segmentation. Signal Processing 142:403–411

    Article  Google Scholar 

  69. Cai D, Mei Q, Han J, Zhai C (2008) Modeling hidden topics on document manifold, In: Proceedings of the 17th ACM conference on Information and knowledge management, pp 911–920

  70. Rydén T et al (2008) Em versus markov chain monte carlo for estimation of hidden markov models: A computational perspective. Bayesian Analysis 3(4):659–688

    MathSciNet  MATH  Google Scholar 

Download references

Funding

This work was supported by the the Key Projects of Colleges and Universities in Henan Province(Grant No. 20A520028), China Scholarship Council (Grant No. 202008410014) and the Science and Technology Breakthrough Project of Henan Science and Technology Department (192102210249).

Author information

Authors and Affiliations

Authors

Contributions

Conception of the study: Jia Yu; Analysis and preprocessing of data: Hongxiang Shao; Algorithm design and experiment performing: Jia Yu; Drafting the manuscript: Jia Yu; Revising the manuscript: Hongxiang Shao.

Corresponding author

Correspondence to Hongxiang Shao.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Conflict of Interest

The authors declare that there is no conflict of interest regarding the publication of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, J., Shao, H. Broadcast news story segmentation using sticky hierarchical dirichlet process. Appl Intell 52, 12788–12800 (2022). https://doi.org/10.1007/s10489-021-03098-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-03098-4

Keywords

Navigation