Abstract
In this chapter, we present state-of-art machine learning approaches for speech and language processing with highlight on topic models for structural learning and temporal modeling from unlabeled sequential patterns. In general, speech and language processing involves extensive knowledge of statistical models. We require designing a flexible, scalable, and robust system to meet heterogeneous and nonstationary environments in the era of big data. This chapter starts from an introduction of unsupervised speech and language processing based on factor analysis and independent component analysis. Unsupervised learning is then generalized to a latent variable model which is known as the topic model. The evolution of topic models from latent semantic analysis to hierarchical Dirichlet process, from non-Bayesian parametric models to Bayesian nonparametric models, and from single-layer model to hierarchical tree model is investigated in an organized fashion. The inference approaches based on variational Bayesian and Gibbs sampling are introduced. We present several case studies on topic modeling for speech and language applications including language model, document model, segmentation model, and summarization model.
Keywords
- Markov Chain Monte Carlo
- Independent Component Analysis
- Topic Model
- Latent Dirichlet Allocation
- Latent Semantic Analysis
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options






References
Basilevsky, A.: Statistical Factor Analysis and Related Methods—Theory and Applications. Wiley, New York (1994)
Bell, A.J., Sejnowski, T.J.: An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7, 1129–1159 (1995)
Blei, D., Griffiths, T., Jordan, M., Tenenbaum, J.: Hierarchical topic models and the nested Chinese restaurant process. Adv. Neural Inf. Proc. Syst. 16, 17–24 (2004)
Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57(2) (2010). Article 7
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of International Conference on Machine Learning, pp. 113–120 (2006)
Blei, D.M., Lafferty, J.D.: A correlated topic model of science. Ann. Appl. Stat. 1(1), 17–35 (2007)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Chang, Y.L., Chien, J.T.: Latent Dirichlet learning for document summarization. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, pp. 1689–1692 (2009)
Chien, J.T., Chang, Y.L.: Hierarchical Pitman-Yor and Dirichlet process for language model. In: Proceedings of Annual Conference of International Speech Communication Association, pp. 2212–2216 (2013)
Chien, J.T., Chang, Y.L.: Hierarchical theme and topic model for summarization. In: Proceedings of IEEE International Workshop on Machine Learning for Signal Processing, pp. 1–6 (2013)
Chien, J.T., Chang, Y.L.: Bayesian sparse topic model. J. Signal Proc. Syst. 74(3), 375–389 (2014)
Chien, J.T., Chen, B.C.: A new independent component analysis for speech recognition and separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1245–1254 (2006)
Chien, J.T., Chueh, C.H.: Dirichlet class language models for speech recognition. IEEE Trans. Audio, Speech, Lang. Process. 19(3), 482–495 (2011)
Chien, J.T., Chueh, C.H.: Topic-based hierarchical segmentation. IEEE Trans. Audio Speech Lang. Process. 20(1), 55–66 (2012)
Chien, J.T., Hsieh, H.L.: Convex divergence ICA for blind source separation. IEEE Trans. Audio Speech Lang. Process. 20(1), 290–301 (2012)
Chien, J.T., Ting, C.W.: Factor analyzed subspace modeling and selection. IEEE Trans. Audio Speech Lang. Process. 16(1), 239–248 (2008)
Chien, J.T., Ting, C.W.: Acoustic factor analysis for streamed hidden Markov model. IEEE Trans. Audio Speech Lang. Process. 17(7), 1279–1291 (2009)
Chien, J.T., Wu, M.S.: Adaptive Bayesian latent semantic analysis. IEEE Trans. Audio Speech Lang. Process. 16(1), 198–207 (2008)
Chueh, C.H., Chien, J.T.: Nonstationary latent Dirichlet allocation for speech recognition. In: Proceedings of Annual Conference of International Speech Communication Association, pp. 372–375 (2009)
Chueh, C.H., Chien, J.T.: Adaptive segment model for spoken document retrieval. In: Proceedings of International Symposium on Chinese Spoken Language Processing, pp. 261–264 (2010)
Comon, P.: Independent component analysis, a new concept? Signal Process. 36(3), 287–314 (1994)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39(1), 1–38 (1977)
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 524–531 (2005)
Gildea, D., Hofmann, T.: Topic-based language models using EM. In: Proceedings of European Conference on Speech Communication and Technology, pp. 2167–2170 (1999)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Nat. Acad. Sci. U.S.A. 101(1), 5228–5235 (2004)
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)
Hyvarinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626–634 (1999)
Ishwaran, H., Rao, J.S.: Spike and slab variable selection: frequentist and Bayesian strategies. Ann. Stat. 33(2), 730–773 (2005)
Jolliffe, I.T.: Principal Component Analysis. Springer, New York (1986)
Kim, S., Georgiou, P., Narayanan, S.: Latent acoustic topic models for unstructured audio classification. APSIPA Trans. Signal Inf. Process. 1 (2012). doi:10.1017/ATSIP.2012.7
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, pp. 181–184 (1995)
Kuhn, R., Junqua, J.C., Nguyen, P., Niedzielski, N.: Rapid speaker adaptation in eigenvoice space. IEEE Trans. Audio Speech Lang. Process. 8(4), 695–707 (2000)
Pitman, J., Yor, M.: The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Ann. Probab. 25, 855–900 (1997)
Smaragdis, P., Shashanka, M., Raj, B.: Topic models for audio mixture analysis. In: Proceedings of NIPS Workshop on Applications for Topic Models: Text and Beyond (2009)
Tam, Y.C., Schultz, T.: Dynamic language model adaptation using variational Bayes inference. In: Proceedings of Annual Conference of International Speech Communication Association, pp. 5–8 (2005)
Tam, Y.C., Schultz, T.: Unsupervised language model adaptation using latent semantic marginals. In: Proceedings of Annual Conference of International Speech Communication Association (2006)
Teh, Y.W.: A hierarchical Bayesian language model based on Pitman-Yor processes. In: Proceedings of International Conference on Computational Linguistics and Annual Meeting of the Association for Computational Linguistics, pp. 985–992 (2006)
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 The Author(s)
About this chapter
Cite this chapter
Chien, JT. (2015). Topic Modeling for Speech and Language Processing. In: Peters, G., Matsui, T. (eds) Modern Methodology and Applications in Spatial-Temporal Modeling. SpringerBriefs in Statistics(). Springer, Tokyo. https://doi.org/10.1007/978-4-431-55339-7_4
Download citation
DOI: https://doi.org/10.1007/978-4-431-55339-7_4
Published:
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-55338-0
Online ISBN: 978-4-431-55339-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)