Text-Independent Phone Segmentation Method Using Gaussian Function

Hoang, Dac-Thang; Wang, Hsiao-Chuan

doi:10.1007/978-3-319-02741-8_11

Text-Independent Phone Segmentation Method Using Gaussian Function

Dac-Thang Hoang^7,8 &
Hsiao-Chuan Wang⁷

Conference paper

1011 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 244))

Abstract

In this paper, an effective method is proposed for the automatic phone segmentation of speech signal without using prior information about the transcript of utterance. The spectral change is used as the criterion for hypothesizing the phone boundary. Gaussian function can be used to measure the similarity of two vectors. Then a dissimilarity function is derived from the Gaussian function to measure the variation of speech spectra between mean feature vectors before and after the considered location. The peaks in the dissimilarity curve indicate locations of phone boundaries. Experiments on the TIMIT corpus show that the proposed method is more accurate than previous methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Scharenborg, O., Wan, V., Ernestus, M.: Unsupervised speech segmentation: an analysis of the hypothesized phone boundaries. J. Acoust. Soc. Amer. 172(2), 1084–1095 (2010)
Article Google Scholar
Estevan, Y.P., Wan, V., Scharenborg, O.: Finding Maximum Margin Segments in Speech. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process 2007, ICASSP 2007, pp. 937–940 (2007)
Google Scholar
Räsänen, O., Laine, U.K., Altosaar, T.: Blind segmentation of speech using non-linear filtering methods. In: Ipsic, I. (ed.) Speech Technologies, pp. 105–124. InTech Publishing (2011)
Google Scholar
Aversano, G., Esposito, A., Esposito, A., Marinaro, M.: A New Text-Independent Method for Phoneme Segmentation. In: Proc. the 44th IEEE Midwest Symposium on Circuit and System 2001, vol. 2, pp. 516–519 (2001)
Google Scholar
Dusan, S., Rabiner, L.: On the Relation between Maximum Spectral Transition Position and Phone Boundaries. In: Proc. INTERSPEECH 2006, pp. 17–21 (2006)
Google Scholar
ten Bosch, L., Cranen, B.: A computational model for unsupervised word discovery. In: Proc. INTERSPEECH 2007, pp. 1481–1484 (2007)
Google Scholar
Almpanidis, G., Kotti, M., Kotropoulos, C.: Robust Detection of Phone Boundaries Using Model Selection Criteria with Few Observation. IEEE Trans. on Audio, Speech, and Lang. Process. 17(2), 287–298 (2009)
Article Google Scholar
Qiao, Y., Shimomura, N., Minematsu, N.: Unsupervised Optimal Phoneme Segmentation: Objective, Algorithm, and Comparisons. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. 2008, ICASSP 2008, pp. 3989–3992 (2008)
Google Scholar
Lee, C.Y., Glass, J.: A nonparametric Bayesian Approach to Acoustic Model Discovery. In: Proc. 50th Annual Meeting of the Association for Computational Linguistics, pp. 40–49 (2012)
Google Scholar
Cherniz, A.S., Torres, M.E., Rufiner, H.L.: Dynamic Speech Parameterization for Text-Independent Phone Segmentation. In: Proc. 32nd Annual International Conference of the IEEE EMBS, pp. 4044–4047 (2010)
Google Scholar
Khanagha, V., Daoudi, K., Pont, O., Yahia, H.: A novel text-independent phonetic segmentation algorithm based on the microcanonical multiscal formalism. In: Proc. INTERSPEECH 2010, pp. 1393–1396 (2010)
Google Scholar
Khanagha, V., Daoudi, K., Pont, O., Yahia, H.: Improving Text-Independent Phonetic Segmentation based on the microcanonical multiscal formalism. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. 2011, ICASSP 2011, pp. 4484–4487 (2011)
Google Scholar
Huang, X., Acero, A., Hon, H.W.: Section 5.4 Digital Filters and Windows. In: Spoken Language Processing. Prentice Hall PTR (2001)
Google Scholar
Deller Jr., J.R., Hansen, J.H.L., Proakis, J.G.: Section 6.2.4 Other Forms and Variations on the stRC Parameters. In: Discrete-Time Processing of Speech Signals. IEEE Press (2000)
Google Scholar
Peng, H., Luo, L., Lin, C.: The parameter optimization of Gaussian function via the similarity comparison within class and between classes. In: Proc. Third Pacific-Asia Conference on Circuits, Communications and System 2011, PACCS 2011, pp. 1–4 (2011)
Google Scholar
Delacourt, P., Wellekens, C.J.: DISTBIC: A Speaker-based segmentation for audio data indexing. Speech Commun. 32(1-2), 111–126 (2000)
Article Google Scholar
Ajmera, J., McCowan, I., Bourlard, H.: Robust Speaker Change Detection. IEEE Signal Processing Letters 11(8), 649–651 (2004)
Article Google Scholar
Räsänen, O.J., Laine, U.K., Altosaar: An Improved Speech Segmentation Quality Measure: the R-value. In: Proc. INTERSPEECH 2009, pp. 1851–1854 (1854)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, National Tsing Hua University, No. 101, Kuang-Fu Road, Hsinchu, Taiwan, 30013
Dac-Thang Hoang & Hsiao-Chuan Wang
Department of Network System, Institute of Information Technology, No. 18 Hoang Quoc Viet Road, Hanoi, Vietnam
Dac-Thang Hoang

Authors

Dac-Thang Hoang
View author publications
You can also search for this author in PubMed Google Scholar
Hsiao-Chuan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Knowledge Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Van Nam Huynh
UMR CNRS 7253 Heudiasyc, Universite de Technologie de Compiegne, Compiegne Cedex, France
Thierry Denoeux
Faculty of Information Technology, Hanoi National University of Education, Hanoi, Vietnam
Dang Hung Tran
Faculty of Information Technology, University of Engineering and Technology, Hanoi, Vietnam
Anh Cuong Le
Faculty of Information Technology, University of Engineering and Technology, Hanoi, Vietnam
Son Bao Pham

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hoang, DT., Wang, HC. (2014). Text-Independent Phone Segmentation Method Using Gaussian Function. In: Huynh, V., Denoeux, T., Tran, D., Le, A., Pham, S. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 244. Springer, Cham. https://doi.org/10.1007/978-3-319-02741-8_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-02741-8_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02740-1
Online ISBN: 978-3-319-02741-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics