Skip to main content

Text-Independent Phone Segmentation Method Using Gaussian Function

  • Conference paper
  • 1011 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 244))

Abstract

In this paper, an effective method is proposed for the automatic phone segmentation of speech signal without using prior information about the transcript of utterance. The spectral change is used as the criterion for hypothesizing the phone boundary. Gaussian function can be used to measure the similarity of two vectors. Then a dissimilarity function is derived from the Gaussian function to measure the variation of speech spectra between mean feature vectors before and after the considered location. The peaks in the dissimilarity curve indicate locations of phone boundaries. Experiments on the TIMIT corpus show that the proposed method is more accurate than previous methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Scharenborg, O., Wan, V., Ernestus, M.: Unsupervised speech segmentation: an analysis of the hypothesized phone boundaries. J. Acoust. Soc. Amer. 172(2), 1084–1095 (2010)

    Article  Google Scholar 

  2. Estevan, Y.P., Wan, V., Scharenborg, O.: Finding Maximum Margin Segments in Speech. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process 2007, ICASSP 2007, pp. 937–940 (2007)

    Google Scholar 

  3. Räsänen, O., Laine, U.K., Altosaar, T.: Blind segmentation of speech using non-linear filtering methods. In: Ipsic, I. (ed.) Speech Technologies, pp. 105–124. InTech Publishing (2011)

    Google Scholar 

  4. Aversano, G., Esposito, A., Esposito, A., Marinaro, M.: A New Text-Independent Method for Phoneme Segmentation. In: Proc. the 44th IEEE Midwest Symposium on Circuit and System 2001, vol. 2, pp. 516–519 (2001)

    Google Scholar 

  5. Dusan, S., Rabiner, L.: On the Relation between Maximum Spectral Transition Position and Phone Boundaries. In: Proc. INTERSPEECH 2006, pp. 17–21 (2006)

    Google Scholar 

  6. ten Bosch, L., Cranen, B.: A computational model for unsupervised word discovery. In: Proc. INTERSPEECH 2007, pp. 1481–1484 (2007)

    Google Scholar 

  7. Almpanidis, G., Kotti, M., Kotropoulos, C.: Robust Detection of Phone Boundaries Using Model Selection Criteria with Few Observation. IEEE Trans. on Audio, Speech, and Lang. Process. 17(2), 287–298 (2009)

    Article  Google Scholar 

  8. Qiao, Y., Shimomura, N., Minematsu, N.: Unsupervised Optimal Phoneme Segmentation: Objective, Algorithm, and Comparisons. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. 2008, ICASSP 2008, pp. 3989–3992 (2008)

    Google Scholar 

  9. Lee, C.Y., Glass, J.: A nonparametric Bayesian Approach to Acoustic Model Discovery. In: Proc. 50th Annual Meeting of the Association for Computational Linguistics, pp. 40–49 (2012)

    Google Scholar 

  10. Cherniz, A.S., Torres, M.E., Rufiner, H.L.: Dynamic Speech Parameterization for Text-Independent Phone Segmentation. In: Proc. 32nd Annual International Conference of the IEEE EMBS, pp. 4044–4047 (2010)

    Google Scholar 

  11. Khanagha, V., Daoudi, K., Pont, O., Yahia, H.: A novel text-independent phonetic segmentation algorithm based on the microcanonical multiscal formalism. In: Proc. INTERSPEECH 2010, pp. 1393–1396 (2010)

    Google Scholar 

  12. Khanagha, V., Daoudi, K., Pont, O., Yahia, H.: Improving Text-Independent Phonetic Segmentation based on the microcanonical multiscal formalism. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. 2011, ICASSP 2011, pp. 4484–4487 (2011)

    Google Scholar 

  13. Huang, X., Acero, A., Hon, H.W.: Section 5.4 Digital Filters and Windows. In: Spoken Language Processing. Prentice Hall PTR (2001)

    Google Scholar 

  14. Deller Jr., J.R., Hansen, J.H.L., Proakis, J.G.: Section 6.2.4 Other Forms and Variations on the stRC Parameters. In: Discrete-Time Processing of Speech Signals. IEEE Press (2000)

    Google Scholar 

  15. Peng, H., Luo, L., Lin, C.: The parameter optimization of Gaussian function via the similarity comparison within class and between classes. In: Proc. Third Pacific-Asia Conference on Circuits, Communications and System 2011, PACCS 2011, pp. 1–4 (2011)

    Google Scholar 

  16. Delacourt, P., Wellekens, C.J.: DISTBIC: A Speaker-based segmentation for audio data indexing. Speech Commun. 32(1-2), 111–126 (2000)

    Article  Google Scholar 

  17. Ajmera, J., McCowan, I., Bourlard, H.: Robust Speaker Change Detection. IEEE Signal Processing Letters 11(8), 649–651 (2004)

    Article  Google Scholar 

  18. Räsänen, O.J., Laine, U.K., Altosaar: An Improved Speech Segmentation Quality Measure: the R-value. In: Proc. INTERSPEECH 2009, pp. 1851–1854 (1854)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Hoang, DT., Wang, HC. (2014). Text-Independent Phone Segmentation Method Using Gaussian Function. In: Huynh, V., Denoeux, T., Tran, D., Le, A., Pham, S. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 244. Springer, Cham. https://doi.org/10.1007/978-3-319-02741-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-02741-8_11

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02740-1

  • Online ISBN: 978-3-319-02741-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics