A new approach to speech segmentation based on the maximum likelihood

šarić, Z. M.; Turajlić, S. R.

doi:10.1007/BF01213958

A new approach to speech segmentation based on the maximum likelihood

Published: September 1995

Volume 14, pages 615–632, (1995)
Cite this article

Circuits, Systems and Signal Processing Aims and scope Submit manuscript

Z. M. šarić¹ &
S. R. Turajlić²

59 Accesses
Explore all metrics

Abstract

Successful speech recognition is highly dependent on appropriate speech segmentation. The poor efficiency of the sequential detection of abrupt changes in the signals with relatively short stationary intervals, as is the case with speech signals, can be improved by the off-line maximum likelihood segmentation algorithm. In this paper the new segmentation algorithm is presented. For the a priori known number of segments, the algorithm determines such signal partitions for which the sum of segment distortion is minimal. The generalized maximum likelihood distortion measure has been introduced, and has proven to be particularly efficient on short signal segments. In the case of an unknown number of segments, its estimate is obtained comparing the reduction of the distortion. The asymptotic properties of the distortion sequence have been analyzed, which led to the definition of the presented segmentation algorithm. The introduced measure can be applied both to the AR and ARMA models. The segmentation algorithm is verified on test signals as well as on the natural speech signal, for which the pitch synchronous framing scheme is applied. The experimental results also include a comparison of the AR and ARMA model-based segmentations. The first results show that ARMA model-based segmentation gives somewhat better results than the AR model algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Signal Segmentation into Vocalized and Unvocalized Segments on the Basis of Simultaneous Masking

Article 01 July 2018

Audio Segmentation for Speech Recognition Using Segment Features

Guaranteed Significance Level Criterion in Automatic Speech Signal Segmentation

Article 26 November 2020

References

R. Andre-Obrecht, A new statistical approach for the automatic segmentation of continuous speech signals,IEEE Trans. Acoust. Speech Signal Process. ASSP-36, no. 1, January 1988, pp. 29–40.
Google Scholar
U. Appel and A. V. Brandt, Adaptive sequential segmentation of piecewise stationary time series,Information Science, vol. 29, no. 1, 1983, pp. 17–56.
Google Scholar
M. Basseville and A. Beneviste, eds.,Detection of Abrupt Changes in Signals and Dynamical Systems, Springer-Verlag, Berlin and New York, 1986.
Google Scholar
B. Friedlander, Lattice filters for adaptive processing,Proc. IEEE, vol. 70, no. 8, August, 1982, pp. 829–867.
Google Scholar
W. Hess,Pitch Determination of Speech Signals, Springer-Verlag, Berlin and New York, 1983.
Google Scholar
F. Itakura and S. Saito, A statistical method for estimation of speech spectral density and formant frequencies,Electron, and Commun., vol. 53-A, 1970, pp. 36–43.
Google Scholar
I. Konvalinka and M. Milosavljević, Sequential detection of the speech signal stationarity boundaries,Proc. XXIX ETAN Conf., Niš, vol. IV, pp. 141–146, June 1985, (in Serbian).
Google Scholar
Ashok K. Krishnamurthy and Donald G. Childers, Two-channels speech analysis,IEEE Trans. Acoust. Speech Signal Process. ASSP-34, no. 4, August 1986, pp. 730–742.
Google Scholar
Chin-Hui Lee, Frank K. Song, and Biing-Hwang Juang, A segment model based approach to speech recognition,IC ASSP, 1988, pp. 501–504.
J. D. Markel and A. H. Gray, Jr.,Linear Prediction of Speech, Springer-Verlag, Berlin and New York, 1976.
Google Scholar
Yoshiaki Miyoshi, Kazuharu Yamato, Riichiro Mizoguchi, Masuzo Yanagida, and Osamu Kakusho, Analysis of speech signals of short pitch period by a sample-selective linear prediction,IEEE Trans. Acoust. Speech Signal Process. ASSP-35, no. 9, September 1987, pp. 1233–1239.
Google Scholar
Zoran šarić, Reducing the speech signal pitch influence to AR parameters estimation using weighted sum of squares errors,XXXII Yugoslavian Conference ETAN, June 1988, pp. 177–184, Sarajevo (in Serbian).
Zoran šarić and Srbijank R. Turajlić, Estimation and setting starting values in ARMA algorithms,Circuits Systems Signal Process., vol. 12, no. 1, 1993, pp. 85–103.
Google Scholar
T. Svedsen and F. K. Soong, On the automatic segmentation of speech signals,IC ASSP, 1987, pp. 77–80.
E. Vidal and A. Marzal, A review and new approaches for automatic segmentation of speech signal,Proc. of EUSIPCO-90, Barcelona (Spain), September 1990, vol. 1, pp. 43–54.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Applied Mathematics and Electronics, Kneza Miloša 73, 11000, Beograd, Yugoslavia
Z. M. šarić
Department of Electrical Engineering, University of Belgrade, Bulevar Revolucije 73, 11000, Beograd, Yugoslavia
S. R. Turajlić

Authors

Z. M. šarić
View author publications
You can also search for this author in PubMed Google Scholar
S. R. Turajlić
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Research supported in part by the Mathematical Institute of the Serbian Science Academy and Serbian Science Foundation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

šarić, Z.M., Turajlić, S.R. A new approach to speech segmentation based on the maximum likelihood. Circuits Systems and Signal Process 14, 615–632 (1995). https://doi.org/10.1007/BF01213958

Download citation

Received: 31 March 1993
Accepted: 22 November 1993
Issue Date: September 1995
DOI: https://doi.org/10.1007/BF01213958

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new approach to speech segmentation based on the maximum likelihood

Abstract

Access this article

Similar content being viewed by others

Speech Signal Segmentation into Vocalized and Unvocalized Segments on the Basis of Simultaneous Masking

Audio Segmentation for Speech Recognition Using Segment Features

Guaranteed Significance Level Criterion in Automatic Speech Signal Segmentation

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new approach to speech segmentation based on the maximum likelihood

Abstract

Access this article

Similar content being viewed by others

Speech Signal Segmentation into Vocalized and Unvocalized Segments on the Basis of Simultaneous Masking

Audio Segmentation for Speech Recognition Using Segment Features

Guaranteed Significance Level Criterion in Automatic Speech Signal Segmentation

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation