Post-processing of Automatic Segmentation of Speech Using Dynamic Programming

  • Marcin Szymański
  • Stefan Grocholewski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4188)


Building unit-selection speech synthesisers requires a precise annotation of large speech corpora. Manual segmentation of speech is a very laborious task, hence there is the need for automatic segmentation algorithms. As it was observed that the common HMM-based method is prone to systematical errors, some boundary refinement approaches, like boundary-specific correction, were introduced.

Last year, a dynamic programming fine-tuning approach was proposed, that combined two sources information, boundary error distribution and boundary MFCC statistical models. In this paper we verify the usefulness of incorporating several other data, boundary energy dynamics models and the signal periodicity information.


Speech Recognition Automatic Segmentation Manual Segmentation Speech Synthesis Segmentation Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adell, J., Bonafonte, A.: Towards phone segmentation for concatenation speech synthesis. In: Proc. 5th Speech Synthesis Workshop, Pittsburgh, pp. 139–144 (2004)Google Scholar
  2. 2.
    Grocholewski, S.: CORPORA – Speech Database for Polish Diphones. In: Proc. Eurospeech 1997, pp. 1735–1738 (1997)Google Scholar
  3. 3.
    Klabbers, E., Stoeber, K., Veldhuis, R., Wagner, P., Breuer, S.: Speech Synthesis Development Made Easy: The Bonn Open Synthesis System. In: Proc. Eurospeech 2001, pp. 521–525 (2001)Google Scholar
  4. 4.
    Kvale, K.: Segmentation and Labelling of Speech, Ph.D. Thesis, Inst. for Teleteknikk, Trondheim (1993)Google Scholar
  5. 5.
    Matousek, J., Tihelka, D., Psutka, J.: Automatic Segmentation for Czech Concatenative Speech Synthesis Using Statistical Approach with Boundary-Specific Correction. In: Proc. Eurospeech 2003, Geneva, pp. 301–304 (2003)Google Scholar
  6. 6.
    Ostendorf, M., Digalakis, V.V., Kimball, O.A.: From HMM’s to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition. IEEE Trans. on Speech and Audio Proc. 4(5) (September 1996)Google Scholar
  7. 7.
    Szymański, M., Grocholewski, S.: Dynamic programming method for fine-tuning the boundary points in automatic segmentation of speech. In: Proc. Speech Analysis, Synthesis and Recognition Workshop, Krakow, Poland (2005)Google Scholar
  8. 8.
    Taylor, P.A., Isard, S.D.: Automatic phone segmentation. In: Proc. Eurospeech, Genova, pp. 709–711 (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Marcin Szymański
    • 1
  • Stefan Grocholewski
    • 1
  1. 1.Institute of Computing SciencePoznan University of TechnologyPoznańPoland

Personalised recommendations