Circuits, Systems, and Signal Processing

, Volume 37, Issue 5, pp 2074–2097 | Cite as

Significance of Differenced EGG Signal as a Spectrum in Phase Difference Computation for the Estimation of Glottal Closure Instants

  • G. Anushiya Rachel
  • N. Sripriya
  • P. Vijayalakshmi
  • T. Nagarajan
Article
  • 112 Downloads

Abstract

Estimation of glottal closure instants (GCIs) from an electroglottograph (EGG) signal can aid in clinical applications involving the diagnosis and treatment of speech pathologies and can also serve as a ground truth to assess algorithms that estimate GCIs from speech signals. In this regard, the current work proposes a phase-difference-based approach that considers the symmetrized, differenced EGG (DEGG) signal to be the Fourier transform of an arbitrary even-signal, to estimate GCIs from EGG signals. The DEGG signal possesses sharp negative valleys at the GCIs and since the symmetrized DEGG is assumed to be a spectrum, these valleys correspond to zeros that lie outside the unit circle. The angular locations of these zeros, and in turn the locations of GCIs, can be derived from the phase-difference spectrum, since it possesses a value of around \(2\pi \) at these locations, the derivation of which is elaborated in the paper. The proposed algorithm is compared with the existing time of excitation generator, the high quality time of excitation algorithm, and the singularity in EGG by multiscale analysis algorithm, in terms of the identification, miss, and false alarm rates, and the identification accuracy, on normal and pathological EGG. The proposed algorithm is observed to outperform the rest with an identification rate of 98.28% in normal EGG and 96.90% in pathological EGG.

Keywords

Glottal closure instants EGG signal Fourier transform phase Phase difference Group delay 

References

  1. 1.
    G. Anushiya Rachel, V.S. Solomi, K. Naveenkumar, P. Vijayalakshmi, T. Nagarajan, A small-footprint context-independent HMM-based synthesizer for Tamil. Int. J. Speech Technol. 18(3), 405–418 (2015)CrossRefGoogle Scholar
  2. 2.
    G. Anushiya Rachel, S. Sreenidhi, P. Vijayalakshmi, T. Nagarajan, Incorporation of happiness into neutral speech by modifying emotive-keywords, in IEEE Region 10 Conference (TENCON) (2014), pp. 1–6Google Scholar
  3. 3.
    W. Barry, M. Putzer, Saarbrucken voice database, Institute of Phonetics, University of Saarland (2016). http://www.stimmdatenbank.coli.unisaarland.de. Accessed 19 March 2016
  4. 4.
    A. Bouzid, N. Ellouze, Multiscale product of electroglottogram signal for glottal closure and opening instant detection, in IMACS Multiconference on Computational Engineering in Systems Applications (2006), pp. 106–109Google Scholar
  5. 5.
    D.G. Childers, D.M. Hooks, G.P. Moore, L. Eskenazi, A.L. Lalwani, Electroglottography and vocal fold physiology. J. Speech Hear. Res. 33(2), 245–254 (1990)CrossRefGoogle Scholar
  6. 6.
    T. Drugman, T. Dutoit, Glottal closure and opening instant detection from speech signals, in INTERSPEECH (2009), pp. 2891–2894Google Scholar
  7. 7.
    N.D. Gaubitch, P.A. Naylor, Spatiotemporal averaging method for enhancement of reverberant speech, in 15th International Conference on Digital Signal Processing (2007), pp. 607–610Google Scholar
  8. 8.
    D.M. Howard, Variation of electrolaryngographically derived closed quotient for trained and untrained adult female singers. J. Voice 9(2), 121–1223 (1995)CrossRefGoogle Scholar
  9. 9.
    M. Huckvale, Speech Filing system: tools for speech, Tech. Rep. (University College of London, London, 2004)Google Scholar
  10. 10.
    J. Kominek, A. Black, The CMU arctic speech databases, in 5th ISCA Speech Synthesis Workshop (2004), pp. 223–224Google Scholar
  11. 11.
    A.I. Koutrouvelis, G.P. Kafentzis, N.D. Gaubitch, R. Heusdens, A fast method for high-resolution voiced/unvoiced detection and glottal closure/opening instant estimation of speech. IEEE Trans. Audio Speech Lang. Process. 24(2), 316–328 (2016)CrossRefGoogle Scholar
  12. 12.
    M.A. Little, D.A.E. Costello, M.L. Harries, Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures. J. Voice 25(1), 21–31 (2011)CrossRefGoogle Scholar
  13. 13.
    H.A. Murthy, B. Yegnanarayana, Group delay functions and its applications in speech technology. Sadhana 36(5), 745–782 (2011)CrossRefGoogle Scholar
  14. 14.
    K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)CrossRefGoogle Scholar
  15. 15.
    T. Nagarajan, H.A. Murthy, R.M. Hegde, Segmentation of speech into syllable-like units, in Eurospeech (2003), pp. 2893–2896Google Scholar
  16. 16.
    P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm. IEEE Trans. Audio Speech Lang. Process. 15(1), 34–43 (2007)CrossRefGoogle Scholar
  17. 17.
    A.V. Oppenheim, R.W. Schafer, Discrete-Time Signal Processing (Prentice-Hall, Englewood Cliffs, 2000)MATHGoogle Scholar
  18. 18.
    A.P. Prathosh, T.V. Ananthapadmanabha, A.G. Ramakrishnan, Epoch extraction based on integrated linear prediction residual using plosion index. IEEE Trans. Audio Speech Lang. Process. 21(12), 2471–2480 (2013)CrossRefGoogle Scholar
  19. 19.
    J.G. Proakis, D.G. Manolakis, Digital Signal Processing (Pearson, London, 1992)Google Scholar
  20. 20.
    K. Ramesh, S.R.M. Prasanna, D. Govind, Detection of glottal opening instants using Hilbert envelope, in INTERSPEECH (2013), pp. 44–48Google Scholar
  21. 21.
    K.S. Rao, Unconstrained pitch contour modification using instants of significant excitation. Circuits Syst. Sig. Process. 31(6), 2133–2152 (2012)MathSciNetCrossRefGoogle Scholar
  22. 22.
    N. Sripriya, T. Nagarajan, Estimation of glottal closure instants by considering speech signal as a spectrum. IET Electron. Lett. 51(8), 649–651 (2015)CrossRefGoogle Scholar
  23. 23.
    M.R.P. Thomas, N.D. Gaubitch, J. Gudnason, P.A. Naylor, A practical multichannel dereverberation algorithm using multichannel DYPSA and spatiotemporal averaging, in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (2007), pp. 50–53Google Scholar
  24. 24.
    M.R.P. Thomas, J. Gudnason, P.A. Naylor, B. Geiser, P. Vary, Voice source estimation for artificial bandwidth extension of telephone speech, in IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (2010), pp. 4794–4797Google Scholar
  25. 25.
    M.R.P. Thomas, J. Gudnason, P.A. Naylor, Estimation of glottal closing and opening instants in voiced speech using the YAGA algorithm. IEEE Trans. Audio Speech Lang. Process. 20(1), 82–91 (2012)CrossRefGoogle Scholar
  26. 26.
    M.R.P. Thomas, P.A. Naylor, The SIGMA algorithm: a glottal activity detector for electroglottographic signals. IEEE Trans. Audio Speech Lang. Process. 17(8), 1557–1566 (2009)CrossRefGoogle Scholar
  27. 27.
    M.R.P. Thomas, P.A. Naylor, The SIGMA algorithm for estimation of reference-quality glottal closure instants from electroglottograph signals, in 16th European Signal Processing Conference (2008), pp. 1–5Google Scholar
  28. 28.
    B. Yegnanarayana, H.A. Murthy, Significance of group delay functions in spectrum estimation. IEEE Trans. Sig. Process. 40(9), 2281–2289 (1992)CrossRefMATHGoogle Scholar
  29. 29.
    B. Yegnanarayana, D. Saikia, T. Krishnan, Significance of group delay functions in signal reconstruction from spectral magnitude or phase. IEEE Trans. Acoust. Speech Sig. Process. 32(3), 610–623 (1984)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • G. Anushiya Rachel
    • 1
  • N. Sripriya
    • 1
  • P. Vijayalakshmi
    • 1
  • T. Nagarajan
    • 1
  1. 1.Speech LabSSN College of EngineeringChennaiIndia

Personalised recommendations