Abstract
Estimation of glottal closure instants (GCIs) from an electroglottograph (EGG) signal can aid in clinical applications involving the diagnosis and treatment of speech pathologies and can also serve as a ground truth to assess algorithms that estimate GCIs from speech signals. In this regard, the current work proposes a phase-difference-based approach that considers the symmetrized, differenced EGG (DEGG) signal to be the Fourier transform of an arbitrary even-signal, to estimate GCIs from EGG signals. The DEGG signal possesses sharp negative valleys at the GCIs and since the symmetrized DEGG is assumed to be a spectrum, these valleys correspond to zeros that lie outside the unit circle. The angular locations of these zeros, and in turn the locations of GCIs, can be derived from the phase-difference spectrum, since it possesses a value of around \(2\pi \) at these locations, the derivation of which is elaborated in the paper. The proposed algorithm is compared with the existing time of excitation generator, the high quality time of excitation algorithm, and the singularity in EGG by multiscale analysis algorithm, in terms of the identification, miss, and false alarm rates, and the identification accuracy, on normal and pathological EGG. The proposed algorithm is observed to outperform the rest with an identification rate of 98.28% in normal EGG and 96.90% in pathological EGG.
Similar content being viewed by others
References
G. Anushiya Rachel, V.S. Solomi, K. Naveenkumar, P. Vijayalakshmi, T. Nagarajan, A small-footprint context-independent HMM-based synthesizer for Tamil. Int. J. Speech Technol. 18(3), 405–418 (2015)
G. Anushiya Rachel, S. Sreenidhi, P. Vijayalakshmi, T. Nagarajan, Incorporation of happiness into neutral speech by modifying emotive-keywords, in IEEE Region 10 Conference (TENCON) (2014), pp. 1–6
W. Barry, M. Putzer, Saarbrucken voice database, Institute of Phonetics, University of Saarland (2016). http://www.stimmdatenbank.coli.unisaarland.de. Accessed 19 March 2016
A. Bouzid, N. Ellouze, Multiscale product of electroglottogram signal for glottal closure and opening instant detection, in IMACS Multiconference on Computational Engineering in Systems Applications (2006), pp. 106–109
D.G. Childers, D.M. Hooks, G.P. Moore, L. Eskenazi, A.L. Lalwani, Electroglottography and vocal fold physiology. J. Speech Hear. Res. 33(2), 245–254 (1990)
T. Drugman, T. Dutoit, Glottal closure and opening instant detection from speech signals, in INTERSPEECH (2009), pp. 2891–2894
N.D. Gaubitch, P.A. Naylor, Spatiotemporal averaging method for enhancement of reverberant speech, in 15th International Conference on Digital Signal Processing (2007), pp. 607–610
D.M. Howard, Variation of electrolaryngographically derived closed quotient for trained and untrained adult female singers. J. Voice 9(2), 121–1223 (1995)
M. Huckvale, Speech Filing system: tools for speech, Tech. Rep. (University College of London, London, 2004)
J. Kominek, A. Black, The CMU arctic speech databases, in 5th ISCA Speech Synthesis Workshop (2004), pp. 223–224
A.I. Koutrouvelis, G.P. Kafentzis, N.D. Gaubitch, R. Heusdens, A fast method for high-resolution voiced/unvoiced detection and glottal closure/opening instant estimation of speech. IEEE Trans. Audio Speech Lang. Process. 24(2), 316–328 (2016)
M.A. Little, D.A.E. Costello, M.L. Harries, Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures. J. Voice 25(1), 21–31 (2011)
H.A. Murthy, B. Yegnanarayana, Group delay functions and its applications in speech technology. Sadhana 36(5), 745–782 (2011)
K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)
T. Nagarajan, H.A. Murthy, R.M. Hegde, Segmentation of speech into syllable-like units, in Eurospeech (2003), pp. 2893–2896
P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm. IEEE Trans. Audio Speech Lang. Process. 15(1), 34–43 (2007)
A.V. Oppenheim, R.W. Schafer, Discrete-Time Signal Processing (Prentice-Hall, Englewood Cliffs, 2000)
A.P. Prathosh, T.V. Ananthapadmanabha, A.G. Ramakrishnan, Epoch extraction based on integrated linear prediction residual using plosion index. IEEE Trans. Audio Speech Lang. Process. 21(12), 2471–2480 (2013)
J.G. Proakis, D.G. Manolakis, Digital Signal Processing (Pearson, London, 1992)
K. Ramesh, S.R.M. Prasanna, D. Govind, Detection of glottal opening instants using Hilbert envelope, in INTERSPEECH (2013), pp. 44–48
K.S. Rao, Unconstrained pitch contour modification using instants of significant excitation. Circuits Syst. Sig. Process. 31(6), 2133–2152 (2012)
N. Sripriya, T. Nagarajan, Estimation of glottal closure instants by considering speech signal as a spectrum. IET Electron. Lett. 51(8), 649–651 (2015)
M.R.P. Thomas, N.D. Gaubitch, J. Gudnason, P.A. Naylor, A practical multichannel dereverberation algorithm using multichannel DYPSA and spatiotemporal averaging, in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (2007), pp. 50–53
M.R.P. Thomas, J. Gudnason, P.A. Naylor, B. Geiser, P. Vary, Voice source estimation for artificial bandwidth extension of telephone speech, in IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (2010), pp. 4794–4797
M.R.P. Thomas, J. Gudnason, P.A. Naylor, Estimation of glottal closing and opening instants in voiced speech using the YAGA algorithm. IEEE Trans. Audio Speech Lang. Process. 20(1), 82–91 (2012)
M.R.P. Thomas, P.A. Naylor, The SIGMA algorithm: a glottal activity detector for electroglottographic signals. IEEE Trans. Audio Speech Lang. Process. 17(8), 1557–1566 (2009)
M.R.P. Thomas, P.A. Naylor, The SIGMA algorithm for estimation of reference-quality glottal closure instants from electroglottograph signals, in 16th European Signal Processing Conference (2008), pp. 1–5
B. Yegnanarayana, H.A. Murthy, Significance of group delay functions in spectrum estimation. IEEE Trans. Sig. Process. 40(9), 2281–2289 (1992)
B. Yegnanarayana, D. Saikia, T. Krishnan, Significance of group delay functions in signal reconstruction from spectral magnitude or phase. IEEE Trans. Acoust. Speech Sig. Process. 32(3), 610–623 (1984)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Phase-Difference Spectrum Due to a Pole-Zero Pair: A Geometric Interpretation
The system function considered in (1) consists of a pole at the origin and a zero at a. If \(a = re^{j\omega _0}\), the zero is at the angular location \(\omega _0\) and with a radius r. The contribution to Fourier transform (FT) phase, due to this pole and the zero, can be explained using geometric interpretation, as explained below. The following mathematical notations are used:
-
\(\theta _{p_0}, \theta _{p_1}\), and \(\theta _{p_2} \) refers to the contribution of the pole (p) to the FT phase measured at \(\omega _{0}, \omega _{1}\), and \(\omega _{2}\) respectively.
-
\(\theta _{z_0}, \theta _{z_1}\), and \(\theta _{z_2} \) refers to the contribution of the zero (z) to the FT phase measured at \(\omega _{0}, \omega _{1}\), and \(\omega _{2}\) respectively.
-
\(\theta (e^{j\omega _0}), \theta (e^{j\omega _1})\), and \(\theta \left( e^{j\omega _2}\right) \) refers to the FT phase due to the pole (p) and the zero (z), at \(\omega _{0}, \omega _{1}\), and \(\omega _{2}\) respectively.
-
\(\theta _{p_1} = \theta _{p_0} - \delta \theta _{p_0}\) and \(\theta _{p_2} = \theta _{p_0} + \delta \theta _{p_0}\), where \(\delta \theta _{p_0}\) is an infinitesimal change to \(\theta _{p_0}\)
-
\(\theta _{z_1} = \theta _{z_0} - \delta \theta _{z_0}\) and \(\theta _{z_2} = \theta _{z_0} + \delta \theta _{z_0}\), where \(\delta \theta _{z_0}\) is an infinitesimal change to \(\theta _{z_0}\)
-
\(\omega _{1} = \omega _{0} - \delta \omega _0\) and \(\omega _{2} = \omega _{0} + \delta \omega _0\), where \(\delta \omega _{0}\) is an infinitesimal change to \(\omega _0\)
Let us consider four different cases for a zero that lies at an angular location \(\omega _0\) as follows:
-
(a)
FT phase at \(\omega _0\) due to a pole at \(r=0\) and a zero at \(0.0<r<1.0\) (inside the unit circle): From Fig. 14, one can notice that the FT phase due to the pole (\(\theta _{p_0}\)) and the zero (\(\theta _{z_0}\)) are of same magnitude. The total FT phase due to the pole-zero pair at any frequency bin, in general, is the FT phase due to the zero, minus the FT phase due to the pole [19]. From Fig. 14, it is clear that the FT phase due to the pole and zero are equal, and the total phase can be given by,
$$\begin{aligned} \theta \left( e^{j\omega _{0}}\right)= & {} \theta _{z_0} - \theta _{p_0}\end{aligned}$$(10)$$\begin{aligned}= & {} 0^{\circ } \end{aligned}$$(11) -
(b)
FT phase at the angular location of a zero, that lies outside the unit circle: In this case, the FT phase due to the zero is with a different sign as it measured in the anticlockwise direction as shown in Fig. 2. The total FT phase, here, can be given as,
$$\begin{aligned} \theta \left( e^{j\omega _{0}}\right)= & {} -\theta _{z_0} - \theta _{p_0}\nonumber \\ \end{aligned}$$(12)$$\begin{aligned}= & {} -180^{\circ } \end{aligned}$$(13) -
(c)
Phase at the angular location \(\omega _1 = \omega _0 - \delta \omega _0\) due to a zero that lies outside the unit circle at an angular location \(\omega _0\) shown in Fig. 15: At \(\omega _1\), the total FT phase can be given as,
$$\begin{aligned} \theta \left( e^{j\omega _1}\right)= & {} \theta _{z_1} - \theta _{p_1} \end{aligned}$$(14)$$\begin{aligned}= & {} -\left[ \theta _{z_0} - \delta \theta _{z_0}\right] - \left[ \theta _{p_0} - \delta \theta _{p_0}\right] \end{aligned}$$(15)$$\begin{aligned}= & {} \left[ -\theta _{z_0} - \theta _{p_0}\right] + \left[ \delta \theta _{p_0} + \delta \theta _{z_0}\right] \end{aligned}$$(16)$$\begin{aligned}= & {} -180^{\circ } + \left[ \delta \theta _{p_0} + \delta \theta _{z_0}\right] \end{aligned}$$(17)where \(\delta \theta _{p_0}\) and \(\delta \theta _{z_0}\) are very small, if \(\delta \omega _0\) is very small. Therefore, \(\theta (e^{j\omega _1}) \approx - 180^{\circ }\).
-
(d)
Phase at the angular location \(\omega _2 = \omega _0 + \delta \omega _0\) due to a zero that lies outside the unit circle at an angular location \(\omega _0\) shown in Fig. 16:
$$\begin{aligned} \theta \left( e^{j\omega _2}\right)= & {} \theta _{z_2} - \theta _{p_2} \end{aligned}$$(18)$$\begin{aligned}= & {} -\left[ \theta _{z_0} + \delta \theta _{z_0}\right] - \left[ \theta _{p_0} + \delta \theta _{p_0}\right] \end{aligned}$$(19)$$\begin{aligned}= & {} \left[ -\theta _{z_0} - \theta _{p_0}\right] - \left[ \delta \theta _{p_0} + \delta \theta _{z_0}\right] \end{aligned}$$(20)$$\begin{aligned}= & {} -180^{\circ } - \left[ \delta \theta _{p_0} + \delta \theta _{z_0}\right] \end{aligned}$$(21)where \(\delta \theta _{p_0}\) and \(\delta \theta _{z_0}\) are very small, if \(\delta \omega _0\) is very small. Therefore, \(\theta \left( e^{j\omega _2}\right) \approx - 180^{\circ }\) (the absolute phase is slightly greater than \(\pi \). Since the principal phase is defined between \({-}\pi \) and \(\pi \), for this case, the phase is wrapped and there is a sign change. Finally, \(\theta (e^{j\omega 2}) = +180^{\circ } - \left[ \delta \theta _{p_0} + \delta \theta _{z_0}\right] \).
1.2 Observations
-
At \(\omega _0\), \(|\theta (e^{j\omega _0})| = 180^{\circ }\).
-
At \(\omega _1 = \omega _{0} - \delta \omega _0\), \(|\theta (e^{j\omega _1})|\) is less than \(180^{\circ }\) by an amount \((\delta \theta _{p_0} + \delta \theta _{z_0})\), which is small if \(\delta \omega _0\) is small.
-
At \(\omega _2 = \omega _{0} + \delta \omega _0\), \(|\theta \left( e^{j\omega _2}\right) |\) is greater than \(180^{\circ }\) by an amount \((\delta \theta _{p_0} + \delta \theta _{z_0})\), which is small if \(\delta \omega _0\) is small. Due to phase wrapping, there is a sign change and \(|\theta \left( e^{j\omega _2}\right) |\) becomes slightly less than \(\pi \).
-
The phase-difference (\(\tau \)) is given by,
$$\begin{aligned} \tau= & {} -\left[ \theta \left( e^{j\omega _{2}}\right) - \theta \left( e^{j\omega _{2}}\right) \right] \end{aligned}$$(22)$$\begin{aligned}= & {} -2\pi + 2\left[ \delta \theta _{p_0} + \delta \theta _{z_0}\right] \end{aligned}$$(23) -
If the phase is unwrapped, then,
$$\begin{aligned} \tau = 2\left[ \delta \theta _{p_0} + \delta \theta _{z_0}\right] \end{aligned}$$(24)
1.3 Zero on the Unit Circle: Phase and Phase-Difference Spectrum
Let us consider a single zero that lies on the unit circle itself at an angular frequency \(\omega _0\) as shown in Fig. 17. Measuring the FT phase, using geometric interpretation, at the angular location of a singularity itself due to a singularity that lies on the unit circle, is not possible. However, after applying an appropriate condition on a FT phase at a neighboring angular location (say \(\omega _{1} = \omega _{0} - \delta \)), one can comment on the value of the FT phase for such case too, as explained below. Here, the FT phase at \(\omega _1\), due to the zero on the unit circle and the pole at the origin of the unit circle, is the sum of \(\theta _{p_{1}}\) and \(\theta _{z_{1}}\) (with a negative sign).
Here, if \(\delta \) approaches 0, the line joining the singularity and \(\omega _{0}\) becomes a tangent of the unit circle at \(\omega _0\). This implies that the FT phase for this case is \(\frac{\pi }{2}\). Similar to this, the FT phase at \(\omega _{0} + \delta \) will also have a value \(\frac{\pi }{2}\), but with a change in sign. Now, the phase difference at \(\omega _{0}\) can be shown to have a value \(\pi \). Due to truncation of a sequence (to make it finite), zeros will be introduced on the unit circle. These zeros introduce a discontinuity in the phase spectrum with a magnitude \(\pi \) radians at the corresponding angular locations. Due to these discontinuities, the phase-difference spectrum is expected to reach the value \(\pi \) at these angular locations.
Rights and permissions
About this article
Cite this article
Anushiya Rachel, G., Sripriya, N., Vijayalakshmi, P. et al. Significance of Differenced EGG Signal as a Spectrum in Phase Difference Computation for the Estimation of Glottal Closure Instants. Circuits Syst Signal Process 37, 2074–2097 (2018). https://doi.org/10.1007/s00034-017-0654-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-017-0654-y