Abstract
In this paper, a convolutional recurrent neural network (ConvRNN) wavelet-based kernel is proposed to improve the time–frequency localization of non-stationary signals. Time–frequency distributions (TFDs) are used to localize the power spectral density components of a signal simultaneously over the time–frequency plane. For various non-stationary signals (NSS) with different features, poor time–frequency resolution is always an inherent problem in all developed TFDs. This is mainly due to inefficient segmentation methods and improper kernel adaptation. In the current work, the fraction Bezier–Bernstein polynomial function is applied to model the NSS, and points of inflection are used for signal segmentation. From the obtained segments, statistical time–frequency features are extracted and fed to a ConvRNN for better time–frequency localization. The ConvRNN employs a convolution computation between input signal features and the proposed Newton–Raphson gradient algorithm (NRGA)-based wavelet function. The optimization of the ConvRNN network is achieved by incorporating a hybrid method that combines the principles of normalized adaptive gradient descent and momentum-based optimization, with an additional normalization step to enhance convergence and stability. The ConvRNN weights are updated in both forward and backward directions (resilient propagation) until a better correlation is achieved between signal segments and the wavelet kernel. It is observed that the proposed ConvRNN NRGA wavelet improves the time–frequency localization when compared with standard TFDs and state-of-the-art methodologies. Furthermore, the proposed ConvRNN model is compared with other CNN and RNN architectures for better practical interpretations.
Similar content being viewed by others
Data availability
The data and materials are available from the corresponding author on reasonable request.
References
Cohen, L.: Time–frequency distributions: a review. Proc. IEEE 77(7), 941–981 (1989)
Griffin, D., Lim, J.: Signal estimation from modified short-time Fourier transform. IEEE Trans. Acoust. Speech Signal Process. 32(2), 236–243 (1984)
Aguiar-Conraria, L., Soares, M.J.: The continuous wavelet transform: moving beyond uni- and bivariate analysis. J. Econ. Surv. 28(2), 344–375 (2014)
Stockwell, R.G., Mansinha, L., Lowe, R.P.: Localization of the complex spectrum: the S transform. IEEE Trans. Signal Process. 44(4), 998–1001 (1996)
Boashash, B., Black, P.: An efficient real-time implementation of the Wigner-Ville distribution. IEEE Trans. Acoust. Speech Signal Process. 35(11), 1611–1618 (1987)
Shin, Y.S., Jeon, J.J.: Pseudo Wigner-Ville time–frequency distribution and its application to machinery condition monitoring. Shock. Vib. 1(1), 65–76 (1993)
Wu, X., Liu, T.: Spectral decomposition of seismic data with reassigned smoothed pseudo Wigner-Ville distribution. J. Appl. Geophys. 68(3), 386–393 (2009)
Papandreou, A., Boudreaux-Bartels, G.F.: Generalization of the Choi-Williams distribution and the Butterworth distribution for time–frequency analysis. IEEE Trans. Signal Process. 41(1), 463 (1993)
Zeng, D., et al.: Automatic modulation classification of radar signals using the Rihaczek distribution and Hough transform. IET Radar Sonar Navig. 6(5), 322–331 (2012)
Deprem, Z., Cetin, A.E.: Kernel estimation for time–frequency distribution. In: 2015 23nd Signal Processing and Communications Applications Conference (SIU). IEEE (2015)
Cooley, J.W., Lewis, P.A.W., Welch, P.D.: The fast Fourier transform and its applications. IEEE Trans. Educ. 12(1), 27–34 (1969)
Kumar, G.G., Sahoo, S.K., Meher, P.K.: 50 years of FFT algorithms and applications. Circuits Syst. Signal Process. 38, 5665–5698 (2019)
Abramovich, F., Bailey, T.C., Sapatinas, T.: Wavelet analysis and its statistical applications. J. R. Stat. Soc. Ser. D (Stat.) 49(1), 1–29 (2000)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Mallat, S.G.: A Wavelet Tour of Signal Processing. Elsevier, Amsterdam (1999)
Penny, W.D., Roberts, S.J.: Dynamic models for nonstationary signal segmentation. Comput. Biomed. Res. 32(6), 483–502 (1999)
Van Den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K.: Wavenet: a generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)
Lea, C., Flynn, M.D., Vidal, R., Reiter, A., Hager, G.D.: Temporal convolutional networks for action segmentation and detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Tuggener, L., Elezi, I., Schmidhuber, J., Pelillo, M., Stadelmann, T.: Deepscores-a dataset for segmentation, detection and classification of tiny objects. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE (2018)
Li, Y., et al.: PointCNN: convolution on x-transformed points. Adv. Neural Inf. Process. Syst. 31 (2018)
Simeoni, M., Kashani, S., Hurley, P., Vetterli, M.: Deepwave: a recurrent neural-network for real-time acoustic imaging. Adv. Neural Inf. Process. Syst. 32 (2019)
Pal, M., et al.: Meta-learning with latent space clustering in generative adversarial network for speaker diarization. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 1204–1219 (2021)
Nam, Youngja, Lee, Chankyu: Cascaded convolutional neural network architecture for speech emotion recognition in noisy conditions. Sensors 21(13), 4399 (2021)
Deprem, Z., Cetin, A.E.: High-resolution time–frequency representation with generative adversarial networks. Signal Image Video Process. 17(3), 849–854 (2023)
Khan, Nabeel Ali: Iterative adaptive directional time–frequency distribution for both mono-sensor and multi-sensor recordings. SIViP 17(2), 501–508 (2023)
Lerman, P.M.: Fitting segmented regression models by grid search. J. R. Stat. Soc. Ser. C Appl. Stat. 29(1), 77–84 (1980)
https://www.kaggle.com/datasets/isevilla/stanford-earthquake-dataset-stead
Acknowledgements
The author wishes to thank Research and Development centre of Sri Vasavi Engineering College for providing financial support to carry out the work.
Funding
This research is supported by Sri Vasavi Engineering college, R &D department, Government of India.
Ethics declarations
Conflict of interest
The authors claim no conflicting financial or personal connections influencing in this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Krishna, B.M., Satyanarayana, S.V.V., Satyanarayana, P.V.V. et al. Improving time–frequency resolution in non-stationary signal analysis using a convolutional recurrent neural network. SIViP (2024). https://doi.org/10.1007/s11760-024-03116-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11760-024-03116-1