Abstract
Articulatory Synthesis consists in reproducing speech by means of models of the vocal tract and of articulatory processes. Recent advances in Magnetic Resonance Imaging (MRI) allowed for important improvements with respect to the speech comprehension and the forms taken by the vocal tract. However, one of the main challenges in the field is the fast and at the same time high-quality acquisition of image sequences. Since adopting more powerful acquisition devices might be financially inviable, we propose a method for the spatio-temporal resolution enhancement of the obtained sequences using only digital image processing techniques. The approach involves two stages: (1) the temporal resolution enhancement by means of a motion compensated interpolation technique; and (2) the spatial resolution enhancement by means of a super resolution image reconstruction technique. Considering the spatial resolution enhancement, inspired by two methods available in the literature, three adaptations of the Wiener filter were proposed: the statistical interpolation, the multi-temporal approach, and the adaptive Wiener filter. In all cases, a separable Markovian model and an isotropic model were compared for the characterization of the spatial correlation structures. Considering all Wiener filter-based approaches, the adaptive Wiener filter outperformed all other approaches.
Similar content being viewed by others
References
Bresch, E., Narayanan, S.: Region segmentation in the frequency domain applied to upper airway Real-Time magnetic resonance images. IEEE Trans. Med. Imaging 28(3), 323–338 (2009)
Baer, T., Gore, J.C., Gracco, L.C., Nye, P.W.: Analysis of vocal tract shape and dimensions using magnetic resonance imaging: vowels. J. Acoust. Soc. Am. 90(2), 799–828 (1991)
Bresch, E., Kim, Y.-C., Nayak, K., Byrd, D., Narayanan, S.: Seeing speech: capturing vocal tract shaping using real-time magnetic resonance imaging [exploratory DSP]. IEEE Signal Process. Mag. 25(3), 123–132 (2008)
Engwall, O.: Combining MRI, EMA & EPG measurements in a three-dimensional tongue model. In: Speech Communication, vol. 41, pp. 303–329 (2003)
Martins, A.L.D., Mascarenhas, N.D.A., Suazo, C.A.T.: Spatio-Temporal resolution enhancement of vocal tract MRI sequences based on image registration. Integr. Comput.-Aided Eng. 18(3), 143–155 (2011)
Rueckert, D., Sodona, L.I., Hayes, C., Hill, D.L.G., Leach, M.O., Hawkes, D.J.: Nonrigid registration using Free-Form deformations: application to breast MR images. IEEE Trans. Med. Imaging 18(8), 712–721 (1999)
Hardie, R.A.: A fast image super-resolution algorithm using an adaptive Wiener flter. IEEE Trans. Image Process. 16(12), 2953–2964 (2007)
Mascarenhas, N.D.A., Banon, G.J.F., Candeias, A.L.B.: Multispectral image data fusion under bayesian approach. Int. J. Remote Sens. 17(8), 1457–1471 (1996)
Rueckert, D., Aljabar, P.: Nonrigid registration of medical images: theory, methods, and applications. IEEE Signal Process. Mag. 27(4), 113–119 (2010)
Tsai, R.Y., Huang, T.S.: Multi-frame image restoration and registration. Adv. Comput. Vis. Image Process. 317–339 (1984)
Stark, H., Oskoui, P.: High-resolution image recovery from image-plane arrays, using convex projections. J. Opt. Soc. Am. A 6(11), 1715–1726 (1989)
Katsaggelos, A.K., Molina, R., Mateos, J.: Super Resolution of Images and Video Morgan & Claypool, San Rafael (2007), 134 pp.
Schultz, R.R., Stevenson, R.L.: Extraction of highresolution frames from video sequences. IEEE Trans. Image Process. 5(6), 996–1011 (1996)
Siegel, S., Castellan, N.J. Jr.: Nonparametric Statistics for the Behavioral Sciences, 2nd edn. McGraw-Hill, New York (1988), 399 pp.
Park, S.C., Park, M.K., Kang, M.G.: Super-resolution image reconstruction: a technical overview. IEEE Signal Process. Mag. 20(3), 21–36 (2003)
Papoulis, A., Pillai, S.U.: Probability, Random Variables and Stochastic Processes, 4th edn. McGraw-Hill Europe, London (2002), 852 pp.
Pratt, W.K.: Digital Image Processing: PIKS Scientific Inside, 4th edn. Wiley-Interscience, New York (2007), 812 pp.
Hardie, R.C.: Super-Resolution using adaptive wiener filters. In: Milanfar, P. (ed.) Super-Resolution Imaging, Cap. 2, pp. 35–61. CRC Press, Boca Raton (2010)
Acknowledgements
We would like to thank professors Antonio Teixeira and Augusto Silva from Instituto de Engenharia Eletronica e Telematica de Aveiro (IEETA) of Universidade de Aveiro, Portugal, for the vocal tract images used in this work. These images are part of the HERON Project—A Framework for Portuguese Articulatory Synthesis Research, POSI/PLP/57680/2004. Ana L.D. Martins is supported by FAPESP, Brazil, under grant number 2008/01348-2.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Martins, A.L.D., Mascarenhas, N.D.A. Spatio-Temporal Resolution Enhancement of Vocal Tract MRI Sequences—A Comparison Among Wiener Filter Based Methods. J Math Imaging Vis 45, 200–213 (2013). https://doi.org/10.1007/s10851-012-0389-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10851-012-0389-0