Contributions to a Quantitative Unsupervised Processing and Analysis of Tongue in Ultrasound Images

Barros, Fábio; Valente, Ana Rita; Albuquerque, Luciana; Silva, Samuel; Teixeira, António; Oliveira, Catarina

doi:10.1007/978-3-030-50516-5_15

Contributions to a Quantitative Unsupervised Processing and Analysis of Tongue in Ultrasound Images

Fábio Barros^11,12,
Ana Rita Valente^11,12,
Luciana Albuquerque^11,13,
Samuel Silva^11,12,
António Teixeira^11,12 &
…
Catarina Oliveira^11,14

Conference paper
First Online: 17 June 2020

910 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12132))

Abstract

Speech production studies and the knowledge they bring forward are of paramount importance to advance a wide range of areas including Phonetics, speech therapy, synthesis and interaction. Several technologies have been considered to study static and dynamic features of the articulators and speech motor control, such as electromagnetic articulography (EMA), real-time magnetic resonance (RTMRI) and ultrasound (US) imaging. While the latest advances in RTMRI provide a wealth of data of the full vocal tract, it is an expensive resource that requires specialized facilities. In this sense, US is a more affordable alternative for several contexts, enabling the acquisition of larger datasets, but demands adequate computational approaches for processing and analysis. While the literature is prolific in proposing methods for tongue segmentation from US, the noisy nature of the images and the specificities of the equipment often dictate a poor performance on novel datasets, a matter that needs to be assessed, before large data acquisition, to devise suitable acquisition and processing methods. In the scope of a line of research studying speech changes with age, this work describes the first results of an automatic tongue segmentation method from US, along with a characterization of the main challenges posed by the image data. Even though improvements are still needed, particularly to ensure temporal coherence, at its current stage, this method can already provide the required data for an automatic analysis of maximum tongue height, a relevant parameter to assess speech changes on vowel production.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Akgul, Y.S., Stone, C., Maureen, K.: Automatic extraction and tracking of contours. Trans. Med. Imaging 18(10), 1035–1045 (1999)
Article Google Scholar
Articulate Assistant Ltd.: Articulate Assistant Advanced Ultrasound Module User Manual (2014)
Google Scholar
Articulate Instruments Ltd.: Ultrasound Stabilisation Headset Users Manual (2008)
Google Scholar
Articulate Instruments Ltd.: SyncBrightUp Users Manual (2010)
Google Scholar
Chen, Y., Lin, H.: Analysing tongue shape and movement in vowel production Using SS ANOVA in ultrasound imaging. In: ICPhS, pp. 124–127 (2011)
Google Scholar
Csapó, T.G., Lulich, S.M.: Error analysis of extracted tongue contours from 2D ultrasound images. In: INTERSPEECH, pp. 2157–2161. ISCA, Dresden (2015)
Google Scholar
Dokovova, M., Sabev, M., Scobbie, J.M., Lickley, R., Cowen, S.: Bulgarian vowel reduction in unstressed position: an ultrasound and acoustic investigation. In: 19th ICPhS, pp. 2720–2724 (2019)
Google Scholar
Fabre, D., Hueber, T., Bocquelet, F., Badin, P.: Tongue tracking in ultrasound images using EigenTongue decomposition and artificial neural networks. In: INTERSPEECH, pp. 2410–2414. ISCA, Dresden (2015)
Google Scholar
Fabre, D., Hueber, T., Girin, L., Alameda-Pineda, X., Badin, P.: Automatic animation of an articulatory tongue model from ultrasound images of the vocal tract. Speech Commun. 93, 63–75 (2017). https://doi.org/10.1016/j.specom.2017.08.002
Article Google Scholar
Fasel, I., Berry, J.: Deep belief networks for real-time extraction of tongue contours from ultrasound during speech. In: International Conference on Pattern Recognition, pp. 1493–1496 (2010). https://doi.org/10.1109/ICPR.2010.369
Georgeton, L., Antolík, T.K., Fougeron, C.: Effect of domain initial strengthening on vowel height and backness contrasts in French: acoustic and ultrasound data. JSLHR 59(6), S1575–S1586 (2016)
Google Scholar
Georgeton, L., Kocjančič Antolík, T., Fougeron, C.: Domain initial strengthening and height contrast in French: acoustic and ultrasound data. In: 10th ISSP, Cologne, pp. 142–145 (2014). https://halshs.archives-ouvertes.fr/halshs-01401388
Hall, K.C., Allen, C., Mcmullin, K., Letawsky, V., Turner, A.: Measuring magnitude of tongue movement for vowel height and backness. In: ICPhS (2015)
Google Scholar
Hillenbrand, J., Getty, L.A., Clark, M., Wheeler, K.: Acoustic characteristics of American English vowels. J. Acoust. Soc. Am. 97(5), 3099–3111 (1995). http://ukpmc.ac.uk/abstract/MED/7759650
Article Google Scholar
Jaumard-Hakoun, A., Xu, K., Roussel-ragot, P., Stone, M.L.: Tongue contour extraction from ultrasound images. In: 18th International Congress of Phonetic Sciences (ICPhS) (2015)
Google Scholar
Karimi, E., Ménard, L., Laporte, C.: Fully-automated tongue detection in ultrasound images. Comput. Biol. Med. 111(103335), 1–13 (2019). https://doi.org/10.1016/j.compbiomed.2019.103335
Article Google Scholar
Kirkham, S., Nance, C.: An acoustic-articulatory study of bilingual vowel production: advanced tongue root vowels in Twi and tense/lax vowels in Ghanaian English. J. Phon. 62, 65–81 (2017)
Article Google Scholar
Kisler, T., Reichel, U., Schiel, F.: Multilingual processing of speech via web services. Comput. Speech Lang. 45, 326–347 (2017). https://doi.org/10.1016/j.csl.2017.01.005
Article Google Scholar
Kovesi, P., et al.: Symmetry and asymmetry from local phase. In: Tenth Australian Joint Conference on Artificial Intelligence, vol. 190, pp. 2–4. Citeseer (1997)
Google Scholar
Lancia, L., Rausch, P., Morris, J.S.: Automatic quantitative analysis of ultrasound tongue contours via wavelet-based functional mixed models. J. Acoust. Soc. Am. 137(2), EL178–EL183 (2015). https://doi.org/10.1121/1.4905881
Article Google Scholar
Laporte, C., Ménard, L.: Robust tongue tracking in ultrasound images: a multi-hypothesis approach. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 633–637 (2015)
Google Scholar
Laporte, C., Ménard, L.: Multi-hypothesis tracking of the tongue surface in ultrasound video recordings of normal and impaired speech. Med. Image Anal. 44, 98–114 (2018). https://doi.org/10.1016/j.media.2017.12.003
Article Google Scholar
Lee, S.H., Yu, J.F., Hsieh, Y.H., Lee, G.S.: Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am. J. Speech Lang. Pathol. 24(4), 739–749 (2015)
Article Google Scholar
Li, M., Kambhamettu, C., Stone, M.: Automatic contour tracking in ultrasound images. Clin. Linguist. Phon. 19(6–7), 545–554 (2005)
Article Google Scholar
Morrison, G.S., Assmann, P.F.: Vowel Inherent Spectral Change: Modern Acoustics and Signal Processing. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-14209-3
Book Google Scholar
Mozaffari, M.H., Lee, W.S.: Domain adaptation for ultrasound tongue contour extraction using transfer learning: a deep learning approach. J. Acoust. Soc. Am. 146(5), EL431–EL437 (2019). https://doi.org/10.1121/1.5133665
Article Google Scholar
Mozaffari, M.H., Wen, S., Wang, N., Lee, W.: Real-time automatic tongue contour tracking in ultrasound video for guided pronunciation training. In: 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2019), vol. 1, pp. 302–309 (2019). https://doi.org/10.5220/0007523503020309
Muldal, A.: Python-phasepack (2016). https://github.com/alimuldal/phasepack
Noble, A., et al.: Ultrasound image segmentation : a survey. IEEE Trans. Med. Imaging 25, 987–1010 (2006)
Article Google Scholar
Song, J.Y.: The use of ultrasound in the study of articulatory properties of vowels in clear speech. Clin. Linguist. Phon. 31(5), 351–374 (2017). https://doi.org/10.1080/02699206.2016.1268207
Article Google Scholar
Stone, M.: A guide to analysing tongue motion from ultrasound images. Clin. Linguist. Phon. 19(6–7), 455–501 (2005). https://doi.org/10.1080/02699200500113558
Article Google Scholar
Tang, L., Bressmann, T., Hamarneh, G.: Tongue contour tracking in dynamic ultrasound via higher-order MRFs and efficient fusion moves. Med. Image Anal. 16(8), 1503–1520 (2012). https://doi.org/10.1016/j.media.2012.07.001
Article Google Scholar
Tang, L., Hamarneh, G.: Graph-based tracking of the tongue contour in ultrasound sequences with adaptive temporal regularization. In: Computer Society Conference on Computer Vision and Pattern Recognition - Workshops (CVPRW 2010), pp. 154–161. IEEE (2010). https://doi.org/10.1109/CVPRW.2010.5543597
Unser, M., Stone, M.: Automated detection of the tongue surface in sequences of ultrasound images. J. Acoust. Soc. Am. 91(5), 3001–3007 (1992). https://doi.org/10.1121/1.402934
Article Google Scholar
Wang, H., Wang, S., Denby, B., Dang, J.: Automatic tongue contour tracking in ultrasound sequences without manual initialization. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp. 200–203. IEEE (2015). https://doi.org/10.1109/APSIPA.2015.7415503
Wen, S.: Automatic tongue contour segmentation using deep learning. Master of Applied Science in Electrical and Computer Engineering, University of Otawa (2018)
Google Scholar
Xu, K., et al.: Robust contour tracking in ultrasound tongue image sequences. Clin. Linguist. Phon. 30(3–5), 313–327 (2016). https://doi.org/10.3109/02699206.2015.1110714
Article Google Scholar
Zhu, J., Styler, W., Calloway, I.: Automatic tongue contour extraction in ultrasound images with convolutional neural networks. J. Acoust. Soc. Am. 143(3), 1966 (2018). https://doi.org/10.1121/1.5036466
Article Google Scholar
Zhu, J., Styler, W., Calloway, I.: A CNN-based tool for automatic tongue contour tracking in ultrasound images. eprint arXiv:1907.10210, pp. 1–6 (2019)

Download references

Acknowledgements

This research was financially supported by the projects VoxSenes (POCI-01-0145-FEDER-03082) and MEMNON (POCI-01-0145-FEDER-028976) – COMPETE2020 under POCI and FEDER, and by national funds (OE), through FCT/MCTES, SOCA – Smart Open Campus CENTRO-01-0145-FEDER-000010 (Portugal 2020 under POCI and FEDER) and by IEETA Research Unit funding (UIDB/00127/2020). Luciana Albuquerque’s work is funded by the FCT through grant SFRH/BD/115381/2016.

Author information

Authors and Affiliations

Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
Fábio Barros, Ana Rita Valente, Luciana Albuquerque, Samuel Silva, António Teixeira & Catarina Oliveira
Department of Electronics Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
Fábio Barros, Ana Rita Valente, Samuel Silva & António Teixeira
Center for Health Technology and Services Research, University of Aveiro, Aveiro, Portugal
Luciana Albuquerque
School of Health Science, University of Aveiro, Aveiro, Portugal
Catarina Oliveira

Authors

Fábio Barros
View author publications
You can also search for this author in PubMed Google Scholar
Ana Rita Valente
View author publications
You can also search for this author in PubMed Google Scholar
Luciana Albuquerque
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Silva
View author publications
You can also search for this author in PubMed Google Scholar
António Teixeira
View author publications
You can also search for this author in PubMed Google Scholar
Catarina Oliveira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fábio Barros .

Editor information

Editors and Affiliations

University of Porto, Porto, Portugal
Aurélio Campilho
University of Waterloo, Waterloo, ON, Canada
Fakhri Karray
University of Waterloo, Waterloo, ON, Canada
Zhou Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barros, F., Valente, A.R., Albuquerque, L., Silva, S., Teixeira, A., Oliveira, C. (2020). Contributions to a Quantitative Unsupervised Processing and Analysis of Tongue in Ultrasound Images. In: Campilho, A., Karray, F., Wang, Z. (eds) Image Analysis and Recognition. ICIAR 2020. Lecture Notes in Computer Science(), vol 12132. Springer, Cham. https://doi.org/10.1007/978-3-030-50516-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-50516-5_15
Published: 17 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50515-8
Online ISBN: 978-3-030-50516-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics