Abstract
In speech science, analyzing the shape of the tongue during human speech production is of great importance. In this field, magnetic resonance imaging (MRI) is currently regarded as the preferred modality for acquiring dense 3D information about the human vocal tract . However, the desired shape information is not directly available from the acquired MRI data. In this chapter, we present a minimally supervised framework for extracting the tongue shape from a 3D MRI scan. It combines an image segmentation approach with a template fitting technique and produces a polygon mesh representation of the identified tongue shape. In our evaluation, we focus on two aspects: First, we investigate whether the approach can be regarded as independent of changes in tongue shape caused by different speakers and phones. Moreover, we check whether an average user who is not necessarily an anatomical expert may obtain acceptable results. In both cases, our framework shows promising results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ultrax: Real-time tongue tracking for speech therapy using ultrasound (2014). http://www.ultrax-speech.org/. Accessed 5 May 2015
Allen, B., Curless, B., Popović, Z.: The space of human body shapes: reconstruction and parameterization from range scans. ACM Trans. Graph. 22 (3), 587–594 (2003). doi:10.1145/1201775.882311
Baker, A.: A biomechanical tongue model for speech production based on MRI live speaker data (2011). http://www.adambaker.org/qmu.php. Accessed 5 May 2015
Blandin, R., Arnela, M., Laboissière, R., Pelorson, X., Guasch, O., Hirtum, A.V., Laval, X.: Effects of higher order propagation modes in vocal tract like geometries. J. Acoust. Soc. Am. 137 (2), 832–843 (2015). doi:10.1121/1.4906166
Botsch, M., Kobbelt, L., Pauly, M., Alliez, P., Levy, B.: Polygon Mesh Processing. A K Peters/CRC Press, Natick (2010)
Boykov, Y., Funka-Lea, G.: Graph cuts and efficient ND image segmentation. Int. J. Comput. Vis. 70 (2), 109–131 (2006). doi:10.1007/s11263-006-7934-5
Brunton, A., Salazar, A., Bolkart, T., Wuhrer, S.: Review of statistical shape spaces for 3D data with comparative analysis for human faces. Comput. Vis. Image Underst. 128, 1–17 (2014). doi:10.1016/j.cviu.2014.05.005
Chan, T.F., Vese, L.A.: Active contours without edges. IEEE Trans. Image Process. 10 (2), 266–277 (2001). doi:10.1109/83.902291
Engwall, O.: Can audio-visual instructions help learners improve their articulation? – an ultrasound study of short term changes. In: 9th Annual Conference of the International Speech Communication Association (Interspeech), Brisbane, pp. 2631–2634 (2008)
Eryildirim, A., Berger, M.O.: A guided approach for automatic segmentation and modeling of the vocal tract in MRI images. In: European Signal Processing Conference (EUSIPCO), Barcelona, pp. 61–65 (2011)
Grady, L.: Random walks for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 28 (11), 1768–1783 (2006). doi:10.1109/TPAMI.2006.233
Harandi, N.M., Abugharbieh, R., Fels, S.: 3D segmentation of the tongue in MRI: a minimally interactive model-based approach. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 3 (4), 178–188 (2015). doi:10.1080/21681163.2013.864958
Hewer, A., Steiner, I., Wuhrer, S.: A hybrid approach to 3D tongue modeling from vocal tract MRI using unsupervised image segmentation and mesh deformation. In: 15th Annual Conference of the International Speech Communication Association (Interspeech), Singapore, pp. 418–421 (2014)
Kazhdan, M., Bolitho, M., Hoppe, H.: Poisson surface reconstruction. In: Eurographics Symposium on Geometry Processing (SGP), Cagliari, pp. 61–70 (2006). doi:10.2312/SGP/SGP06/061-070
Ladefoged, P.: A Course in Phonetics, 2nd edn. Harcourt Brace Jovanovich, New York (1982)
Lee, J., Woo, J., Xing, F., Murano, E.Z., Stone, M., Prince, J.L.: Semi-automatic segmentation of the tongue for 3D motion analysis with dynamic MRI. In: IEEE 10th International Symposium on Biomedical Imaging (ISBI), San Francisco, pp. 1465–1468 (2013). doi:10.1109/ISBI.2013.6556811
Li, C., Kao, C.Y., Gore, J.C., Ding, Z.: Implicit active contours driven by local binary fitting energy. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, pp. 1–7 (2007). doi:10.1109/CVPR.2007.383014
Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. ACM Trans. Graph. 28 (5), 175:1–175:10 (2009). doi:10.1145/1618452.1618521
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45 (1–3), 503–528 (1989). doi:10.1007/BF01589116
Liu, J., Udupa, J.K.: Oriented active shape models. IEEE Trans. Med. Imaging 28 (4), 571–584 (2009). doi:10.1109/TMI.2008.2007820
Osher, S., Sethian, J.A.: Fronts propagating with curvature-dependent speed: algorithms based on Hamilton-Jacobi formulations. J. Comput. Phys. 79 (1), 12–49 (1988). doi:10.1016/0021-9991(88)90002-2
Peng, T., Kerrien, E., Berger, M.O.: A shape-based framework to segmentation of tongue contours from MRI data. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, pp. 662–665 (2010). doi:10.1109/ICASSP.2010.5495123
Raeesy, Z., Rueda, S., Udupa, J.K., Coleman, J.: Automatic segmentation of vocal tract MR images. In: IEEE 10th International Symposium on Biomedical Imaging (ISBI), San Francisco, pp. 1328–1331 (2013). doi:10.1109/ISBI.2013.6556777
Witten, D.M.: Penalized unsupervised learning with outliers. Stat. Interface 6 (2), 211–221 (2013). doi:10.4310/SII.2013.v6.n2.a5
Wuhrer, S., Lang, J., Tekieh, M., Shu, C.: Finite element based tracking of deforming surfaces. Graph. Models 77, 1–17 (2015). doi:10.1016/j.gmod.2014.10.002
Acknowledgements
This study uses data from work supported by EPSRC Healthcare Partnerships Grant number EP/I027696/1 (“Ultrax”).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Hewer, A., Wuhrer, S., Steiner, I., Richmond, K. (2016). Tongue Mesh Extraction from 3D MRI Data of the Human Vocal Tract. In: Breuß, M., Bruckstein, A., Maragos, P., Wuhrer, S. (eds) Perspectives in Shape Analysis. Mathematics and Visualization. Springer, Cham. https://doi.org/10.1007/978-3-319-24726-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-24726-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24724-3
Online ISBN: 978-3-319-24726-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)