Automatic Annotation of an Ultrasound Corpus for Studying Tongue Movement

  • Samuel Silva
  • António Teixeira
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8814)


Silent speech interfaces can work as an alternative way of interaction in situations where the acoustic speech signal is absent (e.g., speech impairments) or is not suited for the current context (e.g., environmental noise). The goal is to use external data to infer/improve speech recognition. Surface electromyography (sEMG) is one of the modalities used to gather such data, but its applicability still needs to be further explored involving methods to provide reference data about the phenomena under study. A notable example concerns exploring sEMG to detect tongue movements. To that purpose, along with the acquisition of the sEMG, a modality that allows observing the tongue, such as ultrasound imaging, must also be synchronously acquired. In these experiments, manual annotation of the tongue movement in the ultrasound sequences, to allow the systematic analysis of the sEMG signals, is mostly infeasible. This is mainly due to the size of the data involved and the need to maintain uniform annotation criteria. Therefore, to address this task, we present an automatic method for tongue movement detection and annotation in ultrasound sequences. Preliminary evaluation comparing the obtained results with 72 manual annotations shows good agreement.


Audio Signal Manual Annotation Automatic Annotation sEMG Signal Tongue Movement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Denby, B., Schultz, T., Honda, K., Hueber, T., Gilbert, J.M., Brumberg, J.S.: Silent speech interfaces. Speech Communication 52(4), 270–287 (2009)CrossRefGoogle Scholar
  2. 2.
    Freitas, J., Teixeira, A., Dias, M.S.: Multimodal corpora for silent speech interaction. In: Proc. LREC, Reykjavik, Iceland (2014)Google Scholar
  3. 3.
    Hahn, S.L.: Hilbert Transform in Signal Processing. Artech House (1996)Google Scholar
  4. 4.
    Rossato, S., Teixeira, A., Ferreira, L.: Les nasales du portugais et du français: une étude comparative sur les données EMMA. Journées d’Études sur la Parole (JEP), 143–146 (Juin 2006)Google Scholar
  5. 5.
    Scobbie, J., Wrench, A., van der Linden, M.: Head-probe stabilization in ultrasound tongue imaging using a headset to permit natural head movement. In: Proc. 8th Int. Seminar on Speech Production, pp. 373–376 (2008)Google Scholar
  6. 6.
    Tang, L., Bressman, T., Hamarneh, G.: Tongue contour tracking in dynamic ultrasound via higher-order MRFs and efficient fusion moves. Medical Image Analysis 16(8), 1503–1520 (2012)CrossRefGoogle Scholar
  7. 7.
    Teixeira, A., Martins, P., Oliveira, C., Ferreira, C., Silva, A., Shosted, R.: Real-time MRI for Portuguese: database, methods and applications. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS, vol. 7243, pp. 306–317. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  8. 8.
    Weickert, J.: Anisotropic Diffusion in Image Processing. ECMI Series. Teubner-Verlag (1998)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.DETI/IEETAUniversity of AveiroAveiroPortugal

Personalised recommendations