Advertisement

International Journal of Speech Technology

, Volume 15, Issue 1, pp 49–56 | Cite as

Experiments for the selection of sub-word units in the Basque context for semantic tasks

  • Nora Barroso
  • Karmele López de Ipiña
  • Carmen Hernández
  • Aitzol Ezeiza
  • Manuel Graña
Article

Abstract

The long term goal of our project is the development of robust ASR systems in the Basque context where coexist French, Spanish and Basque (a minority language). The development of ASR systems involves dealing with issues such as Acoustic Phonetic Decoding (APD), Language Modelling (LM) or the development of appropriate Language Resources (LR). Thus, these applications are generally very language-dependent and require very large resources. This work is focused on the selection of appropriate sub-word units with under-resourced and noisy conditions. Nowadays, in particular, the work is oriented to Basque Broadcast News (BN) due to the interest of digital mass-media as the trilingual Infozazpi radio (situated in French Basque Country). Thus, in order to decrease the negative impact that the lack of resources has in this issue we apply several data optimization methodologies based on Matrix Covariance Estimation and Ontology-based approaches.

Keywords

Under-resourced languages Sub-word units Multilingual automatic speech recognition Discriminant analysis Matrix covariance estimation methods 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barroso, N., Ezeiza, A., Gilisagasti, N., López de Ipiña, K., López, A., & López, J. M. (2007). Development of multimodal resources for multilingual information retrieval in the Basque context. In Proc. Interspeech Antwerp, Belgium. Google Scholar
  2. Barroso, N., López de Ipiña, K., Ezeiza, A., Hernández, C., Ezeiza, N., Barroso, O., Susperregi, U., & Barroso, S. (2011). GorUp: an ontology-driven audio information retrieval system that suits the requirements of under-resourced languages. In InterSpeech 2011, Florence, Italy. Google Scholar
  3. Cosi, P. (2000). Hybrid HMM-NN architectures for connected digit recognition. In Proc. of the IJC on neural networks (Vol. 5). Google Scholar
  4. Friedman, J. H. (1989). Regularized discriminant analysis. Journal of the American Statistical Association, 84, 165–175. MathSciNetCrossRefGoogle Scholar
  5. Ganapathiraju, A., Hmaker, J., & Picone, J. (2000). Hybrid SVM/HMM architectures for speech recognition. In Proc. of the international conference on spoken language processing (Vol. 4, pp. 504–507). Google Scholar
  6. Hoffbeck, J. P., & Landgrebe, D. (1996). Covariance estimation and classification with limited training data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(7), 763–767. CrossRefGoogle Scholar
  7. Le, V. B., & Besacier, L. (2009). Automatic speech recognition for under-resourced languages: application to Vietnamese language. IEEE Transactions on Audio, Speech, and Language Processing, 17(8), 1471–1482. CrossRefGoogle Scholar
  8. Lopez de Ipiña, K., Graña, M., Ezeiza, N., Hernández, M., Zulueta, E., Ezeiza, A., & Tovar, C. (2003). Selection of lexical units for CSR of Basque. In LNCS. Progress in pattern recognition, speech and image analysis (pp. 244–250). Berlin: Springer. CrossRefGoogle Scholar
  9. Martinez, A., & Kak, A. (2001). PCA versus LDA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 228–233. CrossRefGoogle Scholar
  10. Padrell, J., Martín-Iglesias, D., & Díaz-de-María, F. (2006). Support vector machines for continuous speech recognition. In 14th EUSIPCO, Florence, Italy, September 4–8. Google Scholar
  11. Schultz, T., & Kirchhoff, N. (2006). Multilingual speech processing. Elsevier: Amsterdam. Google Scholar
  12. Schultz, T., & Waibel, A. (1998). Multilingual and crosslingual speech recognition. In Proceedings of the DARPA BC, workshop. Google Scholar
  13. Seng, S., Sam, S., Le, V. B., Bigi, B., & Besacier, L. (2008). Which units for acoustic and language modelling for Khmer automatic speech recognition. In 1st international conference on spoken language processing for under-resourced languages, Hanoi, Vietnam. Google Scholar
  14. Smith, N., & Gales, M. (2002). Advances in neural information processing systems: Vol 14. Speech recognition using SVMs. MIT Press: Cambridge. Google Scholar
  15. Tadjudin, S., & Landgrebe, D. (1998). Classification of high dimensional data with limited training samples (Technical Report TRECE 98-8). School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana. Google Scholar
  16. Tadjudin, S., & Landgrebe, D. (2000). Covariance estimation with limited training samples. IEEE Transactions on Geoscience and Remote Sensing, 37, 102–120. Google Scholar
  17. Toledano, D., Moreno, A., Colás, J., & Garrido, J. (2005). Acoustic-phonetic decoding of different types of spontaneous speech in Spanish. In DSS 2005, Aix-en-Provence, France. Google Scholar
  18. Vandecatseye, A., et al. (2004). The COST278 pan-European broadcast news database. In Proc. LREC, Lisbon, Portugal. Google Scholar
  19. Wheatley, B., Kondo, K., Anderson, W., & Muthusamy, Y. (1994). An evaluation of cross-language adaptation for rapid HMM development in a new language. In ICASSP (pp. 237–240), Adelaine. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Nora Barroso
    • 1
  • Karmele López de Ipiña
    • 2
  • Carmen Hernández
    • 2
  • Aitzol Ezeiza
    • 2
  • Manuel Graña
    • 2
  1. 1.Irunweb EnterpriseIrunSpain
  2. 2.Grupo de Inteligencia ComputacionalUPV/EHUDonostiaSpain

Personalised recommendations