Baby-Steps Towards Building a Spanglish Language Model

  • Juan Carlos Franco
  • Thamar Solorio
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4394)


Spanglish is the simultaneous use, or alternating of both, traditional Spanish and English within the same conversational event. This interlanguage is commonly used in U.S. populations with large percentages of Spanish speakers. Despite the popularity of this dialect, and the wide spread of automated voice systems, currently there are no spoken dialog applications that can process Spanglish. In this paper we present the first attempt towards creating a Spanglish language model.


Linguistic Feature Speech Recognizer Linguistic Phenomenon Traditional Language Statistical Language Modeling 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ardila, A.: Spanglish: An anglicized spanish dialect. Hispanic Journal of Behavioral Sciences 27(1), 60–81 (2005)CrossRefGoogle Scholar
  2. 2.
    Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: development and use of a tool for assisting speech corpora production. Speech Communication 33(1–2) (2001), Software downloaded from:
  3. 3.
    Brockett, C., Dolan, W.B., Gamon, M.: Correcting esl errors using phrasal smt techniques. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, July 2006, pp. 249–256. Association for Computational Linguistics (2006),
  4. 4.
    U.S. Census Bureau: U.s. interim projections by age, sex, race, and hispanic origin (2004), Retrieved August 30, 2006 from
  5. 5.
    Clarkson, P.R., Rosenfeld, R.: Statistical language modeling using the cmu-cambridge toolkit. In: Proceedings ESCA Eurospeech 1997 (1997)Google Scholar
  6. 6.
    de Jongh, E.M.: Interpreting in Miami’s federal courts: Code-switching and Spanglish. Hispania 73(1), 274–278 (1990)CrossRefGoogle Scholar
  7. 7.
    Ervin, S., Osgood, C.: Second language learning and bilingualism. Journal of abnormal and social phsychology, supplement 49, 139–146 (1954)Google Scholar
  8. 8.
    Espinosa, A.M.: Speech mixture in New Mexico: the influence of English language on New Mexican Spanish. In: Stevens, H., Bolton, H. (eds.) The Pacific Ocean in history, pp. 408–428 (1917)Google Scholar
  9. 9.
    Good, I.J.: The population frequencies of species and the estimation of population parameters. Biometrika 40, 16–264 (1953)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Grosjean, F.: Life with Two Languages: An Introduction to Bilingualism. Harvard University Press, Harvard (1982)Google Scholar
  11. 11.
    Gumperz, J.J.: Linguistic and social interaction in two communities. In: Gumperz, J.J. (ed.) Language in social groups, 1964, pp. 151–176. Stanford University Press, Stanford (1964)Google Scholar
  12. 12.
    Gumperz, J.J.: Bilingualism, bidialectism and classroom interaction. In: Language in social groups, pp. 311–339. Stanford University Press, Stanford (1971)Google Scholar
  13. 13.
    Gumperz, J.J., Hernandez-Chavez, E.: Cognitive aspects of bilingual communication. Oxford university Press, London (1971)Google Scholar
  14. 14.
    Jelinek, F.: Statistical Methods for Speech Recognition, pp. 57–78. MIT Press, Cambridge (1998)Google Scholar
  15. 15.
    Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, pp. 191–234. Prentice-Hall, Englewood Cliffs (2000)Google Scholar
  16. 16.
    Lipski, J.M.: Code-switching and the problem of bilingual competence. In: Paradis, M. (ed.) Aspects of bilingualism, pp. 250–264. Hornbeam, Columbia (1978)Google Scholar
  17. 17.
    Nash, R.: Spanglish: Language contact in Puerto Rico. American Speech 45(3/4), 223–233 (1970)CrossRefGoogle Scholar
  18. 18.
    Poplack, S.: Sometimes I’ll start a sentence in Spanish y termino en español: toward a typology of code-switching. Linguistics 18(7/8), 581–618 (1980)Google Scholar
  19. 19.
    Poplack, S., Sankoff, D., Miller, C.: The social correlates and linguistic processes of lexical borrowing and assimilation. Linguistics 26(1), 47–104 (1988)CrossRefGoogle Scholar
  20. 20.
    Sankoff, D.: Social aspects of multilingualism in New Guinea. Ph.D. thesis, McGill University (1968)Google Scholar
  21. 21.
    Toribio, A.J.: Spanish/english speech practices: Bringing chaos to order. International Journal of Bilingual Education and Bilingualism 7(2–3), 133–155 (2004)CrossRefGoogle Scholar
  22. 22.
    Wintermute, S.: The universal text imitator (Oct. 2006), Software downloaded from

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Juan Carlos Franco
    • 1
  • Thamar Solorio
    • 1
  1. 1.University of Texas at El Paso, El Paso, TX, 79912USA

Personalised recommendations