Cognitive Approaches to Spoken Language Technology

Chapter

Abstract

As evidenced by the contributions of the other authors in this volume, spoken language technology (SLT) has made great strides over the past 20 or so years. The introduction of data-driven machine-learning approaches to building statistical models for automatic speech recognition (ASR), unit selection inventories for text-to-speech synthesis (TTS) or interaction strategies for spoken language dialogue systems (SLDS) has given rise to a steady year-on-year improvement in system capabilities. Such continued incremental progress has also been underpinned by a regime of public benchmark testing sponsored by national funding agencies, such as DARPA, coupled with an ongoing increase in available computer power.

References

  1. 1.
    Moore, R. K. (2005). Research challenges in the automation of spoken language interaction. In: Proc. COST278 and ISCA Tutorial and Research Workshop on Applied Spoken Language Interaction in Distributed Environments (ASIDE 2005): Aalborg University, Denmark, 10–11.Google Scholar
  2. 2.
    Huang, X. D. (2002). Making speech mainstream. Microsoft Speech Technologies Group.Google Scholar
  3. 3.
    Henton, C. (2002). Fiction and reality of TTS, Speech Technology Magazine 7(1).Google Scholar
  4. 4.
    Moore, R. K. (2003). A comparison of the data requirements of automatic speech recognition systems and human listeners. In: Proc. EUROSPEECH’03, Geneva, Switzerland, September 1–4, 2582–2584.Google Scholar
  5. 5.
    Gorin, A., Riccardi, G., Wright, J. (1997). How may I help you? Speech Commun., 23, 113–127.Google Scholar
  6. 6.
    Young, S. J. (2006). Using POMDPs for dialog management. In: Proc. IEEE/ACL Workshop on Spoken Language Technology, Aruba Marriott, Palm Beach, Aruba, December 10–13, 8–13.Google Scholar
  7. 7.
    Maslow, A. H. (1943). A theory of human motivation. Psychol. Rev., 50, 370–396.CrossRefGoogle Scholar
  8. 8.
    Scherer, K. R., Schorr, A., Johnstone, T. (2001). Appraisal Processes in Emotion: Theory, Methods, Research. Oxford University Press, New York and Oxford.Google Scholar
  9. 9.
    Broadbent, D. E. (1958). Perception and Communication. Pergamon Press, London.CrossRefGoogle Scholar
  10. 10.
    Toates, F. (2006). A model of the hierarchy of behaviour, cognition and consciousness. Consciousness Cogn., 15, 75–118.Google Scholar
  11. 11.
    Brunswik, E. (1952). The conceptual framework of psychology. International Encyclopaedia of Unified Science, vol. 1, University of Chicago Press, Chicago.Google Scholar
  12. 12.
    Figueredo, A. J., Hammond, K. R., McKierman, E. C. (2006). A Brunswikian evolutionary developmental theory of preparedness and plasticity. Intelligence, 34, 211–227.Google Scholar
  13. 13.
    Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Commun., 40, 227–256.CrossRefMATHGoogle Scholar
  14. 14.
    Rizzolatti, G., Craighero, L. (2004). The mirror-neuron system. Annu. Rev. Neurosci., 27, 169–192.CrossRefGoogle Scholar
  15. 15.
    Powers, W. T. (1973). Behaviour: The Control of Perception. Aldine, Hawthorne, NY.Google Scholar
  16. 16.
    Wilson, M., Knoblich, G. (2005). The case for motor involvement in perceiving conspecifics. Psychol. Bull., 131, 460–473.CrossRefGoogle Scholar
  17. 17.
    Becchio, C., Adenzato, M., Bara, B. G. (2006). How the brain understands intention: Different neural circuits identify the componential features of motor and prior intentions. Consciousness Cogn., 15, 64–74.Google Scholar
  18. 18.
    Grush, R. (2004). The emulation theory of representation: Motor control, imagery, and perception. Behav. Brain Sci., 27, 377–442.Google Scholar
  19. 19.
    Hawkins, J. (2004). On Intelligence. Times Books, New York, NY.Google Scholar
  20. 20.
    Lexandrov, Y. I., Sams, M. E. (2005). Emotion and consciousness: End of a continuum. Cogn. Brain Res., 25, 387–405.CrossRefGoogle Scholar
  21. 21.
    Taylor, M. M. (1992). Strategies for speech recognition and understanding using layered protocols. Speech Recognition and Understanding – Recent Advances. NATO ASI Series F75, Springer-Verlag, Berlin, Heidelberg.Google Scholar
  22. 22.
    Gerdes, V. G. J., Happee, R. (1994). The use of an internal representation in fast goal-directed movements: A modeling approach. Biol. Cybernet., 70, 513–524.Google Scholar
  23. 23.
    Wilson, S. M., Saygin, A. P., Sereno, M. I., Iacoboni, M. (2004). Listening to speech activates motor areas involved in speech production. Nat. Neurosci., 7, 701–702.Google Scholar
  24. 24.
    Gopnik, A., Meltzoff, A. N., Kuhl, P. K. (2001). The Scientist in the Crib. Perennial, New York, NY.Google Scholar
  25. 25.
    Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nat. Rev.: Neurosci., 5, 831–843.Google Scholar
  26. 26.
    Cowley, S. J. (2004). Simulating others: The basis of human cognition. Lang. Sci., 26, 273–299.Google Scholar
  27. 27.
    Weber, C., Wermter, S., Elshaw, M. (2006). A hybrid generative and predictive model of the motor cortex. Neural Netw., 19, 339–353.Google Scholar
  28. 28.
    Mountcastle, V. B. (1978). An organizing principle for cerebral function: The unit model and the distributed system. In: Edelman, G. M., Mountcastle, V. B. (eds) The Mindful Brain, MIT Press, Cambridge, MA.Google Scholar
  29. 29.
    Hawkins, J., George, D. (2006). Hierarchical Temporal Memory: Concepts, Theory, and Terminology. Numenta Inc., Redwood City, CA.Google Scholar
  30. 30.
    Chartrand, T. L., Bargh, J. A. (1999). The chameleon effect: The perception-behavior link and social interaction. Social Psychol., 76, 893–910.Google Scholar
  31. 31.
    Meltzoff, M., Moore, K. (1997). Explaining facial imitation: A theoretical model. Early Dev. Parenting, 6, 179–192.Google Scholar
  32. 32.
    Brass, M., Bekkering, H., Wohlschlager, A., Prinz, W. (2000). Compatibility between observed and executed finger movements: Comparing symbolic, spatial, and imitative cues. Brain Cogn., 44, 124–143.Google Scholar
  33. 33.
    Kerzel, D., Bekkering, H. (2000). Motor activation from visible speech: Evidence from stimulus response compatibility. J. Exp. Psychol. [Hum. Percept.], 26, 634–647.Google Scholar
  34. 34.
    Rizzolatti, G., Fadiga, L., Gallese, V., Fogassi, L. (1996). Premotor cortex and the recognition of motor actions. Cognitive Brain Res., 3, 131–141.CrossRefGoogle Scholar
  35. 35.
    Iacoboni, M., Molnar-Szakacs, I., Gallesse, V., Buccino, G., Mazziotta, J. C., Rizzolatti, G. (2005). Grasping the intentions of others with one’s own mirror system. PLoS Biol., 3, 529–535.Google Scholar
  36. 36.
    Gallese, V., Keysers, C., Rizzolatti, G. (2004). A unifying view of the basis of social cognition. Trends Cogn. Sci., 8(9), 396–403.Google Scholar
  37. 37.
    Baron-Cohen, S., Leslie, A. M., Frith, U. (1985). Does the autistic child have a “theory of mind”? Cognition, 21, 37–46.Google Scholar
  38. 38.
    Baron-Cohen, S. (1997). Mindblindness: Essay on Autism and the Theory of Mind. MIT Press, Cambridge, MA.Google Scholar
  39. 39.
    Kohler, E., Keysers, C., Umilta, M. A., Fogassi, L., Gallese, V., Rizzolatti, G. (2002). Hearing sounds, understanding actions: Action representation in mirror neurons. Science, 297, 846–848.Google Scholar
  40. 40.
    Pulvermüller, F. (2005). Brain mechanisms linking language and action. Nat. Neurosci. Rev., 6, 576–582.Google Scholar
  41. 41.
    Rizzolatti, G., Arbib, M. A. (1998). Language within our grasp. Trends Neurosci., 21, 188–194.Google Scholar
  42. 42.
    Pacherie, E., Dokic, J. (2006). From mirror neurons to joint actions. Cogn. Syst. Res., 7, 101–112.CrossRefGoogle Scholar
  43. 43.
    Studdart-Kennedy, M. (2002). Mirror neurons, vocal imitation, and the evolution of particulate speech. In: Mirror Neurons and the Evolution of Brain and Language. M.I. Stamenov, V. Gallese (Eds.), Philadelphia: Benjamins, 207–227.Google Scholar
  44. 44.
    Arbib, M. A. (2005). From monkey-like action recognition to human language: An evolutionary framework for neurolinguists. Behav. Brain Sci., 28, 105–167.Google Scholar
  45. 45.
    Aboitiz, F., Garcia, R. R., Bosman, C., Brunetti, E. (2006). Cortical memory mechanisms and language origins. Brain Lang., 40–56.Google Scholar
  46. 46.
    Newell, A. (1990). Unified Theories of Cognition. Harvard University Press, Cambridge, MA.Google Scholar
  47. 47.
    Rosenbloom, P. S., Laird, J. E., Newell, A. (1993). The SOAR Papers: Research on Integrated Intelligence. MIT Press, Cambridge, MA.Google Scholar
  48. 48.
    Anderson, J. R. (1996). ACT: A simple theory of complex cognition. American Psychol., 51(4), 355–365.Google Scholar
  49. 49.
    Bratman, M. E. (1987). Intention, Plans, and Practical Reason, Harvard University Press, Cambridge, MA.Google Scholar
  50. 50.
    Rao, A., Georgoff, M. (1995). BDI agents: From theory to practice. Technical Report TR-56. Australian Artificial Intelligence Institute, Melbourne.Google Scholar
  51. 51.
    Winograd, T. (2006). Shifting viewpoints: Artificial intelligence and human-computer interaction. Artif. Intell., 170, 1256–1258.Google Scholar
  52. 52.
    Brooks, R. A. (1991). Intelligence without representation. Artif. Intell., 47, 139–159.Google Scholar
  53. 53.
    Brooks, R. A. (1991). Intelligence without reason. In: Proc. 12th Int. Joint Conf. on Artificial Intelligence, Sydney, Australia, 569–595.Google Scholar
  54. 54.
    Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE J. Rob. Autom. 2, 4–23.Google Scholar
  55. 55.
    Prescott, T. J., Redgrave, P., Gurney, K. (1999). Layered control architectures in robots and vertebrates. Adaptive Behav., 7, 99–127.CrossRefGoogle Scholar
  56. 56.
    Roy, D., Reiter E. (2005). Connecting language to the world. Artif. Intell., 167, 1–12.Google Scholar
  57. 57.
    Roy, D. K., Pentland, A. P. (2002). Learning words from sights and sounds: A computational model. Cogn. Sci., 26, 113–146.Google Scholar
  58. 58.
    Roy, D. (2005). Semiotic schemas: A framework for grounding language in action and perception. Artif. Intell., 167, 170–205.Google Scholar
  59. 59.
    Wang, Y. (2003). Cognitive informatics: A new transdisciplinary research field. Brain Mind, 4, 115–127.Google Scholar
  60. 60.
    Wang, Y. (2003). On cognitive informatics. Brain Mind, 4, 151–167.Google Scholar
  61. 61.
    Moore, R. K. (2005). Cognitive informatics: The future of spoken language processing? In: Proc. SPECOM – 10th Int. Conf. on Speech and Computer, Patras, Greece, October 17–19.Google Scholar
  62. 62.
    Moore, R. K. (2007). Spoken language processing: Piecing together the puzzle. J. Speech Commun. 49:418–43.Google Scholar
  63. 63.
    Moore, R. K. (2005). Towards a unified theory of spoken language processing. In: Proc. 4th IEEE Int. Conf. on Cognitive Informatics, Irvine, CA, USA, 8–10 August, 167–172.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Department of Computer ScienceSheffieldUK

Personalised recommendations