Improving communication skills of children with autism through support of applied behavioral analysis treatments using multimedia computing: a survey

  • 33 Accesses


Naturalistic applied behavior analysis (ABA) techniques have been shown to help children with autism improve their communication skills. Recognizing that individuals who interact with children regularly are in the position to utilize treatments with profound effects, researchers have examined methodologies for training parents, teachers, and peers to implement treatments. These programs are time intensive and often unable to support trainees after training. Technologies need to be examined to determine how they can aid in the educational and support process. Academic publications and publicly available training programs were reviewed to determine the types of participants, methodologies, and training durations that have been reported for instructing interventionists. These resources illustrate a need to make programs more accessible. To address this, selected computer science research is applied to methods of evaluating ABA implementations in order to recommend how the technologies could be utilized to make training and support programs more accessible. Review results of instructional programs, both in research and available in the community, illustrate the challenges in providing training in ABA methodologies. Modern research in multimedia data processing and machine learning could be applied to reduce the human cost of training and support individuals implementing ABA techniques. Utilizing machine learning techniques to analyze video probes of naturalistic ABA treatment implementation could alleviate the human cost of evaluating fidelity, allowing for greater support for individuals interested in the treatments. These technologies could be used in the future to expand data collection to provide more perspective on the treatments.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.


  1. 1.

    Aneeja, G., Yegnanarayana, B.: Single frequency filtering approach for discriminating speech and nonspeech. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 23(4), 705–717 (2015)

  2. 2.

    Baer, D.M., Wolf, M.M., Risley, T.R.: Some current dimensions of applied behavior analysis. J. Appl. Behav. Anal. 1(1), 91–97 (1968)

  3. 3.

    Bagaiolo, L.F., Mari, JdJ, Bordini, D., Ribeiro, T.C., Martone, M.C.C., Caetano, S.C., Brunoni, D., Brentani, H., Paula, C.S.: Procedures and compliance of a video modeling applied behavior analysis intervention for brazilian parents of children with autism spectrum disorders. Autism 21, 603–610 (2017)

  4. 4.

    Baker-Ericzén, M.J., Stahmer, A.C., Burns, A.: Child demographics associated with outcomes in a community-based pivotal response training program. J. Positive Behav. Interv. 9(1), 52–60 (2007)

  5. 5.

    Bastianelli, E., Castellucci, G., Croce, D., Basili, R., Nardi, D.: Effective and robust natural language understanding for human-robot interaction. In: Proceedings of the Twenty-first European Conference on Artificial Intelligence, pp. 57–62. IOS Press (2014)

  6. 6.

    Baxter, R.H., Leach, M.J., Mukherjee, S.S., Robertson, N.M.: An adaptive motion model for person tracking with instantaneous head-pose features. IEEE Signal Process. Lett. 22(5), 578–582 (2015)

  7. 7.

    Bazzani, L., Cristani, M., Tosato, D., Farenzena, M., Paggetti, G., Menegaz, G., Murino, V.: Social interactions by visual focus of attention in a three-dimensional environment. Exp. Syst. 30(2), 115–127 (2013)

  8. 8.

    Boril, H., Zhang, Q., Ziaei, A., Hansen, J.H., Xu, D., Gilkerson, J., Richards, J.A., Zhang, Y., Xu, X., Mao, H., others: Automatic assessment of language background in toddlers through phonotactic and pitch pattern modeling of short vocalizations. In: WOCCI, pp. 39–43 (2014)

  9. 9.

    Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: CVPR (2017)

  10. 10.

    Chan, W., Jaitly, N., Le, Q., Vinyals, O.: Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4960–4964. IEEE (2016)

  11. 11.

    Chang, J.H., Kim, N.S., Mitra, S.K.: Voice activity detection based on multiple statistical models. IEEE Trans. Signal Process. 54(6), 1965–1976 (2006)

  12. 12.

    Chen, C.Y., Grauman, K.: Efficient activity detection in untrimmed video with max-subgraph search. IEEE Trans. Pattern Anal. Mach. Intell. 39(5), 908–921 (2017)

  13. 13.

    Children’s Hospital at Sacred Heart: Children’s hospital at sacred heart—autism center Acessed 19 May 2018 (2018)

  14. 14.

    Choice Autism Center: Our programs. Accessed 19 May 2018 (2018)

  15. 15.

    Coolican, J., Smith, I.M., Bryson, S.E.: Brief parent training in pivotal response treatment for preschoolers with autism. J. Child Psychol. Psychiatry 51(12), 1321–1330 (2010)

  16. 16.

    Coronato, A., De Pietro, G., Paragliola, G.: A situation-aware system for the detection of motion disorders of patients with autism spectrum disorders. Exp. Syst. Appl. 41(17), 7868–7877 (2014)

  17. 17.

    Dave, N.: Feature extraction methods LPC, PLP and MFCC in speech recognition. Int. J. Adv. Res. Eng. Technol. 1(6), 1–4 (2013)

  18. 18.

    Deng, J., Cummins, N., Schmitt, M., Qian, K., Ringeval, F., Schuller, B.: Speech-based diagnosis of autism spectrum condition by generative adversarial network representations. In: Proceedings of the 2017 International Conference on Digital Health, pp. 53–57. ACM (2017)

  19. 19.

    Deng, Z., Vahdat, A., Hu, H., Mori, G.: Structure inference machines: Recurrent neural networks for analyzing relations in group activity recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4772–4781 (2016)

  20. 20.

    Dorfmüller, K.: Robust tracking for augmented reality using retroreflective markers. Comput. Gr. 23(6), 795–800 (1999)

  21. 21.

    Drugman, T., Stylianou, Y., Kida, Y., Akamine, M.: Voice activity detection: merging source and filter-based information. IEEE Signal Process. Lett. 23(2), 252–256 (2016)

  22. 22.

    Dudy, S., Bedrick, S., Asgari, M., Kain, A.: Automatic analysis of pronunciations for children with speech sound disorders. Comput. Speech Lang. 50, 62–84 (2017)

  23. 23.

    Duffner, S., Garcia, C.: Visual focus of attention estimation with unsupervised incremental learning. IEEE Trans. Circuits Syst. Video Technol. 26(12), 2264–2272 (2016)

  24. 24.

    Estes, A., Vismara, L., Mercado, C., Fitzpatrick, A., Elder, L., Greenson, J., Lord, C., Munson, J., Winter, J., Young, G.: The impact of parent-delivered intervention on parents of very young children with autism. J. Autism Dev. Disord. 44(2), 353–365 (2014)

  25. 25.

    Foster, M.E., Gaschler, A., Giuliani, M.: How can i help you’: comparing engagement classification strategies for a robot bartender. In: Proceedings of the 15th ACM on International conference on multimodal interaction, pp. 255–262. ACM (2013)

  26. 26.

    Fragkiadaki, K., Levine, S., Felsen, P., Malik, J.: Recurrent network models for human dynamics. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4346–4354 (2015)

  27. 27.

    Gengoux, G.W., Berquist, K.L., Salzman, E., Schapp, S., Phillips, J.M., Frazier, T.W., Minjarez, M.B., Hardan, A.Y.: Pivotal response treatment parent training for autism: findings from a 3-month follow-up evaluation. J. Autism Dev. Disord. 45(9), 2889–2898 (2015)

  28. 28.

    Gillesen, J.C., Barakova, E., Huskens, B.E., Feijs, L.M.: From training to robot behavior: Towards custom scenarios for robotics in training programs for ASD. In: 2011 IEEE International Conference on Rehabilitation Robotics (ICORR), pp. 1–7. IEEE (2011)

  29. 29.

    Gillett, J.N., LeBlanc, L.A.: Parent-implemented natural language paradigm to increase language and play in children with autism. Res. Autism Spec. Disord. 1(3), 247–255 (2007)

  30. 30.

    Goldberg, Y.: Neural network methods for natural language processing. Synth. Lect. Human Lang. Technol. 10(1), 1–309 (2017)

  31. 31.

    Gosztolya, G.: Detecting laughter and filler events by time series smoothing with genetic algorithms. In: International Conference on Speech and Computer, pp. 232–239 (2016)

  32. 32.

    Gosztolya, G., Grósz, T., Busa-Fekete, R., Tóth, L.: Determining native language and deception using phonetic features and classifier combination. Interspeech (2016)

  33. 33.

    Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (icassp), pp. 6645–6649. IEEE (2013)

  34. 34.

    Górriz, J.M., Ramírez, J., Lang, E.W., Puntonet, C.G.: Hard c-means clustering for voice activity detection. Speech Commun. 48(12), 1638–1649 (2006)

  35. 35.

    Hardan, A.Y., Gengoux, G.W., Berquist, K.L., Libove, R.A., Ardel, C.M., Phillips, J., Frazier, T.W., Minjarez, M.B.: A randomized controlled trial of pivotal response treatment group for parents of children with autism. J. Child Psychol. Psychiatry 56(8), 884–892 (2015)

  36. 36.

    Harper, C.B., Symon, J.B., Frea, W.D.: Recess is time-in: using peers to improve social skills of children with autism. J. Autism Dev. Disord. 38(5), 815–826 (2008)

  37. 37.

    Heath, C.D., Venkateswara, H., McDaniel, T., Panchanathan, S.: Detecting attention in pivotal response treatment video probes. In: International Conference on Smart Multimedia (2018)

  38. 38.

    Heath, C.D., McDaniel, T., Venkateswara, H., Panchanathan, S.: Parent and child voice activity detection in pivotal response treatment video probes. In: Human Computer Interaction International (2019)

  39. 39.

    Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)

  40. 40.

    Jain, A., Zamir, A.R., Savarese, S., Saxena, A.: Structural-RNN: Deep learning on spatio-temporal graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5308–5317 (2016)

  41. 41.

    Jazouli, M., Elhoufi, S., Majda, A., Zarghili, A., Aalouane, R.: Stereotypical motor movement recognition using microsoft kinect with artificial neural network. World Acad. Sci. Eng. Technol. Int. J. Comput. Electr. Autom. Control Inf. Eng. 10(7), 1270–1274 (2016)

  42. 42.

    Jernite, Y., Srinet, K., Gray, J., Szlam, A.: CraftAssist instruction parsing: semantic parsing for a minecraft assistant. Preprint arXiv:1905.01978 (2019)

  43. 43.

    Johnson, N., Frenn, M., Feetham, S., Simpson, P.: Autism spectrum disorder: parenting stress, family functioning and health-related quality of life. Fam. Syst. Health 29(3), 232 (2011)

  44. 44.

    Jones, E.A., Feeley, K.M.: Parent implemented joint attention intervention for preschoolers with autism. J. Speech Lang. Pathol. Appl. Behav. Anal. 4(1), 74–89 (2009).

  45. 45.

    Kane, M., Connell, J.E., Pellecchia, M.: A quantitative analysis of language interventions for children with autism. Behav. Anal. Today 11(2), 128 (2010)

  46. 46.

    Kasari, C., Gulsrud, A., Paparella, T., Hellemann, G., Berry, K.: Randomized comparative efficacy study of parent-mediated interventions for toddlers with autism. J. Consult. Clin. Psychol. 83(3), 554 (2015)

  47. 47.

    Khan, N.A., Sawand, M.A., Qadeer, M., Owais, A., Junaid, S., Shahnawaz, P.: Autism detection using computer vision. Int. J. Comput. Sci. Netw. Secur. (IJCSNS) 17(4), 256 (2017)

  48. 48.

    Kim, J., Hahn, M.: Voice activity detection using an adaptive context attention model. IEEE Signal Process. Lett. 25(8), 1181 (2018)

  49. 49.

    Kim, J., Englebienne, G., Truong, K., Evers, V.: Towards speech emotion recognition” in the wild” using aggregated corpora and deep multi-task learning. Interspeech (2017)

  50. 50.

    Kitsantas, A., Kavussanu, M.: Acquisition of sport knowledge and skill. In: Zimmerman, B., Schunk, D. (eds.) Handbook of Self-regulation of Learning and Performance, pp. 217–233. Routledge, New York, London (2011)

  51. 51.

    Koegel, L.K., Camarata, S.M., Valdez-Menchaca, M., Koegel, R.L.: Setting generalization of question-asking by children with autism. Am. J. Mental Retard. 102(4), 346–357 (1997)

  52. 52.

    Koegel, L.K., Koegel, R.L., Harrower, J.K., Carter, C.M.: Pivotal response intervention i: overview of approach. J. Assoc. Persons Severe Handicaps 24(3), 174–185 (1999)

  53. 53.

    Koegel, L.K., Koegel, R.L., Shoshan, Y., McNerney, E.: Pivotal response intervention II: preliminary long-term outcome data. J. Assoc. Persons Severe Handicaps 24(3), 186–198 (1999)

  54. 54.

    Koegel, L.K., Carter, C.M., Koegel, R.L.: Teaching children with autism self-initiations as a pivotal response. Top. Lang. Disord. 23(2), 134–145 (2003)

  55. 55.

    Koegel, L.K., Koegel, R.L., Green-Hopkins, I., Barnes, C.C.: Brief report: question-asking and collateral language acquisition in children with autism. J. Autism Dev. Disord. 40(4), 509–515 (2010)

  56. 56.

    Koegel, L.K., Singh, A.K., Koegel, R.L., Hollingsworth, J.R., Bradshaw, J.: Assessing and improving early social engagement in infants. J. Positive Behav. Interv. 16(2), 69–80 (2014)

  57. 57.

    Koegel, R.L., Schreibman, L., Good, A., Cerniglia, L., Murphy, C., Koegel, L.: How to teach pivotal behaviors to children with autism: a training manual. University of California, Santa Barbara (1988)

  58. 58.

    Koegel, R.L.: A natural language teaching paradigm for nonverbal autistic children. J. Autism Dev. Disord. 17(2), 187–200 (1987)

  59. 59.

    Koegel, R.L., O’Dell, M., Dunlap, G.: Producing speech use in nonverbal autistic children by reinforcing attempts. J. Autism Dev. Disord. 18(4), 525–538 (1988)

  60. 60.

    Koegel, R.L., Koegel, L.K., Surratt, A.: Language intervention and disruptive behavior in preschool children with autism. J. Autism Dev. Disord. 22(2), 141–153 (1992)

  61. 61.

    Koegel, R.L., Bimbela, A., Schreibman, L.: Collateral effects of parent training on family interactions. J. Autism Dev. Disord. 26(3), 347–359 (1996)

  62. 62.

    Koegel, R.L., Camarata, S., Koegel, L.K., Ben-Tall, A., Smith, A.E.: Increasing speech intelligibility in children with autism. J. Autism Dev. Disord. 28(3), 241–251 (1998)

  63. 63.

    Koegel, R.L., Symon, J.B., Kern Koegel, L.: Parent education for families of children with autism living in geographically distant areas. J. Positive Behav. Interv. 4(2), 88–103 (2002)

  64. 64.

    Koegel, R.L., Vernon, T.W., Koegel, L.K.: Improving social initiations in young children with autism using reinforcers with embedded social interactions. J. Autism Dev. Disord. 39(9), 1240–1251 (2009)

  65. 65.

    Koegel, R.L., Bradshaw, J.L., Ashbaugh, K., Koegel, L.K.: Improving question-asking initiations in young children with autism using pivotal response treatment. J. Autism Dev. Disord. 44(4), 816–827 (2014)

  66. 66.

    Koh, Y.J., Kim, C.S.: Primary object segmentation in videos based on region augmentation and reduction. In: CVPR, vol. 1, p. 7 (2017)

  67. 67.

    Kumar, M., Bone, D., McWilliams, K., Williams, S., Lyon, T.D., Narayanan, S.: Multi-scale context adaptation for improving child automatic speech recognition in child-adult spoken interactions. Proc. Interspeech 2017, 2730–2734 (2017)

  68. 68.

    Laski, K.E., Charlop, M.H., Schreibman, L.: Training parents to use the natural language paradigm to increase their autistic children’s speech. J. Appl. Behav. Anal. 21(4), 391–400 (1988)

  69. 69.

    Lawton, K., Kasari, C.: Teacher-implemented joint attention intervention: pilot randomized controlled study for preschoolers with autism. J. Consul. Clin. Psychol. 80(4), 687 (2012)

  70. 70.

    Leaf, J.B., Leaf, R., McEachin, J., Taubman, M., Ala’i-Rosales, S., Ross, R.K., Smith, T., Weiss, M.J.: Applied behavior analysis is a science and therefore, progressive. J. Autism Dev. Disord. 46(2), 720–731 (2016)

  71. 71.

    Lecavalier, L., Smith, T., Johnson, C., Bearss, K., Swiezy, N., Aman, M.G., Sukhodolsky, D.G., Deng, Y., Dziura, J., Scahill, L.: Moderators of parent training for disruptive behaviors in young children with autism spectrum disorder. J. Abnormal Child Psychol. 45(6), 1235–1245 (2017)

  72. 72.

    Lee, S., Potamianos, A., Narayanan, S.: Acoustics of children’s speech: developmental changes of temporal and spectral parameters. J. Acoust. Soc. Am. 105(3), 1455–1468 (1999)

  73. 73.

    Li, J., Deng, L., Gong, Y., Haeb-Umbach, R.: An overview of noise-robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(4), 745–777 (2014)

  74. 74.

    Li, L., Xu, Q., Tan, Y.K.: Attention-based addressee selection for service and social robots to interact with multiple persons. In: Proceedings of the Workshop at SIGGRAPH Asia, pp. 131–136. ACM (2012)

  75. 75.

    Liao, H., Pundak, G., Siohan, O., Carroll, M.K., Coccaro, N., Jiang, Q.M., Sainath, T.N., Senior, A., Beaufays, F., Bacchiani, M.: Large vocabulary automatic speech recognition for children. In: Sixteenth Annual Conference of the International Speech Communication Association, pp. 1611–1615 (2015)

  76. 76.

    Ma, S., Sigal, L., Sclaroff, S.: Learning activity progression in lstms for activity detection and early detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1942–1950 (2016)

  77. 77.

    Machalicek, W., O’Reilly, M.F., Rispoli, M., Davis, T., Lang, R., Franco, J.H., Chan, J.M.: Training teachers to assess the challenging behaviors of students with autism using video tele-conferencing. Educ. Train. Autism Dev. Disabil. 45, 203–215 (2010)

  78. 78.

    Marchi, E., Schuller, B., Baron-Cohen, S., Golan, O., Bölte, S., Arora, P., Häb-Umbach, R.: Typicality and emotion in the voice of children with autism spectrum condition: Evidence across three languages. In: Sixteenth Annual Conference of the International Speech Communication Association, pp. 115–119 (2015)

  79. 79.

    McLoughlin, I.V.: The use of low-frequency ultrasound for voice activity detection. In: Fifteenth Annual Conference of the International Speech Communication Association (2014)

  80. 80.

    Mehner, W., Boltes, M., Mathias, M., Leibe, B.: Robust marker-based tracking for measuring crowd dynamics. In: International Conference on Computer Vision Systems, pp. 445–455. Springer (2015)

  81. 81.

    Microsoft: Microsoft cognitive toolkit (CNTK), an open source deep-learning toolkit Accessed 24 June 2018 (2018)

  82. 82.

    Mohammadzaheri, F., Koegel, L.K., Rezaee, M., Rafiee, S.M.: A randomized clinical trial comparison between pivotal response treatment (PRT) and structured applied behavior analysis (ABA) intervention for children with autism. J. Autism Dev. Disord. 44(11), 2769–2777 (2014)

  83. 83.

    Mohammadzaheri, F., Koegel, L.K., Rezaei, M., Bakhshi, E.: A randomized clinical trial comparison between pivotal response treatment (PRT) and adult-driven applied behavior analysis (ABA) intervention on disruptive behaviors in public school children with autism. J. Autism Dev. Disord. 45(9), 2899–2907 (2015)

  84. 84.

    Naim, I., Tanveer, M.I., Gildea, D., Hoque, M.E.: Automated prediction and analysis of job interview performance: the role of what you say and how you say it. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–6. IEEE (2015)

  85. 85.

    Nefdt, N., Koegel, R., Singer, G., Gerber, M.: The use of a self-directed learning program to provide introductory training in pivotal response treatment to parents of children with autism. J. Positive Behav. Interv. 12(1), 23–32 (2010)

  86. 86.

    Pawar, R., Albin, A., Gupta, U., Rao, H., Carberry, C., Hamo, A., Jones, R.M., Lord, C., Clements, M.A.: Automatic analysis of LENA recordings for language assessment in children aged five to fourteen years with application to individuals with autism. In: 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), pp. 245–248. IEEE (2017)

  87. 87.

    Perrier, A.: Google upgrades its speech-to-text service with tailored deep-learning models Accessed 24 June 2018 (2018)

  88. 88.

    Pi, J., Gu, Y., Hu, K., Cheng, X., Zhan, Y., Wang, Y.: Real-time scale-adaptive correlation filters tracker with depth information to handle occlusion. J. Electron. Imag. 25(4), 043022 (2016)

  89. 89.

    Pierce, K., Schreibman, L.: Increasing complex social behaviors in children with autism: effects of peer-implemented pivotal response training. J. Appl. Behav. Anal. 28(3), 285–295 (1995)

  90. 90.

    Pierce, K., Schreibman, L.: Multiple peer use of pivotal response training to increase social behaviors of classmates with autism: results from trained and untrained peers. J. Appl. Behav. Anal. 30(1), 157–160 (1997)

  91. 91.

    Poon, H., Domingos, P.: Unsupervised semantic parsing. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: vol. 1, pp. 1–10. Association for Computational Linguistics (2009)

  92. 92.

    Poria, S., Cambria, E., Howard, N., Huang, G.B., Hussain, A.: Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing 174, 50–59 (2016)

  93. 93.

    Potamianos, A., Narayanan, S.: Spoken dialog systems for children. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, 1998, vol. 1, pp. 197–200. IEEE (1998)

  94. 94.

    Pradhan, S.S., Ward, W.H., Hacioglu, K., Martin, J.H., Jurafsky, D.: Shallow semantic parsing using support vector machines. Proc. Human Lang. Technol. Conf. North Am. Chapter Assoc. Comput. Linguist. HLT-NAACL 2004, 233–240 (2004)

  95. 95.

    Presti, L., Sclaroff, S., Rozga, A.: Joint alignment and modeling of correlated behavior streams. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 730–737 (2013)

  96. 96.

    Pusiol, G., Soriano, L., Frank, M.C., Fei-Fei, L.: Discovering the signatures of joint attention in child-caregiver interaction. In: Proceedings of the Cognitive Science Society, vol. 36 (2014)

  97. 97.

    Rajagopalan, S.S., Murthy, O.R., Goecke, R., Rozga, A.: Play with me-measuring a child’s engagement in a social interaction. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–8. IEEE (2015)

  98. 98.

    Rajagopalan, S.S., Morency, L.P., Baltrusaitis, T., Goecke, R.: Extending long short-term memory for multi-view structured learning. In: European Conference on Computer Vision, pp. 338–353. Springer (2016)

  99. 99.

    Raptis, M., Sigal, L.: Poselet key-framing: A model for human activity recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2650–2657 (2013)

  100. 100.

    Sadjadi, S.O., Hansen, J.H.: Unsupervised speech activity detection using voicing measures and perceptual spectral flux. IEEE Signal Process. Lett. 20(3), 197–200 (2013)

  101. 101.

    Sainath, T.N., Vinyals, O., Senior, A., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4580–4584. IEEE (2015)

  102. 102.

    Sanghvi, J., Castellano, G., Leite, I., Pereira, A., McOwan, P.W., Paiva, A.: Automatic analysis of affective postures and body motion to detect engagement with a game companion. In: 2011 6th ACM/IEEE International Conference on Human–Robot Interaction (HRI), pp. 305–311. IEEE (2011)

  103. 103.

    Schreibman, L., Kaneko, W.M., Koegel, R.L.: Positive affect of parents of autistic children: a comparison across two teaching techniques. Behav. Ther. 22(4), 479–490 (1991)

  104. 104.

    Schreibman, L., Dawson, G., Stahmer, A.C., Landa, R., Rogers, S.J., McGee, G.G., Kasari, C., Ingersoll, B., Kaiser, A.P., Bruinsma, Y.: others: Naturalistic developmental behavioral interventions: empirically validated treatments for autism spectrum disorder. J. Autism Dev. Disord. 45(8), 2411–2428 (2015)

  105. 105.

    Sener, F., Ikizler-Cinbis, N.: Two-person interaction recognition via spatial multiple instance embedding. J. Vis. Commun. Image Represent. 32, 63–73 (2015)

  106. 106.

    Shaalan, K.: Extending prolog for better natural language analysis. In: Proceeding of the 1st Conference on Language Engineering, Egyptian Society of Language Engineering (ELSE), pp. 225–236 (2019)

  107. 107.

    Shahroudy, A., Liu, J., Ng, T.T., Wang, G.: NTU RGB+ d: A large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019 (2016)

  108. 108.

    Sherer, M.R., Schreibman, L.: Individual behavioral profiles and predictors of treatment effectiveness for children with autism. J. Consul. Clin. Psychol. 73(3), 525 (2005)

  109. 109.

    Shin, J.W., Chang, J.H., Kim, N.S.: Voice activity detection based on statistical models and machine learning approaches. Comput. Speech Lang. 24(3), 515–530 (2010)

  110. 110.

    Shivakumar, S.S., Loeb, H., Bogen, D.K., Shofer, F., Bryant, P., Prosser, L., Johnson, M.J.: Stereo 3d tracking of infants in natural play conditions. In: 2017 International Conference on Rehabilitation Robotics (ICORR), pp. 841–846. IEEE (2017)

  111. 111.

    Smith, D., Sneddon, A., Ward, L., Duenser, A., Freyne, J., Silvera-Tawil, D., Morgan, A.: Improving child speech disorder assessment by incorporating out-of-domain adult speech. Proc. Interspeech 2017, 2690–2694 (2017)

  112. 112.

    Smith, I.M., Flanagan, H.E., Garon, N., Bryson, S.E.: Effectiveness of community-based early intervention based on pivotal response treatment. J. Autism Dev. Disord. 45(6), 1858–1872 (2015)

  113. 113.

    SoundHound Inc.: Houndify by SoundHound inc. Accessed 2 Feb 2019 (2019)

  114. 114.

    Southwest Autism Research and Resouce Center: Parents and caregivers Accessed 19 May 2018 (2016)

  115. 115.

    Stahmer, A.C.: Teaching symbolic play skills to children with autism using pivotal response training. J. Autism Dev. Disord. 25(2), 123–141 (1995)

  116. 116.

    Stahmer, A.C., Schreibman, L., Powell, N.P.: Social validation of symbolic play training for children with autism. J. Early Intensive Behav. Interv. 3(2), 196 (2006)

  117. 117.

    Steiner, A.M., Gengoux, G.W., Klin, A., Chawarska, K.: Pivotal response treatment for infants at-risk for autism spectrum disorders: a pilot study. J. Autism Dev. Disord. 43(1), 91–102 (2013)

  118. 118.

    Suhrheinrich, J., Chan, J.: Exploring the effect of immediate video feedback on coaching. J. Spec. Educ. Technol. 32(1), 47–53 (2017)

  119. 119.

    Suhrheinrich, J., Reed, S., Schreibman, L., Bolduc, C.: Classroom pivotal response teaching for children with autism. Guilford Press, New York (2011)

  120. 120.

    Symon, J.B.: Expanding interventions for children with autism: parents as trainers. J. Positive Behav. Interv. 7(3), 159–173 (2005)

  121. 121.

    Tamura, Y., Yano, S., Osumi, H.: Modeling of human attention based on analysis of magic. In: Proceedings of the 2014 ACM/IEEE international conference on Human–robot interaction, pp. 302–303. ACM (2014)

  122. 122.

    Tamura, Y., Akashi, T., Yano, S., Osumi, H.: Human visual attention model based on analysis of magic for smooth human–robot interaction. Int. J. Soc. Robot. 8(5), 685–694 (2016)

  123. 123.

    The Help Group: Parenting classes. Accessed 19 May 2018 (2018)

  124. 124.

    Thorp, D.M., Stahmer, A.C., Schreibman, L.: Effects of sociodramatic play training on children with autism. J. Autism Dev. Disord. 25(3), 265–282 (1995)

  125. 125.

    Tripathi, S., Lipton, Z.C., Belongie, S., Nguyen, T.: Context matters: Refining object detection in video with recurrent neural networks. Preprint arXiv:1607.04648 (2016)

  126. 126.

    Tsatsoulis, P.D., Kordas, P., Marshall, M., Forsyth, D., Rozga, A.: The static multimodal dyadic behavior dataset for engagement prediction. In: Computer Vision-ECCV 2016 Workshops, pp. 386–399. Springer (2016)

  127. 127.

    University of Washington: Parent family trainings. Accessed 19 May 2018 (2018)

  128. 128.

    Van Gemeren, C., Poppe, R., Veltkamp, R.C.: Spatio-temporal detection of fine-grained dyadic human interactions. In: International Workshop on Human Behavior Understanding, pp. 116–133. Springer (2016)

  129. 129.

    Ventola, P., Friedman, H.E., Anderson, L.C., Wolf, J.M., Oosting, D., Foss-Feig, J., McDonald, N., Volkmar, F., Pelphrey, K.A.: Improvements in social and adaptive functioning following short-duration PRT program: a clinical replication. J. Autism Dev. Disord. 44(11), 2862–2870 (2014)

  130. 130.

    Vismara, L.A., Lyons, G.L.: Using perseverative interests to elicit joint attention behaviors in young children with autism: theoretical and clinical implications for understanding motivation. J. Positive Behav. Interv. 9(4), 214–228 (2007)

  131. 131.

    Vismara, L.A., Young, G.S., Stahmer, A.C., Griffith, E.M., Rogers, S.J.: Dissemination of evidence-based practice: Can we train therapists from a distance? J. Autism Dev. Disord. 39(12), 1636 (2009)

  132. 132.

    Vismara, L.A., Young, G.S., Rogers, S.J.: Telehealth for expanding the reach of early autism training to parents. Autism Res. Treat. 2012, 12 (2012)

  133. 133.

    Vismara, L.A., McCormick, C., Young, G.S., Nadhan, A., Monlux, K.: Preliminary findings of a telehealth approach to parent training in autism. J. Autism Dev. Disord. 43(12), 2953–2969 (2013)

  134. 134.

    Wang, Y., Berant, J., Liang, P.: Building a semantic parser overnight. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol. 1: Long Papers), pp. 1332–1342 (2015)

  135. 135.

    Wei, P., Xie, D., Zheng, N., Zhu, S.C.: Inferring human attention by learning latent intentions. In: Proceedings of the Twenty-Sixth International Joint Conference of Artificial Intelligence (2017)

  136. 136.

    Wong, C., Odom, S.L., Hume, K.A., Cox, A.W., Fettig, A., Kucharczyk, S., Brock, M.E., Plavnick, J.B., Fleury, V.P., Schultz, T.R.: Evidence-based practices for children, youth, and young adults with autism spectrum disorder: a comprehensive review. J. Autism Dev. Disord. 45(7), 1951–1966 (2015)

  137. 137.

    Xu, D., Yapanel, U., Gray, S., Gilkerson, J., Richards, J., Hansen, J.: Signal processing for young child speech language development. In: First Workshop on Child, Computer and Interaction (2008)

  138. 138.

    Xu, D., Gilkerson, J., Richards, J., Yapanel, U., Gray, S.: Child vocalization composition as discriminant information for automatic autism detection. In: Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE, pp. 2518–2522. IEEE (2009)

  139. 139.

    Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing. Preprint arXiv:1708.02709 (2017)

  140. 140.

    Zhang, X.L., Wang, D.: Boosting contextual information for deep neural network based voice activity detection. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 24(2), 252–264 (2016)

  141. 141.

    Zhang, Y., Liu, X., Chang, M.C., Ge, W., Chen, T.: Spatio-temporal phrases for activity recognition. In: European Conference on Computer Vision, pp. 707–721. Springer (2012)

  142. 142.

    Zhang, Y., Pezeshki, M., Brakel, P., Zhang, S., Bengio, C.L.Y., Courville, A.: Towards end-to-end speech recognition with deep convolutional neural networks. Preprint arXiv:1701.02720 (2017)

  143. 143.

    Zhang, Y., Yan, D., Yuan, Y.: An object tracking algorithm with embedded gyro information. In: Seventh International Conference on Electronics and Information Engineering, vol. 10322, p. 103220U. International Society for Optics and Photonics (2017)

  144. 144.

    Zhao, R., Ali, H., van der Smagt, P.: Two-stream RNN/CNN for action recognition in 3d videos. Preprint arXiv:1703.09783 (2017)

Download references


The authors thank Arizona State University and the National Science Foundation for their funding support. This material is partially based upon work supported by the National Science Foundation under Grant Nos. 1069125 and 1828010.

Author information

Correspondence to Corey D. C. Heath.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Heath, C.D.C., McDaniel, T., Venkateswara, H. et al. Improving communication skills of children with autism through support of applied behavioral analysis treatments using multimedia computing: a survey. Univ Access Inf Soc (2020) doi:10.1007/s10209-019-00707-5

Download citation


  • Applied behavior analysis
  • Pivotal response treatment
  • Autism spectrum disorder
  • Parent training
  • Multimedia data processing
  • Machine learning