Data-Driven Development and Evaluation of Enskill English


Cloud computing offers developers of learning environments access to unprecedented amounts of learner data. This makes possible data-driven development (D3) of learning environments. In the D3 approach the learning environment is a data collection tool as well a learning tool. It continually collects data from interactions with learners, which is used in ongoing evaluation and iterative development. Iterative development cycles become very rapid, limited by the time required to analyze data and deploy system updates. D3 is particularly relevant to fielded AIED systems that operate in uncontrolled conditions, where learners may behave in unexpected ways. This article presents two snapshot case studies in the data-driven development of Enskill® English, a system for learning to speak English as a foreign language. In the first trial at the University of Novi Sad in Serbia two versions of Enskill English’s dialogue system were tested simultaneously: the released version and a new version incorporating statistical natural language processing technology. A new version was released and data were collected from a second snapshot evaluation at the University of Split, Croatia. Data from learners in Latin America and Europe were analyzed for comparison. The evaluations provided preliminary evidence that Enskill English is helpful for learning spoken English skills, and leading indicators that learner performance improves through practice with Enskill English. They suggest that Enskill English can be extended to meet the needs of more advanced learners who wish to use English in a professional context. Broader recommendations for data-driven development of intelligent learning environments are presented.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. ACTFL (2015). NCSSFL-ACTFL can-do statements: performance indicators for language learners. Alexandria: American Council on the Teaching of Foreign Languages.

    Google Scholar 

  2. Alelo (2019). The British Council nominates Alelo for the 2019 ELTons digital innovation award. Retrieved May 25, 2019 from

  3. Allen, M. (2012). Leaving ADDIE for SAM: An agile model for developing the best learning experiences. Alexandria, VA: ATD.

  4. ALTE (2002). The ALTE can do project (English version). Retrieved May 25, 2019 from

  5. Amershi, S., & Conati, C. (2011). Automatic recognition of learner types in exploratory learning environments. In C. Romero, S. Ventura, M. Pechenizkiy, & R. S. J.d. Baker (Eds.), Handbook of educational data mining (pp. 213–230). Boca Raton: CRC Press.

  6. Anderson, J. R. (1982). Acquisition of cognitive skill. Psychological Review, 89(4), 369–406.

    Article  Google Scholar 

  7. Arora, V., Lahiri, A., & Reetz, H. (2017). Phonological feature based mispronunciation detection and diagnosis using multi-task DNNs and active learning (p. 2017). Stockholm, Sweden: INTERSPEECH.

    Google Scholar 

  8. Bailey, P., Onwuegbuzie, A.J., & Daley, C. E. (2003). Foreign language anxiety and student attrition. Academic Exchange Quarterly, 7(3), 304–308.

    Google Scholar 

  9. Beck, K., Grenning, J., Martin, R. C., Beedle, M., Highsmith, J., Mellor, S., van Bennekum, A., Hunt, A., Schwaber, K., Cockburn, A., Jeffries, R., Sutherland, J., Cunningham, W., Kern, J., Thomas, D., Fowler, M., & Marick, B. (2001). Manifesto for agile software development. Agile Alliance. Retrieved May 25, 2019 from

  10. Bernstein, J., Najmi, A., & Ehsani, F. (1999). Subarashii: Encounters in Japanese spoken language education. CALICO Journal, 16(3), 361–384.

    Google Scholar 

  11. Branch, R. M. (2009). Instructional design: The ADDIE approach. Berlin: Springer Science+Business Media.

    Book  Google Scholar 

  12. Branson, R.K., Rayner, G.T., Cox, J.L., Furman, J.P., King, F.J., Hannum, W.H. (1975). Interservice procedures for instructional systems development: Executive summary and model. (Vols. 1–5) TRADOC Pam 350–30, Ft. Monroe: U.S. Army Training and Doctrine Command.

  13. British Council (2019). Digital innovation 2019 finalists. Retrieved May 25, 2019 from

  14. PSLC DataShop (2012). DataShop@CMU: A data analysis service for the learning science community. Retrieved May 25, 2019 from

  15. Dzikovksa, M.O., Moore, J.D., Steinhauser, N., & Campbell, G. (2011). Exploring user satisfaction in a tutorial dialogue system. Proceedings of SIGDIAL 2011, pp. 162–172.

  16. Engwall, E. (Ed.). (2012). Proceedings of the international symposium on automatic detection of errors in pronunciation training. Stockholm, Sweden: KTH.

    Google Scholar 

  17. Evanini, K., Tsuprun, E., Timpe-Laughlin, V., Ramanarayanan, V., Lange, P., & Suendermann-Oeft, D. (2017). Evaluating the impact of local context on CALL applications using spoken dialogue systems. Proceedings of CALL2017, Berkeley, CA.

  18. Fitts, P. M., & Posner, M. I. (1967). Human performance. Belmont: Brooks/Cole.

    Google Scholar 

  19. Fournier-Viger, P., Nkambou, R., & Nguifo, E.M. (2011). Learning procedural knowledge from user solutions to ill-defined tasks in a simulated robotic manipulator. In In C. Romero, S. Ventura, M. Pechenizkiy, & R. S.J.d. Baker (Eds.), Handbook of Educational Data Mining, pp. 451–467. Boca Raton: CRC Press.

  20. Graesser, A. C., Chipman, P., Haynes, B. C., & Olney, A. (2005). AutoTutor: An intelligent tutoring system for mixed-initiative dialogue. IEEE Transactions on Education, 48(4), 612–618.

    Article  Google Scholar 

  21. Hsieh, P. H. (2008). Why are college foreign language students’ self-efficacy, attitude, and motivation so different? International Education, 38(1).

  22. Johnson, W. L. (2010). Serious use of a serious game for language learning. International Journal of Artificial Intelligence in Education, 20(2), 175–195.

    Google Scholar 

  23. Johnson, W. L. (2015). Cultural training as behavior change. In T. Ahram, W. Karwowski, & D. Schmorrow (Eds.), 6 th international conference on applied human factors and ergonomics (AHFE 2015) and the affiliated conferences (pp. 3860–3867). Amsterdam: Elsevier B.V.

  24. Johnson, W. L. & Valente, A. (2008). Collaborative authoring of serious games for language and culture. In Proceedings of SimTecT 2008.

  25. Johnson, W.L., Friedland, L., Schrider, P., Valente, A., & Sheridan, S. (2011). The virtual cultural awareness trainer (VCAT): Joint knowledge Online's (JKO's) solution to the individual operational culture and language training gap. In Proceedings of ITEC 2011. London: Clarion Events.

  26. Johnson, W. L., Friedland, L., Watson, A. M., & Surface, E. A. (2012). The art and science of developing intercultural competence. In P. J. Durlach & A. M. Lesgold (Eds.), Adaptive technologies for training and education (pp. 261–285). New York: Cambridge University Press.

    Chapter  Google Scholar 

  27. Johnson, W. L., Lindsay, B., Naber, A., Carlin, A., & Freeman, J. (2018). Initial evaluations of adaptive training technology for language and culture (p. 2018). Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC).

  28. Koedinger, K. R., Brunskill, E., Baker R. S. J. D., McLaughlin, E. A., & Stamper, J. (2013). New potentials for data-driven intelligent tutoring system development and optimization. AI Magazine, 34(3), 27.

    Article  Google Scholar 

  29. Kohavi, R., & Longbotham, R. (2017). Online controlled experiments and a/B tests. In C. Sammut & G. Webb (Eds.), Encyclopedia of machine learning and data mining. Berlin: Springer.

  30. Lee, K., Kweon, S.-O., Lee, S., & Noh, H. (2014). POSTECH immersive English study (POMY): Dialogue-based language learning game. In IEICE Transactions on Information and Systems E97.D(7):1830–1841.

  31. Linguistic Data Consortium (2018). Retrieved May 25, 2019 from

  32. Littman, D. J., & Silliman, S. (2004). ITSPOKE: An Intelligent Tutoring Dialogue System. In Proceedings of HLT-NAACL.

  33. MCCLL (Marine Corps Center for Lessons Learned) (2008). Tactical Iraqi language and culture training system (TILTS): Survey of 63 marines. Retrieved May 25, 2019 from

  34. McKenney, S. & Reeves, T. C. (2018). Conducting educational design research (2nd edn.). London: Routledge.

  35. Segalowitz, N. (2010). Cognitive bases of second language fluency. New York: Routledge.

    Book  Google Scholar 

  36. St. Giles International (2018). English language level descriptors. Retrieved May 25, 2019 from

  37. Stone, M. L., Kent, K. M., Roscoe, R. D., Corley, K. M.,  Allen, L. K., & McNamara, D. S. (2018). The design implementation framework: Iterative design from the lab to the classroom. In R. D. Roscoe, S. D. Craig, & I. Douglas (Eds.), End-user considerations in educational technology design (pp. 76–98). Hershey: IGI Global.

  38. Thiriau, C. (2017). Teaching speaking in ELT. Cambridge: Cambridge University Press. Retrieved May 25, 2019 from

  39. Treser, M. (2015). Getting to Know ADDIE: Part 5 – Evaluation. eLearning Industry. Retrieved May 25, 2019 from

  40. van Merriënboer, J. J. G. (1997). Training complex cognitive skills: A four-component instructional design model for technical training. Englewood Cliffs: Educational Technology Publications.

    Google Scholar 

  41. Xu, Y. & Seneff, S. (2011). A generic framework for building dialogue games for language learning: Application in the flight domain. In Proceedings of SLaTE 2011, Venice, Italy. ISCA.

Download references


The author and the Enskill English development team wish to thank Vesna Bulatović of the University of Novi Sad, Angelina Gašpar and Ani Grubišić of the University of Split, and the many students who participated in these studies. We also wish to thank Laureate Education for their permission to use their English simulation content in these studies. Finally, I wish to thank the reviewers and editors at the Journal of Artificial Intelligence in Education for providing valuable constructive feedback and seeing this effort through to publication.

Author information



Corresponding author

Correspondence to W. Lewis Johnson.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Johnson, W.L. Data-Driven Development and Evaluation of Enskill English. Int J Artif Intell Educ 29, 425–457 (2019).

Download citation


  • Development methodologies for AIED systems
  • Iterative design
  • Experimental studies
  • Educational design research
  • Computer-aided language learning
  • Dialogue systems