Skip to main content
Log in

Statewide implementation of automated writing evaluation: analyzing usage and associations with state test performance in grades 4-11

  • Research Article
  • Published:
Educational Technology Research and Development Aims and scope Submit manuscript

Abstract

Automated Writing Evaluation (AWE) provides automatic writing feedback and scoring to support student writing and revising. The purpose of the present study was to analyze a statewide implementation of an AWE software (n = 114,582) in grades 4-11. The goals of the study were to evaluate (a) to what extent AWE features were used, (b) if equity and access issues influenced AWE usage, and (c) if AWE usage was associated with writing performance on a large-scale state writing assessment. Descriptive statistics and hierarchical linear modeling were used to answer the research questions. Results indicated that the main feature of AWE (i.e., writing and revising essays) were used but some features (peer review and independent lessons) were underutilized. School and student level demographic variables explained little variance in AWE usage. AWE usage was statistically and positively associated with performance on a large-scale state writing assessment when controlling for prior performance and demographics. The study presents evidence that AWE can positively influence writing on a distal measure when implemented at-scale. Implications for large-scale AWE implementation are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Allen, L. K., Jacovina, M. E., & McNamara, D. S. (2016). Computer-based writing instruction. In C. A. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of writing research (pp. 316–329). New York: The Guildford Press.

    Google Scholar 

  • Allison, P. D. (2009). Fixed effects regression models. Thousand Oaks, CA: SAGE.

    Book  Google Scholar 

  • American Institutes for Research. (2018). Utah State Assessments 2017–2018 technical report: Volume 1 Technical report. https://schools.utah.gov/file/97391cfd-9251-4ad1-9266-47b2ebe88e84

  • Bai, L., & Hu, G. (2017). In the face of fallible AWE feedback: How do students respond. Educational Psychology, 37, 67–81.

    Article  Google Scholar 

  • Bauer, M. S., Damschroder, L., Hagedom, H., Smith, J., & Kilbourne, A. M. (2015). An introduction to implementation science for the non-specialist. BMC Psychology, 3(32), 1–12.

    Google Scholar 

  • Bejar, I. I., Flor, M., Futagi, Y., & Ramineni, C. (2014). On the vulnerability of automated scoring to construct-irrelevant response strategies (CIRS): An illustration. Assessing Writing, 22, 48–59.

    Article  Google Scholar 

  • Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21, 5–31.

    Article  Google Scholar 

  • Brasiel, S., Jeong, S., Ames, C., Lawanto, K., Yuan, M., & Martin, T. (2016). Effects of educational technology on mathematics achievement for K-12 students in Utah. Journal of Online Learning Research, 2, 205–226.

    Google Scholar 

  • Brindle, M., Graham, S., Harris, K. R., & Hebert, M. (2016). Third and fourth grade teachers’ classroom practices in writing: A national survey. Reading and Writing, 29, 929–954.

    Article  Google Scholar 

  • Bunch, M. B., Vaughn, D., & Miel, S. (2016). Automated scoring in assessment systems. In Y. Rosen, S. Ferrara, & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp. 611–626). Hershey, PA: IGI Global.

    Chapter  Google Scholar 

  • Campuzano, L., Dynarski, M., Agodini, R., & Rall, K. (2009). Effectiveness of reading and mathematics software products: Findings from two student cohorts (NCEE 2009–4042). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

    Google Scholar 

  • Carless, D., & Boud, D. (2018). The development of student feedback literacy: Enabling uptake of feedback. Assessment & Evaluation in Higher Education, 43(8), 1315–1325. https://doi.org/10.1080/02602938.2018.1463354

    Article  Google Scholar 

  • Carver, L. B. (2016). Teacher perception of barriers and benefits in K-12 technology usage. Turkish Online Journal of Educational Technology, 15, 110–116. Retrieved from http://www.tojet.net/articles/v15i1/15111.pdf

  • Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32(3), 385–405. https://doi.org/10.1177/0265532214565386

    Article  Google Scholar 

  • Coe, M., Hanita, M., Nishioka, V., & Smiley, R. (2011). An investigation of the impact of the 6 + 1 trait writing model on grade 5 student writing achievement (Final Report NCEE 2012–4010). Washington, DC: National Center for Education Evaluation and Regional Assistance.

    Google Scholar 

  • Conference on College Composition and Communication. (2014). CCCC position statement on teaching, learning and assessing writing in digital environments. Retrieved April 14, 2021, from https://cccc.ncte.org/cccc/resources/positions/writingassessment

  • Deane, P. (2018). The challenge of writing in school: Conceptualizing writing development within a sociocognitive framework. Educational Psychologist, 53, 280–300.

    Article  Google Scholar 

  • Dikli, S. (2010). The nature of automated essay scoring feedback. CALICO Journal, 28, 99–134.

    Article  Google Scholar 

  • Ericsson, P. F., & Haswell, R. J. (Eds.). (2006). Machine scoring of student essays: Truth and consequences. Utah State University Press.

    Google Scholar 

  • Flower, L., & Hayes, J. R. (1980). The dynamics of composing: Making plans and juggling constraints. In L. Gregg & E. Steinberg (Eds.), Cognitive processes in writing (pp. 31–50). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Graham, S., Capizzi, A., Harris, K. R., Hebert, M., & Morphy, P. (2014). Teaching writing to middle school students: A national survey. Reading and Writing, 27, 1015–1042.

    Article  Google Scholar 

  • Grimes, D., & Warschauer, M. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. The Journal of Technology, Learning and Assessment, 8(6), 1–44. Retrieved from http://www.jtla.org

  • Grissmer, D. W., & Berends, M. (1994). Student achievement and the changing American family. Santa Monica, CA: RAND Corporation.

    Google Scholar 

  • Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77, 81–112.

    Article  Google Scholar 

  • Hayes, J. R. (1996). A new framework for understanding cognition and affect in writing. In C. M. Levy & S. Ransdell (Eds.), The science of writing: Theories, methods, individual differences, and applications (pp. 1–27). Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Hayes, J. R. (2012). Modeling and remodeling writing. Written Communication, 29(3), 369–388.

    Article  Google Scholar 

  • Hew, K. F., & Brush, T. (2007). Integrating technology into K-12 teaching and learning: Current knowledge gaps and recommendations for future research. Educational Technology Research and Development, 55, 223–252. https://doi.org/10.1007/s11423-006-9022-5

    Article  Google Scholar 

  • Higgins, D., & Heilman, M. (2014). Managing what we can measure: Quantifying the susceptibility of automated scoring systems to gaming behavior. Educational Measurement: Issues and Practice, 33(3), 36–46.

    Article  Google Scholar 

  • Hoffman, K., & Liagas, C. (2003). Status and trends in the education of blacks (NCES Publication No. 2003–034). Washington, DC: U.S. Department of Education.

    Google Scholar 

  • Huisman, B., Saab, N., van den Broek, P., & van Driel, J. (2019). The impact of formative peer feedback on higher education students’ academic writing: A meta-analysis. Assessment & Evaluation in Higher Evaluation, 44, 863–880.

    Article  Google Scholar 

  • Hull, M., & Dutch, K. (2019). One-to-one technology and student outcomes: Evidence from Mooresville’s digital conversion initiative. Educational Evaluation and Policy Analysis, 41, 79–97.

    Article  Google Scholar 

  • Jeno, L. M., Vandvik, V., Eliassen, S., & Grytnes, J. (2019). Testing the novelty effect of an m-learning tool on internalization and achievement: A self-determination theory approach. Computers & Education, 128, 398–413.

    Article  Google Scholar 

  • Keller, J., & Suzuki, K. (2004). Learner motivation and e-learning design: A multinationally validated process. Journal of Educational Media, 29, 229–239.

    Article  Google Scholar 

  • Kellogg, R. T., & Whiteford, A. P. (2009). Training advanced writing skills: The case for deliberate practice. Educational Psychologist, 44(4), 250–266.

    Article  Google Scholar 

  • Kiuhara, S. A., Graham, S., & Hawken, L. S. (2009). Teaching writing to high school students: A national survey. Journal of Educational Psychology, 101, 136–160.

    Article  Google Scholar 

  • Lee, V. (2000). Using hierarchical linear modeling to study social Contexts: The case of school effects. Educational Psychologist, 35, 125–141. https://doi.org/10.1207/S15326985EP3502_6

    Article  Google Scholar 

  • Little, C. W., Clark, J. C., Tani, N. E., & Connor, C. M. (2018). Improving writing skills through technology-based instruction: A meta-analysis. Review of Education, 6, 183–201.

    Article  Google Scholar 

  • Liu, S., & Kunnan, A. J. (2016). Investigating the application of automated writing evaluation to Chinese undergraduate english majors: A case study of WriteToLearn. CALICO Journal, 33, 71–91. https://doi.org/10.1558/cj.v33i1.26380

    Article  Google Scholar 

  • Liu, M., Li, Y., Xu, W., & Liu, L. (2017). Automated essay feedback generation and its impact on revision. IEEE Transactions on Learning Technologies, 10, 502–513. https://doi.org/10.1109/TLT.2016.2612659

    Article  Google Scholar 

  • Lu, R., & Overbaugh, R. C. (2009). School environment and technology implementation in K–12 classrooms. Computers in the Schools, 26, 89–106. https://doi.org/10.1080/07380560902906096

    Article  Google Scholar 

  • Moore, N. S., & MacArthur, C. A. (2016). Student use of automated essay evaluation technology during revision. Journal of Writing Research, 8, 149–175. https://doi.org/10.17239/jowr-2016.08.01.05

    Article  Google Scholar 

  • Morphy, P., & Graham, S. (2012). Word processing programs and weaker writers/readers: A meta-analysis of research findings. Reading and Writing, 25, 641–678. https://doi.org/10.1007/s11145-010-9292-5

    Article  Google Scholar 

  • National Center for Education Statistics. (2012). The Nation’s report card: Writing 2011 (NCES 2012–470). Washington, D.C: Institute of Education Sciences, U.S. Department of Education.

    Google Scholar 

  • National Commission on Writing for America’s Families, Schools, and Colleges. (2004). Writing: A ticket to work … or a ticket out. A survey of business leaders. New York: College Entrance Examination Board.

    Google Scholar 

  • National Council of Teachers of English. (2013). NCTE position statement on machine scoring. Retrieved from: http://www.ncte.org/positions/statements/machine_scoring.

  • Page, E. B. (2003). Project essay grade: PEG. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 43–54). Mahwah, NJ: Lawrence Erlbaum Associates Publishers.

    Google Scholar 

  • Palermo, C., & Thomson, M. M. (2018). Teacher implementation of self-regulated strategy development with an automated writing evaluation system: Effects on the argumentative writing performance of middle school students. Contemporary Educational Psychology, 54, 255–270.

    Article  Google Scholar 

  • Palermo, C., & Wilson, J. (2020). Implementing automated writing evaluation in different instructional contexts: A mixed-methods study. Journal of Writing Research, 12(1), 63–108.

    Article  Google Scholar 

  • Pinheiro, J., Bates, D., DebRoy, S., Sarkar, D. & R Core Team (2019). Nlme: Linear and nonlinear mixed effects models. R package version 3.1-140.

  • Ranalli, J. (2018). Automated written corrective feedback: How well can students make use of it? Computer Assisted Language Learning, 31, 653–674.

    Article  Google Scholar 

  • Ranalli, J., Link, S., & Chukharev-Hudilainen, E. (2017). Automated writing evaluation for formative assessment of second language writing: Investigating the accuracy and usefulness of feedback as part of argument-based validation. Educational Psychology, 37, 8–25.

    Article  Google Scholar 

  • Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks: Sage.

    Google Scholar 

  • Roscoe, R. D., Allen, L. K., Johnson, A. C., & McNamara, D. S. (2018). Automated writing instruction and feedback: Instructional mode, attitudes, and revising. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 2089–2093. Retrieved from https://journals.sagepub.com/doi/https://doi.org/10.1177/1541931218621471

  • Shermis, M. D. (2014). State-of-the-art automated essay scoring: Competition, results, and future directions from a United States demonstration. Assessing Writing, 20, 53–76.

    Article  Google Scholar 

  • Shermis, M. D., Burstein, J. C., & Bliss, L. (2004, April). The impact of automated essay scoring on high stakes writing assessments. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego

  • Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York: Oxford.

    Book  Google Scholar 

  • Smola, A. J., & Scholkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14, 199–222.

    Article  Google Scholar 

  • Stevenson, M. (2016). A critical interpretative synthesis: The integration of automated writing evaluation into classroom writing instruction. Computers and Composition, 42, 1–16. https://doi.org/10.1016/j.compcom.2016.05.001

    Article  Google Scholar 

  • Stevenson, M., & Phakiti, A. (2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51–65. https://doi.org/10.1016/j.asw.2013.11.007

    Article  Google Scholar 

  • Strobl, C., Ailhaud, E., Benetos, K., Devitt, A., Kruse, O., Proske, A., & Rapp, C. (2019). Digital support for academic writing: A review of technologies and pedagogies. Computers & Education, 131, 33–48.

    Article  Google Scholar 

  • U. S. Department of Education, Office of Educational Technology. (2017). Reimagining the role of Technology in Education: 2017 National Educational Technology Plan Update. Washington, DC: Author. Retrieved from https://tech.ed.gov/

  • Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies An International Journal, 3, 22–36. https://doi.org/10.1080/15544800701771580

  • Warschauer, M., Knobel, M., & Stone, L. (2004). Technology and equity in schooling: Deconstructing the digital divide. Educational Policy, 18, 562–588. https://doi.org/10.1177/0895904804266469

    Article  Google Scholar 

  • Williams, C., & Beam, S. (2019). Technology and writing: Review of research. Computers & Education, 128, 227–242.

    Article  Google Scholar 

  • Wilson, J. (2018s). Universal screening with automated essay scoring: Evaluating classification accuracy in grades 3 and 4. Journal of School Psychology, 68, 19–37. https://doi.org/10.1016/j.jsp.2017.12.005

    Article  Google Scholar 

  • Wilson, J., & Andrada, G. N. (2016). Using automated feedback to improve writing quality: Opportunities and challenges. In Y. Rosen, S. Ferrara & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp.678–703). Hershey, PA: IGI Global.

  • Wilson, J., Chen, D., Sandbank, M. P., & Hebert, M. (2019). Generalizability of automated scores of writing quality in grades 3-5. Journal of Educational Psychology, 111(4), 619–640. https://doi.org/10.1037/edu0000311

    Article  Google Scholar 

  • Wilson, J., & Czik, A. (2016). Automated essay evaluation software in English language arts classrooms: Effects on teacher feedback, student motivation, and writing quality. Computers & Education, 100, 94–109. https://doi.org/10.1016/j.compedu.2016.05.004

    Article  Google Scholar 

  • Wilson, J., Huang, Y., Palermo, C., Beard, G., & MacArthur, C. A. (2021). Automated feedback and automated scoring in the elementary grades: Usage, attitudes, and associations with writing outcomes in a districtwide implementation of MI Write. International Journal of Artificial Intelligence in Education. https://doi.org/10.1007/s40593-020-00236-w

    Article  Google Scholar 

  • Wilson, J., Olinghouse, N. G., & Andrada, G. N. (2014). Does automated feedback improve writing quality. Learning Disabilities A Contemporary Journal, 12, 93–118.

    Google Scholar 

  • Wilson, J., & Roscoe, R. D. (2020). Automated writing evaluation and feedback: Multiple metrics of efficacy. Journal of Educational Computing Research, 58, 87–125. https://doi.org/10.1177/0735633119830764

    Article  Google Scholar 

  • Wise, S. L., & Kong, X. (2010). Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18, 22–36.

    Google Scholar 

  • Zhu, M., Liu, O. L., & Lee, H. (2020). The effect of automated feedback on revision behavior and learning gains in formative assessment of scientific argument writing. Computers & Education, 143. Advanced online publication. https://doi.org/10.1016/j.compedu.2019.103668

Download references

Acknowledgements

This research was supported in part by Delegated Authority contract EDUC432914160001 from Measurement Incorporated® and by Grant R305H170046 from the Institute of Education Sciences, U.S. Department of Education, to the University of Delaware. The opinions expressed are those of the authors and do not represent the views of Measurement Incorporated, the Institute, or the U.S. Department of Education, and no official endorsement by these agencies should be inferred. Thank you to Drs. Christina Barbieri and Henry May for feedback on prior drafts.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew Potter.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 1325 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Potter, A., Wilson, J. Statewide implementation of automated writing evaluation: analyzing usage and associations with state test performance in grades 4-11. Education Tech Research Dev 69, 1557–1578 (2021). https://doi.org/10.1007/s11423-021-10004-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11423-021-10004-9

Keywords

Navigation