Skip to main content

Computational Methods for Analysis of Language in Graduate and Undergraduate Student Texts

  • Conference paper
  • First Online:
Higher Education for All. From Challenges to Novel Technology-Enhanced Solutions (HEFA 2017)

Abstract

Often, academic programs require students to write a thesis or research proposal. The review of such texts is a heavy load, especially at initial stages. Natural Language Processing techniques are employed to mine existing corpora of research proposals and theses to further assess drafts of college students in information technologies and computer science. In this chapter, we focus on examining specific sections of student writings, first seeking for the connection of ideas identifying the pattern of entities. Subsequently, we analyze the justification and conclusions sections, studying features such as the presence of importance in justification and the level of speculative words in a conclusion section. Experiments and results for the different analyses are explained in detail. Each analysis is independent and could allow the student to analyze their text with a set of tools with the aim of improving their writing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.sketchengine.co.uk/penn-treebank-tagset.

  2. 2.

    NC: The sentence was not connected to research results.

  3. 3.

    GT: The sentence is written in General Terms.

  4. 4.

    VO: The vocabulary used in the sentence is not adequate.

  5. 5.

    www.speech.sri.com/projects/srilm/.

  6. 6.

    http://www.fit.vutbr.cz/~imikolov/rnnlm/.

  7. 7.

    Click on the image “TURET 2.0”.

References

  1. Bitchener, J., Basturkmen, H.: Perceptions of the difficulties of postgraduate L2 thesis students writing the discussion section. J. Engl. Acad. Purp. 5, 4–18 (2006)

    Article  Google Scholar 

  2. Allen, G.: The Graduate Students’ Guide to Theses and Dissertations: A Practical Manual for Writing and Research. Jossey-Bass Inc Pub, San Francisco (1976)

    Google Scholar 

  3. Davis, J., Liss, R.: Effective Academic Writing 3, The Essay. Oxford University Press, Oxford (2006)

    Google Scholar 

  4. Webber, B., Egg, M., Kordoni, V.: Discourse structure and language technology. Nat. Lang. Eng. 18, 437–490 (2011)

    Article  Google Scholar 

  5. O’Rourke, S., Calvo, R.: Analysing semantic flow in academic writing. In: Proceedings Conference on Artificial Intelligence in Education: Building Learning Systems that Care, pp. 173–180. IOS Press, Amsterdam (2009)

    Google Scholar 

  6. Barzilay, R., Lapata, M.: Modeling local coherence: an entity-based approach. Comput. Linguist. 34, 1–34 (2008)

    Article  Google Scholar 

  7. Crossley, S., Roscoe, R., McNamara, D.: Using automatic scoring models to detect changes in student writing in an intelligent tutoring system. In: Proceedings of the 26th FLAIRS, pp. 208–213 (2013)

    Google Scholar 

  8. Olney, A.M., et al.: Guru: a computer tutor that models expert human tutors. In: Cerri, S.A., Clancey, W.J., Papadourakis, G., Panourgia, K. (eds.) ITS 2012. LNCS, vol. 7315, pp. 256–261. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30950-2_32

    Chapter  Google Scholar 

  9. Graesser, A., et al.: The relationship between affective states and dialog patterns during interactions with AutoTutor. Interact. Learn. Res. 19, 293–312 (2008)

    Google Scholar 

  10. Bethard, S., Okoye, I., Sultan, A., Hang, H., Martin, J., Sumner, T.: Identifying science concepts and student misconceptions in an interactive essay writing tutor. In: Proceedings of the 7th Workshop on Building Educational Applications Using NLP, pp. 12–21 (2012)

    Google Scholar 

  11. Burstein, J., Marcu, D.: A machine learning approach for identification of thesis and conclusion statements in student essays. Comput. Humanit. 37, 455–467 (2003)

    Article  Google Scholar 

  12. López, S.G., Bethard, S., López-López, A.: Identifying weak sentences in student drafts: a tutoring system. In: Mascio, T.D., Gennari, R., Vitorini, P., Vicari, R., de la Prieta, F. (eds.) Methodologies and Intelligent Systems for Technology Enhanced Learning. AISC, vol. 292, pp. 77–85. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07698-0_10

    Chapter  Google Scholar 

  13. Daudaravicius, V.: Automated evaluation of scientific writing: AESW shared task proposal. In: Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 56–63. Association for Computational Linguistics (2015)

    Google Scholar 

  14. Bellegarda, J.: Unsupervised document clustering using multiresolution latent semantic density analysis. In: Workshop on Machine Learning for Signal Processing, pp. 361–366 (2010)

    Google Scholar 

  15. González-López, S., López-López, A.: Analysis of concept sequencing in student drafts. In: Rensing, C., de Freitas, S., Ley, T., Muñoz-Merino, P.J. (eds.) EC-TEL 2014. LNCS, vol. 8719, pp. 422–427. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11200-8_36

    Chapter  Google Scholar 

  16. Aiken, M., Ghosh, K., Wee, J., Vanjani, M.: An evaluation of the accuracy of online translation systems. Commun. IIMA 9(4), 67–84 (2009)

    Google Scholar 

  17. Eisenstein, J., Barzilay, R.: Bayesian unsupervised topic segmentation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 334–343 (2008)

    Google Scholar 

  18. O’Rourke, S., Calvo, R.: Analysing semantic flow in academic writing. In: Proceedings of the 2009 Conference on Artificial Intelligence in Education: Building Learning Systems that Care, pp. 173–180 (2009)

    Google Scholar 

  19. Mikolov, T., Deoras, A., Kombrink, S., Burget, L., Cernocký, J.: Empirical evaluation and combination of advanced language modeling techniques. In: INTERSPEECH, pp. 605–608 (2011)

    Google Scholar 

  20. Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394. Association for Computational Linguistics (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samuel González-López .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

González-López, S., López-López, A. (2018). Computational Methods for Analysis of Language in Graduate and Undergraduate Student Texts. In: Cristea, A., Bittencourt, I., Lima, F. (eds) Higher Education for All. From Challenges to Novel Technology-Enhanced Solutions. HEFA 2017. Communications in Computer and Information Science, vol 832. Springer, Cham. https://doi.org/10.1007/978-3-319-97934-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-97934-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-97933-5

  • Online ISBN: 978-3-319-97934-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics