Skip to main content

Mining Domain Knowledge for Coherence Assessment of Students Proposal Drafts

  • 3437 Accesses

Part of the Studies in Computational Intelligence book series (SCI,volume 524)


Often, academic programs require students to write a thesis or research proposal. The review of such texts is a heavy load, especially at initial stages. One feature evaluated by instructors is coherence, i.e. the interrelationship of the various elements of the text. We present a coherence analyzer, which employs latent semantic analysis (LSA) to mine existing corpora to further assess new drafts. We designed the analyzer as part of an Intelligent Tutoring System, considering seven common sections. After mining domain knowledge, experiments were done on graduate and undergraduate corpora to define a grading scale. Another experiment that involved human reviewers was set to validate the process. The technique allowed evaluating the coherence of the different sections, reaching an acceptable result and hinting that the level reached so far is adequate to support online review. An innovative exploration across sections was performed, uncovering a consistent interrelationship, according to methodology authors.


  • Coherence
  • Writing support
  • Latent semantic analysis
  • Intelligent tutoring system

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-02738-8_9
  • Chapter length: 27 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   119.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-02738-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   159.99
Price excludes VAT (USA)
Hardcover Book
USD   199.99
Price excludes VAT (USA)
Fig. 9.1
Fig. 9.2
Fig. 9.3
Fig. 9.4
Fig. 9.5
Fig. 9.6
Fig. 9.7
Fig. 9.8
Fig. 9.9
Fig. 9.10
Fig. 9.11
Fig. 9.12
Fig. 9.13
Fig. 9.14
Fig. 9.15
Fig. 9.16
Fig. 9.17



Data mining


Intelligent tutor system


Latent semantic analysis


Latent semantic indexing


Non-negative matrix factorization


Probabilistic latent semantic analysis


Student progress module


Singular values decomposition


  1. Luan, J.: Data mining and its applications in higher education. New Dir. Inst. Res. 2002(113), 17–36 (2002)

    Google Scholar 

  2. Vilarnovo, A.: Coherencia textual: ¿Coherencia Interna o Coherencia Externa? Estudios de Lingüística 6, 229–240 (1990)

    Google Scholar 

  3. Louwerse, M.M.: A concise model of cohesion in text and coherence in comprehension. Revista Signos 37(56), 41–58 (2004)

    CrossRef  Google Scholar 

  4. Skogs, J.: Subject line preferences and other factors contributing to coherence and interaction in student discussion forums. Comput. Educ. 60(1), 172–183 (2013)

    CrossRef  Google Scholar 

  5. Medve, V.B., Takac, V.P.: The influence of cohesion and coherence on text quality: a cross-linguistic study of foreign language learners written production. In: Piechurska-Kuciel, E., Szymańska-Czaplak, E. (eds.) Language in cognition and affect. Second language learning and teaching, pp. 111–131. Springer, Heidelberg (2013)

    Google Scholar 

  6. Yannakoudakis, H., Briscoe, T.: Modeling coherence in ESOL learner texts. In: 7th Workshop on the Innovative Use of NLP for Building Educational Applications, pp. 33–43. Association for Computational Linguistics, Stroudsburg (2012)

    Google Scholar 

  7. Higgins, D., Burstein, J., Marcu, D., Gentile, C.: Evaluating multiple aspects of coherence in student essays. In: Human language technology conference/North American chapter of the Association for Computational Linguistics, pp. 185–192. Association for Computational Linguistics, Boston (2004)

    Google Scholar 

  8. Miltsakaki, E., Kukich, K.: Evaluation of text coherence for electronic essay scoring systems. Nat. Lang. Eng. 10(1), 25–55 (2004)

    CrossRef  Google Scholar 

  9. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 391–407 (1990)

    CrossRef  Google Scholar 

  10. Landauer, T., Dumais, S.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 211–240 (1997)

    CrossRef  Google Scholar 

  11. Foltz, P., Kintsch, W., Launder, T.: Textual coherence using latent semantic analysis. Discourse Process. 25, 285–307 (1998)

    CrossRef  Google Scholar 

  12. Hofmann, T.: Probabilistic latent semantic indexing. In: 22nd international conference on research and development in information retrieval, pp. 50–57. ACM, NY (1999)

    Google Scholar 

  13. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    CrossRef  Google Scholar 

  14. Lee, S., Baker, J., Song, J., Wetherbe, J.C.: An empirical comparison of four text mining methods. In: 43rd Hawaii international conference on system sciences, pp. 1–10. IEEE Computer Society, Washington (2010)

    Google Scholar 

  15. Zhang, M., Yang, H., Ji, D., Teng, C., Wu, H.: Discourse coherence: lexical chain, complex network and semantic field. In: Ji, D., Xiao, G. (eds.) Chinese Lexical Semantics. LNCS, vol. 7717, pp. 756–765. Springer, Heidelberg (2013)

    Google Scholar 

  16. Dessus, P.: An overview of LSA-based systems for supporting learning and teaching. In: Dimitrova, V., Mizoguchi, R., du Boulay, B., Graesser A. (eds.) Conference on Artificial Intelligence in Education: Building Learning Systems that Care: From Knowledge Representation to Affective Modeling, pp. 157–164. IOS Press, Amsterdam (2009)

    Google Scholar 

  17. Southavilay, V., Yacef, K., Calvo, R.A.: Process mining to support students’ collaborative writing. In: Baker, R.S.J.D., Merceron, A., Pavlik Jr., P.I. (eds.) 3rd International Conference on Educational Data Mining, pp. 257–266. International Educational Data Mining Society, Pittsburgh (2010)

    Google Scholar 

  18. Jiang, H., Huang, G., Liu, J.: The research on CET automated essay scoring based on data mining. In: Zhou, M., Tan H. (eds.) Advances in Computer Science and Education Applications. Communications in Computer and Information Science, vol. 202, pp. 100–105. Springer, Heidelberg (2011)

    Google Scholar 

  19. Villalón, J., Kearney, P., Calvo, R.A., Reimann, P.: Glosser: enhanced feedback for student writing tasks. In: International conference on advanced learning technologies, pp. 454–458. IEEE Computer Society, Washington (2008)

    Google Scholar 

  20. Liu, M., Calvo, R.A., Rus, V.: Automatic question generation for literature review writing support. In: Aleven, V., Kay, J., Mostow, J. (eds.) Intelligent Tutoring Systems. LNCS, vol. 6094, pp. 45–54. Springer, Heidelberg (2010)

    Google Scholar 

  21. Higgins, D., Burstein, J.: Sentence similarity measures for essay coherence. In: 7th international workshop on computational semantics, pp. 77–88. Tilburg University, Tilburg (2007)

    Google Scholar 

  22. Vasile, R., Nobal, N.: Automated detection of local coherence in short argumentative essays based on centering theory. In: Gelbukh, A. (ed.) Computational Linguistics and Intelligent Text Processing. LNCS, vol. 7181, pp. 450–461. Springer, Heidelberg (2012)

    Google Scholar 

  23. Kintsch, W.: On the notions of theme and topic in psychological process models of text comprehension. In: Louwerse, M., Van Peer, W. (eds.) Thematics, Interdisciplinary Studies, pp. 157–170. John Benjamins Publishing, Amsterdam (2002)

    Google Scholar 

  24. Berry, M.W., Dumais, S.T., O’Brien, G.W.: Using linear algebra for intelligent information retrieval. Soc. Ind. Appl. Math. 4, 573–595 (1995)

    MathSciNet  Google Scholar 

  25. Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)

    MathSciNet  CrossRef  MATH  Google Scholar 

  26. Hernández, R.: Metodología de la Investigación. Mc Graw Hill, México (2006)

    Google Scholar 

  27. García-Gorrostieta, J.M., González-López, S., López-López, A., Carrillo, M.: An intelligent tutoring system to evaluate and advise on lexical richness in students writings. In: Hernández-Leo, D., Ley T., Klamma, R., Harrer, A. (eds.) EC-TEL 2013. LNCS, vol. 8095, pp. 548–551. Springer, Heidelberg (2013)

    Google Scholar 

Download references


We thank the reviewers: Rene Castro M., Claudia I. Esquivel L., J. Miguel García G., Ramón Cárdenas G., Israel Chávez G., Orlando Madrid M., and Raúl Beltran Q. This research was supported by CONACYT, México, through the scholarship 1124002 for the first author. The second author was partially supported by SNI, México.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Aurelio López-López .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

López, S.G., López-López, A. (2014). Mining Domain Knowledge for Coherence Assessment of Students Proposal Drafts. In: Peña-Ayala, A. (eds) Educational Data Mining. Studies in Computational Intelligence, vol 524. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02737-1

  • Online ISBN: 978-3-319-02738-8

  • eBook Packages: EngineeringEngineering (R0)