Skip to main content

Multi-class Text Complexity Evaluation via Deep Neural Networks

Part of the Lecture Notes in Computer Science book series (LNISA,volume 11872)

Abstract

Automatic Text Complexity Evaluation (ATE) is a natural language processing task which aims to assess texts difficulty taking into account many facets related to complexity. A large number of papers tackle the problem of ATE by means of machine learning algorithms in order to classify texts into complex or simple classes. In this paper, we try to go beyond the methodologies presented so far by introducing a preliminary system based on a deep neural network model whose objective is to classify sentences into more of two classes. Experiments have been carried out on a manually annotated corpus which has been preprocessed in order to make it suitable for the scope of the paper. The results show that a higher detail level of the classification makes the ATE problem much harder to resolve, showing the weaknesses of the model to accomplish the task correctly.

Keywords

  • Automatic Text Complexity Evaluation
  • Deep neural network
  • Text simplification

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-33617-2_32
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-33617-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   79.99
Price excludes VAT (USA)
Fig. 1.

References

  1. Bosco, G.L., Pilato, G., Schicchi, D.: A neural network model for the evaluation of text complexity in Italian language: a representation point of view. Procedia Comput. Sci. 145, 464–470 (2018)

    CrossRef  Google Scholar 

  2. Alfano, M., Lenzitti, B., Lo Bosco, G., Perticone, V.: An automatic system for helping health consumers to understand medical texts, pp. 622–627 (2015)

    Google Scholar 

  3. Braun, P., Cameron, J.J., Cuzzocrea, A., Jiang, F., Leung, C.K.: Effectively and efficiently mining frequent patterns from dense graph streams on disk. In: 18th International Conference in Knowledge Based and Intelligent Information and Engineering Systems, KES 2014, Gdynia, Poland, 15–17 September 2014, pp. 338–347 (2014)

    CrossRef  Google Scholar 

  4. Chiavetta, F., Lo Bosco, G., Pilato, G.: A lexicon-based approach for sentiment classification of Amazon books reviews in Italian language, vol. 2, pp. 159–170 (2016)

    Google Scholar 

  5. Chiavetta, F., Lo Bosco, G., Pilato, G.: A layered architecture for sentiment classification of products reviews in Italian language. Lect. Notes Bus. Inf. Process. 292, 120–141 (2017)

    CrossRef  Google Scholar 

  6. Cuzzocrea, A., Bertino, E.: Privacy preserving OLAP over distributed XML data: a theoretically-sound secure-multiparty-computation approach. J. Comput. Syst. Sci. 77(6), 965–987 (2011)

    MathSciNet  CrossRef  Google Scholar 

  7. Di Gangi, M., Lo Bosco, G., Pilato, G.: Effectiveness of data-driven induction of semantic spaces and traditional classifiers for sarcasm detection. Nat. Lang. Eng. 25(2), 257–285 (2019)

    CrossRef  Google Scholar 

  8. Flesch, R.: Marks of Readable Style; A Study in Adult Education. Teachers College Contributions to Education (1943)

    Google Scholar 

  9. Franchina, V., Vacca, R.: Adaptation of flesh readability index on a bilingual text written by the same author both in Italian and English languages. Linguaggi 3, 47–49 (1986)

    Google Scholar 

  10. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  11. Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. CoRR abs/1802.06893 (2018). http://arxiv.org/abs/1802.06893

  12. Hinton, G., Srivastava, N., Swersky, K.: Neural networks for machine learning lecture 6a overview of mini-batch gradient descent (2012)

    Google Scholar 

  13. Kauchak, D., Mouradi, O., Pentoney, C., Leroy, G.: Text simplification tools: using machine learning to discover features that identify difficult text. In: 2014 47th Hawaii International Conference on System Sciences, pp. 2616–2625, January 2014. https://doi.org/10.1109/HICSS.2014.330

  14. Kincaid, J.: Derivation of New Readability Formulas: (automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Research Branch report, Chief of Naval Technical Training, Naval Air Station Memphis (1975). https://books.google.it/books?id=4tjroQEACAAJ

  15. Lo Bosco, G., Pilato, G., Schicchi, D.: A sentence based system for measuring syntax complexity using a recurrent deep neural network. In: 2nd Workshop on Natural Language for Artificial Intelligence, NL4AI 2018, vol. 2244, pp. 95–101. CEUR-WS (2018)

    Google Scholar 

  16. Schicchi, D., Lo Bosco, G., Pilato, G.: Machine learning models for measuring syntax complexity of English text. In: Samsonovich, A.V. (ed.) BICA 2019. AISC, vol. 948, pp. 449–454. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-25719-4_59

    CrossRef  Google Scholar 

  17. Lo Bosco, G., Pilato, G., Schicchi, D.: A recurrent deep neural network model to measure sentence complexity for the Italian language. In: International Workshop on Artificial Intelligence and Cognition, 6th Edition, Palermo, Italy (2018, in press)

    Google Scholar 

  18. Paetzold, G., Alva-Manchego, F., Specia, L.: Massalign: alignment and annotation of comparable documents. In: Proceedings of the IJCNLP 2017, System Demonstrations, pp. 1–4 (2017)

    Google Scholar 

  19. Scarton, C., Paetzold, G., Specia, L.: Text simplification from professionally produced corpora. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018). European Languages Resources Association (ELRA), Miyazaki, Japan, May 2018. https://www.aclweb.org/anthology/L18-1553

  20. Schicchi, D., Pilato, G.: A social humanoid robot as a playfellow for vocabulary enhancement. In: 2018 Second IEEE International Conference on Robotic Computing (IRC), pp. 205–208. IEEE Computer Society, Los Alamitos, February 2018

    Google Scholar 

  21. Schicchi, D., Pilato, G.: WORDY: a semi-automatic methodology aimed at the creation of neologisms based on a semantic network and blending devices. In: Barolli, L., Terzo, O. (eds.) Complex, Intelligent, and Software Intensive Systems. AISC, vol. 611, pp. 236–248. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-61566-0_23

    CrossRef  Google Scholar 

  22. Siddharthan, A.: A survey of research on text simplification. ITL Int. J. Appl. Linguist. 165(2), 259–298 (2014)

    CrossRef  Google Scholar 

  23. Subramani, S., Michalska, S., Wang, H., Du, J., Zhang, Y., Shakeel, H.: Deep learning for multi-class identification from domestic violence online posts. IEEE Access 7, 46210–46224 (2019)

    CrossRef  Google Scholar 

  24. Vajjala, S., Meurers, D.: Assessing the relative reading level of sentence pairs for text simplification. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 288–297 (2014)

    Google Scholar 

  25. Wu, Z., Yin, W., Cao, J., Xu, G., Cuzzocrea, A.: Community detection in multi-relational social networks. In: Web Information Systems Engineering - WISE 2013–14th International Conference, Nanjing, China, October 13–15, 2013, Proceedings, Part II, pp. 43–56 (2013)

    Google Scholar 

  26. Xu, W., Callison-Burch, C., Napoles, C.: Problems in current text simplification research: new data can help. Trans. Assoc. Comput. Linguist. 3, 283–297 (2015)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfredo Cuzzocrea .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Cuzzocrea, A., Bosco, G.L., Pilato, G., Schicchi, D. (2019). Multi-class Text Complexity Evaluation via Deep Neural Networks. In: Yin, H., Camacho, D., Tino, P., Tallón-Ballesteros, A., Menezes, R., Allmendinger, R. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2019. IDEAL 2019. Lecture Notes in Computer Science(), vol 11872. Springer, Cham. https://doi.org/10.1007/978-3-030-33617-2_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33617-2_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33616-5

  • Online ISBN: 978-3-030-33617-2

  • eBook Packages: Computer ScienceComputer Science (R0)