YATS: Yet Another Text Simplifier

  • Daniel Ferrés
  • Montserrat Marimon
  • Horacio Saggion
  • Ahmed AbuRa’ed
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9612)

Abstract

We present a text simplifier for English that has been built with open source software and has both lexical and syntactic simplification capabilities. The lexical simplifier uses a vector space model approach to obtain the most appropriate sense of a given word in a given context and word frequency simplicity measures to rank synonyms. The syntactic simplifier uses linguistically-motivated rule-based syntactic analysis and generation techniques that rely on part-of-speech tags and syntactic dependency information. Experimental results show good performance of the lexical simplification component when compared to a hard-to-beat baseline, good syntactic simplification accuracy, and according to human assessment, improvements over the best reported results in the literature for a system with same architecture as YATS.

References

  1. 1.
    Biran, O., Brody, S., Elhadad, N.: Putting it simply: a context-aware approach to lexical simplification. In: Proceedings of the ACL 2011, pp. 496–501 (2011)Google Scholar
  2. 2.
    Bott, S., Rello, L., Drndarevic, B., Saggion, H.: Can Spanish be simpler? LexSiS: lexical simplification for Spanish. In: Proceedings of the COLING 2012, Mumbai, India, pp. 357–374 (2012)Google Scholar
  3. 3.
    Carroll, J., Minnen, G., Canning, Y., Devlin, S., Tait, J.: Practical simplification of English newspaper text to assist aphasic readers. In: Proceedings of the AAAI 1998 Workshop on Integrating AI and Assistive Technology, pp. 7–10 (1998)Google Scholar
  4. 4.
    Chandrasekar, R., Doran, C., Srinivas, B.: Motivations and methods for text simplification. In: Proceedings of the COLING 1996, pp. 1041–1044 (1996)Google Scholar
  5. 5.
    Coster, W., Kauchak, D.: Learning to simplify sentences using wikipedia. In: Proceedings of ACL 2011 Workshop on Monolingual Text-To-Text Generation, Portland, Oregon, USA, pp. 1–9 (2011)Google Scholar
  6. 6.
    Devlin, S., Tait, J.: The use of a psycholinguistic database in the simplification of text for aphasic readers. In: Linguistic Databases, pp. 161–173 (1998)Google Scholar
  7. 7.
    Horn, C., Manduca, C., Kauchak, D.: Learning a lexical simplifier using Wikipedia. In: Proceedings of ACL 2014, pp. 458–463 (2014)Google Scholar
  8. 8.
    Saggion, H., Bott, S., Rello, L.: Simplifying words in context. Experiments with two lexical resources in Spanish. Comput. Speech Lang. 35, 200–218 (2016)CrossRefGoogle Scholar
  9. 9.
    Saggion, H., Stajner, S., Bott, S., Mille, S., Rello, L., Drndarevic, B.: Making it simplext: implementation and evaluation of a text simplification system for spanish. TACCESS 6(4), 14 (2015)CrossRefGoogle Scholar
  10. 10.
    Shardlow, M.: Out in the open: finding and categorising errors in the lexical simplification pipeline. In: Proceedings of LREC 2014, Reykjavik, Iceland (2014)Google Scholar
  11. 11.
    Siddharthan, A.: Syntactic simplification and text cohesion. In: Proceedings of the LEC 2002, pp. 64–71 (2002)Google Scholar
  12. 12.
    Siddharthan, A.: Text simplification using typed dependencies: a comparision of the robustness of different generation strategies. In: Proceedings of the 13th European Workshop on Natural Language Generation, Nancy, France (2011)Google Scholar
  13. 13.
    Siddharthan, A., Angrosh, M.: Hybrid text simplification using synchronous dependency grammars with hand-written and automatically harvested rules. In: Proceedings of the EACL 2014, Gothenburg, Sweden (2014)Google Scholar
  14. 14.
    Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Int. Res. 37(1), 141–188 (2010)MathSciNetMATHGoogle Scholar
  15. 15.
    Wubben, S., Bosch, A., Krahmer, E.: Sentence simplification by monolingual machine translation. In: Proceedings of ACL 2012, pp. 1015–1024 (2012)Google Scholar
  16. 16.
    Yatskar, M., Pang, B., Danescu-Niculescu-Mizil, C., Lee, L.: For the sake of simplicity: unsupervised extraction of lexical simplifications from Wikipedia. In: Proceedings of HLT-NAACL 2010 (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Daniel Ferrés
    • 1
  • Montserrat Marimon
    • 1
  • Horacio Saggion
    • 1
  • Ahmed AbuRa’ed
    • 1
  1. 1.TALN - DTICUniversitat Pompeu FabraBarcelonaSpain

Personalised recommendations