Skip to main content

Translating from Complex to Simplified Sentences

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6001))

Abstract

We address the problem of simplifying Portuguese texts at the sentence level by treating it as a “translation task”. We use the Statistical Machine Translation (SMT) framework to learn how to translate from complex to simplified sentences. Given a parallel corpus of original and simplified texts, aligned at the sentence level, we train a standard SMT system and evaluate the “translations” produced using both standard SMT metrics like BLEU and manual inspection. Results are promising according to both evaluations, showing that while the model is usually overcautious in producing simplifications, the overall quality of the sentences is not degraded and certain types of simplification operations, mainly lexical, are appropriately captured.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ribeiro, V.M.: Analfabetismo e alfabetismo funcional no Brasil. In: Boletim INAF. Instituto Paulo Montenegro, São Paulo (2006)

    Google Scholar 

  2. Max, A.: Writing for Language-impaired Readers. In: Proceedings of 7th Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, pp. 567–570 (2006)

    Google Scholar 

  3. Petersen, S.E.: Natural Language Processing Tools for Reading Level Assessment and Text Simplification for Bilingual Education. PhD thesis, University of Washington (2007)

    Google Scholar 

  4. Siddharthan, A.: Syntactic Simplification and Text Cohesion. PhD thesis, University of Cambridge (2003)

    Google Scholar 

  5. Devlin, S., Unthank, G.: Helping aphasic people process online information. In: Proceedings of the ACM Conference on Computers and Accessibility, Portland, Oregon, pp. 225–226 (2006)

    Google Scholar 

  6. Klebanov, B., Knight, K., Marcu, D.: Text Simplification for Information-Seeking Applications. In: Meersman, R., Tari, Z. (eds.) OTM 2004. LNCS, vol. 3290, pp. 735–747. Springer, Heidelberg (2004)

    Google Scholar 

  7. Vickrey, D., Koller, D.: Sentence Simplification for Semantic Role Labeling. In: Proceedings of the ACL-HLT, pp. 344–352 (2008)

    Google Scholar 

  8. Chandrasekar, R., Srinivas, B.: Automatic Induction of Rules for Text Simplification. Knowledge-Based Systems 10, 183–190 (1997)

    Article  Google Scholar 

  9. Daelemans, W., Hothker, A., Sang, E.T.K.: Automatic Sentence Simplification for Subtitling in Dutch and English. In: Proceedings of the 4th Conference on Language Resources and Evaluation, Lisbon, Portugal, pp. 1045–1048 (2004)

    Google Scholar 

  10. Petersen, S.E., Ostendorf, M.: Text Simplification for Language Learners: A Corpus Analysis. In: Proceedings of the Speech and Language Technology for Education Workshop, Pennsylvania, USA, pp. 69–72 (2007)

    Google Scholar 

  11. Candido Jr., A., Maziero, E., Gasperin, C., Pardo, T.A.S., Specia, L., Aluisio, S.M.: Supporting the Adaptation of Texts for Poor Literacy Readers: a Text Simplification Editor for Brazilian Portuguese. In: Proceedings of the NAACL/HLT Workshop on Innovative Use of NLP for Building Educational Applications, Boulder, Colorado, pp. 34–42 (2009)

    Google Scholar 

  12. Gasperin, C., Specia, L., Pereira, T., Aluisio, S.M.: Learning When to Simplify Sentences for Natural Text Simplification. In: Proceedings of the Encontro Nacional de Inteligência Artificial (ENIA), Bento Gonçalves, Brazil, pp. 809–818 (2009)

    Google Scholar 

  13. Simard, W., Goutte, C., Isabelle, P.: Statistical Phrase-based Post-editing. In: Proceedings of NAACL HLT, Rochester, USA, pp. 508–515 (2007)

    Google Scholar 

  14. Caseli, H.M., Pereira, T.F., Specia, L., Pardo, T.A.S., Gasperin, C., Aluísio, S.M.: Building a Brazilian Portuguese parallel corpus of original and simplified texts. In: 10th Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, pp. 59–70 (2009)

    Google Scholar 

  15. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, C., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open Source Toolkit for Statistical Machine Translation. In: Proceedings of the 45th ACL, demonstration session, Prague, Czech Republic (2007)

    Google Scholar 

  16. Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of the 41st ACL, Sapporo, Japan, pp. 160–167 (2003)

    Google Scholar 

  17. Papineni, K., Roukos, S., Ward, T., Zhu, W.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th ACL, Morristown, pp. 311–318 (2002)

    Google Scholar 

  18. Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the 2nd Conference on Human Language Technology Research, San Diego, pp. 138–145 (2002)

    Google Scholar 

  19. Callison-Burch, C., Koehn, P., Monz, C., Schroeder, J.: Findings of the 2009 Workshop on Statistical Machine Translation. In: Proceedings of the 4th Workshop on Statistical Machine Translation, Athens, Greece, pp. 1–28 (2009)

    Google Scholar 

  20. Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: Proceedings of the 43rd ACL, Ann Arbor, USA, pp. 263–270 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Specia, L. (2010). Translating from Complex to Simplified Sentences. In: Pardo, T.A.S., Branco, A., Klautau, A., Vieira, R., de Lima, V.L.S. (eds) Computational Processing of the Portuguese Language. PROPOR 2010. Lecture Notes in Computer Science(), vol 6001. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12320-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12320-7_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12319-1

  • Online ISBN: 978-3-642-12320-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics