Skip to main content

Self-Supervised Text Style Transfer with Rationale Prediction and Pretrained Transformers

  • 277 Accesses

Part of the Communications in Computer and Information Science book series (CCIS,volume 1734)


Sentiment transfer involves changing the sentiment of a sentence, such as from a positive to negative sentiment, while maintaining the informational content. Given the dearth of parallel corpora in this domain, sentiment transfer and other text rewriting tasks have been posed as unsupervised learning problems. In this paper we propose a self-supervised approach to sentiment or text style transfer. First, sentiment words are identified through an interpretable text classifier based on the method of rationales. Second, a pretrained BART model is fine-tuned as a denoising autoencoder to autoregressively reconstruct sentences in which sentiment words are masked. Third, the model is used to generate a parallel corpus, filtered using a sentiment classifier, which is used to fine-tune the model further in a self-supervised manner. Human and automatic evaluations show that on the Yelp sentiment transfer dataset the performance of our self-supervised approach is close to the state-of-the-art while the BART model performs substantially better than a sequence-to-sequence baseline. On a second dataset of Amazon reviews our approach scores high on fluency but struggles more to modify sentiment while maintaining sentence content. Rationale-based sentiment word identification obtains similar performance to the saliency-based sentiment word identification baseline on Yelp but underperforms it on Amazon. Our main contribution is to demonstrate the advantages of self-supervised learning for unsupervised text rewriting.


  • Text style transfer
  • Self-supervised learning
  • Transformers

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


  1. 1.

  2. 2.

  3. 3.

    We use SACREBLEU [17] with default settings to calculate the BLEU score.


  1. Bastings, J., Aziz, W., Titov, I.: Interpretable neural predictions with differentiable binary variables. In: ACL, no. 1, pp. 2963–2977 (2019)

    Google Scholar 

  2. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, no. 1, pp. 4171–4186 (2019)

    Google Scholar 

  3. He, J., Wang, X., Neubig, G., Berg-Kirkpatrick, T.: A probabilistic formulation of unsupervised text style transfer. In: ICLR (2020)

    Google Scholar 

  4. He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Proceedings of the 25th International Conference on World Wide Web, pp. 507–517 (2016)

    Google Scholar 

  5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    CrossRef  Google Scholar 

  6. Keskar, N.S., McCann, B., Varshney, L.R., Xiong, C., Socher, R.: Ctrl: a conditional transformer language model for controllable generation (2019)

    Google Scholar 

  7. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2014)

    Google Scholar 

  8. Krishna, K., Wieting, J., Iyyer, M.: Reformulating unsupervised style transfer as paraphrase generation. In: EMNLP, pp. 737–762 (2020)

    Google Scholar 

  9. Kumaraswamy, P.: A generalized probability density function for double-bounded random processes. J. Hydrol. 46(1–2), 79–88 (1980)

    CrossRef  Google Scholar 

  10. Lample, G., Subramanian, S., Smith, E., Denoyer, L., Ranzato, M., Boureau, Y.L.: Multiple-attribute text rewriting. In: ICML (2019).

  11. Lei, T., Barzilay, R., Jaakkola, T.S.: Rationalizing neural predictions. In: EMNLP, pp. 107–117 (2016)

    Google Scholar 

  12. Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: ACL, pp. 7871–7880 (2020)

    Google Scholar 

  13. Li, J., Jia, R., He, H., Liang, P.: Delete, retrieve, generate: a simple approach to sentiment and style transfer. In: NAACL-HLT, pp. 1865–1874 (2018)

    Google Scholar 

  14. Logeswaran, L., Lee, H., Bengio, S.: Content preserving text generation with attribute controls. In: Advances in Neural Information Processing Systems, pp. 5108–5118 (2018)

    Google Scholar 

  15. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  16. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)

    Google Scholar 

  17. Post, M.: A call for clarity in reporting BLEU scores. In: WMT, pp. 186–191 (2018)

    Google Scholar 

  18. Pryzant, R., Richard, D.M., Dass, N., Kurohashi, S., Jurafsky, D., Yang, D.: Automatically neutralizing subjective bias in text. In: AAAI (2020)

    Google Scholar 

  19. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)

    Google Scholar 

  20. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)

    Google Scholar 

  21. Rao, S., Tetreault, J.R.: Dear sir or madam, may I introduce the GYAFC dataset: corpus, benchmarks and metrics for formality style transfer. In: NAACL-HLT, pp. 129–140 (2018)

    Google Scholar 

  22. Nogueira dos Santos, C., Melnyk, I., Padhi, I.: Fighting offensive language on social media with unsupervised text style transfer. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 189–194. Melbourne, Australia (2018).,

  23. Sudhakar, A., Upadhyay, B., Maheswaran, A.: Transforming delete, retrieve, generate approach for controlled text style transfer. In: EMNLP/IJCNLP, no. 1, pp. 3267–3277 (2019)

    Google Scholar 

  24. Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: ICML, pp. 3319–3328. PMLR (2017)

    Google Scholar 

  25. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  26. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing, pp. 5998–6008 (2017)

    Google Scholar 

  27. Wang, K., Hua, H., Wan, X.: Controllable unsupervised text attribute transfer via editing entangled latent representation. In: Advances in Neural Information Processing Systems, pp. 11034–11044 (2019)

    Google Scholar 

  28. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: EMNLP (Demos), pp. 38–45 (2020)

    Google Scholar 

  29. Wu, X., Zhang, T., Zang, L., Han, J., Hu, S.: Mask and infill: applying masked language model for sentiment transfer. In: IJCAI, pp. 5271–5277 (2019).

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jan Buys .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sinclair, N., Buys, J. (2022). Self-Supervised Text Style Transfer with Rationale Prediction and Pretrained Transformers. In: Pillay, A., Jembere, E., Gerber, A. (eds) Artificial Intelligence Research. SACAIR 2022. Communications in Computer and Information Science, vol 1734. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-22320-4

  • Online ISBN: 978-3-031-22321-1

  • eBook Packages: Computer ScienceComputer Science (R0)