Models in the Wild: On Corruption Robustness of Neural NLP Systems

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11955)


Natural Language Processing models lack a unified approach to robustness testing. In this paper we introduce WildNLP - a framework for testing model stability in a natural setting where text corruptions such as keyboard errors or misspelling occur. We compare robustness of deep learning models from 4 popular NLP tasks: Q&A, NLI, NER and Sentiment Analysis by testing their performance on aspects introduced in the framework. In particular, we focus on a comparison between recent state-of-the-art text representations and non-contextualized word embeddings. In order to improve robustness, we perform adversarial training on selected aspects and check its transferability to the improvement of models with various corruption types. We find that the high performance of models does not ensure sufficient robustness, although modern embedding techniques help to improve it. We release the code of WildNLP framework for the community.


Natural Language Processing Robustness Adversarial examples Deep learning 



Barbara Rychalska and Dominika Basaj were financially supported by grant no. 2018/31/N/ST6/02273 funded by National Science Centre, Poland. Our research was partially supported as a part of RENOIR Project by the European Union’s Horizon 2020 Research and Innovation Programme under the Marie Skodłowska-Curie grant agreement No. 691152 and by the Ministry of Science and Higher Education (Poland), grant No. W34/H2020/2016.


  1. 1.
    Akbik, A., Blythe, D., Vollgraf, R.: Contextual string embeddings for sequence labeling. In: COLING 2018, 27th International Conference on Computational Linguistics, pp. 1638–1649 (2018)Google Scholar
  2. 2.
    Belinkov, Y., Bisk, Y.: Synthetic and natural noise both break neural machine translation. CoRR abs/1711.02173 (2017).
  3. 3.
    Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)CrossRefGoogle Scholar
  4. 4.
    Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2015)Google Scholar
  5. 5.
    Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680. Association for Computational Linguistics, Copenhagen, September 2017.
  6. 6.
    Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
  7. 7.
    Ebrahimi, J., Rao, A., Lowd, D., Dou, D.: HotFlip: white-box adversarial examples for NLP. CoRR abs/1712.06751 (2017).
  8. 8.
    Gao, J., Lanchantin, J., Soffa, M.L., Qi, Y.: Black-box generation of adversarial text sequences to evade deep learning classifiers. CoRR abs/1801.04354 (2018).
  9. 9.
    Gardner, M., et al.: AllenNLP: a deep semantic natural language processing platform (2017)Google Scholar
  10. 10.
    Glockner, M., Shwartz, V., Goldberg, Y.: Breaking NLI systems with sentences that require simple lexical inferences. CoRR abs/1805.02266 (2018).
  11. 11.
    Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015).
  12. 12.
    Howard, J., Ruder, S.: Fine-tuned language models for text classification. CoRR abs/1801.06146 (2018).
  13. 13.
    Jia, R., Liang, P.: Adversarial examples for evaluating reading comprehension systems. CoRR abs/1707.07328 (2017).
  14. 14.
    Liang, B., Li, H., Su, M., Bian, P., Li, X., Shi, W.: Deep text classification can be fooled. CoRR abs/1704.08006 (2017).
  15. 15.
    Papernot, N., McDaniel, P.D., Goodfellow, I.J.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. CoRR abs/1605.07277 (2016).
  16. 16.
    Parikh, A.P., Täckström, O., Das, D., Uszkoreit, J.: A decomposable attention model for natural language inference. In: EMNLP (2016)Google Scholar
  17. 17.
    Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014).
  18. 18.
    Peters, M.E., et al.: Deep contextualized word representations. In: Proceedings of NAACL (2018)Google Scholar
  19. 19.
    Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392. Association for Computational Linguistics (2016).
  20. 20.
    Ribeiro, M.T., Singh, S., Guestrin, C.: Semantically equivalent adversarial rules for debugging NLP models. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Long Papers, vol. 1, pp. 856–865. Association for Computational Linguistics (2018).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Warsaw University of TechnologyWarsawPoland
  2. 2.TooplooxWrocławPoland

Personalised recommendations