Advertisement

Kernel-Based Generative Adversarial Networks for Weakly Supervised Learning

  • Danilo CroceEmail author
  • Giuseppe Castellucci
  • Roberto Basili
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11946)

Abstract

In recent years, Deep Learning methods have become very popular in NLP classification tasks, due to their ability to reach high performances by relying on very simple input representations. One of the drawbacks in training deep architectures is the large amount of annotated data required for effective training. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs).

In this paper, an SS-GAN is shown to be effective in semantic processing tasks operating in low-dimensional embeddings derived by the unsupervised approximation of rich Reproducing Kernel Hilbert Spaces. Preliminary analyses over a sentence classification task show that the proposed Kernel-based GAN achieves promising results when only 1% of labeled examples are used.

Keywords

Semi-supervised learning Kernel-based deep architectures Generative Adversarial Network 

References

  1. 1.
    Annesi, P., Croce, D., Basili, R.: Semantic compositionality in tree kernels. In: Proceedings of CIKM 2014. ACM (2014)Google Scholar
  2. 2.
    Chapelle, O., Schlkopf, B., Zien, A.: Semi-Supervised Learning, 1st edn. The MIT Press, Cambridge (2010)Google Scholar
  3. 3.
    Collins, M., Duffy, N.: Convolution kernels for natural language. In: Proceedings of Neural Information Processing Systems (NIPS 2001), pp. 625–632 (2001)Google Scholar
  4. 4.
    Croce, D., Filice, S., Castellucci, G., Basili, R.: Deep learning in semantic kernel spaces. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 345–354. Association for Computational Linguistics (2017).  https://doi.org/10.18653/v1/P17-1032, http://aclweb.org/anthology/P17-1032
  5. 5.
    Croce, D., Moschitti, A., Basili, R.: Structured lexical similarity via convolution kernels on dependency trees. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1034–1046. Association for Computational Linguistics, Edinburgh, July 2011. https://www.aclweb.org/anthology/D11-1096
  6. 6.
    Dai, Z., Yang, Z., Yang, F., Cohen, W.W., Salakhutdinov, R.: Good semi-supervised learning that requires a bad GAN. CoRR abs/1705.09783 (2017). http://arxiv.org/abs/1705.09783
  7. 7.
    Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
  8. 8.
    Goldberg, Y.: A primer on neural network models for natural language processing. J. Artif. Intell. Res. 57(1), 345–420 (2016). http://dl.acm.org/citation.cfm?id=3176748.3176757MathSciNetCrossRefGoogle Scholar
  9. 9.
    Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 2672–2680. Curran Associates, Inc. (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
  10. 10.
    Goodfellow, I.J.: NIPS 2016 tutorial: generative adversarial networks. CoRR abs/1701.00160 (2017). http://arxiv.org/abs/1701.00160
  11. 11.
    Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, Qatar, 25–29 October 2014. A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1746–1751 (2014). http://aclweb.org/anthology/D/D14/D14-1181.pdf
  12. 12.
    Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, 12–17 February 2016, pp. 2741–2749 (2016). http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12489
  13. 13.
    Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. CoRR abs/1609.02907 (2016). http://arxiv.org/abs/1609.02907
  14. 14.
    Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 971–980. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/6698-self-normalizing-neural-networks.pdf
  15. 15.
    Li, X., Roth, D.: Learning question classifiers: the role of semantic information. Nat. Lang. Eng. 12(3), 229–249 (2006)CrossRefGoogle Scholar
  16. 16.
    Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)Google Scholar
  17. 17.
    Rappaport Hovav, M., Levin, B.: The syntax-semantics interface, Chap. 19, pp. 593–624. Wiley (2015).  https://doi.org/10.1002/9781118882139.ch19, https://onlinelibrary.wiley.com/doi/abs/10.1002/9781118882139.ch19CrossRefGoogle Scholar
  18. 18.
    Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 2234–2242. Curran Associates, Inc. (2016). http://papers.nips.cc/paper/6125-improved-techniques-for-training-gans.pdf
  19. 19.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)CrossRefGoogle Scholar
  20. 20.
    Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, Hoboken (1998)zbMATHGoogle Scholar
  21. 21.
    Weston, J., Ratle, F., Collobert, R.: Deep learning via semi-supervised embedding. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 1168–1175. ACM, New York (2008).  https://doi.org/10.1145/1390156.1390303, http://doi.acm.org/10.1145/1390156.1390303
  22. 22.
    Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 682–688. MIT Press (2001)Google Scholar
  23. 23.
    Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. CoRR abs/1502.03044 (2015). http://dblp.uni-trier.de/db/journals/corr/corr1502.html#XuBKCCSZB15
  24. 24.
    Yang, Z., Cohen, W.W., Salakhutdinov, R.: Revisiting semi-supervised learning with graph embeddings. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48, pp. 40–48. JMLR.org (2016). http://dl.acm.org/citation.cfm?id=3045390.3045396

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Danilo Croce
    • 1
    Email author
  • Giuseppe Castellucci
    • 1
  • Roberto Basili
    • 1
  1. 1.Department of Enterprise EngineeringUniversity of Roma, Tor VergataRomeItaly

Personalised recommendations