Skip to main content

Advertisement

Log in

Lifelong topic modeling with knowledge-enhanced adversarial network

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Lifelong topic modeling has attracted much attention in natural language processing (NLP), since it can accumulate knowledge learned from past for the future task. However, the existing lifelong topic models often require complex derivation or only utilize part of the context information. In this study, we propose a knowledge-enhanced adversarial neural topic model (KATM) and extend it to LKATM for lifelong topic modeling. KATM employs a knowledge extractor to encourage the generator to learn interpretable document representations and retrieve knowledge from the generated documents. LKATM incorporates knowledge from the previous trained KATM into the current model to learn from prior models without catastrophic forgetting. Experiments on four benchmark text streams validate the effectiveness of our KATM and LKATM in topic discovery and document classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://cs.nyu.edu/~roweis/data.html

  2. http://www.nltk.org/data.html

  3. http://aksw.org/Projects/Palmetto.html

References

  1. Aletras, N., Stevenson, M.: Evaluating topic coherence using distributional semantics. In: Proceedings of the 10th International Conference on Computational Semantics, pp. 13–22 (2013). https://www.aclweb.org/anthology/W13-0102/. Accessed 23 Oct 2020

  2. Bengio, Y.: Discussion of the neural autoregressive distribution estimator. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, vol. 15, pp. 38–39 (2011). http://proceedings.mlr.press/v15/bengio11a/bengio11a.pdf. Accessed 22 Oct 2020

  3. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  4. Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, pp. 2787–2794 (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16755. Accessed 11 Nov 2020

  5. Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with A-GEM. In: Proceedings of the 7th International Conference on Learning Representations (2019). https://openreview.net/forum?id=Hkf2_sC5FX. Accessed 1 Nov 2020

  6. Chen, D., Mei, J., Wang, C., Feng, Y., Chen, C.: Online knowledge distillation with diverse peers. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, pp. 3430–3437 (2020). https://aaai.org/ojs/index.php/AAAI/article/view/5746. Accessed 1 Nov 2020

  7. Chen, Q., Zhu, X., Ling, Z., Inkpen, D., Wei, S.: Neural natural language inference models enhanced with external knowledge. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 2406–2417. Association for Computational Linguistics (2018). https://aclanthology.org/P18-1224/. Accessed 1 Jan 2021

  8. Chen, T., Goodfellow, I.J., Shlens, J.: Net2net: Accelerating learning via knowledge transfer. In: Y. Bengio, Y. LeCun (eds.) Proceedings of the 4th International Conference on Learning Representations (2016). arxiv:1511.05641. Accessed 1 Jan 2021

  9. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: A deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31th International Conference on Machine Learning, pp. 647–655 (2014)

  10. Du, W., Black, A.W.: Data augmentation for neural online chats response selection. In: Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI, pp. 52–58 (2018). https://doi.org/10.18653/v1/w18-5708

  11. Fan, W., Guo, Z., Bouguila, N., Hou, W.: Clustering-based online news topic detection and tracking through hierarchical bayesian nonparametric models. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2126–2130. ACM (2021). https://doi.org/10.1145/3404835.3462982

  12. Feng, Y., Feng, J., Rao, Y.: Reward-modulated adversarial topic modeling. In: Proceedings of the 25th International Conference on Database Systems for Advanced Applications, vol. 12112, pp. 689–697 (2020). https://doi.org/10.1007/978-3-030-59410-7_47

  13. Fu, Y., Feng, Y.: Natural answer generation with heterogeneous memory. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 185–195. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/n18-1017

  14. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems vol 27, pp. 2672–2680 (2014). https://proceedings.neurips.cc/paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.html. Accessed on 1 Sep 2020

  15. Gupta, P., Chaudhary, Y., Buettner, F., Schütze, H.: Document informed neural autoregressive topic models with distributional prior. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, pp. 6505–6512 (2019). https://doi.org/10.1609/aaai.v33i01.33016505

  16. Gupta, P., Chaudhary, Y., Runkler, T.A., Schütze, H.: Neural topic modeling with continual lifelong learning. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119, pp. 3907–3917 (2020). http://proceedings.mlr.press/v119/gupta20a.html. Accessed 10 Sep 2020

  17. Han, X., Dai, Y., Gao, T., Lin, Y., Liu, Z., Li, P., Sun, M., Zhou, J.: Continual relation learning via episodic memory activation and reconsolidation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6429–6440 (2020). https://www.aclweb.org/anthology/2020.acl-main.573/. Accessed 1 Oct 2020

  18. He, S., Liu, C., Liu, K., Zhao, J.: Generating natural answers by incorporating copying and retrieving mechanisms in sequence-to-sequence learning. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 199–208. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-1019

  19. Hida, R., Takeishi, N., Yairi, T., Hori, K.: Dynamic and static topic model for analyzing time-series document collections. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 516–520 (2018). https://www.aclweb.org/anthology/P18-2082/. Accessed 20 Sep 2020

  20. Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arxiv:1503.02531 (2015). Accessed 1 Sep 2020

  21. Hoyle, A., Goel, P., Resnik, P.: Improving neural topic models using knowledge distillation. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 1752–1771 (2020). https://doi.org/10.18653/v1/2020.emnlp-main.137

  22. Hu, X., Wang, R., Zhou, D., Xiong, Y.: Neural topic modeling with cycle-consistent adversarial training. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 9018–9030 (2020). https://www.aclweb.org/anthology/2020.emnlp-main.725/. Accessed 25 Nov 2020

  23. Huang, J., Peng, M., Li, P., Hu, Z., Xu, C.: Improving biterm topic model with word embeddings. World Wide Web 23(6), 3099–3124 (2020). https://doi.org/10.1007/s11280-020-00823-w

    Article  Google Scholar 

  24. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 448–456 (2015). http://proceedings.mlr.press/v37/ioffe15.html. Accessed 1 Oct 2020

  25. Jiang, H., Zhou, R., Zhang, L., Wang, H., Zhang, Y.: A topic model based on poisson decomposition. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1489–1498. ACM (2017). https://doi.org/10.1145/3132847.3132942

  26. Jiang, H., Zhou, R., Zhang, L., Wang, H., Zhang, Y.: Sentence level topic models for associated topics extraction. World Wide Web 22(6), 2545–2560 (2019). https://doi.org/10.1007/s11280-018-0639-1

    Article  Google Scholar 

  27. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., Liu, T.: Lightgbm: A highly efficient gradient boosting decision tree. In: Proceedings of the 31st Conference on Neural Information Processing Systems, pp. 3146–3154 (2017). https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html. Accessed on 20 Nov 2020

  28. Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: Generalization gap and sharp minima. In: Proceedings of the 5th International Conference on Learning Representations (2017). https://openreview.net/forum?id=H1oyRlYgg. Accessed 1 Oct 2020

  29. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (2015). arxiv: 1412.6980. Accessed 15 Sep 2020

  30. Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., Hassabis, D., Clopath, C., Kumaran, D., Hadsell, R.: Overcoming catastrophic forgetting in neural networks. In: Proceedings of the National Academy of Sciences, pp. 3521–3526 (2017)

  31. Lauly, S., Zheng, Y., Allauzen, A., Larochelle, H.: Document neural autoregressive distribution estimation. Journal of Machine Learning Research 18, 113:1–113:24 (2017). http://jmlr.org/papers/v18/16-017.html. Accessed 20 Sep 2020

  32. Li, Z., Hoiem, D.: Learning without forgetting. In: Proceedings of the 14th European Conference on Computer Vision, pp. 614–629. Springer (2016)

  33. Liu, Y., Zhang, W., Wang, J.: Adaptive multi-teacher multi-level knowledge distillation. Neurocomputing 415, 106–113 (2020). https://doi.org/10.1016/j.neucom.2020.07.048

    Article  Google Scholar 

  34. Madotto, A., Wu, C., Fung, P.: Mem2seq: Effectively incorporating knowledge bases into end-to-end task-oriented dialog systems. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 1468–1478. Association for Computational Linguistics (2018). https://aclanthology.org/P18-1136/

  35. Marsland, S., Shapiro, J., Nehmzow, U.: A self-organising network that grows when required. Neural Networks 15, 1041–1058 (2002). https://www.sciencedirect.com/science/article/pii/S0893608002000783. Accessed 11 Nov 2020

  36. Mccloskey, M.: Catastrophic interference in connectionist networks: The sequential learning problem. Psychology of Learning and Motivation 24, 109–165 (1989)

    Article  Google Scholar 

  37. Miao, Y., Grefenstette, E., Blunsom, P.: Discovering discrete latent topics with neural variational inference. In: Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 70, pp. 2410–2419 (2017). http://proceedings.mlr.press/v70/miao17a.html. Accessed 23 Sep 2020

  38. Miao, Y., Yu, L., Blunsom, P.: Neural variational inference for text processing. In: Proceedings of the 33nd International Conference on Machine Learning, vol. 48, pp. 1727–1736 (2016). http://proceedings.mlr.press/v48/miao16.html. Accessed 20 Sep 2020

  39. Mimno, D.M., Wallach, H.M., Talley, E.M., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 262–272 (2011). https://www.aclweb.org/anthology/D11-1024/. Accessed 1 Oct 2020

  40. li Ming, G., Song, H.: Adult neurogenesis in the mammalian brain: Significant answers and significant questions. Neuron 70, 687–702 (2011). https://www.sciencedirect.com/science/article/pii/S0896627311003485. Accessed 15 Nov 2020

  41. Mnih, A., Gregor, K.: Neural variational inference and learning in belief networks. In: Proceedings of the 31th International Conference on Machine Learning, JMLR Workshop and Conference Proceedings, pp. 1791–1799. JMLR.org (2014). http://proceedings.mlr.press/v32/mnih14.html. Accessed 10 Sep 2020

  42. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, pp. 807–814 (2010). https://icml.cc/Conferences/2010/papers/432.pdf. Accessed 11 Oct 2020

  43. Nan, F., Ding, R., Nallapati, R., Xiang, B.: Topic modeling with wasserstein autoencoders. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, pp. 6345–6381 (2019). https://doi.org/10.18653/v1/p19-1640

  44. Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: A review. Neural Networks 113, 54–71 (2019). https://doi.org/10.1016/j.neunet.2019.01.012

    Article  Google Scholar 

  45. Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014). https://doi.org/10.3115/v1/d14-1162

  46. Peters, M.E., Neumann, M., IV, R.L.L., Schwartz, R., Joshi, V., Singh, S., Smith, N.A.: Knowledge enhanced contextual word representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 43–54. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1005

  47. Rebuffi, S., Kolesnikov, A., Sperl, G., Lampert, C.H.: icarl: Incremental classifier and representation learning. In: Proceedings of the 30th Conference on Computer Vision and Pattern Recognition, pp. 5533–5542 (2017). https://doi.org/10.1109/CVPR.2017.587

  48. Robins, A.V.: Catastrophic forgetting, rehearsal and pseudorehearsal. Connect. Sci. 7(2), 123–146 (1995). https://doi.org/10.1080/09540099550039318

    Article  Google Scholar 

  49. Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the 8th ACM International Conference on Web Search and Data Mining, pp. 399–408 (2015). https://doi.org/10.1145/2684822.2685324

  50. Shen, Y., Zeng, X., Jin, H.: A progressive model to enable continual learning for semantic slot filling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 1279–1284 (2019). https://doi.org/10.18653/v1/D19-1126

  51. Srivastava, A., Sutton, C.: Autoencoding variational inference for topic models. In: Proceedings of the 5th International Conference on Learning Representation (2017). https://openreview.net/forum?id=BybtVK9lg. Accessed 19 Sep 2020

  52. Venkatesaramani, R., Downey, D., Malin, B.A., Vorobeychik, Y.: A semantic cover approach for topic modeling. In: Proceedings of the 8th Joint Conference on Lexical and Computational Semantics, pp. 92–102 (2019). https://doi.org/10.18653/v1/s19-1011

  53. Wang, H., Xiong, W., Yu, M., Guo, X., Chang, S., Wang, W.Y.: Sentence embedding alignment for lifelong relation extraction. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 796–806 (2019). https://doi.org/10.18653/v1/n19-1086

  54. Wang, R., Hu, X., Zhou, D., He, Y., Xiong, Y., Ye, C., Xu, H.: Neural topic modeling with bidirectional adversarial training. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 340–350 (2020). https://www.aclweb.org/anthology/2020.acl-main.32/. Accessed 19 Sep 2020

  55. Wang, R., Zhou, D., He, Y.: ATM: adversarial-neural topic model. Information Processing and Management 56 (2019). https://doi.org/10.1016/j.ipm.2019.102098

  56. Wang, R., Zhou, D., He, Y.: Open event extraction from online text using a generative adversarial network. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 282–291 (2019). https://doi.org/10.18653/v1/D19-1027

  57. Wang, S., Chen, Z., Liu, B.: Mining aspect-specific opinion using a holistic lifelong topic model. In: Proceedings of the 25th International Conference on World Wide Web, pp. 167–176 (2016). https://doi.org/10.1145/2872427.2883086

  58. Yang, P., Li, L., Luo, F., Liu, T., Sun, X.: Enhancing topic-to-essay generation with external commonsense knowledge. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, pp. 2002–2012. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/p19-1193

  59. Yu, W., Zhu, C., Li, Z., Hu, Z., Wang, Q., Ji, H., Jiang, M.: A survey of knowledge-enhanced text generation. arxiv:2010.04389 (2020)

  60. Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: Proceedings of the 34th International Conference on Machine Learning, pp. 3987–3995 (2017)

  61. Zhang, H., Liu, Z., Xiong, C., Liu, Z.: Grounded conversation generation as guided traverses in commonsense knowledge graphs. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 2031–2043 (2020). https://doi.org/10.18653/v1/2020.acl-main.184

  62. Zhou, H., Young, T., Huang, M., Zhao, H., Xu, J., Zhu, X.: Commonsense knowledge aware conversation generation with graph attention. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 4623–4629. ijcai.org (2018) https://doi.org/10.24963/ijcai.2018/643

Download references

Acknowledgements

The research described in this paper was supported by the National Natural Science Foundation of China (61972426), Guangdong Basic and Applied Basic Research Foundation (2020A1515010536), the Hong Kong Research Grants Council (project no. PolyU 11204919), and an internal research grant from the Hong Kong Polytechnic University (project 1.9B0V).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanghui Rao.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Rao, Y. & Li, Q. Lifelong topic modeling with knowledge-enhanced adversarial network. World Wide Web 25, 219–238 (2022). https://doi.org/10.1007/s11280-021-00984-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-021-00984-2

Keywords

Navigation