Abstract
In recent years, with the rapid development of deep learning, natural language processing has achieved great progress in many aspects. In the field of text generation, classical Chinese poetry, as an important part of Chinese culture, also attached growing attention. However, the existing researches on neural-network-based classical Chinese poetry generation ignore the semantics contained in Chinese words. A sentence in Chinese is a sequence of characters without spaces, and thus it is of great significance to segment the sentence properly for understanding the original text correctly. Therefore, supposing that the model knows how to segment the sentence, the meaning of the sentence will be more accurately understood. In this paper, we propose a novel model, namely WE-Transformer (Word-Enhanced Transformer), to generate classical Chinese poetry from vernacular Chinese in a supervised approach, which incorporates external Chinese word segmentation knowledge. Our model learns word semantics based on character embeddings by bidirectional LSTM and enhances the quality of generated classical poems based on the Transformer with extra word encoders. Compared to the baselines and state-of-the-art models, our experiments on automatic and human evaluations have demonstrated that our method can bring better performance.
Similar content being viewed by others
Data Availability
If this paper is accepted, we will release our data.
Code Availability
If this paper is accepted, we will release our code.
References
Cai D, Lam W (2020) Graph transformer for graph-to-sequence learning. In: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, The thirty-second innovative applications of artificial intelligence conference, IAAI 2020, The tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, pp 7464–7471. https://ojs.aaai.org/index.php/AAAI/article/view/6243
Che W, Feng Y, Qin L, Liu T (2020) N-LTP: a open-source neural chinese language technology platform with pretrained models. 2009.11616
Chen X, Xu L, Liu Z, Sun M, Luan H (2015) Joint learning of character and word embeddings. In: Yang Q, Wooldridge MJ (eds) Proceedings of the twenty-fourth international joint conference on artificial intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015. AAAI Press, pp 1236–1242. http://ijcai.org/Abstract/15/178
Chen Y, Wu L, Zaki MJ (2020) Reinforcement learning based graph-to-sequence model for natural question generation. In: 8th international conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, https://openreview.net/forum?id=HygnDhEtvr
Chiu JPC, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns, vol 4, pp 357–370. https://transacl.org/ojs/index.php/tacl/article/view/792
Cui Y, Che W, Liu T, Qin B, Yang Z, Wang S, Hu G (2019) Pre-training with whole word masking for chinese BERT. 1906.08101
Deng L, Wang J, Liang H, Chen H, Xie Z, Zhuang B, Wang S, Xiao J (2020) An iterative polishing framework based on quality aware masked language model for chinese poetry generation. In: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, the thirty-second innovative applications of artificial intelligence conference, IAAI 2020, the tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, pp 7643–7650. https://aaai.org/ojs/index.php/AAAI/article/view/6265
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, pp 4171–4186. https://doi.org/10.18653/v1/N19-1423, https://www.aclweb.org/anthology/N19-1423
Diao S, Bai J, Song Y, Zhang T, Wang Y (2020) ZEN: Pre-training Chinese text encoder enhanced by n-gram representations. In: Findings of the association for computational linguistics: EMNLP 2020. Association for Computational Linguistics, pp 4729–4740. https://doi.org/10.18653/v1/2020.findings-emnlp.425, https://www.aclweb.org/anthology/2020.findings-emnlp.425
Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1631–1640. https://doi.org/10.18653/v1/P16-1154, https://www.aclweb.org/anthology/P16-1154
He J, Zhou M, Jiang L (2012) Generating chinese classical poems with statistical machine translation models. In: Hoffmann J, Selman B (eds) Proceedings of the twenty-sixth AAAI conference on artificial intelligence, July 22-26, 2012, Toronto, Ontario, Canada. AAAI Press. http://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/4753
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Bengio Y, LeCun Y (eds) 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, conference track proceedings. 1412.69801412.6980
Lample G, Ott M, Conneau A, Denoyer L, Ranzato M (2018) Phrase-based & neural unsupervised machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, pp 5039–5049. https://doi.org/10.18653/v1/D18-1549, https://www.aclweb.org/anthology/D18-1549
Li X, Meng Y, Sun X, Han Q, Li J (2019) Is word segmentation necessary for deep learning of Chinese representations?. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, pp 3242–3252. https://doi.org/10.18653/v1/P19-1314, https://www.aclweb.org/anthology/P19-1314
Li P, Zhang H, Liu X, Shi S (2020) Rigid formats controlled text generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, pp 742–751. https://doi.org/10.18653/v1/2020.acl-main.68, https://www.aclweb.org/anthology/2020.acl-main.68
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized BERT pretraining approach. 1907.11692
Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Philadelphia, pp 311–318. https://doi.org/10.3115/1073083.1073135, https://www.aclweb.org/anthology/P02-1040
Sun Y, Wang S, Li Y, Feng S, Chen X, Zhang H, Tian X, Zhu D, Tian H, Wu H (2019) ERNIE: enhanced representation through knowledge integration. 1904.09223
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp 5998–6008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
Wang Z, He W, Wu H, Wu H, Li W, Wang H, Chen E (2016) Chinese poetry generation with planning based neural network. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, the coling 2016 organizing committee, Osaka, Japan, pp 1051–1060. https://www.aclweb.org/anthology/C16-1100
Yan R (2016) i, poet: automatic poetry composition through recurrent neural networks with iterative polishing schema. In: Kambhampati S (ed) Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016. IJCAI/AAAI Press, pp 2238–2244. http://www.ijcai.org/Abstract/16/319
Yan R, Jiang H, Lapata M, Lin S, Lv X, Li X (2013) i, poet: automatic chinese poetry composition through a generative summarization framework under constrained optimization. In: Rossi F (ed) IJCAI 2013, proceedings of the 23rd international joint conference on artificial intelligence, Beijing, China, August 3-9, 2013. IJCAI/AAAI, pp 2197–2203. http://www.aaai.org/ocs/index.php/IJCAI/IJCAI13/paper/view/6772
Yang Z, Cai P, Feng Y, Li F, Feng W, Chiu ESY, Yu H (2019) Generating classical Chinese poems from vernacular Chinese. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, pp 6155–6164. https://doi.org/10.18653/v1/D19-1637, https://www.aclweb.org/anthology/D19-1637
Yang Z, Dai Z, Yang Y, Carbonell JG, Salakhutdinov R, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. In: Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R (eds) Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp 5754–5764. https://proceedings.neurips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html
Yang X, Lin X, Suo S, Li M (2018) Generating thematic chinese poetry using conditional variational autoencoders with hybrid decoders. In: Lang J (ed) Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, ijcai.org, pp 4539–4545. https://doi.org/10.24963/ijcai.2018/631
Yi X, Li R, Sun M (2018) Chinese poetry generation with a salient-clue mechanism. In: Proceedings of the 22nd conference on computational natural language learning. Association for Computational Linguistics, Brussels, pp 241–250. https://doi.org/10.18653/v1/K18-1024, https://www.aclweb.org/anthology/K18-1024
Yi X, Sun M, Li R, Yang Z (2018) Chinese poetry generation with a working memory model. In: Lang J (ed) Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, ijcai.org, pp 4553–4559. https://doi.org/10.24963/ijcai.2018/633
Yi X, Li R, Yang C, Li W, Sun M (2020) Mixpoet: diverse poetry generation via learning controllable mixed latent space. In: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, The thirty-second innovative applications of artificial intelligence conference, IAAI 2020, The tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, pp 9450–9457. https://aaai.org/ojs/index.php/AAAI/article/view/6488
Zhang X, Li H (2020) AMBERT: a pre-trained language model with multi-grained tokenization. https://arxiv.org/abs/2008.11869
Zhang X, Lapata M (2014) Chinese poetry generation with recurrent neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 670–680. https://doi.org/10.3115/v1/D14-1074, https://www.aclweb.org/anthology/D14-1074
Zhang Y, Yang J (2018) Chinese NER using lattice LSTM. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, pp 1554–1564. https://doi.org/10.18653/v1/P18-1144, https://www.aclweb.org/anthology/P18-1144
Acknowledgements
We would like to thank Yuzheng Xu for the thesis modification and preliminary discussion of the idea.
Funding
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, D., Chen, A.L.P. Classical Chinese poetry generation from vernacular Chinese: a word-enhanced supervised approach. Multimed Tools Appl 82, 39139–39156 (2023). https://doi.org/10.1007/s11042-023-15137-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15137-y