Skip to main content
Log in

Chinese story generation of sentence format control based on multi-channel word embedding and novel data format

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

It is very difficult to generate stories in Chinese language. So far, there is no effective method to generate smooth articles. Here proposed a novel approach to improve the generation of Chinese stories in artificial intelligence, in order that it can effectively control the part-of-speech structure in sentence generation to imitate the writer’s writing style. The main proposal consists of three parts. First, the pre-processing of the sentence discards the input as the summary and the output as the text. It uses the format containing < SOS >  < MOS >  < EOS > for processing and the detailed method is defined in session 4. The second part is for vectorization. Traditional vectorization methods include Word2vec, Fasttext, LexVec and Glove; the different vectorization methods can help data semantic or grammatical understanding. Combining different vectorization methods improves the information of the input data. Therefore, this paper proposes the multi-channel word embedding and the details defined in session 5. The last part contains the optimization of the model architecture and how to control the process of sentence generation effectively. It also rewrites the Bert model proposed by Google to be the proposed model architecture. In addition, the Softmax function had been optimizing to reduce the search time during training and increase the training speed in the model. To make the model have better performance, the necessary training of the generative adversarial network was carried out, and the GAN architecture was revised for the data set, and the detail is defined in session 6. After the model is trained, to effectively control the structure of the generated sentence. This paper proposes a complete generation flowchart. In the process, based on the concept of FP-Tree, all sentences in the data set are built into a tree structure, and the part-of-speech structure of the next sentence is restricted through model generation combined with FP-Tree and the detail is defined in session 7. In addition, the experimental results show that our proposed method can effectively control the results of Chinese story generation and generate sentences with better performance and the detail is defined in session 8.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Bannard C, Callison-Burch C (2005) Paraphrasing with bilingual parallel corpora In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05) (pp 597–604)

  • Bartoli A, De Lorenzo A, Medvet E, Tarlao F (2016) Your paper has been accepted, rejected, or whatever: Automatic generation of scientific paper reviews In International Conference on Availability, Reliability, and Security (pp 19–28). Springer, Cham

  • Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311

    Google Scholar 

  • Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, ... Amodei D (2020) Language models are few-shot learners arXiv preprint arXiv:2005.14165

  • Che T, Li Y, Zhang R, Hjelm RD, Li W, Song Y, Bengio Y (2017) Maximum-likelihood augmented discrete generative adversarial networks. arXiv preprint arXiv:1702.07983

  • Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078

  • Clarke J, Lapata M (2010) Discourse constraints for document compression. Comput Linguist 36(3):411–441

    Article  Google Scholar 

  • Dai AM, Le QV (2015) Semi-supervised sequence learning. Adv Neural Inf Process Syst 28:3079–3087

    Google Scholar 

  • Dalte P (2020) Inteligência artificial e poesia: Uma reflexão sobre o caso dos” Poetry bots”. Revista 2i: Estudos de Identidade e Intermedialidade. 2(2): 165–177

  • Devlin J, Chang MW, Lee K, Toutanova K. (2018) Bert: Pre-training of deep bidirectional transformers for language understanding arXiv preprint arXiv:1810.04805

  • Guo B, Zhang C, Liu J, Ma X (2019) Improving text classification with weighted word embeddings via a multi-channel TextCNN model. Neurocomputing 363:366–374

    Article  Google Scholar 

  • Guo J, Lu S, Cai H, Zhang W, Yu Y, Wang J (2018) Long text generation via adversarial training with leaked information In Proceedings of the AAAI Conference on Artificial Intelligence (Vol 32, No. 1)

  • Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. ACM SIGMOD Rec 29(2):1–12

    Article  Google Scholar 

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Howard J, Ruder S (2018) Universal language model fine-tuning for text classification arXiv preprint arXiv:1801.06146

  • Kebria PM, Khosravi A, Salaken SM, Nahavandi S (2019) Deep imitation learning for autonomous vehicles based on convolutional neural networks. IEEE/CAA J. Automatica Sinica 7(1):82–95

    Article  Google Scholar 

  • Lestrade S (2017) Unzipping Zipf’s law. PLoS ONE 12(8):1987

    Article  Google Scholar 

  • Lin K, Li D, He X, Zhang Z, Sun MT (2017) Adversarial ranking for language generation arXiv preprint arXiv:1705.11001

  • Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In Text summarization branches out (pp 74–81)

  • Luo X, Zhou M, Li S, Shang M (2017) An inherently nonnegative latent factor model for high-dimensional and sparse matrices from industrial applications. IEEE Trans Industr Inf 14(5):2011–2022

    Article  Google Scholar 

  • Mirza M, Osindero S (2014) Conditional generative adversarial nets arXiv preprint arXiv:1411.1784

  • Mohammad SM, Kiritchenko S, Zhu X (2013) NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets. arXiv preprint arXiv:1308.6242

  • Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51

    Article  Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp 311–318)

  • Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations arXiv preprint arXiv:1802.05365

  • Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. OpenAI Blog 1(8):9

    Google Scholar 

  • Radford A, Narasimhan K, Salimans T, Sutskever I (2018). Improving language understanding by generative pre-training

  • Subramanian S, Li R, Pilault J, Pal C (2019) On extractive and abstractive neural document summarization with transformer language models. arXiv preprint arXiv:1909.03186

  • Sundermeyer M, Schlüter R, Ney H (2012). LSTM neural networks for language modeling In Thirteenth annual conference of the international speech communication association

  • Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks arXiv preprint arXiv:1409.3215

  • Tissier J, Gravier C, Habrard A (2017) Dict2vec: learning word embeddings using lexical dictionaries In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp 254–263)

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, ... Polosukhin I (2017) Attention is all you need In Advances in neural information processing systems (pp. 5998–6008)

  • Wang Q, Huang L, Jiang Z, Knight K, Ji H, Bansal M, Luan Y (2019) Paperrobot: incremental draft generation of scientific ideas arXiv preprint arXiv:1905.07870

  • Yi X, Li R, Yang C, Li W, Sun M (2020) Mixpoet: diverse poetry generation via learning controllable mixed latent space In Proceedings of the AAAI Conference on Artificial Intelligence (Vol 34, No 05, pp 9450–9457)

  • Yoon J, Kim H (2017) Multi-channel lexicon integrated CNN-BiLSTM models for sentiment analysis In Proceedings of the 29th conference on computational linguistics and speech processing (ROCLING 2017) (pp 244–253)

  • Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: Sequence generative adversarial nets with policy gradient In Proceedings of the AAAI conference on artificial intelligence (Vol 31, No 1)

  • Zhang Y, Wallace B (2015) A sensitivity analysis of (and practitioners' guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:1510.03820

  • Zhang Y, Gan Z, Fan K, Chen Z, Henao R, Shen D, Carin L (2017) Adversarial feature matching for text generation In International Conference on Machine Learning (pp 4006–4015) PMLR

  • Zhu Y, Lu S, Zheng L, Guo J, Zhang W, Wang J, Yu Y (2018) Texygen: a benchmarking platform for text generation models. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (pp 1097–1100)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jhe-Wei Lin.

Ethics declarations

Conflict of interest

Authors have no conflict of interest to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, JW., Chang, RG. Chinese story generation of sentence format control based on multi-channel word embedding and novel data format. Soft Comput 26, 2179–2196 (2022). https://doi.org/10.1007/s00500-021-06548-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-06548-w

Keywords

Navigation