LongStory: Coherent, Complete and Length Controlled Long Story Generation

Park, Kyeongman; Yang, Nakyeong; Jung, Kyomin

doi:10.1007/978-981-97-2253-2_15

Kyeongman Park¹³,
Nakyeong Yang¹³ &
Kyomin Jung¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14646))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

174 Accesses

Abstract

A human author can write any length of story without losing coherence. Also, they always bring the story to a proper ending, an ability that current language models lack. In this work, we present the LongStory for coherent, complete, and length-controlled long story generation. LongStory introduces two novel methodologies: (1) the long and short-term contexts weight calibrator (CWC) and (2) long story structural positions (LSP). The CWC adjusts weights for long-term context Memory and short-term context Cheating, acknowledging their distinct roles. The LSP employs discourse tokens to convey the structural positions of a long story. Trained on three datasets with varied average story lengths, LongStory outperforms other baselines, including the strong story generator Plotmachine, in coherence, completeness, relevance, and repetitiveness. We also perform zero-shot tests on each dataset to assess the model’s ability to predict outcomes beyond its training data and validate our methodology by comparing its performance with variants of our model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We covered a test where alpha is also a learnable parameter rather than a constant hyperparameter in Sect. 4.3.2. In this test, the BERT-tiny determines \(\alpha \), \(\beta \), and \(\gamma \) independently, and finally divides by their sum, so that each variable adds up to 1.
2.
https://huggingface.co/facebook/bart-large.
3.
We crawl this dataset from https://blog.reedsy.com/short-stories/..
4.
https://github.com/hrashkin/plotmachines.
5.
Note that in-self-BLEU score is not the same as self-BLEU score [6]. The self-BLEU score has taken one whole generated document as a hypothesis and the others as references, which cannot represent inner repetitiveness.

References

Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)
Lin, Z., Riedl, M.O.: Plug-and-blend: a framework for plug-and-play controllable story generation with sketches. In: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 17, No. 1, pp. 58–65 (2021)
Google Scholar
Fan, A., Lewis, M., Dauphin, Y.: Hierarchical neural story generation. arXiv preprint arXiv:1805.04833 (2018)
OpenAI. GPT-4 Technical Report (2023)
Google Scholar
Peng, N., Ghazvininejad, M., May, J., Knight, K.: Towards controllable story generation. In: Proceedings of the First Workshop on Storytelling, pp. 43–49 (2018)
Google Scholar
Rashkin, H., Celikyilmaz, A., Choi, Y., Gao, J.: PlotmaChines: outline-conditioned generation with dynamic plot state tracking. arXiv preprint arXiv:2004.14967 (2020)
Yang, K., Peng, N., Tian, Y., Klein, D.: Re3: generating longer stories with recursive reprompting and revision. arXiv preprint arXiv:2210.06774 (2022)
Tang, C., Lin, C., Huang, H., Guerin, F., Zhang, Z.: EtriCA: event-triggered context-aware story generation augmented by cross attention. arXiv preprint arXiv:2210.12463 (2022)
Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)
Google Scholar
Lewis, M., et al.: Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)
Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic keyword extraction from individual documents. Text Mining: Applications and Theory, 1–20 (2010)
Google Scholar
Wang, S., Durrett, G., Erk, K.: Narrative interpolation for generating and understanding stories.arXiv preprint arXiv:2008.07466 (2020)
Yang, K., Klein, D., Peng, N., Tian, Y.: DOC: improving long story coherence with detailed outline control. arXiv preprint arXiv:2212.10077 (2022)
Kryściński, W., Rajani, N., Agarwal, D., Xiong, C., Radev, D.: Booksum: a collection of datasets for long-form narrative summarization. arXiv preprint arXiv:2105.08209 (2021)
Yao, L., Peng, N., Weischedel, R., Knight, K., Zhao, D., Yan, R.: Plan-and-write: towards better automatic storytelling. InP: roceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 7378–7385 (2019)
Google Scholar
Alabdulkarim, A., Li, W., Martin, L.J., Riedl, M.O.: Goal-directed story generation: augmenting generative language models with reinforcement learning. arXiv preprint arXiv:2112.08593 (2021)
Pradyumna, T., Murtaza, D., Lara, J. M., Mehta, A., Harrison, B.: Controllable neural story plot generation via reward shaping. In: Proceedings of the International Joint Conference Artificial Intelligence, pp. 5982–5988 (2019)
Google Scholar
Guan, J., Huang, F., Zhao, Z., Zhu, X., Huang, M.: A knowledge-enhanced pretraining model for commonsense story generation. In: Transactions of the Association for Computational Linguistics, vol. 8, pp. 93–108 (2020)
Google Scholar
Peng, X., Li, S., Wiegreffe, S., Riedl, M.: Inferring the reader: guiding automated story generation with commonsense reasoning. arXiv preprint arXiv:2105.01311 (2021)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp. 74–81 (2004)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318 (2002)
Google Scholar
Safovich, Y., Azaria, A.: Fiction sentence expansion and enhancement via focused objective and novelty curve sampling. In: 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 835–843. IEEE (2020)
Google Scholar
Li, J., Bing, L., Qiu, L., Chen, D., Zhao, D., Yan, R.: Learning to write stories with thematic consistency and wording novelty. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, vol. 01, pp. 1715–1722 (2019)
Google Scholar
Hu, Z., Chan, H.P., Liu, J., Xiao, X., Wu, H., Huang, L.: Planet: dynamic content planning in autoregressive transformers for long-form text generation. arXiv preprint arXiv:2203.09100 (2022)
Yang, K., Klein, D.: FUDGE: controlled text generation with future discriminators. arXiv preprint arXiv:2104.05218 (2021)
Sakaguchi, K., Bhagavatula, C., Bras, R. L., Tandon, N., Clark, P., Choi, Y.: proscript: partially ordered scripts generation via pre-trained language models. arXiv preprint arXiv:2104.08251 (2021)
Budzianowski, P., Vulić, I.: Hello, it’s GPT-2–how can I help you? towards the use of pretrained language models for task-oriented dialogue systems. arXiv preprint arXiv:1907.05774 (2019)
Welleck, S., Kulikov, I., Kim, J., Pang, R.Y., Cho, K.: Consistency of a recurrent language model with respect to incomplete decoding. arXiv preprint arXiv:2002.02492 (2020)
Zellers, R., et al.: Defending against neural fake news. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Guan, J., Mao, X., Fan, C., Liu, Z., Ding, W., Huang, M.: Long text generation by modeling sentence-level and discourse-level coherence. arXiv preprint arXiv:2105.08963 (2021)
Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020)
Google Scholar
McCoy, R.T., Smolensky, P., Linzen, T., Gao, J., Celikyilmaz, A.: How much do language models copy from their training data? evaluating linguistic novelty in text generation using raven. Trans. Assoc. Comput. Linguist. 11, 652–670 (2023)
Article Google Scholar

Download references

Acknowledgements

We thank anonymous reviewers for their constructive and insightful comments. K. Jung is with ASRI, Seoul National University, Korea. This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) [NO.2022-0-00184, Development and Study of AI Technologies to Inexpensively Conform to Evolving Policy on Ethics & NO.2021-0-01343, Artificial Intelligence Graduate School Program (Seoul National University)].

Author information

Authors and Affiliations

Seoul National University, Seoul, Republic of Korea
Kyeongman Park, Nakyeong Yang & Kyomin Jung

Authors

Kyeongman Park
View author publications
You can also search for this author in PubMed Google Scholar
Nakyeong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kyomin Jung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kyomin Jung .

Editor information

Editors and Affiliations

Taipei, Taiwan
De-Nian Yang
Microsoft Research Asia, Beijing, China
Xing Xie
National Yang Ming Chiao Tung University, Hsinchu, Taiwan
Vincent S. Tseng
Duke University, Durham, NC, USA
Jian Pei
National Cheng Kung University, Tainan, Taiwan
Jen-Wei Huang
Silesian University of Technology, Gliwice, Poland
Jerry Chun-Wei Lin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Park, K., Yang, N., Jung, K. (2024). LongStory: Coherent, Complete and Length Controlled Long Story Generation. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14646. Springer, Singapore. https://doi.org/10.1007/978-981-97-2253-2_15

Download citation

DOI: https://doi.org/10.1007/978-981-97-2253-2_15
Published: 25 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2252-5
Online ISBN: 978-981-97-2253-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

LongStory: Coherent, Complete and Length Controlled Long Story Generation