Skip to main content
Log in

Exploiting time series based story plot popularity for movie success prediction

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Movie success prediction in the development phase is considered a challenging task due to the availability of very limited information. A movie plot is established during the development phase and it is crucial aspect for determining the movie success. In this research, novel time series based features are proposed for “say” Story popularity in order to predict the movie success accurately. Multiple time series are generated representing the sentiment of a story and plot topics that are collectively termed as “say” Story popularity. A hybrid voting based classifier is created using Gradient Boosting, Random Forest, Support Vector Machine, Multilayer Perceptron, and Deep Learning classifiers to forecast the success of the movies in the development phase. The proposed features enhanced the accuracy by 11.9% and achieves an F1 Score of 75.1% in comparison to the state-of-the-art. This study also conducts experiments that highlight the importance of story popularity on release time related to movie success.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://www.scriptbook.io/

  2. https://scikit-learn.org/stable/supervised_learning.html

References

  1. Abidi SMR, Xu Y, Ni J, Wang X, Zhang W (2020) Popularity prediction of movies: from statistical modeling to machine learning techniques, multimedia tools and applications. pp 1–35

  2. Ahmad IS, Bakar AA, Yaakub MR (2020) Movie revenue prediction based on purchase intention mining using youtube trailer reviews. Inf Process Manag 57(5):102278

    Article  Google Scholar 

  3. Ahmed U, Waqas H, Afzal MT (2020) Pre-production box-office success quotient forecasting. Soft Comput 24(9):6635–6653

    Article  Google Scholar 

  4. Banik R (2017) The movies dataset, dataset on kaggle. Version 7:3

    Google Scholar 

  5. Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR (2021) Abcdm: an attention-based bidirectional cnn-rnn deep model for sentiment analysis. Futur Gener Comput Syst 115:279–294

    Article  Google Scholar 

  6. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation, journal of machine learning research 3 (Jan). pp 993–1022

  7. Chaturvedi S, Srivastava S, Roth D (2018) Where have i heard this story before? identifying narrative similarity in movie remakes. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Vol 2 (Short Papers). pp 673–678

  8. Chen K, Franko K, Sang R (2021) Structured model pruning of convolutional networks on tensor processing units, arXiv:2107.04191

  9. Chow PS (2020) You are here: home/spring 2020_# intelligence/ghost in the (hollywood) machine: emergent applications of artificial intelligence...

  10. Dashtipour K, Gogate M, Li J, Jiang F, Kong B, Hussain A (2020) A hybrid persian sentiment analysis framework: integrating dependency grammar based rules and deep neural networks. Neurocomputing 380. pp 1–10

  11. Dora L, Agrawal S, Panda R, Abraham A (2018) Nested cross-validation based adaptive sparse representation algorithm and its application to pathological brain classification, expert systems with applications. vol 114, pp 313–321

  12. Eliashberg J, Hui S, Zhang S (2010) Green-lighting movie scripts: revenue forecasting and risk management, Ph.D. thesis, Ph, D. Thesis, University of Pennsylvania

  13. Eliashberg J, Hui SK, Zhang ZJ (2014) Assessing box office performance using movie scripts: a kernel-based approach. IEEE Trans Knowl Data Eng 26 (11):2639–2648

    Article  Google Scholar 

  14. Ertugrul AM, Karagoz P (2018) Movie genre classification from plot summaries using bidirectional lstm. In: 2018 IEEE 12th International Conference on Semantic Computing ICSC, IEEE. pp 248–251

  15. Fathalla A, Salah A, Li K, Li K, Francesco P (2020) Deep end-to-end learning for price prediction of second-hand items, knowledge and information systems. pp 1–28

  16. Franses PH (2021) Modeling box office revenues of motion pictures, technological forecasting and social change 169. pp 120812

  17. Gao Z, Malic V, Ma S, Shih P (2019) How to make a successful movie: factor analysis from both financial and critical perspectives. In: International Conference on Information, Springer. pp 669–678

  18. Gorinski PJ, Lapata M (2018) What’s this movie about? a joint neural network architecture for movie content analysis. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Vol 1 (Long Papers), pp 1770–1781

  19. Gross JA, Roberson WC, Foley-Cox JB (2021) Cs 230: film success prediction using nlp techniques

  20. Hunter I, David S, Smith S, Singh S (2016) Predicting box office from the screenplay: a text analytical approach. J Screenwriting 7(2):135–154

    Article  Google Scholar 

  21. Hutto CJ, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International AAAI Conference on Weblogs and Social Media

  22. Hyndman RJ, Athanasopoulos G (2018) Forecasting: principles and practice OTexts

  23. Jabrayilzade E, Arslan AP, Para H, Polatbilek O, Sezerer E, Tekir S (2020) A turkish topic modeling dataset for multi-label classification of movie genre. In: 2020 28th Signal Processing and Communications Applications Conference (SIU), IEEE. pp 1–5

  24. Kim DH (2021) What types of films are successful at the box office? predicting opening weekend and non-opening gross earnings of films, journal of media business studies. pp 1–21

  25. Kim T, Hong J, Kang P (2015) Box office forecasting using machine learning algorithms based on sns data. Int J Forecast 31(2):364–390

    Article  Google Scholar 

  26. Kim Y-J, Lee J-H, Cheong Y-G (2019) Prediction of a movie’s success from plot summaries using deep learning models. ACL 2019:127

    Google Scholar 

  27. Lash MT, Zhao K (2016) Early predictions of movie success: the who, what, and when of profitability. J Manag Inf Syst 33(3):874–903

    Article  Google Scholar 

  28. Lee O-J, Jung JJ (2018) Explainable movie recommendation systems by using story-based similarity. In: IUI Workshops

  29. Manning CD, Manning CD, Schütze H (1999) Foundations of statistical natural language processing MIT press

  30. Moon S, Jalali N, Song R (2022) Green-lighting scripts in the movie pre-production stage: an application of consumption experience carryover theory, journal of business research

  31. Mun MK, Chong CW (2018) Forecasting movie demand using total and split exponential smoothing. Jurnal Ekonomi Malaysia 52(2):81–94

    Google Scholar 

  32. Mundra S, Dhingra A, Kapur A, Joshi D (2019) Prediction of a movie’s success using data mining techniques. In: Information and Communication Technology for Intelligent Systems, Springer. pp 219–227

  33. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML

  34. Nawar A, Toma NT, Mamun S, Kaiser MS, Mahmud M, Rahman MA (2021) Cross-content recommendation between movie and book using machine learning. In: 2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT), IEEE. pp 1–6

  35. Parvandeh S, Yeh H-W, Paulus MP, McKinney B (2020) Consensus features nested cross-validation, bioRxiv. pp 2019–12

  36. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python, journal of machine learning research. vol 12 (Oct), pp 2825–2830

  37. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L Deep contextualized word representations, arXiv:1802.05365

  38. Rasmussen NV (2020) data, camera, action: how algorithms are shaking up european screen production, AoIR selected papers of internet research

  39. Razeen F, Sankar S, Banu WA, Magesh S (2021) Predicting movie success using regression techniques. In: Intelligent Computing and Applications, Springer. pp 657–670

  40. Redfern N (2012) Genre trends at the us box office, 1991 to 2010. Eur J of Am Cult 31(2):145–167

    Article  Google Scholar 

  41. Ru Y, Li B, Liu J, Chai J (2018) An effective daily box office prediction model based on deep neural networks. Cogn Syst Res 52:182–191

    Article  Google Scholar 

  42. Ryoo JH, Wang X, Lu S (2021) Do spoilers really spoil? using topic modeling to measure the effect of spoiler reviews on box office revenue. J Mark 85 (2):70–88

    Article  Google Scholar 

  43. Sarimax: Introduction (2020) (2020) https://www.statsmodels.org/dev/examples/notebooks/generated/statespace_sarimax_stata.html Accessed: 2020-02-30

  44. Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with python. In: Proceedings of the 9th Python in Science Conference. vol 57, Austin, TX. pp 61

  45. Silver-Lasky P (2004) Screenwriting for the 21st century, sterling publishing company

  46. Usero B, Hernández V, Quintana C (2022) Social media mining for business intelligence analytics: an application for movie box office forecasting. In: Intelligent Computing, Springer. pp 981–999

  47. van Gerven M, Bohte S (2018) Artificial neural networks as models of neural information processing, frontiers media SA

  48. Wang Z, Zhang J, Ji S, Meng C, Li T, Zheng Y (2020) Predicting and ranking box office revenue of movies based on big data, information fusion

  49. Wei WW (2006) Time series analysis. In: The Oxford Handbook of Quantitative Methods in Psychology: vol 2

  50. Where data and the movie business meet (2020) https://www.the-numbers.com/, Accessed: 2020-02-30

  51. Xu L, Yu X, Gulliver TA (2021) Intelligent outage probability prediction for mobile iot networks based on an igwo-elman neural network. IEEE Trans Veh Technol 70(2):1365–1375

    Article  Google Scholar 

  52. Zhang C, Tian Y-X, Fan Z-P (2021) Forecasting the box offices of movies coming soon using social media analysis: a method based on improved bass models, expert systems with applications. pp 116241

  53. Zhou Q, Han R, Li T, Xia B (2019) Joint prediction of time series data in inventory management. Knowl and Inf Syst 61(2):905–929

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Muzammil Hussain Shahid: Conceptualization, Methodology, Writing — original draft preparation, Software, Validation, Formal Analysis, Data Curation. Muhammad Arshad Islam: Conceptualization, Supervision, Writing — review and editing Mirza Beg: Conceptualization, Supervision, Writing — review and editing

Corresponding author

Correspondence to Muzammil Hussain Shahid.

Ethics declarations

Conflicts of interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Availability of Data and Material

The data that support the findings of this study are available in Kaggle at [4]. Time series based data were derived from the following resources available in the public domain “the-numbers” at [50]

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

“Any great film is always driven by script, script, script.” Ridley Scott.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shahid, M.H., Islam, M.A. & Beg, M. Exploiting time series based story plot popularity for movie success prediction. Multimed Tools Appl 82, 3509–3534 (2023). https://doi.org/10.1007/s11042-022-13219-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13219-x

Keywords

Navigation