Journal of Revenue and Pricing Management

, Volume 16, Issue 3, pp 295–307 | Cite as

Are box office revenues equally unpredictable for all movies? Evidence from a Random forest-based model

Research Article


In this study we develop a model for early box office receipts forecasting that, in addition to traditionally used regressors, uses several inputs that have never been used before, but appeared to be very useful predictors according to our variable importance analysis. New predictors account for the power of actors and directors, as well as for the intensity of competition at the time of movie release. Instead of Motion Picture of Association of America (MPAA) ratings commonly used in movie success prediction, textual information about the reasons for giving a movie its MPAA rating was formalized using word frequency and principal components analyses. The expert system is based on the Random forest algorithm, which outperformed a stepwise regression and a multilayer perceptron neural network. A regression tree-based diagnostic approach allowed us to detect the heterogeneity of model accuracy across segments of data and assess the applicability of the model to different movie types.


data mining sales forecasting nonparametric methods regression neural networks random forest 


  1. Antipov, E.A. and Pokryshevskaya, E. (2011) Accounting for latent classes in movie box office modeling. Journal of Targeting, Measurement and Analysis for Marketing 19(1): 3–10. doi: 10.1057/jt.2011.3.CrossRefGoogle Scholar
  2. Armstrong, J.S. (2001) Evaluating forecasting methods. In Principles of forecasting (pp. 443–472). Springer, New York.Google Scholar
  3. Bishop, C. (1996) Neural Networks for Pattern Recognition. 1st edn Oxford:Oxford University Press.Google Scholar
  4. Breiman, L. (1984) Classification and regression trees. Boca Raton: Chapman & Hall/CRC.Google Scholar
  5. Breiman, L. (2001) Random forests. Machine Learning 45(1): 5–32.CrossRefGoogle Scholar
  6. Delen, D., Sharda, R. and Kumar, P. (2007). Movie forecast Guru: a web-based DSS for Hollywood managers. Decision Support Systems 43(4): 1151–1170. doi: 10.1016/j.dss.2005.07.005.CrossRefGoogle Scholar
  7. Draper, N.R. and Smith, H. (1981) Applied regression analysis. 2nd edn. New York, NY: Wiley.Google Scholar
  8. Elberse, A. and Eliashberg, J. (2002) Dynamic behavior of consumers retailers regarding sequentially released products in international markets: The case of motion pictures. Marketing Science 22: 329–354.CrossRefGoogle Scholar
  9. Eliashberg, J., Hui, S.K. and Zhang, Z. J. (2007) From story line to box office: A new approach for green-lighting movie scripts. Management Science 53(6): 881–893.CrossRefGoogle Scholar
  10. Evans, J.D. (1996) Straightforward statistics for the behavioral sciences. Boston: Brooks/Cole.Google Scholar
  11. Flores, B.E. (1986) A pragmatic view of accuracy measurement in forecasting. Omega 14(2): 93–98.CrossRefGoogle Scholar
  12. Hyndman, R.J. and Koehler, A.B. (2006) Another look at measures of forecast accuracy. International Journal of Forecasting 22(4): 679–688.CrossRefGoogle Scholar
  13. Kim, T., Hong, J. and Kang, P. (2015) Box office forecasting using machine learning algorithms based on SNS data. International Journal of Forecasting 31(2): 364–390.CrossRefGoogle Scholar
  14. Lee, K.J. and Chang, W. (2009) Bayesian belief network for box-office performance: a case study on Korean movies. Expert Systems with Applications 36(1): 280–291. doi: 10.1016/j.eswa.2007.09.042.CrossRefGoogle Scholar
  15. McLachlan, G. and Peel, D. (2004) Finite mixture models. New York: Wiley.Google Scholar
  16. Sharda, R. and Delen, D. (2006) Predicting box-office success of motion pictures with neural networks. Expert Systems with Applications 30(2): 243–254. doi:  10.1016/j.eswa.2005.07.018.CrossRefGoogle Scholar

Copyright information

© Macmillan Publishers Ltd 2016

Authors and Affiliations

  1. 1.National Research University Higher School of EconomicsSaint-PetersburgRussia

Personalised recommendations