Movie films consumption in Brazil: an analysis of support vector machine classification

  • Marislei NishijimaEmail author
  • Nathalia Nieuwenhoff
  • Ricardo Pires
  • Patrícia R. Oliveira
Open Forum


We employ the support vector machine (SVM) classifier, over different types of kernels, to investigate whether observable variables of individuals and their household information are able to describe their consumption decision of film at theaters in Brazil. Using a very big dataset of 340,000 individuals living in metropolitan areas of a whole large developing economy, we performed a Knowledge Discovery in Databases to classify the film consumers, which results in 80% instances correctly classified. To reduce the degrees of freedom for SVM and to learn the more important determinants of film consumption, we apply the Linear Discriminant Analysis that allows us to identify the key determinants of this consumption. The main individual characteristics are age, education (that merges to be a student), income, and preferences for cultural goods. Regarding the main geographic characteristics, these are the timing of sample, population concentration, and supply of movie theaters. The results point to an ineffective policy for the sector at the time investigated.


Film at theaters SVM LDA KDD Classification Consumers Individual data 



  1. Akbani R, Kwek S, Japkowicz N (2004) Applying support vector machines to imbalanced datasets. In: Boulicaut JF, Esposito F, Giannotti F, Pedreschi D (eds) Machine learning: ECML 2004. ECML 2004. Lecture notes in computer science, vol 3201. Springer, BerlinGoogle Scholar
  2. Bruzzone L, Serpico SB (1997) Classification of imbalanced remote-sensing data by neural networks. Pattern Recogn Lett 18(11–13):1323–1328CrossRefGoogle Scholar
  3. Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27. Software available at Accessed 03 Aug 2018
  4. Chen X, Chen Y, Weinberg CB (2013) Learning about movies: the impact of movie release types on the Nationwide. J Cult Econ 37:359–386CrossRefGoogle Scholar
  5. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297zbMATHGoogle Scholar
  6. Diniz SC, Machado AF (2011) Analysis of the consumption of artistic-cultural goods and services in Brazil. J Cult Econ 35(1):1–18CrossRefGoogle Scholar
  7. Eaton JW, Bateman D, Hauberg S, Wehbring R (2014) GNU Octave version 3.8.1 manual: a high-level interactive language for numerical computations. CreateSpace Independent Publishing Platform. ISBN 1441413006. Accessed 30 July 2018
  8. Fayyad U, Shapiro GP, Smyth P (1996) From data mining to knowledge discovery in databases, AI Magazine, vol 17, Issue 3Google Scholar
  9. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188CrossRefGoogle Scholar
  10. Galar M, Fernandez A, Barrenechea B, Bustince H, Herrera F (2013) A review on ensembles for the class imbalance problem: bagging, boosting, and hybrid-based approaches. IEEE Trans Syst Man Cybern C 42:463–484CrossRefGoogle Scholar
  11. Jehle GA, Reny PJ (2000) Advanced microeconomic theory, 2nd edn. Prentice Hall, USAGoogle Scholar
  12. Kinto EA (2011) Otimização e análise das máquinas de vetores de suporte aplicadas à classificação de documentos. PhD Dissertation, University of Sao Paulo, Sao Paulo, Brazil, p 145Google Scholar
  13. McLachlan Geoffrey (2004) Discriminant analysis and statistical pattern recognition, vol 544. Wiley, USAzbMATHGoogle Scholar
  14. Mitchell TM (1997) Machine learning, 1st edn. [S.1]. McGraw-Hill Science/Engineering/MathGoogle Scholar
  15. Moretti E (2011) Social learning and peer effects in consumption: evidence from movie sales. Rev Econ Stud 78(1):356–393CrossRefGoogle Scholar
  16. Amo S, Rocha, AR (2003) Mining sequential patterns using genetic programming, International Conference on Artificial Intelligence, Las Vegas, USA, pp 451–456Google Scholar
  17. Russel SJ, Norvig P (1995) Artificial intelligence—a modern approach. Pearson Education, MalaysiazbMATHGoogle Scholar
  18. Scott AJ (2017) Creative cities: the role of culture. Revue d’économie Politique 120(1):181–204CrossRefGoogle Scholar
  19. Segaram T (2007) Advance classification: kernel methods and SVMs. In: Programming Collective intelligence: Build Smart Web 2.0 Applications, O’ReillyGoogle Scholar
  20. Witten IH, Frank E, Hall AM (2011) Data mining practical machine learning tools and techniques, 3rd edn. Elsevier, NetherlandsGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Institute of International RelationsUniversity of Sao PauloSão PauloBrazil
  2. 2.School of Arts, Sciences and HumanitiesUniversity of Sao PauloSao PauloBrazil
  3. 3.Federal Institute of Sao PauloSao PauloBrazil

Personalised recommendations