Skip to main content
Log in

Machine learning in films: an approach towards automation in film censoring

  • Original Article
  • Published:
Journal of Data, Information and Management Aims and scope Submit manuscript

Abstract

In most of the countries of the world, before a film is displayed in a theatre, it is mandatory to receive permission from the censor board. The job of a censor board is to evaluate a film, check for inappropriate content, judge the film and assign a rating to the film. In the present, human beings watch a film and rate the film based on how appropriate it would be to display the film to the general audience. With the growing range of application of image and text processing the objective of this research paper is to present how machine learning can be used to develop these two technologies can be used to rate films without the need of human beings. The goal of this paper is to develop an efficient as well as an accurate automated film censoring system which can detect explicit languages which have been used in films that detect inappropriate visual content and it assigns a rating such as ‘Universal’, ‘Universal Adult’ or ‘Adult’ based on the density of sensitive visual content and inappropriate language using the method of machine learning. The result of this paper would be a computer application which accepts the movie file as an input that produces the rating of the film as an output. This technology can be evolved into a mobile application which can accept videos stored on the mobile phone and assign a rating to them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data Availability

All relevant data and material are presented in the main paper.

References

  • Amizah WMW, Ibrahim F, Ahmed F, Mustaffah N, Mahabob MH (2013) Putting policemen as censors in cinemas: the history of film censors in Malaysia. Asian Soc Sci 9(6):43–49

    Article  Google Scholar 

  • Ap-apid, R. (2005) An algorithm for nudity detection. 5th Philippine Computing Science Congress, 201-205

  • Bhatti AQ, Umer M, Adil SH, Ebrahim M, Nawaz D, Ahmed F (2018) Explicit content detection system: an approach towards a safe and ethical environment. Hindawi:1–13. https://doi.org/10.1155/2018/1463546

  • Bushman BJ, Cantor J (2003) Media ratings for violence and sex implications for policymakers and parents. Am Psychol 58(2):130–141

    Article  Google Scholar 

  • Chen, Y., Zhou, Y., Zhu, S., Xu, H. (2012) Detecting offensive language in social media to protect adolescent online safety. International conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, 71–80

  • Chin, H., Kim, J., Kim, Y., Shin, J., Yi, M.Y. 2018 Explicit content detection in music lyrics using machine learning. IEEE International Conference on Big Data and Smart Computing (BigComp), 517–521

  • Cohen J (1968) Multiple regression as a general data analytic system. Psychol Bull 70(6):436–443

    Article  Google Scholar 

  • Dey A (2016) Machine learning algorithms: a review. Int J Comp Sci Inf Technol 7(3):1174–1179

    Google Scholar 

  • Domingos P (2012) A few useful things to know about machine learning. Commun ACM 55(10):78–87

    Article  Google Scholar 

  • Fonti V, Belister E (2017) Feature selection using LASSO. Business Analytics:1–26

  • Foster, D.P., Liberman, M., Stine, R.A. (2013) Featurizing text: converting text into predictors for regression analysis. Working Paper. Department of Statistics,University of Pennsylvania; 2013:1e37. http://citeseerx.ist.psu.edu/viewdoc/summary?doi¼10.1.1.591.6715.

  • Goldwasser MA, Fitzmaurice GM (2006) Multivariate linear regression analysis of childhood psychopathology using multiple informant data. Int J Methods Psychiatr Res 10(1):1–10

    Article  Google Scholar 

  • Grotenhuis MT, Thijs P (2015) Dummy variables and their interactions in regression analysis: examples from research on body mass index. ArXiv Prepr ArXiv:151105728 2015

  • Hippel PT (2007) Regression with missing Y’s: an improved strategy for analyzing multiply imputed data. Sociol Methodol 37:83–117

    Article  Google Scholar 

  • Jha K, Doshi A, Patel P, Shah M (2019) A comprehensive review on automation in agriculture using artificial intelligence. Artif Intell Agric 2:1–12

    Google Scholar 

  • Kakkad V, Patel M, Shah M (2019) Biometric authentication and image encryption for image security in cloud framework. Multiscale Multidiscip Model Exp Des 2:233–248. https://doi.org/10.1007/s41939-019-00049-y

    Article  Google Scholar 

  • Khan JA, Aelst S, Zamar RH (2007) Building a robust linear model with forward selection and stepwise procedures. Comput Stat Data Anal 52:239–248

    Article  MathSciNet  Google Scholar 

  • Kim JH (2015) How to choose the level of significance: a pedagogical note. MPRA Paper. University Library of Munich, Germany. 1–14. http://EconPapers.repec.org/RePEc:pra:mprapa:66373

  • Nathans LL, Oswald FL, Nimon K (2012) Interpreting multiple linear regression: a guidebook of variable importance. Pract Assess Res Eval 17(9):1–19

    Google Scholar 

  • Newell A, Rosenbloom PS (1980) Mechanisms of skill acquisition and the law of practice. In: Anderson JR (ed) Cognitive skills and their acquisition. Erlbaum, In press, Hillsdale, pp 1–60

    Google Scholar 

  • Ngo THD, Puente L (2012) The steps to follow in a multiple regression analysis. SAS Global Forum 2012:1–12

    Google Scholar 

  • Nguyen, D., Smith, N.A., Rose, C.P. (2011) Author age prediction from text using linear regression. LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. 115-123

  • Norman D (2013) The design of everyday things. Basic Books, A Member of the Perseus Books Group New York 1–369

  • Ochoa, V.M.T., Yayilgan, S.Y., Cheikh, F.A. (2012) Adult video content detection using machine learning techniques. Eighth International Conference on Signal Image Technology and Internet Based Systems, 967–974

  • Ozgur C, Hughes Z, Rogers G, Parveen S (2016) Multiple linear regression applications in real estate pricing. Int J Math Stat Invent 4(8):39–50

    Google Scholar 

  • Pandya R, Nadiadwala S, Shah R, Shah M (2019) Buildout of methodology for meticulous diagnosis of K-complex in EEG for aiding the detection of Alzheimer’s by artificial intelligence. Augment Hum Res. https://doi.org/10.1007/s41133-019-0021-6

  • Samuels P, Gilchrist M (2014) Descriptive statistics – measures of spread. Tech Rep:1–4

  • Shah G, Shah A, Shah M (2019) Panacea of challenges in real-world application of big data analytics in healthcare sector. Data Inf Manag:1–10. https://doi.org/10.1007/s42488-019-00010-1

  • Shakil M, 2019. A multiple linear regression model to predict the student’s final grade in a mathematics class, Sam Houston State University 1–12

  • Simon A, Deo MS, Venkatesan S, Babu R (2015) An overview of machine learning and its applications. Int J Electr Sci Eng 1(1):22–24

    Google Scholar 

  • Smola A, Viswanathan SVN (2008) Introduction to machine learning. Cambridge University Press, Cambridge, pp 1–234

    Google Scholar 

  • Sukanya CM, Gokul R, Paul V (2016) A survey on object recognition models. Int J Comp Sci Eng Technol 6(1):48–52

    Google Scholar 

  • Talwar A, Kumar Y (2013) Machine learning: an artificial intelligence methodology. Int J Eng Comp Sci 2(12):3400–3404

    Google Scholar 

  • Timbó NS, Labidi S, Nascimento TP, Lima ML, Neto GN, Matos RC (2016) Approach based linear regression for stock exchange prediction: case study of PETR4 Petrobras Brazil. Int J Artif Intell Appl 7(1):21–31

    Google Scholar 

  • Uyanik GK, Guler N (2013a) A study on multiple linear regression analysis. Procedia Soc Behav Sci 106:234–240

    Article  Google Scholar 

  • Uyanik, G.K., Guler, N. (2013b) A study on multiple linear regression analysis. 4th International Conference on New horizons and Education Procedia - Social and Behavioral Sciences 106, 234–240

  • Vuckovic V (2008) Image and its matrix. Matrix and its Image 12:17–31

  • Wallamyn G, Weler C (2006) Young people, pornography and sexuality: sources and attitudes. J Sch Nurs 22(5):290–295

    Article  Google Scholar 

  • Wong, M., Wright, J., Buswell, R., Brownlee, A. (2013) A comparison of approaches to stepwise regression for global sensitivity analysis used with evolutionary optimization. 13th Conference of International Building Performance Simulation Association, Chambéry. 2551–2558

  • Zou KH, Tuncali K, Silverman SG (2003) Correlation and simple linear regression. Radiology 227:617–628

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to Department of Information and Communication Technology, DAIICT, SRM University and, Pandit Deendayal Petroleum University for the permission to publish this research.

Author information

Authors and Affiliations

Authors

Contributions

All the authors make substantial contribution in this manuscript. KJ, MC, HP and MS participated in drafting the manuscript. KJ and MC wrote the main manuscript, all the authors discussed the results and implication on the manuscript at all stages.

Corresponding author

Correspondence to Manan Shah.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jani, K., Chaudhuri, M., Patel, H. et al. Machine learning in films: an approach towards automation in film censoring. J. of Data, Inf. and Manag. 2, 55–64 (2020). https://doi.org/10.1007/s42488-019-00016-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42488-019-00016-9

Keywords

Navigation