Skip to main content

Method for Pornography Filtering in the WEB Based on Automatic Classification and Natural Language Processing

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8113))

Abstract

The paper presents a method for pornography detection in the web pages based on natural language processing. The described classification method uses feature set of single words and groups of words. Syntax analysis is performed to extract collocations. A modification of TF-IDF is used to weight terms. An evaluation and comparison of quality and performance of classification are given.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. TopTenReviews: Internet pornography statistics (March 2013), http://internet-filter-review.toptenreviews.com/internet-pornography-statistics.html

  2. Polpinij, J., Chotthanom, A., Sibunruang, C., Chamchong, R., Puangpronpitag, S.: Content-based text classifiers for pornographic web filtering. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2006, vol. 2, pp. 1481–1485 (2006)

    Google Scholar 

  3. Polpinij, J., Sibunruang, C., Paungpronpitag, S., Chamchong, R., Chotthanom, A.: A web pornography patrol system by content-based analysis: In particular text and image. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2008, pp. 500–505 (2008)

    Google Scholar 

  4. Ho, W., Watters, P.: Statistical and structural approaches to filtering internet pornography. In: 2004 IEEE International Conference on Systems, Man and Cybernetics, vol. 5, pp. 4792–4798 (2004)

    Google Scholar 

  5. Lee, P., Hui, S., Fong, A.: A structural and content-based analysis for web filtering. Internet Research 13(1), 27–37 (2003)

    Article  Google Scholar 

  6. Hammami, M., Chahir, Y., Chen, L.: Webguard: Web based adult content detection and filtering system. In: Proceedings of the IEEE/WIC International Conference on Web Intelligence, WI 2003, pp. 574–578 (2003)

    Google Scholar 

  7. Hu, W., Wu, O., Chen, Z., Fu, Z., Maybank, S.: Recognition of pornographic web pages by classifying texts and images. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(6), 1019–1034 (2007)

    Article  Google Scholar 

  8. eTesting Labs: U.S. department of justice: Updated web content filtering software comparison. Technical report, eTesting Labs (2001)

    Google Scholar 

  9. Chou, C.-H., Sinha, A.P., Zhao, H.: A text mining approach to internet abuse detection. Information Systems and e-Business Management (2008)

    Google Scholar 

  10. Su, G.Y., Li, J.H., Ma, Y.H., Li, S.H.: Improving the precision of the keyword-matching pornographic text filtering method using a hybrid model. Journal of Zhejiang University Science 5(9), 1106–1113 (2004)

    Article  Google Scholar 

  11. Churcharoenkrung, N., Kim, Y.S., Kang, B.H.: Dynamic web content filtering based on user’s knowledge. In: Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC 2005), vol. I, pp. 184–188. IEEE Computer Society, Washington, DC (2005)

    Google Scholar 

  12. Du, R., Safavi-Naini, R., Susilo, W.: Web filtering using text classification. In: The 11th IEEE International Conference on Networks, ICON 2003, pp. 325–330 (2003)

    Google Scholar 

  13. Mbaykodzhi, A., Dral, A.A., Sochenkov, I.V.: Short text messages classification method. Information Technologies and Computational Systems (3), 93–102 (2012)

    Google Scholar 

  14. Manning, C., Raghavan, P., Shutze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)

    Google Scholar 

  15. FreeLing: An open source suite of language analyzers, http://nlp.lsi.upc.edu/freeling/

  16. AOT: Automatic text processing, http://aot.ru/

  17. Osipov, G., Smirnov, I., Tikhomirov, I., Shelmanov, A.: Relational-situational method for intelligent search and analysis of scientific publications. In: Proceedings of the Integrating IR Technologies for Professional Search Workshop, pp. 57–64 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Suvorov, R., Sochenkov, I., Tikhomirov, I. (2013). Method for Pornography Filtering in the WEB Based on Automatic Classification and Natural Language Processing. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01931-4_31

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-01930-7

  • Online ISBN: 978-3-319-01931-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics