Abstract
Recently, many people are using communication tools on the Web, but some send harmful information to others. Most operators manually deal with harmful information, which is expensive. In this paper, we implement two-word co-occurrence filtering by applying the Bayesian filtering method as a spam filter. We propose grouping co-occurrence filtering based on Bayesian filtering and experimentally verify our approach. Grouping co-occurence filtering detect harmful or safe documents at low cost. Our result suggests that grouping co-occurrence filtering is more stable and has a higher accuracy than co-occurrence filtering baesd on Bayesian filtering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cellular phone companies were requested to promote harmful site access restriction services (filtering service) on mobile phones and PHSs, http://www.soumu.go.jp/menu_news/s-news/2007/071210_4.html
Graham, P.: A plan for spam. In: Graham, P. (ed.) Hackers and Painters. OReilly (2004)
Graham, P.: Better bayesian filtering. In: Proceedings of the 2003 Spam Conference (2003)
Gray, R.: A statistical approach to the spam problem, http://www.linuxjournal.com/article/6467/
Gray, R.: Spam detection (2002), http://radio.weblogs.com/0101454/stories/2002/09/16/spamDetection.html
Mera, K., Ichimura, T., Aizawa, T., Yamashita, T.: Invoking emotions in a dialog system based on word-impressions. Trans. Japanese Society for Artificial Intelligence 173, 186–195 (2002)
Nagata, M., Taira, H.: Text classification, Trade fair of learning theories. IPSJ Magazine 42(1), 32–37 (2001)
Tsuda, Y.: Text Categorization Using Native Bayes Model Based on Co-Occurrence words. In: Symposium on Information Theory and its Applications, SITA (2006)
Manning, C.D., Schutze, H.: Foundations of statistical natural language perspectives. Oxford Univ. Press, New York (1999); Guage processing. MIT Press, Cambridge (1999)
Ando, S., Fujii, Y., Ito, T.: Filtering Harmful Sentences based on Multiple Word Co-occurrence. In: 2010 IEEE/ACIS 9th International Conference on Computer and Information Science (2010)
Matsuo, Y., Ishizuka, M.: Keyword Extraction from a Document using Word Co-occurrence Statistical Information. Transactions of the Japanese Society for Artificial Intelligence 17(3), 217–223
Cortes, C., Vapnik, V.: Support Vector Networks. Machine Learning 20, 273–297 (1995)
Kobayashi, D., Ishizuka, M.: Classification of Spam Posts on Knowledge Searching Website. In: The 21st Annual Conference of the JSAI (2007)
Kumamoto, T., Tanaka, K.: Proposal of Impression Mining from News Articles. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005, Part I. LNCS (LNAI), vol. 3681, pp. 901–910. Springer, Heidelberg (2005)
Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proc. Conference on ACL (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yoshimura, T., Fujii, Y., Ito, T. (2012). Grouping Co-occurrence Filtering Based on Bayesian Filtering. In: Jiang, H., Ding, W., Ali, M., Wu, X. (eds) Advanced Research in Applied Artificial Intelligence. IEA/AIE 2012. Lecture Notes in Computer Science(), vol 7345. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31087-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-31087-4_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31086-7
Online ISBN: 978-3-642-31087-4
eBook Packages: Computer ScienceComputer Science (R0)