Abstract
As we know people are using lot of social business and many other platforms for different purposes by using internet. Huge amount of data is transferred over networks. Internet has made communication and business online very easy and fast. People are using internet world wide for different purposes. Where internet technology is used for positive purposes same as it is also used for negative or illegal activities. These platforms are also used for lot of illegal activities like terrorism, threads, violation of copyrights, phishing scams, frauds and spams etc. The law enforcement agencies and departments are trying to overcome these problems by using different techniques. This paper includes some tools and techniques to detect these illegal activities on online forums by identifying suspicious discussions, words, users and groups. Stop word, Stemming Algorithm, Suffix & Affix Stemmers, Emotional Algorithms, Levenshtein algorithm, Classification, Brute Force Algorithms and some statistical formulas are discussed in this paper to detect suspicious activities on online forums.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Murugesan, M.S., Devi, R.P., Deepthi, S., Lavanya, V.S., Princy, A.: Automated monitoring suspicious discussions on online forums using data mining statistical corpus based approach. Imp. J. Interdiscip. Res. 2(5) (2016)
Upganlawar, H., Sambhe, N.: Surveillance of suspicious discussions on online forums using text data mining. Int. J. Adv. Electron. Comput. Sci. 4(4) (2017)
Alami, S., Beqqali, O.E.: Detecting suspicious profiles using text analysis within social media. J. Theor. Appl. Inf. Technol. 73(3) (2015)
Kaiser, C., Bodendorf, F.: Monitoring opinions in online forums-a case study from the sports industry. Int. J. Inf. Educ. Technol. 2(3), 212 (2012)
Hosseinkhani, J., Koochakzaei, M., Keikhaee, S., Naniz, J.H.: Detecting suspicion information on the Web using crime data mining techniques. Int. J. Adv. Comput. Sci. Inf. Technol. 3(1), 32–41 (2014)
Yao, Z., Ze-wen, C.: Research on the construction and filter method of stop-word list in text preprocessing. In: Proceedings of 2011 IEEE Intelligent Computation Technology and Automation (ICICTA), pp. 217–221, 11–13 (2011)
Ayral, H., Yavuz, S.: An automated domain specific stop word generation method for natural language text classification. In: International Symposium on Proceedings of Innovations in Intelligent Systems and Applications (INISTA), pp. 500–503, 15–18 June 2011
Silva, C., Ribeiro, B.: The importance of stop word removal on recall values in text categorization. In: 2003 Proceedings of the International Joint Conference on Neural Networks, vol. 3. IEEE (2003)
Yu, S.: Stemming algorithm for text data and application to data mining. In: Proceedings of 2010 IEEE 5th International Conference on Computer Science & Education (ICCSE), pp. 507–510, 24–27 (2010)
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
O’Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media (2010)
Ho, T.K.: Stop word location and identification for adaptive text recognition. In: Proceedings of 2000 IEEE International Journal on Document Analysis and Recognition, vol. 3, no. 1 (2000)
Zeng, Z., Yang, H., Feng, T.: Data mining methods for knowledge discovery. In: Proceedings of 2011 IEEE International Conference on Data Mining Methods for Extraction of Data, pp. 412–415, 29–31 (2011)
Yang, Y.: An evaluation of statistical approaches to text categorization. In: Proceedings of 1999 IEEE Journal on Information Retrieval, vol. 1, no. 1 (1999)
Li, R., Wang, S., Deng, H., Wang, R., Chang, K.C.-C.: Towards social user profiling: unified and discriminative influence model for inferring home locations. In: KDD 2012, Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA (2012)
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Marquiz, S.: Classificateur de Kolmogorov sur le web 7 Juin (2004)
Levorato, V., Van Le, T., Lamure, M., Bui, M.: Distance de compression et classification prétopologique (2009)
Kaufman L., Rousseeuw P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Interscience (1990)
Dommers, M.: Calculating the normalized compression distance between two strings, 20 January 2009
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (1993)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2009)
Agrawal, R., Srikant, R.: Mining sequential motifs. In: 11th International Conference on Data Engineering (1995)
Frank, R., Cheng, C., Pun, V.: Social media sites: new fora for criminal, communication, and investigation opportunities. Research and National Coordination Organized Crime Division Law Enforcement and Policy Branch Public Safety Canada (2011)
Alderson, M.: Facebook: a useful tool for police? Connectedcops. 25 January 2011. Web, 3 February 2011
Sentistrength - sentiment strength detection in short texts. http://sentistrength.wlv.ac.uk
Caren, N.: An Introduction to Text Analysis with Python. http://nealcaren.web.unc.edu/
Gokulakrishnan, B., Priyanthan, P., Ragavan, T., Prasath, N., Perera, A.: Opinion mining and sentiment analysis on a Twitter data stream. In: 2012 International Conference on Advances in ICT for Emerging Regions (ICTer), pp. 182–188 (2012)
Recorded future: Creating an insightful world. https://www.recordedfuture.com/
Voices of the Mumbai terror siege: Police taped chilling phone conversations between suicide terrorists and their Pakistani handlers. http://transcripts.cnn.com/TRANSCRIPTS/0911/15/fzgps.01.html
The Hindu: Audio of 26/11 tape: Zabiuddin ansari briefs terrorists. http://www.thehindu.com/news/resources/article3568903.ecel
Black Friday: The shocking truth behind the 1993 Bombay blast film conversation subtitle. http://www.subtitles.net/en/ppodnapisi/podnapis/i/206775/black-friday-2004-subtitlesl
Jurafsky, D., Bethard, S.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Pearson Education Inc. (2009)
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. 1005 Gravenstein Highway North. O Reilly Media, Inc. Sebastopol (2009)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Gephi: Network analysis and visualization. https://gephi.org/
Kumar, A.S., Singh, S.: Detection of user cluster with suspicious activity in online social networking sites. In: 2013 2nd International Conference on Advanced Computing, Networking and Security (ADCONS), pp. 220–225. IEEE (2013)
Bavane, A.B., Ambilwade Priyanka, V., Bachhav Mourvika, D., Dafal Sumit, N., Fulari Priyanka, Y.: Monitoring suspicious discussions on online forum by data mining
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
ur Rasheed, H., Khan, F.H., Bashir, S., Fatima, I. (2019). Detecting Suspicious Discussion on Online Forums Using Data Mining. In: Bajwa, I., Kamareddine, F., Costa, A. (eds) Intelligent Technologies and Applications. INTAP 2018. Communications in Computer and Information Science, vol 932. Springer, Singapore. https://doi.org/10.1007/978-981-13-6052-7_23
Download citation
DOI: https://doi.org/10.1007/978-981-13-6052-7_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6051-0
Online ISBN: 978-981-13-6052-7
eBook Packages: Computer ScienceComputer Science (R0)