Abstract
This chapter presents the central theme and a big picture of the methods and technologies covered in this book (see Fig. 2.2). For the readers to comprehend presented security and forensics issues, and associated solutions, the content is organized as components of a forensics analysis framework. The framework is employed to analyze online messages by integrating machine learning algorithms, natural language processing techniques, and social networking analysis techniques in order to help cybercrime investigation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
New Report on the State of Phishing Attacks from Wombat Security Shows Significant Increases Year over Year (2016)
(APWG), Phishing Activity Trends Report: Unifying the Global Response To Cybercrime (2017). [Online]. https://docs.apwg.org/reports/apwg_trends_report_q4_2016.pdf
A. Abbasi, H. Chen, J.F. Nunamaker, Stylometric identification in electronic markets: scalability and robustness. J. Manag. Inf. Syst. 25(1), 49–78 (2008)
H. Chen et al., Crime data mining: an overview and case studies, in Proceedings of the 2003 Annual National Conference on Digital Government Research (2003), pp. 1–5
R.C. der Hulst, Introduction to Social Network Analysis (SNA) as an investigative tool. Trends Organ Crime 12(2), 101–121 (2009)
First Amendment And The Media ‘Encyclopedia of Communication and Information’ (2002). [Online]. https://www.encyclopedia.com/media/encyclopedias-almanacs-transcripts-and-maps/first-amendment-and-media
K.C. Darrell Etherington, Large DDoS attacks cause outages at Twitter, Spotify, and other sites (2016). [Online]. https://techcrunch.com/2016/10/21/many-sites-including-twitter-and-spotify-suffering-outage/
M.E. Whitman, H.J. Mattord, Principles of Information Security (Cengage Learning, 2011)
Oregon Woman Loses $400,000 to Nigerian E-Mail Scam (2008). [Online]. http://www.foxnews.com/story/2008/11/17/oregon-woman-loses-400000-to-nigerian-e-mail-scam.html
Scam Victim Stories, Scammer’s Exposed (2017). [Online]. https://scammer419.wordpress.com/scam-victim-stories/
N. Chou, R. Ledesma, Y. Teraguchi, J.C. Mitchell et al., Client-side defense against web-based identity theft, in NDSS (2004)
C.E.H. Chua, J. Wareham, Fighting internet auction fraud: an assessment and proposal. Computer (Long. Beach. Calif) 37(10), 31–37 (2004)
G.-F. Teng, M.-S. Lai, J.-B. Ma, Y. Li, E-mail authorship mining based on SVM for computer forensic, in Proceedings of 2004 International Conference on Machine Learning and Cybernetics, vol. 2 (2004), pp. 1204–1207
Forensic ToolKit. [Online]. https://accessdata.com/products-services/forensic-toolkit-ftk. Accessed 5 May 2020
Encase. [Online]. http://www.guidancesoftware.com/. Accessed 5 May 2020
Data Warehousing—Coplink*/BorderSafe/RISC. [Online]. https://eller.arizona.edu/departments-research/centers-labs/artificial-intelligence/research/previous/coplink. Accessed 5 May 2020
Paraben’s E3: EMX. [Online]. https://www.paraben.com/products/e3-emx. Accessed 5 May 2020
S.J. Stolfo, S. Hershkop, Email mining toolkit supporting law enforcement forensic analyses, in Proceedings of the 2005 National Conference on Digital Government Research (2005), pp. 221–222
S. Argamon, M. Šarić, S.S. Stein, Style mining of electronic messages for multiple authorship discrimination: first results, in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2003), pp. 475–480
M. Koppel, J. Schler, S. Argamon, Computational methods in authorship attribution. J. Am. Soc. Inf. Sci. Technol. 60(1), 9–26 (2009)
H. Baayen, H. Van Halteren, F. Tweedie, Outside the cave of shadows: using syntactic annotation to enhance authorship attribution. Liter. Linguist. Comput. 11(3), 121–132 (1996)
J.F. Burrows, Word-patterns and story-shapes: the statistical analysis of narrative style. Liter. Linguist. Comput. 2(2), 61–70 (1987)
F. Mosteller, D.L. Wallace, Applied Bayesian and Classical Inference: The Case of the Federalist Papers (Springer Science & Business Media, 2012)
J.F. Burrows, ‘An ocean where each kind...’: statistical analysis and some major determinants of literary style. Comput. Hum. 23(4–5), 309–321 (1989)
R.S. Forsyth, D.I. Holmes, Feature-finding for test classification. Liter. Linguist. Comput. 11(4), 163–174 (1996)
O. De Vel, Mining e-mail authorship, in Proc. Workshop on Text Mining, ACM International Conference on Knowledge Discovery and Data Mining (KDD’2000) (2000)
R. Zheng, J. Li, H. Chen, Z. Huang, A framework for authorship identification of online messages: writing-style features and classification techniques. J. Am. Soc. Inf. Sci. Technol. 57(3), 378–393 (2006)
F. Iqbal, R. Hadjidj, B.C.M. Fung, M. Debbabi, A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digit. Investig. 5, S42–S51 (2008)
O. De Vel, A. Anderson, M. Corney, G. Mohay, Mining e-mail content for author identification forensics. ACM SIGMOD Rec. 30(4), 55–64 (2001)
F. Inc, fastText: library for efficient text classification and representation learning (2016)
A. Vaswani et al., Attention is all you need. Adv. Neural Inf. Proces. Syst., 5998–6008 (2017)
P.J. Liu et al., Generating wikipedia by summarizing long sequences, in arXiv Prepr. arXiv1801.10198 (2018)
J. Schroeder, J. Xu, H. Chen, M. Chau, Automated criminal link analysis based on domain knowledge. J. Am. Soc. Inf. Sci. Technol. 58(6), 842–855 (2007)
J. Allan, J. Carbonell, G. Doddington, J. Yamron, Y. Yang et al., Topic detection and tracking pilot study: final report, in Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, vol. 1998 (1998), pp. 194–218
R. Barzilay, N. Elhadad, Inferring strategies for sentence ordering in multidocument news summarization. J. Artif. Intell. Res. 17, 35–55 (2002)
R. Barzilay, K.R. McKeown, Sentence fusion for multidocument news summarization. Comput. Linguist. 31(3), 297–328 (2005)
D. Das, A.F.T. Martins, A survey on automatic text summarization. Liter. Surv. Lang. Stat. II Course C. 4, 192–195 (2007)
M. White, T. Korelsky, C. Cardie, V. Ng, D. Pierce, K. Wagstaff, Multidocument summarization via information extraction, in Proceedings of the First International Conference on Human Language Technology Research (2001), pp. 1–7
N. Chinchor, Overview of MUC-7, in Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29–May 1, 1998 (1998)
E. Minkov, R.C. Wang, W.W. Cohen, Extracting personal names from email: applying named entity recognition to informal text, in Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing (2005), pp. 443–450
G. Wang, H. Chen, H. Atabakhsh, Automatically detecting deceptive criminal identities. Commun. ACM 47(3), 70–76 (2004)
V.R. Carvalho, W.W. Cohen, Learning to extract signature and reply lines from email, in Proceedings of the Conference on Email and Anti-Spam, vol. 2004 (2004)
H. Chen, W. Chung, J. Qin, E. Reid, M. Sageman, G. Weimann, Uncovering the dark Web: a case study of Jihad on the Web. J. Am. Soc. Inf. Sci. Technol. 59(8), 1347–1359 (2008)
A. Pons-Porrata, R. Berlanga-Llavori, J. Ruiz-Shulcloper, Topic discovery based on text mining techniques. Inf. Process. Manag. 43(3), 752–768 (2007)
F. Sebastiani, Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002)
N. Pendar, Toward spotting the pedophile telling victim from predator in text chats, in International Conference on Semantic Computing, 2007. ICSC 2007 (2007), pp. 235–241
E. Elnahrawy, Log-based chat room monitoring using text categorization: a comparative study, in The International Conference on Information and Knowledge Sharing, US Virgin Islands (2002)
H. Dong, S. Cheung Hui, Y. He, Structural analysis of chat messages for topic detection. Online Inf. Rev. 30(5), 496–516 (2006)
T. Kolenda, L.K. Hansen, J. Larsen, Signal detection using ICA: application to chat room topic spotting, in Third Int. Conf. Indep. Compon. Anal. Blind Source Sep. (2001), no. 1, pp. 540–545
Ö. Özyurt, C. Köse, Chat mining: automatically determination of chat conversations’ topic in Turkish text based chat mediums. Expert Syst. Appl. 37(12), 8705–8710 (2010)
Y. Zhang, N. Zincir-Heywood, E. Milios, Narrative text classification for automatic key phrase extraction in web document corpora, in Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management (2005), pp. 51–58
R. Xiong, J. Donath, PeopleGarden: creating data portraits for users, in Proceedings of the 12th Annual ACM Symposium on User Interface Software and Technology (1999), pp. 37–44
J. Bengel, S. Gauch, E. Mittur, R. Vijayaraghavan, Chattrack: chat room topic detection using classification, in International Conference on Intelligence and Security Informatics (2004), pp. 266–277
G. Salton, M.J. McGill, Introduction to modern information retrieval (1986)
H. Chen, W. Chung, J.J. Xu, G. Wang, Y. Qin, M. Chau, Crime data mining: a general framework and some examples. Computer (Long. Beach. Calif). 37(4), 50–56 (2004)
Y. Xiang, M. Chau, H. Atabakhsh, H. Chen, Visualizing criminal relationships: comparison of a hyperbolic tree and a hierarchical list. Decis. Support. Syst. 41(1), 69–83 (2005)
E. Frank, M.A. Hall, I.H. Witten, The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques” (Morgan Kaufmann, 2016)
I.H. Witten, E. Frank, M.A. Hall, C.J. Pal, Data Mining: Practical Machine Learning Tools and Techniques (Morgan Kaufmann, 2016)
E. Frank, S. Kramer, Ensembles of nested dichotomies for multi-class problems, in Proceedings of the Twenty-First International Conference on Machine Learning (2004), p. 39
J.R. Quinlan, Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
M.D. Buhmann, Radial Basis Functions: Theory and Implementations, vol 12 (Cambridge University Press, 2003)
S.E. Robertson, K.S. Jones, Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27(3), 129–146 (1976)
J. Pearl, Bayesian networks: a model of self-activated memory for evidential reasoning, in Proceedings of the 7th Conference of the Cognitive Science Society, 1985 (1985), pp. 329–334
N. Kitaev, D. Klein, Constituency parsing with a self-attentive encoder, in arXiv Prepr. arXiv1805.01052 (2018)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Iqbal, F., Debbabi, M., Fung, B.C.M. (2020). Messaging Forensics In Perspective. In: Machine Learning for Authorship Attribution and Cyber Forensics. International Series on Computer Entertainment and Media Technology. Springer, Cham. https://doi.org/10.1007/978-3-030-61675-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-61675-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61674-8
Online ISBN: 978-3-030-61675-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)