Beyond Keyword Filtering for Message and Conversation Detection

  • D. B. Skillicorn
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3495)

Abstract

Keyword filtering is a commonly used way to select, from a set of intercepted messages, those that need further scrutiny. An obvious countermeasure is to replace words that might be on a keyword list by others. We show that this strategy itself creates a signature in the altered messages that makes them readily detectable using several forms of matrix decomposition. Not only can unusual messages be detected, but sets of related messages can be detected as conversations, even when their endpoints have been obscured (by using transient email addresses, stolen cell phones and so on).

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baker, W.E., Faulkner, R.B.: The social organization of conspiracy: Illegal networks in the heavy electrical equipment industry. American Sociological Review 58, 837–860 (1993)CrossRefGoogle Scholar
  2. 2.
    Coffman, T., Greenblatt, S., Marcus, S.: Graph-based technologies for intelligence analysis. CACM 47(3), 45–47 (2004)Google Scholar
  3. 3.
    European Parliament Temporary Committee on the ECHELON Interception System. Final report on the existence of a global system for the interception of private and commercial communications (echelon interception system) (2001)Google Scholar
  4. 4.
    Garton, L., Haythornthwaite, C., Wellman, B.: Studying online social networks. Journal of Computer-Mediated Communication 3(1) (1997)Google Scholar
  5. 5.
    Golub, G.H., van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)MATHGoogle Scholar
  6. 6.
    Hyvärinen, A.: Survey on independent component analysis. Neural Computing Surveys 2, 94–128 (1999)Google Scholar
  7. 7.
    Ferreri Cancho, R., Solé, R.V.: The small world of human language. In: Proceedings of the Royal Society of London Series B – Biological Sciences, pp. 2261–2265 (2001)Google Scholar
  8. 8.
    Keila, P.S., Skillicorn, D.B.: Structure in the Enron email dataset. In: Third Workshop on Link Analysis, Counterterrorism and Security, SIAM International Data Mining Conference, pp. 55–64 (2005)Google Scholar
  9. 9.
    Krebs, V.E.: Mapping networks of terrorist cells. Connections 24(3), 43–52 (2002)Google Scholar
  10. 10.
    Li, W.: Random texts exhibit Zipf’s-law-like word frequency distribution. IEEETIT: IEEE Transactions on Information Theory 38(6), 1842–1845 (1992)CrossRefGoogle Scholar
  11. 11.
    McArthur, R., Bruza, P.: Discovery of implicit and explicit connections between people using email utterance. In: Proceedings of the Eighth European Conference of Computer-supported Cooperative Work, Helsinki, pp. 21–40 (2003)Google Scholar
  12. 12.
    Tyler, J.R., Wilkinson, D.M., Huberman, B.A.: Email as spectroscopy: Automated discovery of community structure within organizations. HP Labs, Mill Road, Palo Alto CA, 94304, p. 1501 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • D. B. Skillicorn
    • 1
  1. 1.School of ComputingQueen’s University 

Personalised recommendations