Detecting Near-Duplicate SPITs in Voice Mailboxes Using Hashes

  • Ge Zhang
  • Simone Fischer-Hübner
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7001)


Spam over Internet Telephony (SPIT) is a threat to the use of Voice of IP (VoIP) systems. One kind of SPIT can make unsolicited bulk calls to victims’ voice mailboxes and then send them a prepared audio message. We detect this threat within a collaborative detection framework by comparing unknown VoIP flows with known SPIT samples since the same audio message generates VoIP flows with the same flow patterns (e.g., the sequence of packet sizes). In practice, however, these patterns are not exactly identical: (1) a VoIP flow may be unexpectedly altered by network impairments (e.g., delay jitter and packet loss); and (2) a sophisticated SPITer may dynamically generate each flow. For example, the SPITer employs a Text-To-Speech (TTS) synthesis engine to generate a speech audio instead of using a pre-recorded one. Thus, we measure the similarity among flows using local-sensitive hash algorithms. A close distance between the hash digest of flow x and a known SPIT suggests that flow x probably belongs the same bulk of the known SPIT. Finally, we also experimentally study the detection performance of the hash algorithms.


Packet Size Short Message Service Session Initiation Protocol Packet Loss Rate Equal Error Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    SPIT on VoIP. Communications News (January 2005),;content (visited at May16, 2011)
  2. 2.
    Balasubramaniyan, V.A., Ahamad, M., Park, H.: Callrank: Using call duration, social networks and pagerank to combat SPIT. In: Proceedings of CEAS 2007. ACM, New York (2007)Google Scholar
  3. 3.
    Bai, Y., Su, X., Bhargava, B.: Adaptive voice spam control with user behavior analysis. In: Proceedings of HPCC 2009. IEEE Computer Society, Los Alamitos (2009)Google Scholar
  4. 4.
    Quittek, J., Niccolini, S., Tartarelli, S., Stiemerling, M., Brunner, M., Ewald, T.: Detecting SPIT calls by checking human communication patterns. In: Proceedings of ICC 2007. IEEE Communication Society, Los Alamitos (2007)Google Scholar
  5. 5.
    Sarafijanovic, S., Perez, S., Boudec, J.L.: Improving digest-based collaborative spam detection. In: Proceedings of MIT Spam Conference 2008 (2008)Google Scholar
  6. 6.
    Damiani, E., Vimercati, S.D.C., Paraboschi, S., Samarati, P.: An open digest-based technique for spam detection. In: Proceedings of ISCA PDCCS 2004. ISCA (2004)Google Scholar
  7. 7.
    Wright, C., Ballard, L., Coull, S., Monrose, F., Masson, G.: Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations. In: Proceedings of S&P 2008. IEEE Computer Society, Los Alamitos (2008)Google Scholar
  8. 8.
    White, A.M., Matthews, A.R., Snow, K.Z., Monrose, F.: Phonotactic reconstruction of encrypted VoIP conversations: Hookt on fon-iks. In: Proceedings of S&P 2011. IEEE Computer Society, Los Alamitos (2011)Google Scholar
  9. 9.
    Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., Schooler, E.: SIP: Session Initiation Protocol, RFC 3261 (2002)Google Scholar
  10. 10.
    Schulzrinne, H., Casner, S., Frederick, R., Jacobson, V.: RTP: A transport protocol for real-time applications, RFC 3550 (2003)Google Scholar
  11. 11.
    G.711, (visited at May 15, 2011)
  12. 12.
    Speex, (visited at May 15, 2011)
  13. 13.
    Schroeder, M., Atal, B.: Code-excited linear prediction (CELP): High-quality speech at very low bit rates. In: Proceedings of ICASSP 1985. IEEE Signaling Proceesing Society (1985)Google Scholar
  14. 14.
    Skype, (visited at May 15th, 2011)
  15. 15.
    Festival, (visited at May 16, 2011)
  16. 16.
    Coskun, B., Memon, N.: Tracking encrypted VoIP calls via robust hashing of network flows. In: Proceedings of ICASSP 2010. IEEE Signaling Proceesing Society (2010)Google Scholar
  17. 17.
    Nilsimsa, (visited at May 16th, 2011)
  18. 18.
    Spamassassin, (visited at May 16, 2011)
  19. 19.
    SMS spam corpus, (visited at May 16, 2011)
  20. 20.
    Rosenberg, J., Jennings, C.: The Session Initiation Protocol (SIP) and Spam, RFC 5039 (2008)Google Scholar
  21. 21.
    Shin, D., Ahn, J., Shim, C.: Progressive multi gray-leveling: a voice spam protection algorithm. IEEE Networks 20(5), 18–24 (2006)CrossRefGoogle Scholar
  22. 22.
    Zhang, R., Gurtov, A.: Collaborative reputation-based voice spam filtering. In: Proceedings of DEXA Workshop 2009. IEEE Computer Society, Los Alamitos (2009)Google Scholar
  23. 23.
    Markkola, A., Lindqvist, J.: Accessible voice CAPTCHAs for internet telephony. In: Proceedings of SOAPS 2008. ACM, New York (2008)Google Scholar
  24. 24.
    Soupionis, Y., Gritzalis, D.: Audio CAPTCHA: Existing solutions assessment and a new implementation for VoIP telephony. Computers & Security 29(5), 603–618 (2010)CrossRefGoogle Scholar
  25. 25.
    Salehin, S.M.A., Ventura, N.: Blocking unsolicited voice calls using decoys for the IMS. In: Proceedings of ICC 2007. IEEE Communication Society (2007)Google Scholar
  26. 26.
    Mehta, B., Nangia, S., Gupta, M., Nejdl, W.: Detecting image spam using visual features and near duplicate detection. In: Proceeding of WWW 2008. ACM, New York (2008)Google Scholar
  27. 27.
    Yeh, C., Lin, C.: Near-duplicate mail detection based on URL information for spam filtering. In: Chong, I., Kawahara, K. (eds.) ICOIN 2006. LNCS, vol. 3961, pp. 842–851. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ge Zhang
    • 1
  • Simone Fischer-Hübner
    • 1
  1. 1.Karlstad UniversityKarlstadSweden

Personalised recommendations