Collusion Set Detection Through Outlier Discovery

  • Vandana P. Janeja
  • Vijayalakshmi Atluri
  • Jaideep Vaidya
  • Nabil R. Adam
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3495)


The ability to identify collusive malicious behavior is critical in today’s security environment. We pose the general problem of Collusion Set Detection (CSD): identifying sets of behavior that together satisfy some notion of “interesting behavior”. For this paper, we focus on a subset of the problem (called CSD′), by restricting our attention only to outliers. In the process of proposing the solution, we make the following novel research contributions: First, we propose a suitable distance metric, called the collusion distance metric, and formally prove that it indeed is a distance metric. We propose a collusion distance based outlier detection (CDB) algorithm that is capable of identifying the causal dimensions (n) responsible for the outlierness, and demonstrate that it improves both precision and recall, when compared to the Euclidean based outlier detection. Second, we propose a solution to the CSD′ problem, which relies on the semantic relationships among the causal dimensions.


Outlier Detection Semantic Relationship Local Outlier Causal Dimension Outlier Detection Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aggarwal, C.C., Yu, P.S.: Outlier detection for high dimensional data. In: Proceedings of the ACM SIGMOD, pp. 37–46 (2001)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, September 12-15, 1994, pp. 487–499 (1994)Google Scholar
  3. 3.
    Barnett, V., Lewis, T.: Outliers in Statistical Data, 3rd edn. John Wiley and Sons, Chichester (1994)zbMATHGoogle Scholar
  4. 4.
    Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Optics-of: Identifying local outliers. In: Proceedings of the European Conference on Principles of Data Mining and Knowledge Discovery, pp. 262–270 (1999)Google Scholar
  5. 5.
    Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: Identifying density-based local outliers. In: Proceedings of the ACM SIGMOD (2000)Google Scholar
  6. 6.
    He, Z., Deng, S., Xu, X.: Outlier detection integrating semantic knowledge. In: Meng, X., Su, J., Wang, Y. (eds.) WAIM 2002. LNCS, vol. 2419, pp. 126–131. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Piers global intelligence solutions,
  8. 8.
    Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the International Conference on Very Large Data Bases (VLDB 1998), August 1998, pp. 392–403 (1998)Google Scholar
  9. 9.
    Knorr, E.M., Ng, R.T.: Finding intensional knowledge of distance-based outliers. In: Proceedings of 25th International Conference on Very Large Data Bases, pp. 211–222 (1999)Google Scholar
  10. 10.
    Kubica, J., Moore, A., Cohn, D., Schneider, J.: Finding underlying connections: A fast graph-based method for link analysis and collaboration queries. In: Proceedings of the International Conference on Machine Learning (August 2003)Google Scholar
  11. 11.
    Lopez, M.F., Gomez-Perez, A., Sierra, J.P., Sierra, A.P.: Building a chemical ontology using methontology and the ontology design environment. Intelligent Systems 14, 37–46 (1999)CrossRefGoogle Scholar
  12. 12.
    Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: Proceedings of the ACM SIGMOD, pp. 427–438 (2000)Google Scholar
  13. 13.
    Rote, G.: Computing the minimum hausdorff distance between two point sets on a line under translation. Inf. Process. Lett. 38(3), 123–127 (1991)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Wang, G., Chen, H., Atabakhsh, H.: Automatically detecting deceptive criminal identities. Commun. ACM 47(3), 70–76 (2004)CrossRefGoogle Scholar
  15. 15.
    Wasserman, S., Faust, K.: Social network analysis. Cambridge University Press, Cambridge (1994)Google Scholar
  16. 16.
    Xu, J., Chen, H.: Untangling criminal networks: A case study. In: Chen, H., Miranda, R., Zeng, D.D., Demchak, C.C., Schroeder, J., Madhusudan, T. (eds.) ISI 2003. LNCS, vol. 2665, pp. 232–248. Springer, Heidelberg (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Vandana P. Janeja
    • 1
  • Vijayalakshmi Atluri
    • 1
  • Jaideep Vaidya
    • 1
  • Nabil R. Adam
    • 1
  1. 1.MSIS Department and CIMICRutgers University 

Personalised recommendations