A Cross Datasets Referring Outlier Detection Model Applied to Suspicious Financial Transaction Discrimination

  • Tang Jun
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3917)


Outlier detection is a key element for intelligent financial surveillance systems which intend to identify fraud and money laundering by discovering unusual customer behaviour pattern. The detection procedures generally fall into two categories: comparing every transaction against its account history and further more, comparing against a peer group to determine if the behavior is unusual. The later approach shows particular merits in efficiently extracting suspicious transaction and reducing false positive rate. Peer group analysis concept is largely dependent on a cross-datasets outlier detection model. In this paper, we propose a cross outlier detection model based on distance definition incorporated with the financial transaction data features. An approximation algorithm accompanied with the model is provided to optimize the computation of the deviation from tested data point to the reference dataset. An experiment based on real bank data blended with synthetic outlier cases shows promising results of our model in reducing false positive rate while enhancing the discriminative rate remarkably.


False Positive Rate Outlier Detection Money Laundering Account History Suspicious Transaction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hawkins, D.: Identification of outliers. Chapman and Hall, London (1980)CrossRefMATHGoogle Scholar
  2. 2.
    Faloutsos, C., Seeger Jr., B., T, C., Trainar, A.: Spatial join selectivity using power laws. In: Proc. SIGMOD, pp. 177–188 (2000)Google Scholar
  3. 3.
    Knorr, E., Ng, R.: Algorithms for mining distance-based outliers:Properties and computation. In: Kdd 1997, pp. 219–222 (1997)Google Scholar
  4. 4.
    Knorr, E.M., Ng, R.: Algorithms for mining distance-based outliers in large datasets. In: Proc. VLDB 1998, pp. 392–403 (1998)Google Scholar
  5. 5.
    Knorr, E., Ng, R.: Finding intentional knowledge of distancebased outliers. In: Proc. VLDB, pp. 211–222 (1999)Google Scholar
  6. 6.
    Knorr, E., Ng, R., Tucakov, V.: Distancebased outliers: Algorithms and applications. VLDB Journal 8, 237–253 (2000)CrossRefGoogle Scholar
  7. 7.
    Ramaswarmy, S., Rastogi, R., Kyuseok, S.: Efficient Algorithms for Mining Outliers from Large Datasets. In: SIGMOD 2000, pp. 93–104 (2000)Google Scholar
  8. 8.
    Traina, A., Traina, C., Papadimitriou, S., Faloutsos, C.: Tri-plots:Scalable tools for multidimensional data mining. In: Proc.KDD, pp. 184–193 (2001)Google Scholar
  9. 9.
    Spiros Papadimitriou.Cross-Outlier Detection,
  10. 10.
    Ramaswarmy, S., Rastogi, R., Kyuseok, S.: Efficient Algorithms for Mining Outliers from Large Datasets. In: SIGMOD 2000, pp. 93–104 (2000)Google Scholar
  11. 11.
    Eltoz, L., Steinbach, U., Kumar, V.: A new shared nearest neighbor clusteing algoithm and its applications, AHPCRC, Tech. Rep, p. 134 (August 2002)Google Scholar
  12. 12.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Tang Jun
    • 1
    • 2
  1. 1.School of Computer Science and technologyWuhan University of TechnologyWuhanChina
  2. 2.School of InformationZhongnan University of Economics and LawWuhanChina

Personalised recommendations