Abstract
Data mining is one of the most effective methods for fraud detection. This is highlighted by 25 % of organizations that have suffered from economic crimes [1]. This paper presents a case study using real-world data from a large retail company. We identify symptoms of fraud by looking for outliers. To identify the outliers and the context where outliers appear, we learn a regression tree. For a given node, we identify the outliers using the set of examples covered at that node, and the context as the conjunction of the conditions in the path from the root to the node. Surprisingly, at different nodes of the tree, we observe that some outliers disappear and new ones appear. From the business point of view, the outliers that are detected near the leaves of the tree are the most suspicious ones. These are cases of difficult detection, being observed only in a given context, defined by a set of rules associated with the node.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Skalak, S.: Global economic crime survey. Technical report, PwC (2014)
Jans, M., Lybaert, N., Vanhoof, K: Data mining for fraud detection: toward an improvement on internal control systems? In: 30th Annual Congress European Accounting Association (EAA 2007)
Coderre, D.: Computer-Aided Fraud Prevention & Detection. Wiley, Hoboken (2009)
Torgo, L.: Data Mining with R: Learning with Case Studies, 1st edn. Chapman & Hall/CRC, Boca Raton (2010)
Bates, A.: Fraud risk management: developing a strategy for prevention,detection, and response, Technical report, KPMG Advisory Forensic (2006)
Stulb, D., Remnitz, D.: Big risks require big data thinking: global forensic data analytics survey 2014. Technical report, EY (2014)
Singh, K., Upadhyaya, S.: Outlier detection: applications and techniques. Int. J. Comput. Sci. Issues 9(3), 307–323 (2012)
Kristin, R.N., Matkovsky, I.P.: Using data mining techniques for fraud detection. Technical report, SAS Institute Inc. and Federal Data Corporation (1999)
Phua, C., Lee, V.C.S., Smith-Miles, K., Gayler, R.W.: A comprehensive survey of data mining-based fraud detection research. CoRR abs/1009.6119 (2010)
Hawkins, D.: Identification of Outliers. Monographs on Applied Probability and Statistics. Chapman & Hall, New York (1980)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Gupta, M., Gao, J., Aggarwal, C.C., Han, J.: Outlier detection for temporal data: a survey. IEEE Trans. Knowl. Data Eng. 26(9), 2250–2267 (2014)
Anglia Ruskin University: NuMBerS: numerical methods for biosciences students. http://web.anglia.ac.uk/numbers/. Accessed 02 May 2016
Wells, J.T.: Corporate Fraud Handbook: Prevention and Detection, 2nd edn. Wiley, Hoboken (2007)
Gama, J., Carvalho, A., Faceli, K., Lorena, C., Oliveira, M.: Extração de Conhecimento de Dados - Data Mining, 1st edn. Silabo (2012)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall, New York (1984)
Therneau, T., Atkinson, B., Ripley, B.: rpart: Recursive Partitioning and Regression Trees. R package version 4.1-10 (2015)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2016)
Acknowledgments
This work was supported by research project TEC4Growth - Pervasive Intelligence, Enhancers and Proofs of Concept with Industrial Impact/NORTE-01-0145-FEDER-000020, financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund and by European Commission through the project MAESTRA (ICT-2013-612944).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Ribeiro, R.P., Oliveira, R., Gama, J. (2016). Detection of Fraud Symptoms in the Retail Industry. In: Montes y Gómez, M., Escalante, H., Segura, A., Murillo, J. (eds) Advances in Artificial Intelligence - IBERAMIA 2016. IBERAMIA 2016. Lecture Notes in Computer Science(), vol 10022. Springer, Cham. https://doi.org/10.1007/978-3-319-47955-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-47955-2_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47954-5
Online ISBN: 978-3-319-47955-2
eBook Packages: Computer ScienceComputer Science (R0)