Detection of Fraud Symptoms in the Retail Industry

Ribeiro, Rita P.; Oliveira, Ricardo; Gama, João

doi:10.1007/978-3-319-47955-2_16

Detection of Fraud Symptoms in the Retail Industry

Rita P. Ribeiro^17,18,
Ricardo Oliveira¹⁹ &
João Gama^18,19

Conference paper
First Online: 14 October 2016

1225 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10022))

Abstract

Data mining is one of the most effective methods for fraud detection. This is highlighted by 25 % of organizations that have suffered from economic crimes [1]. This paper presents a case study using real-world data from a large retail company. We identify symptoms of fraud by looking for outliers. To identify the outliers and the context where outliers appear, we learn a regression tree. For a given node, we identify the outliers using the set of examples covered at that node, and the context as the conjunction of the conditions in the path from the root to the node. Surprisingly, at different nodes of the tree, we observe that some outliers disappear and new ones appear. From the business point of view, the outliers that are detected near the leaves of the tree are the most suspicious ones. These are cases of difficult detection, being observed only in a given context, defined by a set of rules associated with the node.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Skalak, S.: Global economic crime survey. Technical report, PwC (2014)
Google Scholar
Jans, M., Lybaert, N., Vanhoof, K: Data mining for fraud detection: toward an improvement on internal control systems? In: 30th Annual Congress European Accounting Association (EAA 2007)
Google Scholar
Coderre, D.: Computer-Aided Fraud Prevention & Detection. Wiley, Hoboken (2009)
Google Scholar
Torgo, L.: Data Mining with R: Learning with Case Studies, 1st edn. Chapman & Hall/CRC, Boca Raton (2010)
Book Google Scholar
Bates, A.: Fraud risk management: developing a strategy for prevention,detection, and response, Technical report, KPMG Advisory Forensic (2006)
Google Scholar
Stulb, D., Remnitz, D.: Big risks require big data thinking: global forensic data analytics survey 2014. Technical report, EY (2014)
Google Scholar
Singh, K., Upadhyaya, S.: Outlier detection: applications and techniques. Int. J. Comput. Sci. Issues 9(3), 307–323 (2012)
MathSciNet Google Scholar
Kristin, R.N., Matkovsky, I.P.: Using data mining techniques for fraud detection. Technical report, SAS Institute Inc. and Federal Data Corporation (1999)
Google Scholar
Phua, C., Lee, V.C.S., Smith-Miles, K., Gayler, R.W.: A comprehensive survey of data mining-based fraud detection research. CoRR abs/1009.6119 (2010)
Google Scholar
Hawkins, D.: Identification of Outliers. Monographs on Applied Probability and Statistics. Chapman & Hall, New York (1980)
Book Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Article Google Scholar
Gupta, M., Gao, J., Aggarwal, C.C., Han, J.: Outlier detection for temporal data: a survey. IEEE Trans. Knowl. Data Eng. 26(9), 2250–2267 (2014)
Article MathSciNet MATH Google Scholar
Anglia Ruskin University: NuMBerS: numerical methods for biosciences students. http://web.anglia.ac.uk/numbers/. Accessed 02 May 2016
Wells, J.T.: Corporate Fraud Handbook: Prevention and Detection, 2nd edn. Wiley, Hoboken (2007)
Google Scholar
Gama, J., Carvalho, A., Faceli, K., Lorena, C., Oliveira, M.: Extração de Conhecimento de Dados - Data Mining, 1st edn. Silabo (2012)
Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall, New York (1984)
MATH Google Scholar
Therneau, T., Atkinson, B., Ripley, B.: rpart: Recursive Partitioning and Regression Trees. R package version 4.1-10 (2015)
Google Scholar
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2016)
Google Scholar

Download references

Acknowledgments

This work was supported by research project TEC4Growth - Pervasive Intelligence, Enhancers and Proofs of Concept with Industrial Impact/NORTE-01-0145-FEDER-000020, financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund and by European Commission through the project MAESTRA (ICT-2013-612944).

Author information

Authors and Affiliations

Faculty of Sciences, University of Porto, Porto, Portugal
Rita P. Ribeiro
LIAAD/INESC TEC, University of Porto, Porto, Portugal
Rita P. Ribeiro & João Gama
Faculty of Economics, University of Porto, Porto, Portugal
Ricardo Oliveira & João Gama

Authors

Rita P. Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
João Gama
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Rita P. Ribeiro or João Gama .

Editor information

Editors and Affiliations

INAOE , Tonantzintla, Mexico
Manuel Montes y Gómez
Astrofisica Optica y Electronica, INAOE , Puebla, Mexico
Hugo Jair Escalante
Universidad Nacional de Costa Rica , Heredia, Costa Rica
Alberto Segura
Universidad Nacional de Costa Rica , Heredia, Costa Rica
Juan de Dios Murillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ribeiro, R.P., Oliveira, R., Gama, J. (2016). Detection of Fraud Symptoms in the Retail Industry. In: Montes y Gómez, M., Escalante, H., Segura, A., Murillo, J. (eds) Advances in Artificial Intelligence - IBERAMIA 2016. IBERAMIA 2016. Lecture Notes in Computer Science(), vol 10022. Springer, Cham. https://doi.org/10.1007/978-3-319-47955-2_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-47955-2_16
Published: 14 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47954-5
Online ISBN: 978-3-319-47955-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics