Forensic Accounting Techniques with R Uncovering Fraud and Knowing Your Data

  • Kevin Feasel

Your browser needs to be JavaScript capable to view this video

Try reloading this page, or reviewing your browser settings

You're watching a preview of subscription content. Log in to check access

This video demonstrates a range of techniques used by forensic accounts and fraud examiners to uncover fraudulent journal entries and illegal activities. As data professionals, most of us will never unravel a Bernie Madoff scheme, but we can apply these same techniques in our own environments to learn more about our data. This video will uses the R programming language to apply these fraud detection techniques and help you to gain a better understanding of your data.

What You Will Learn

  • Summarize and review a new data set

  • Perform regression analysis using linear regression

  • Discover distributions of data, overall and between cohorts

  • Compare cohort behavior to discover outliers

  • Use distributions of first and last digits to test data set validity

Who This Video Is For

Data platform specialists and data scientists who are interested in identifying anomalies which may indicate fraud or the opportunity for deeper business insight. Viewers may have some experience with Python or R and some knowledge of statistics, but neither is necessary to get value from the video.

You will learn a variety of techniques from this video by which to examine your data and draw inferences that can help you to detect fraud and malfeasance. You’ll begin with the use of basic analytical techniques such as including regression analysis. From there, you will learn how to use cohort analysis to find outliers between groups, leading you on a data-driven approach to forensic investigation. Finally, you will review numeric techniques around data set validity, including rules around the distributions of the first and last digits in data sets.

About The Author

Kevin Feasel

Kevin Feasel is a Microsoft Data Platform MVP and CTO at Envizage, where he specializes in data analytics with T-SQL and R, forcing Spark clusters to do his bidding, fighting with Kafka, and pulling rabbits out of hats on demand. He is the lead contributor to Curated SQL and author of PolyBase Revealed (forthcoming). A resident of Durham, North Carolina, he can be found cycling the trails along the triangle whenever the weather’s nice enough.


Supporting material

View source code at GitHub.

About this video

Kevin Feasel
Online ISBN
Total duration
46 min
Copyright information
© Kevin Feasel 2019

Related content

Video Transcript

Welcome to Forensic Analysis with R. My name is Kevin Feasel. I’m the CTO of Envizage Technologies. I also run a predictive analytics team. In addition, I’m a Microsoft Data Platform MVP.

We are tasked with reviewing expense reports for our company as part of an audit. Our company has never gone through this kind of audit before. But we have a trustworthy group, so no problem, right?

Now, our data looks a bit like this. We have 12 people in our sales department, and they travel to different cities throughout the year selling our products. The only thing we have here is a set of nine years of individual expense reports. Each expense report looks like the ones sampled here. We have the type of city, employee name, date, and amount.

Armed with this data, we will use a series of tools to audit our data. We will learn a series of data analysis techniques around summary and growth analysis. Next we will dive into the use of linear regression to review and predict results. From there we will use cardinality and cohort analysis to compare groups with one another. Finally, we will look at digit analysis, particularly the last and first digits in sequences of numbers.

You will need a couple of skills going into this course. First I expect some basic familiarity with statistics. We won’t get too deep into the weeds, and I will try to explain as much as I can along the way. But there may be terms I expect you to understand.

I also expect some familiarity with R. I won’t show you how to install R and do not intend this course is a primer on R. But the tools and techniques we use are pretty straightforward. So with that, it’s on with the show.