The value of the last digit: statistical fraud detection with digit analysis
Digit distributions are a popular tool for the detection of tax payers’ noncompliance and other fraud. In the early stage of digital analysis, Nigrini and Mittermaier (A J Pract Theory 16(2):52–67, 1997) made use of Benford’s Law (Benford in Am Philos Soc 78:551–572, 1938) as a natural reference distribution. A justification of that hypothesis is only known for multiplicative sequences (Schatte in J Inf Process Cyber EIK 24:443–455, 1988). In applications, most of the number generating processes are of an additive nature and no single choice of ‘an universal first-digit law’ seems to be plausible (Scott and Fasli in Benford’s law: an empirical investigation and a novel explanation. CSM Technical Report 349, Department of Computer Science, University of Essex, http://cswww.essex.ac.uk/technical-reports/2001/CSM-349.pdf, 2001). In that situation, some practioneers (e.g. financial authorities) take recourse to a last digit analysis based on the hypothesis of a Laplace distribution. We prove that last digits are approximately uniform for distributions with an absolutely continuous distribution function. From a practical perspective, that result, of course, is only moderately interesting. For that reason, we derive a result for ‘certain’ sums of lattice-variables as well. That justification is provided in terms of stationary distributions.
KeywordsFraud detection Last digits Digit analysis Benford’s law
Mathematics Subject Classification (2000)60B10 62P20 91B99
- Benford F (1938) The law of anomalous numbers. Proc Am Philos Soc 78: 551–572Google Scholar
- Nigrini MJ, Mittermaier LJ (1997) The use of Benford’s law as an aid in analytical procedures: audit. A J Pract Theory 16(2): 52–67Google Scholar
- Scott PD, Fasli M (2001) Benford’s law: an empirical investigation and a novel explanation. Technical report. CSM Technical Report 349, Department of Computer Science, University of Essex, http://cswww.essex.ac.uk/technical-reports/2001/CSM-349.pdf