Skip to main content

Digital Forensics and the Big Data Deluge — Some Concerns Based on Ramsey Theory

  • Conference paper
  • First Online:
Advances in Digital Forensics XVI (DigitalForensics 2020)

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 589))

Included in the following conference series:

  • 524 Accesses

Abstract

Constructions of science that slowly change over time are deemed to be the basis of the reliability with which scientific knowledge is regarded. A potential paradigm shift based on big data is looming – many researchers believe that massive volumes of data have enough substance to capture knowledge without the theories needed in earlier epochs. Patterns in big data are deemed to be sufficient to make predictions about the future, as well as about the past as a form of understanding. This chapter uses an argument developed by Calude and Longo  [6] to critically examine the belief system of the proponents of data-driven knowledge, especially as it applies to digital forensic science.

From Ramsey theory it follows that, if data is large enough, knowledge is imbued in the domain represented by the data purely based on the size of the data. The chapter concludes that it is generally impossible to distinguish between true domain knowledge and knowledge inferred from spurious patterns that must exist purely as a function of data size. In addition, what is deemed a significant pattern may be refuted by a pattern that has yet to be found. Hence, evidence based on patterns found in big data is tenuous at best. Digital forensics should therefore proceed with caution if it wants to embrace big data and the paradigms that evolve from and around big data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Anderson, The end of theory: The data deluge makes the scientific method obsolete, Wired, June 23, 2008.

    Google Scholar 

  2. N. Beebe, Digital forensic research: The good, the bad and the unaddressed, in Advances in Digital Forensics V, G. Peterson and S. Shenoi (Eds.), Springer, Heidelberg, Germany, pp. 17–36, 2009.

    Google Scholar 

  3. P. Blair, P. Fleming, D. Bensley, I. Smith, C. Bacon and E. Taylor, Plastic mattresses and sudden infant death syndrome, Lancet, vol. 345(8951), p. 720, 1995.

    Google Scholar 

  4. R. Blanch, Report of the Inquiry into the Convictions of Kathleen Megan Folbigg, State of New South Wales, Parramatta, Australia (www.folbigginquiry.justice.nsw.gov.au/Documents/Report%20of%20the%20Inquiry%20into%20the%20convictions%20of%20Kathleen%20Megan%20Folbigg.pdf), 2019.

    Google Scholar 

  5. J. Buolamwini and T. Gebru, Gender shades: Intersectional accuracy disparities in commercial gender classification, Proceedings of Machine Learning Research, vol. 81, pp. 77–91, 2018.

    Google Scholar 

  6. C. Calude and G. Longo, The deluge of spurious correlations in big data, Foundations of Science, vol. 22(3), pp. 595–612, 2017.

    Google Scholar 

  7. J. Clemens, Automatic classification of object code using machine learning, Digital Investigation, vol. 14(S1), pp. S156–S162, 2015.

    Google Scholar 

  8. K. Crawford and T. Paglen, Excavating AI: The Politics of Training Sets for Machine Learning, Excavating AI (www.excavating.ai), September 19, 2019.

    Google Scholar 

  9. S. D’Agostino, The architect of modern algorithms, Quanta Magazine, November 20, 2019.

    Google Scholar 

  10. England and Wales Court of Appeal (Criminal Division), Regina v. Sally Clark, EWCA Crim 54, Case No: 1999/07495/Y3, Royal Courts of Justice, London, United Kingdom, October 2, 2000.

    Google Scholar 

  11. England and Wales Court of Appeal (Criminal Division), Regina v. Sally Clark, EWCA Crim 1020, Case No. 2002/03824/Y3, Royal Courts of Justice, London, United Kingdom, April 11, 2003.

    Google Scholar 

  12. M. Kestemont, M. Tschuggnall, E. Stamatatos, W. Daelemans, G. Specht and B. Potthast, Overview of the author identification task at PAN-2018: Cross-domain authorship attribution and style change detection, in Working Notes of CLEF 2018 – Conference and Labs of the Evaluation Forum, L. Cappellato, N. Ferro, J. Nie and L. Soulier (Eds.), Volume 2125, CEUR-WS.org, RWTH Aachen University, Aachen, Germany, 2018.

    Google Scholar 

  13. W. Knight, Facebook’s head of AI says the field will soon “hit the wall,” Wired, December 4, 2019.

    Google Scholar 

  14. P. Langley, The changing science of machine learning, Machine Learning, vol. 82(3), pp. 275–279, 2011.

    Google Scholar 

  15. R. Meadow, Fatal abuse and smothering, in ABC of Child Abuse, R. Meadow (Ed.), BMJ Publishing Group, London, United Kingdom, pp. 27–29,1997.

    Google Scholar 

  16. F. Mitchell, The use of artificial intelligence in digital forensics: An introduction, Digital Evidence and Electronic Signature Law Review, vol. 7, pp. 35–41, 2010.

    Google Scholar 

  17. F. Mitchell, An overview of artificial intelligence based pattern matching in a security and digital forensic context, in Cyberpatterns, C. Blackwell and H. Zhu (Eds.), Springer, Cham, Switzerland, pp. 215–222, 2014.

    Google Scholar 

  18. M. Pollitt and A. Whitledge, Exploring big haystacks, in Advances in Digital Forensics II, M. Olivier and S. Shenoi (Eds.), Springer, Boston, Massachusetts, pp. 67–76, 2006.

    Google Scholar 

  19. I. Raji and J. Buolamwini, Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial AI products, Proceedings of the AAAI/ACM Conference on AI, Ethics and Society, pp. 429–435, 2019.

    Google Scholar 

  20. F. Ramsey, On a problem of formal logic, Proceedings of the London Mathematical Society, vol. s2-30(1), pp. 264–286, 1930.

    Google Scholar 

  21. Royal Statistical Society, Royal Statistical Society concerned by issues raised in Sally Clark case, News Release, London, United Kingdom, October 23, 2001.

    Google Scholar 

  22. J. Smeaton, Reports of the Late John Smeaton, F.R.S., Made on Various Occasions, in the Course of his Employment as a Civil Engineer, Volume II, Longman, London, United Kingdom, 1812.

    Google Scholar 

  23. J. Wulff, Artificial intelligence and law enforcement, Australasian Policing, vol. 10(1), pp. 16–23, 2018.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Olivier .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Olivier, M. (2020). Digital Forensics and the Big Data Deluge — Some Concerns Based on Ramsey Theory. In: Peterson, G., Shenoi, S. (eds) Advances in Digital Forensics XVI. DigitalForensics 2020. IFIP Advances in Information and Communication Technology, vol 589. Springer, Cham. https://doi.org/10.1007/978-3-030-56223-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-56223-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-56222-9

  • Online ISBN: 978-3-030-56223-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics