pp 1–4 | Cite as

Comments on: Data science, big data and statistics

  • Abigael C. NachtsheimEmail author
  • John Stufken

We would first like to thank the authors for writing this thought-provoking article on such an important topic. This piece explores the intersection of data science and statistics in a world increasingly concerned with the analysis of massive data sets. The authors consider seven areas in which the increased presence of big data may alter and expand traditional statistical approaches; they give an overview of the emerging field of data science; and they provide two examples of statistical analyses driven by big data. Finally, they provide some insight into the future of statistics, imagining it as one piece of the multi-faceted and evolving field of data science.

The authors stress the role that big data has and will continue to have in the development of data science as a field of study, and in the ways that statistics as a discipline must adapt. We would like to note that the field of statistics already has a long history of evolution. The study of statistics dates at least to the...

Mathematics Subject Classification

62 (Statistics) 



The work by JS was partially supported by NSF grant DMS-1811363.


  1. Donoho D (2017) 50 years of data science. J Comput Graph Stat 26(4):745–766MathSciNetCrossRefGoogle Scholar
  2. Holland PW (1986) Statistics and causal inference. J Am Stat Assoc 81(396):945–960MathSciNetCrossRefzbMATHGoogle Scholar
  3. Ioannidis JP (2005) Contradicted and Initially stronger effects in highly cited clinical research. JAMA 294(2):218–228MathSciNetCrossRefGoogle Scholar
  4. Ma P, Mahoney M, Yu B (2015) A statistical perspective on algorithmic leveraging. J Mach Learn Res 16:861–911MathSciNetzbMATHGoogle Scholar
  5. Meng XL (2018) Statistical paradises and paradoxes in big data (I): law of large populations, big data paradox, and the 2016 US presidential election. Ann Appl Stat 12(2):685–726MathSciNetCrossRefzbMATHGoogle Scholar
  6. Stigler SM (1986) The history of statistics: the measurement of uncertainty before 1900. Harvard University Press, CambridgezbMATHGoogle Scholar
  7. Wang H, Yang M, Stufken J (2018) Information-based optimal subdata selection for big data linear regression. J Am Stat Assoc 1–13.

Copyright information

© Sociedad de Estadística e Investigación Operativa 2019

Authors and Affiliations

  1. 1.School of Mathematical and Statistical SciencesArizona State UniversityTempeUSA

Personalised recommendations