Advertisement

TEST

pp 1–4 | Cite as

Comments on: Data science, big data and statistics

  • Abigael C. NachtsheimEmail author
  • John Stufken
Discussion
  • 34 Downloads

We would first like to thank the authors for writing this thought-provoking article on such an important topic. This piece explores the intersection of data science and statistics in a world increasingly concerned with the analysis of massive data sets. The authors consider seven areas in which the increased presence of big data may alter and expand traditional statistical approaches; they give an overview of the emerging field of data science; and they provide two examples of statistical analyses driven by big data. Finally, they provide some insight into the future of statistics, imagining it as one piece of the multi-faceted and evolving field of data science.

The authors stress the role that big data has and will continue to have in the development of data science as a field of study, and in the ways that statistics as a discipline must adapt. We would like to note that the field of statistics already has a long history of evolution. The study of statistics dates at least to the...

Mathematics Subject Classification

62 (Statistics) 

Notes

Acknowledgements

The work by JS was partially supported by NSF grant DMS-1811363.

References

  1. Donoho D (2017) 50 years of data science. J Comput Graph Stat 26(4):745–766MathSciNetCrossRefGoogle Scholar
  2. Holland PW (1986) Statistics and causal inference. J Am Stat Assoc 81(396):945–960MathSciNetCrossRefzbMATHGoogle Scholar
  3. Ioannidis JP (2005) Contradicted and Initially stronger effects in highly cited clinical research. JAMA 294(2):218–228MathSciNetCrossRefGoogle Scholar
  4. Ma P, Mahoney M, Yu B (2015) A statistical perspective on algorithmic leveraging. J Mach Learn Res 16:861–911MathSciNetzbMATHGoogle Scholar
  5. Meng XL (2018) Statistical paradises and paradoxes in big data (I): law of large populations, big data paradox, and the 2016 US presidential election. Ann Appl Stat 12(2):685–726MathSciNetCrossRefzbMATHGoogle Scholar
  6. Stigler SM (1986) The history of statistics: the measurement of uncertainty before 1900. Harvard University Press, CambridgezbMATHGoogle Scholar
  7. Wang H, Yang M, Stufken J (2018) Information-based optimal subdata selection for big data linear regression. J Am Stat Assoc 1–13.  https://doi.org/10.1080/01621459.2017.1408468

Copyright information

© Sociedad de Estadística e Investigación Operativa 2019

Authors and Affiliations

  1. 1.School of Mathematical and Statistical SciencesArizona State UniversityTempeUSA

Personalised recommendations