pp 1–4 | Cite as

Comments on: Data science, big data and statistics

  • Peter BühlmannEmail author


We congratulate Pedro Galeano and Daniel Peña for a nice paper on the emerging theme of data science and the role of statistics.


Big data Causal inference Data science Heterogeneity High-dimensional statistics Robustness 

Mathematics Subject Classification




  1. Breiman L (1996a) Bagging predictors. Mach Learn 24:123–140zbMATHGoogle Scholar
  2. Breiman L (1996b) Heuristics of instability and stabilization in model selection. Ann Stat 24:2350–2383MathSciNetCrossRefzbMATHGoogle Scholar
  3. Breiman L (2001a) Random forests. Mach Learn 45:5–32CrossRefzbMATHGoogle Scholar
  4. Breiman L (2001b) Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci 16:199–231MathSciNetCrossRefzbMATHGoogle Scholar
  5. Bühlmann P (2018) Invariance, causality and robustness. Preprint arXiv:1812.08233
  6. Bühlmann P, Meinshausen N (2016) Magging: maximin aggregation for inhomogeneous large-scale data. Proc IEEE 104:126–135CrossRefGoogle Scholar
  7. Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer, BerlinCrossRefzbMATHGoogle Scholar
  8. Carpenter A, Jones T, Lamprecht M, Clarke C, Kang I, Friman O, Guertin D, Chang J, Lindquist R, Moffat J, Golland P, Sabatini D (2006) Cellprofiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol 7:R100CrossRefGoogle Scholar
  9. Hastie T, Tibshirani R, Wainwright M (2015) Statistical learning with sparsity: the lasso and generalizations. CRC Press, Boca RatonCrossRefzbMATHGoogle Scholar
  10. Heinze-Deml C, Peters J, Meinshausen N (2018) Invariant causal prediction for nonlinear models. J Causal Inference 6:20170016. CrossRefGoogle Scholar
  11. Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507MathSciNetCrossRefzbMATHGoogle Scholar
  12. Kamentsky L, Jones T, Fraser A, Bray M, Logan D, Madden K, Ljosa V, Rueden C, Eliceiri K, Carpenter A (2011) Improved structure, function and compatibility for cellprofiler: modular high-throughput image analysis software. Bioinformatics 27:1179–1180CrossRefGoogle Scholar
  13. Meinshausen N (2018) Causality from a distributional robustness point of view. In: 2018 IEEE data science workshop (DSW). IEEE, pp 6–10Google Scholar
  14. Meinshausen N, Bühlmann P (2010) Stability selection (with discussion). J R Stat Soc Ser B 72:417–473MathSciNetCrossRefGoogle Scholar
  15. Meinshausen N, Bühlmann P (2015) Maximin effects in inhomogeneous large-scale data. Ann Stat 43:1801–1830MathSciNetCrossRefzbMATHGoogle Scholar
  16. Peters J, Bühlmann P, Meinshausen N (2016) Causal inference using invariant prediction: identification and confidence interval (with discussion). J R Stat Soc Ser B 78:947–1012CrossRefGoogle Scholar
  17. Pfister N, Bühlmann P, Peters J (2018) Invariant causal prediction for sequential data. J Am Stat Assoc 2018.
  18. Rämö P, Drewek A, Arrieumerlou C, Beerenwinkel N, Ben-Tekaya H, Cardel B, Casanova A, Conde-Alvarez R, Cossart P, Csúcs G, Eicher S, Emmenlauer M, Greber U, Hardt W-D, Helenius A, Kasper C, Kaufmann A, Kreibich S, Kühbacher A, Kunszt P, Low S, Mercer J, Mudrak S, Muntwiler S, Pelkmans L, Pizarro-Cerda J, Podvinec M, Pujadas E, Rinn B, Rouilly V, Schmich F, Siebourg-Polster J, Snijder B, Stebler M, Studer G, Szczurek E, Truttmann M, von Mering C, Vonderheit A, Yakimovich A, Bühlmann P, Dehio C (2014) Simultaneous analysis of large-scale RNAi screens for pathogen entry. BMC Genomics 15(1):1162CrossRefGoogle Scholar
  19. Rothenhäusler D, Meinshausen N, Bühlmann P, Peters J (2018) Anchor regression: heterogeneous data meets causality. Preprint arXiv:1801.06229
  20. Sinha A, Namkoong H, Duchi J (2017) Certifiable distributional robustness with principled adversarial training. Preprint arXiv:1710.10571. Presented at sixth international conference on learning representations (ICLR 2018)
  21. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408MathSciNetzbMATHGoogle Scholar
  22. Yu B (2013) Stability. Bernoulli 19:1484–1500MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2019

Authors and Affiliations

  1. 1.Seminar for StatisticsETH ZürichZurichSwitzerland

Personalised recommendations