Comments on: Data science, big data and statistics
- 153 Downloads
We congratulate Pedro Galeano and Daniel Peña on a magisterial, thorough and stimulating survey of the rapidly expanding and highly important field of the analysis of big data. Their lucid exposition is complemented by the analysis of two examples which show the results that can be obtained as well as illustrating the difficulties that can arise. The first example (their Fig. 4) provides an intriguing network plot.
In the analysis of any set of data, large or small, it is most helpful to be able to create plots that are informative about the structure of the data and any failings of models that are fitted. In their Section 2.5, the authors survey methods for visualising data in high dimensions, many of which are, like scatter plots, model-free. They use quantiles as one way of reducing dimensionality. In Section 3, “The Emergence of Data Science” the authors focus more on such methods as neural networks and those collected under the umbrella of machine learning, where the focus is...
Mathematics Subject Classification62-07
- Torti F, Corbellini A, Atkinson AC (2019) Monitoring robust regression, especially the forward search, in SAS IML studio (In preparation)Google Scholar