Statistical Principles

Zobel, Justin

doi:10.1007/978-1-4471-6639-9_15

Justin Zobel²

14k Accesses

Abstract

We use experiments and take observations to study the behaviour of a system, to test hypotheses, to investigate the effect of manipulations and optimizations, and, overall, to produce evidence for our arguments. The elementary material of evidence is measurement: the reduction of complex phenonema to numerical scores that can be recorded, compared, and analyzed.

Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.

Samuel S. Wilks

Statistics are no substitute for judgement.

Henry Clay

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In careful research published in 1648, Jan-Baptista van Helmont concluded that plants consist of water:
- That all plants immediately and substantially stem from the element water alone I have learnt from the following experiment. I took an earthen vessel in which I placed two hundred pounds of earth dried in an oven, and watered with rain water. I planted in it the stem of a willow tree weighing five pounds. Five years later it had developed into a tree weighing one hundred and sixty-nine pounds and about three ounces. Nothing but rain (and distilled water) had been added. The large vessel was placed in earth and covered by an iron lid with a tin-surface that was pierced with many holes (to allow the soil to breathe while preventing dust from adding to it –jz). I have not weighed the leaves that came off in the four autumn seasons. Finally I dried the earth in the vessel again and found the same two hundred pounds of it diminished by about two ounces. Hence one hundred and sixty-four pounds of wood, bark and roots had come up from water alone.
.
2.
The pattern of size variation was not well chosen, though. As is a common practice, they increased the size of the tables linearly, in this case from 1,000,000 records to 10,000,000 records, in increments of 1,000,000. However, they used this result to make claims about scaling—although only one (decimal) order of magnitude was present. The result would have been more impressive if they had increased the size geometrically, from say 10,000 records to 100,000,000 records, by a factor of 10 or \(\sqrt{10}\) at each step. A logarithmic graph of size versus of time would have clearly demonstrated a trend.
3.
The question of whether and when this protocol is correct or appropriate is beyond the scope of this book. The use of thresholds and particular statistical tests is a continuing topic of scientific debate, and methodologies continue to develop. What is clear is that some use of hypothesis testing is clearly preferable to simple reporting of averages and claimed “improvements”.
4.
There are many instances when much smaller \(\alpha \) is appropriate. Determining which of (say) 1,000,000 genetic variations is significantly linked to a particular property (such as susceptibility to a certain disease) might require \(\alpha <10^{-10}\), or smaller. There is an extensive literature on estimation of \(\alpha \) in different contexts.
5.
It is astonishing how many papers report work in which a slight effect is investigated with a small number of trials. Given that such investigations would generally fail even if the hypothesis was correct, it seems likely that many interesting research questions are unnecessarily discarded.
6.
A typical guess of the likelihood of the better player winning the match is 90 or 95 %; in fact, the likelihood is close to certainty.
When I once suggested to students that they test the code by running it with a probability of 50 % of winning each point, several argued strongly that the program wouldn’t terminate—which is more or less equivalent to arguing that, when tossing coins, you can’t get some given number of heads in a row. They had confused the short-term variability (any number of consecutive throws of a head will come up eventually) with long-term averages. Such are the pitfalls of intuition.
7.
Problems of this kind, and their solutions, can be highly illuminating. In this case, we discovered that disk seek times were a major component of total costs, accounting for around half of all elapsed time. Had we been explicitly investigating the significance of seek costs, we might not have thought of this experiment.
In an experiment undertaken by colleagues of mine in the mid 2000s, they found that the time required for disk-based algorithms could vary by as much as 15 %, depending on whether the data was stored near the inside or the outside of the disk platter. As they noted in their paper, to do such experiments “you may need to become intimately aquainted with the behaviour of your disk drive”.

Author information

Authors and Affiliations

Department of Computing and Information Systems, The University of Melbourne, Parkville, VIC, Australia
Justin Zobel

Authors

Justin Zobel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Justin Zobel .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zobel, J. (2014). Statistical Principles. In: Writing for Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-6639-9_15

Download citation

DOI: https://doi.org/10.1007/978-1-4471-6639-9_15
Published: 10 February 2015
Publisher Name: Springer, London
Print ISBN: 978-1-4471-6638-2
Online ISBN: 978-1-4471-6639-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics