Useful Probability Distributions
We will use probability as a tool to resolve practical questions about data. Here are important example questions. We could ask what process produced the data? For example, I observe a set of independent coin flips. I would now like to know the probability of observing a head when the coin is flipped. We could ask what sort of data can we expect in the future? For example, what will be the outcome of the next election? Answering this requires collecting information about voters, preferences, and the like, then using it to build a model that predicts the outcome. We could ask what labels should we attach to unlabelled data? For example, we might see a large number of credit card transactions, some known to be legitimate and others known to be fraudulent. We now see a new transaction: is it legitimate? We could ask is an effect easily explained by chance variations, or is it real? For example, a medicine appears to help patients with a disease. Is there a real effect, or is it possible that by chance the patients we tested the medicine on felt better?