Appendix A: Asymmetrical confidence intervals
Confidence intervals for Normally distributed estimators are symmetrical and constructed in the familiar way taught in statistics courses. For a large enough sample, continuous and differentiable functions of Normally distributed estimators are also distributed Normal, using the delta method, but the sample size might need to be very large. For example, chi square distributions refer to squares of standard Normals, and chi square distributions become Normal but the number of degrees of freedom must go to infinity. Ratios of Normally distributed estimators are technically Cauchy because of the non-zero density at value zero in the denominator. The precision must increase to the point that no estimate could with any probability large enough to matter be at zero, e.g. male adult American height is distributed Normal with a mean of 68 inches, variance of 4 squared inches, which is 34 standard deviations from 0. For estimators, that might be difficult. The probability distribution of ratios of random variables can be a Cauchy with no finite mean or variance .
Products of estimators can arise in various ways, e.g. mediation models in psychology  or autoregressive disturbances in macroeconomics, i.e. AR(1) as estimated by , for which see Hamilton [17, p. 226]. Exponentiation of logit or hazard model coefficients can also result in non-Normal distributions unless the sample size is very large. The correct distribution is lognormal which becomes Normal as the underlying variance becomes small, i.e. the sample size becomes large.
If the confidence interval is asymmetric, it should be chosen to minimize the width while equalizing the probability density function at each end, while fixing the total probability at 95% or whatever arbitrary value is intended by the researcher. That can be done mathematically in modern times, but special computer programs are required. Symmetric non-linear confidence intervals are inefficiently wide. Table 4 shows asymmetric confidence intervals for the standard problem of exponentiating a normally distributed estimated parameter. The intervals are closer to zero and only slightly shorter if the standard error is small, but much shorter if the standard error is large.
When asymptotic distributions are asymmetric, confidence intervals do not correspond to p values in the usual calculation from Bell curves .
Using resampling variances, the confidence intervals can be computed from the empirical distribution of estimates. Using Bayesian variances, the confidence intervals can be computed from the posterior probability distribution of the parameter.
Appendix B: Existent and hypothetical populations, or finite and infinite collectives
Policy analysis can proceed by collecting data on all charter schools or all environmental laws in a state or nation or all wars over a period of time. This can be construed as a population, but under that interpretation, there appears to be no uncertainty at all about the statistics or relationships in the data. Measurement error and omitted variables would continue to be explanations for standard errors, but neither of those is a satisfactory basis for evaluating policy analysis, as the first invalidates the data, and the second invalidates the model. There is a philosophical problem of justifying the application of inferential statistical formulae to an apparently complete census of a population.
If the sample really is a substantial subset of the population, the variances of sample means, and by implication moments and maximum likelihood estimators, are subject to the finite population correction (Cochran [9, Sect. 2.6, pp. 24–25]). If the sample size is n as usual while the population size is N, the variance is reduced by the factor (N – n)/N. The standard deviation is reduced by \(\surd \)[(N – n)/(N – 1)]. The covariance is reduced by a factor of [(N – n)/(N – 1)]. This can be safely ignored if the sample is a small part of the population, i.e. n/N is small, but under the interpretation that the present population of people, places, or things is all that matters, the finite population correction factor should always be applied, which would eliminate the variance of fixed effects for states in most policy studies. Ignoring the finite population correction factor would then be a standard mistake.
The philosophical interpretation that eliminates the problem in general is based on a distinction between existent and hypothetical populations (Kendall et al. (, Sect. 1.29, pp. 22–23)). The work by von Mises [42, pp. 98–99], in which the word “collective” refers to the more conventional “population” makes a distinction between the idea that “the calculus of probability deals each time merely with one single collective, whose distribution is subjected to certain summations or integrations”, i.e. the “restriction to one single initial collective” versus the “admissible distribution functions, the nature of the sub-sets of the attribute space for which probabilities can be defined, etc”. That is, there is a large set of possible combinations of attributes of the states, people, places, or things studied, and that is the collective which is studied. There could be a million different versions of New York State, only one of which is in fact observed. This makes n/N go to zero and justifies standard empirical inferential statistics.
Returning to Kendall et al. (), the sample can be less than the entire population because some people were not included, but could in principle have been, or because not every possible toss of a die has been recorded, which could not even in principle be done. Those other tosses have a hypothetical existence. The hypothetical population also applies to people—in Kendall et al. (, p. 23), to tsetse flies—as their condition and attributes could be different in a vast number of ways.
It must be emphasized that the distinction between existent and hypothetical populations is not merely a matter of ontological speculation—if it were we could safely ignore it—but one of practical importance when inferences are drawn about a population from a sample generated by it (Kendall et al. (, p. 23)).
Kendall et al. (, Sect. 9.4, p. 292) pursue the idea of a hypothetical population further. If empirical probability is “a limiting relative frequency”, then the limit concept requires that samples go to infinity in principle, and “we must be able to contemplate a series of replications under identical conditions and to specify the possible values that might be realized” (Kendall et al. (, p. 292)). Then “we must be willing that our observed value was randomly selected from the set of possibilities. At first sight, this is a rather baffling conception. However, such a structure is fundamental in a frequency-based theory of inference, and the approach is justified as being empirically useful” (p. 293).
The explanations of empirical probability by von Mises  repeatedly refer to the “limiting value of relative frequencies” (pp. 12, 21, 82, 105, 110, 124, 226, 229). The alternative to the argument of infinite sequences of observations from a hypothetical population is a “finite collective” (pp. 82–83). “There is no doubt about the fact that the sequences of observations to which the theory of probability is applied in practice are all finite” (p. 82). That is, samples are finite, and all states might be observed once. Infinite sequences are not substituted for the finite sequences. Probability and the resulting statistics are calculated based on the finite sequences. Hempel  similarly argued that “the results of a theory based on the notion of an infinite collective can be applied to finite sequences of observations” (von Mises [42, p. 85]).
In econometrics, the assumption is that there is a disturbance term in addition to the explanatory variables, so that given any set of explanatory variables, an infinite number of results (an infinite population or collective) is possible for any state, person, place, or thing. The mean, variance, and possibly the probability distribution could be estimated with enough data.