The sampling distribution of categorical data is determined by the observational procedure applied and not by assumptions with regard to the statistical model that characterizes the population. The sampling distribution is said to be multinomial if the number of observations is fixed in advance (binomial if there are only two categories), product multinomial if certain subsamples have pre-specified sizes, and Poisson if not the sample size rather the time period or geographic extent of sampling is specified in advance. This chapter gives a precise definition of these sampling procedures and discusses their most important characteristics. Marginalization and conditioning are the most frequent transformations that are applied to categorical data when relationships among variables are studied, and their implications for the sampling distributions are also described. The relationship between the multinomial and the Poisson distributions is investigated in detail. Statistical modeling mostly concentrates on data obtained through one of the above sampling procedures, but most surveys of the human population apply sampling procedures with unequal selection probabilities of individuals. These procedures are reviewed briefly.
- 38.Hansen, M.H., Hurwitz, W.N. Madow, W.G.: Sample Survey Methods and Theory, Volumes I and II. Wiley, New York (1993)Google Scholar