Large Scale Multinomial Inferences and Its Applications in Genome Wide Association Studies
Statistical analysis of multinomial counts with a large number K of categories and a small number n of sample size is challenging to both frequentist and Bayesian methods and requires thinking about statistical inference at a very fundamental level. Following the framework of Dempster-Shafer theory of belief functions, a probabilistic inferential model is proposed for this “large K and small n” problem. Using a data-generating device, the inferential model produces probability triplet (p,q,r) for an assertion conditional on observed data. The probabilities p and q are for and against the truth of the assertion, whereas r = 1- p − q is the remaining probability called the probability of “don’t know”. The new inference method is applied in a genome-wide association study with very-high-dimensional count data, to identify association between genetic variants to a disease Rheumatoid Arthritis.
KeywordsMonte Carlo Sample Multinomial Distribution Belief Function Probabilistic Inference Inferential Model
Unable to display preview. Download preview PDF.
- 3.Martin, R., Liu, C.: Inferential models: A framework for prior-free posterior probabilistic inference. Technical Report, Department of Statistics, Purdue University (2011), http://www.stat.purdue.edu/~chuanhai/docs/imbasics.pdf.
- 4.Martin, R., Liu, C.: Generalized inferential models. Technical Report, Department of Statistics, Purdue University (2011), http://www.stat.purdue.edu/~chuanhai/docs/imlik-4.pdf.
- 5.Martin, R., Liu, C.: Inferential Models: Reasoning with uncertainty. Chapman & Hall/CRC (2013)Google Scholar