Slicing and Dicing the Genome: A Statistical Physics Approach to Population Genetics
- First Online:
- Cite this article as:
- Maruvka, Y.E., Shnerb, N.M., Solomon, S. et al. J Stat Phys (2011) 142: 1302. doi:10.1007/s10955-010-0113-7
The inference of past demographic parameters from current genetic polymorphism is a fundamental problem in population genetics. The standard techniques utilize a reconstruction of the gene-genealogy, a cumbersome process that may be applied only to small numbers of sequences. We present a method that compares the total number of haplotypes (distinct sequences) with the model prediction. By chopping the DNA sequence into pieces we condense the immense information hidden in sequence space into a function for the number of haplotypes versus subsequence size. The details of this curve are robust to statistical fluctuations and are seen to reflect the process parameters. This procedure allows for a clear visualization of the quality of the fit and, crucially, the numerical complexity grows only linearly with the number of sequences. Our procedure is tested against both simulated data as well as empirical mtDNA data from China and provides excellent fits in both cases.
KeywordsGalton-Watson theory Haplotype statistics Population genetics
Unable to display preview. Download preview PDF.