Statistics in Biosciences

, Volume 5, Issue 1, pp 3-25

First online:

Single Nucleotide Polymorphism (SNP) Detection and Genotype Calling from Massively Parallel Sequencing (MPS) Data

  • Yun LiAffiliated withDepartment of Genetics, University of North CarolinaDepartment of Biostatistics, University of North CarolinaDepartment of Computer Science, University of North Carolina Email author 
  • , Wei ChenAffiliated withDivision of Pediatric Pulmonary Medicine, Allergy and Immunology, Department of Pediatrics, Children’s Hospital of Pittsburgh of UPMC, University of Pittsburgh School of Medicine
  • , Eric Yi LiuAffiliated withDepartment of Computer Science, University of North Carolina
  • , Yi-Hui ZhouAffiliated withDepartment of Biostatistics, University of North Carolina

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


Massively parallel sequencing (MPS), since its debut in 2005, has transformed the field of genomic studies. These new sequencing technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders. They have also begun to deliver on their promise to explain some of the missing heritability from genome-wide association studies (GWAS) of complex traits. We anticipate a rapidly growing number of MPS-based studies for a diverse range of applications in the near future. One crucial and nearly inevitable step is to detect SNPs and call genotypes at the detected polymorphic sites from the sequencing data. Here, we review statistical methods that have been proposed in the past five years for this purpose. In addition, we discuss emerging issues and future directions related to SNP detection and genotype calling from MPS data.


Massively parallel sequencing Next-generation sequencing SNP detection Genotype calling Linkage disequilibrium (LD)