Computationally efficient map construction in the presence of segregation distortion
- 425 Downloads
- 2 Citations
Abstract
Key message
We present a novel estimator for map construction in the presence of segregation distortion which is highly computationally efficient. For multi-parental designs this estimator outperforms methods that do not account for segregation distortion, at no extra computational cost.
Abstract
Inclusion of genetic markers exhibiting segregation distortion in a linkage map can result in biased estimates of genetic distance and distortion of map positions. Removal of distorted markers is hence a typical filtering criterion; however, this may result in exclusion of biologically interesting regions of the genome such as introgressions and translocations. Estimation of additional parameters characterizing the distortion is computationally slow, as it relies on estimation via the Expectation Maximization algorithm or a higher dimensional numerical optimisation. We propose a robust M-estimator (RM) capable of handling tens of thousands of distorted markers from a single linkage group. We show via simulation that for multi-parental designs the RM estimator can perform much better than uncorrected estimation, at no extra computational cost. We then apply the RM estimator to chromosome 2B in wheat in a multi-parent population segregating for the Sr36 introgression, a known transmission distorter. The resulting map contains over 700 markers, and is consistent with maps constructed from crosses which do not exhibit segregation distortion.
Keywords
Segregation Distortion Recombination Fraction Distorted Marker Biallelic Marker Extra Computational CostList of symbols
- \(M_1, M_2, M_3\)
Genetic markers
- \(M_s\)
Segregation distortion locus (SDL)
- \(r_{12}, r_{23}, r_{13}\)
Recombination fractions between markers \(M_1, M_2\) and \(M_3\)
- \(r_{1s}, r_{s3}\)
Recombination fractions between \(M_s\) and markers \(M_{1}, M_{3}\)
- \(g_y(t)\)
Expected proportion of allele \(y\) at \(M_2\)
- \(n_{xyz}\)
Number of lines with multilocus genotype \(x, y, z\) at markers \(M_1, M_2, M_3\)
- \(n_{x.z}\)
Number of lines with multilocus genotype \(x, z\) at markers \(M_1, M_3\)
- \(\mathbb P_d\)
Distorted probability model
- \(\mathbb P_{d,f}\)
Distorted probability model for MAGIC8 population, assuming funnel \(f\)
- \(\mathbb P_{u,f}\)
Undistorted probability model for MAGIC8 population, assuming funnel \(f\)
- \(\mathbb P_u\)
Undistorted probability model
- \(p_{f}\)
Proportion of lines from funnel \(f\) in a MAGIC8 population
- \(p_{xyzf}\)
Proportion of lines from funnel \(f\) having multi-locus genotype \(x, y, z\) at marker \(M_1, M_2, M_3\)
- \(p_{.y.f}\)
Proportion of lines from funnel \(f\) having genotype \(y\) at marker \(M_2\)
- \(\hat{p}_{.y.}\)
Empirical proportion of lines having genotype \(y\) at \(M_2\)
- \(\hat{p}_{x.z}\)
Empirical proportion of lines having multi-locus genotype \(x, z\) at \(M_1, M_3\)
- \(\hat{p}_{xyz}\)
Empirical proportion of lines having multi-locus genotype \(x, y, z\) at \(M_1, M_2, M_3\)
- \(p_{xyz} (r_{12}, r_{23} )\)
Probability of multi-locus genotype \(x, y, z\) at markers \(M_1\), \(M_2\), \(M_3\), with given recombination fractions and no distortion
- \(p_{.y.}\)
Proportion of lines having genotype \(y\) at \(M_2\)
- \(G\)
Number of underlying genotypes at each marker
- \(F\)
Set of all funnels for the MAGIC8 population
- \(|F|\)
Number of funnels for the MAGIC8 population
Notes
Acknowledgments
Dr Huang is the recipient of an Australian Research Council Discovery Early Career Researcher Award (project number DE120101127).
Conflict of interest
The authors declare that they have no conflict of interest.
References
- Bandillo N, Raghavan C, Muyco PA, Sevilla MAL, Lobina IT, Dilla-Ermita CJ, Tung CW, McCouch S, Thomson M, Mauleon R, Singh RK, Gregorio G, Redona E, Leung H (2013) Multi-parent advanced generation inter-cross (magic) populations in rice: progress and potential for genetics research and breeding. Rice 6:11PubMedCrossRefGoogle Scholar
- Broman K (2005) The genomes of recombinant inbred lines. Genetics 169:1133–1146PubMedCentralPubMedCrossRefGoogle Scholar
- Cavanagh C, Morell M, Mackay I, Powell W (2008) From mutations to magic: resources for gene discovery, validation and delivery in crop plants. Curr Opin Plant Biol 11(2):215–221. doi: 10.1016/j.pbi.2008.01.002. http://www.sciencedirect.com/science/article/pii/S1369526608000162
- Cavanagh CR, Chao S, Wang S, Huang BE, Stephen S, Kiani S, Forrest K, Saintenac C, Brown-Guedira GL, Akhunova A, See D, Bai G, Pumphrey M, Tomar L, Wong D, Kong S, Reynolds M, da Silva ML, Bockelman H, Talbert L, Anderson JA, Dreisigacker S, Baenziger S, Carter A, Korzun V, Morrell PL, Dubcovsky J, Morell MK, Sorrells ME, Hayden MJ, Akhunov E (2013) Genome-wide comparative diversity uncovers multiple targets of selection for improvement in hexaploid wheat landraces and cultivars. Proc Natl Acad Sci 110:8057–8062PubMedCentralPubMedCrossRefGoogle Scholar
- Cheng R, Saito A, Takano Y, Ukai Y (1996) Estimation of the position and effect of a lethal factor locus on a molecular marker linkage map. Theor Appl Genet 93:494–502. doi: 10.1007/BF00417940
- Cheng R, Kleinhofs A, Ukai Y (1998) Method for mapping a partial lethal-factor locus on a molecular-marker linkage map of a backcross and doubled-haploid population. Theor Appl Genet 97:293–298. doi: 10.1007/s001220050898
- Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE (2011) A robust, simple genotyping-by-sequencing (gbs) approach for high diversity species. PLoS ONE 6(e19):379. doi: 10.1371/journal.pone/0019379 Google Scholar
- Farr A, Lacasa Benito I, Cistu L, Jong J, Romagosa I, Jansen J (2011) Linkage map construction involving a reciprocal translocation. Theor Appl Genet 122(5):1029–1037. doi: 10.1007/s00122-010-1507-2
- Gill BS, Friebe BR, White FF (2011) Alien introgressions represent a rich source of genes for crop improvement. Proc Natl Acad Sci 108(19):7657–7658. doi: 10.1073/pnas.1104845108. http://www.pnas.org/content/108/19/7657.short, http://www.pnas.org/content/108/19/7657.full.pdf+html
- Hackett CA, Broadfoot LB (2003) Effects of genotyping errors, missing values and segregation distortion in molecular marker data on the construction of linkage maps. Heredity 90(1):33–38. doi: 10.1038/sj.hdy.6800173
- Hahsler M, Buchta C, Hornik K (2008) Getting things in order: an introduction to the r package seriation. J Stat Softw 25:3Google Scholar
- Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (2005) The approach based on influence functions. In: Robust statistics. Wiley, New York, pp 100–107Google Scholar
- Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685CrossRefGoogle Scholar
- Huang BE, George AW (2011) R/mpmap: a computational platform for the genetic analysis of multi-parent recombinant inbred lines. Bioinformatics 27:727–729PubMedCrossRefGoogle Scholar
- Huang BE, George AW, Forrest KL, Kilian A, Hayden MJ, Morell MK, Cavanagh CR (2012) A multiparent advanced generation inter-cross population for genetic analysis in wheat. Plant Biotechnol J 10(7):826–839. doi: 10.1111/j.1467-7652.2012.00702.x
- Huber PJ (1964) Robust estimation of a location parameter. Ann Math Stat 35(1):73–101CrossRefGoogle Scholar
- Huber PJ, Ronchetti EM (2009) Robust statistics, 2nd edn. Wiley, Hoboken, pp 45–55Google Scholar
- Kover PX, Valdar W, Trakalo J, Scarcelli N, Ehrenreich IM, Purugganan MD, Durrant C, Mott R (2009) A multiparent advanced generation inter-cross to fine-map quantitative traits in Arabidopsis thaliana. PLoS Genet 5(7):e1000,551Google Scholar
- Liu X, Guo L, You J, Liu X, He Y, Yuan J, Feng Z (2010) Progress of segregation distortion in genetic mapping of plants. Res J Agron 4:78–83CrossRefGoogle Scholar
- Lorieux M, Goffinet B, Perrier X, León DG, Lanaud C (1995a) Maximum-likelihood models for mapping genetic markers showing segregation distortion. 1. backcross populations. Theor ApplGenet 90:73–80. doi: 10.1007/BF00220998
- Lorieux M, Perrier X, Goffinet B, Lanaud C, León D (1995b) Maximum-likelihood models for mapping genetic markers showing segregation distortion. 2. f2 populations. Theor Appl Genet 90:81–89. doi: 10.1007/BF00220999
- Teuscher F, Broman K (2007) Haplotype probabilities for multiple-strain recombinant inbred lines. Genetics 175:1267–1274PubMedCentralPubMedCrossRefGoogle Scholar
- Tsilo TJ, Jin Y, Anderson JA (2008) Diagnostic microsatellite markers for the detection of stem rust resistance gene sr36 in diverse genetic backgrounds of wheat. Crop Sci 48(1):253–261CrossRefGoogle Scholar
- Wang C, Zhu C, Zhai H, Wan J (2005) Mapping segregation distortion loci and quantitative trait loci for spikelet sterility in rice (Oryza sativa l.). Genet Res 86:97–106PubMedCrossRefGoogle Scholar
- Wu R, Ma CX, Casella G (2007) Statistical genetics of quantitative traits: linkage, maps and QTL. Springer, Berlin, pp 52–56Google Scholar
- Xie W, Ben-David R, Zeng B, Dinoor A, Xie C, Sun Q, Rder M, Fahoum A, Fahima T (2012) Suppressed recombination rate in 6vs/6al translocation region carrying the pm21 locus introgressed from haynaldia villosa into hexaploid wheat. Mol Breed 29(2):399–412. doi: 10.1007/s11032-011-9557-y
- Xu S (2008) Quantitative trait locus mapping can benefit from segregation distortion. Genetics 180:2201–2208PubMedCentralPubMedCrossRefGoogle Scholar
- Xu S, Hu Z (2009) Mapping quantitative trait loci using distorted markers. Int J Plant Genomics. doi: 10.1155/2009/410825
- Zhang L, Wang S, Li H, Deng Q, Zheng A, Li S, Li P, Li Z, Wang J (2010) Effects of missing marker and segregation distortion on qtl mapping in f2 populations. Theor Appl Genet 121(6):1071–1082. doi: 10.1007/s00122-010-1372-z
- Zhu C, Wang C, Zhang YM (2007) Modeling segregation distortion for viability selection i. reconstruction of linkage maps with distorted markers. Theor Appl Genet 114:295–305. doi: 10.1007/s00122-006-0432-x