Skip to main content
Log in

Fine human genetic map based on UK10K data set

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

Recombination is a major force that shapes genetic diversity. Determination of recombination rate is important and can theoretically be improved by increasing the sample size. However, it is nearly impossible to estimate recombination rates using traditional population genetics methods when the sample size is large because these methods are highly computationally demanding. In this study, we used a refined machine learning approach to estimate the recombination rate of the human genome using the UK10K human genomic dataset with 7,562 genomic sequences and its three subsets with 200, 400 and 2,000 genomic sequences. The estimation was performed under the human Out-of-Africa demographic model. We not only obtained an accurate human genetic map, but also found that the fluctuation of estimated recombination rate is reduced along the human genome when the sample size increases. The estimated UK10K recombination rate heterogeneity is less than that estimated from its subsets. Our results demonstrate how the sample size affects the estimated recombination rate, and analyses of a larger number of genomes result in a more precise estimation of recombination rate. The accurate genetic map based on UK10K data set is also expected to benefit other human biology researches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Altemose N, Noor N, Bitoun E, Tumian A, Imbeault M, Chapman JR, Aricescu AR, Myers SR (2017) A map of human PRDM9 binding provides evidence for novel behaviors of PRDM9 and other zinc-finger proteins in meiosis. Elife 6:e28383

    PubMed  PubMed Central  Google Scholar 

  • Altshuler DM, Durbin RM, Abecasis GR, Bentley DR, Chakravarti A, Clark AG, Donnelly P, Eichler EE, Flicek P, Gabriel SB, Gibbs RA, Green ED, Hurles ME, Knoppers BM, Korbel JO, Lander ES, Lee C, Lehrach H, Mardis ER, Marth GT, McVean GA, Nickerson DA, Schmidt JP, Sherry ST, Wang J, Wilson RK, Gibbs RA, Dinh H, Kovar C, Lee S, Lewis L, Muzny D, Reid J, Wang M, Wang J, Fang XD, Guo XS, Jian M, Jiang H, Jin X, Li GQ, Li JX, Li YR, Li Z, Liu X, Lu Y, Ma XD, Su Z, Tai SS, Tang MF, Wang B, Wang GB, Wu HL, Wu RH, Yin Y, Zhang WW, Zhao J, Zhao MR, Zheng XL, Zhou Y, Lander ES, Altshuler DM, Gabriel SB, Gupta N, Flicek P, Clarke L, Leinonen R, Smith RE, Zheng-Bradley X, Bentley DR, Grocock R, Humphray S, James T, Kingsbury Z, Lehrach H, Sudbrak R, Albrecht MW, Amstislavskiy VS, Borodina TA, Lienhard M, Mertes F, Sultan M, Timmermann B, Yaspo ML, Sherry ST, McVean GA, Mardis ER, Wilson RK, Fulton L, Fulton R, Weinstock GM, Durbin RM, Balasubramaniam S, Burton J, Danecek P, Keane TM, Kolb-Kokocinski A, McCarthy S, Stalker J, Quail M et al (2012) An integrated map of genetic variation from 1092 human genomes. Nature 491:56–65

    Google Scholar 

  • Arbeithuber B, Betancourt AJ, Ebner T, Tiemann-Boege I (2015) Crossovers are associated with mutation and biased gene conversion at recombination hotspots. Proc Natl Acad Sci USA 112:2109–2114

    CAS  PubMed  PubMed Central  Google Scholar 

  • Ardlie KG, Kruglyak L, Seielstad M (2002) Patterns of linkage disequilibrium in the human genome. Nat Rev Genet 3:299–309

    CAS  PubMed  Google Scholar 

  • Auton A, McVean G (2007) Recombination rate estimation in the presence of hotspots. Genome Res 17:1219–1227

    CAS  PubMed  PubMed Central  Google Scholar 

  • Bell AD, Mello CJ, Nemesh J, Brumbaugh SA, Wysoker A, McCarroll SA (2020) Insights into variation in meiosis from 31,228 human sperm genomes. Nature 583:259–264

    CAS  PubMed  Google Scholar 

  • Buhlmann P (2006) Boosting for high-dimensional linear models. Ann Stat 34:559–583

    Google Scholar 

  • Coop G, Przeworski M (2007) An evolutionary view of human recombination. Nat Rev Genet 8:23–34

    CAS  PubMed  Google Scholar 

  • Cullen M, Perfetto SP, Klitz W, Nelson G, Carrington M (2002) High-resolution patterns of meiotic recombination across the human major histocompatibility complex. Am J Hum Genet 71:759–776

    PubMed  PubMed Central  Google Scholar 

  • Dapper AL, Payseur BA (2018) Effects of demographic history on the detection of recombination hotspots from linkage disequilibrium. Mol Biol Evol 35:335–353

    CAS  PubMed  Google Scholar 

  • Fearnhead P, Donnelly P (2001) Estimating recombination rates from population genetic data. Genetics 159:1299–1318

    CAS  PubMed  PubMed Central  Google Scholar 

  • Flagel L, Brandvain Y, Schrider DR (2019) The unreasonable effectiveness of convolutional neural networks in population genetic inference. Mol Biol Evol 36:220–238

    CAS  PubMed  Google Scholar 

  • Francioli LC, Polak PP, Koren A, Menelaou A, Chun S, Renkens I, van Duijn CM, Swertz M, Wijmenga C, van Ommen G, Slagboom PE, Boomsma DI, Ye K, Guryev V, Arndt PF, Kloosterman WP, de Bakker PIW, Sunyaev SR, Consortium GN (2015) Genome-wide patterns and properties of de novo mutations in humans. Nat Genet 47:822–826

    Google Scholar 

  • Fu YX (2006) Exact coalescent for the Wright-Fisher model. Theor Popul Biol 69:385–394

    PubMed  Google Scholar 

  • Gao F, Ming C, Hu WJ, Li HP (2016) New software for the fast estimation of population recombination rates (FastEPRR) in the genomic era. G3 6:1563–1571

    CAS  PubMed  PubMed Central  Google Scholar 

  • Gartner K, Futschik A (2016) Improved versions of common estimators of the recombination rate. J Comput Biol 23:756–768

    PubMed  Google Scholar 

  • Gravel S, Henn BM, Gutenkunst RN, Indap AR, Marth GT, Clark AG, Yu FL, Gibbs RA, Bustamante CD, Project G (2011) Demographic history and rare allele sharing among human populations. Proc Natl Acad Sci USA 108:11983–11988

    Google Scholar 

  • Halldorsson BV, Palsson G, Stefansson OA, Jonsson H, Hardarson MT, Eggertsson HP, Gunnarsson B, Oddsson A, Halldorsson GH, Zink F, Gudjonsson SA, Frigge ML, Thorleifsson G, Sigurdsson A, Stacey SN, Sulem P, Masson G, Helgason A, Gudbjartsson DF, Thorsteinsdottir U, Stefansson K (2019) Characterizing mutagenic effects of recombination through a sequence-level genetic map. Science 363:eaau1043

    CAS  PubMed  Google Scholar 

  • Hassan S, Surakka I, Taskinen MR, Salomaa V, Palotie A, Wessman M, Tukiainen T, Pirinen M, Palta P, Ripatti S (2021) High-resolution population-specific recombination rates and their effect on phasing and genotype imputation. Eur J Hum Genet 29:615–624

    CAS  PubMed  Google Scholar 

  • Hernandez RD, Kelley JL, Elyashiv E, Melton SC, Auton A, McVean G, Sella G, Przeworski M, Project G (2011) Classic selective sweeps were rare in recent human evolution. Science 331:920–924

    Google Scholar 

  • Hill WG, Robertson A (1968) Linkage disequilibrium in finite populations. Theor Appl Genet 38:226–231

    CAS  PubMed  Google Scholar 

  • Hothorn T, Buehlmann P, Kneib T, Schmid M, Hofner B (2018) mboost: Model-Based Boosting, R package version 2.9–1, https://CRAN.R-project.org/package=mboost.

  • Hu WJ, Hao ZQ, Du PY, Di Vincenzo F, Manzi G, Pan YH, Li H (2021) Genomic inference of a human super bottleneck in Mid-Pleistocene transition. bioRxiv. https://doi.org/10.1101/2021.05.16.444351

    Article  PubMed  PubMed Central  Google Scholar 

  • Hudson RR (2001) Two-locus sampling distributions and their application. Genetics 159:1805–1817

    CAS  PubMed  PubMed Central  Google Scholar 

  • Hudson RR (2002) Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18:337–338

    CAS  PubMed  Google Scholar 

  • Hussin JG, Hodgkinson A, Idaghdour Y, Grenier JC, Goulet JP, Gbeha E, Hip-Ki E, Awadalla P (2015) Recombination affects accumulation of damaging and disease-associated mutations in human populations. Nat Genet 47:400–404

    CAS  PubMed  Google Scholar 

  • Jeffreys AJ, Kauppi L, Neumann R (2001) Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet 29:217–222

    CAS  PubMed  Google Scholar 

  • Kamm JA, Spence JP, Chan J, Song YS (2016) Two-locus likelihoods under variable population size and fine-scale recombination rate estimation. Genetics 203:1381–1399

    CAS  PubMed  PubMed Central  Google Scholar 

  • Keinan A, Reich D (2010) Human population differentiation is strongly correlated with local recombination rate. PLoS Genet 6:e1000886

    PubMed  PubMed Central  Google Scholar 

  • Kingman JFC (1982) On the genealogy of large populations. J Appl Probab 19:27–43

    Google Scholar 

  • Kong A, Thorleifsson G, Gudbjartsson DF, Masson G, Sigurdsson A, Jonasdottir A, Walters GB, Jonasdottir A, Gylfason A, Kristinsson KT, Gudjonsson SA, Frigge ML, Helgason A, Thorsteinsdottir U, Stefansson K (2010) Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467:1099–1103

    CAS  PubMed  Google Scholar 

  • Kong A, Thorleifsson G, Frigge ML, Masson G, Gudbjartsson DF, Villemoes R, Magnusdottir E, Olafsdottir SB, Thorsteinsdottir U, Stefansson K (2014) Common and low-frequency variants associated with genome-wide recombination rate. Nat Genet 46:11–16

    CAS  PubMed  Google Scholar 

  • Li H, Stephan W (2005) Maximum-likelihood methods for detecting recent positive selection and localizing the selected site in the genome. Genetics 171:377–384

    CAS  PubMed  PubMed Central  Google Scholar 

  • Li N, Stephens M (2003) Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165:2213–2233

    CAS  PubMed  PubMed Central  Google Scholar 

  • Lin K, Li H, Schlotterer C, Futschik A (2011) Distinguishing positive selection from neutral evolution: Boosting the performance of summary statistics. Genetics 187:229–244

    PubMed  PubMed Central  Google Scholar 

  • Lin K, Futschik A, Li H (2013) A fast estimate for the population recombination rate based on regression. Genetics 194:473–484

    PubMed  PubMed Central  Google Scholar 

  • McVean G, Awadalla P, Fearnhead P (2002) A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160:1231–1241

    CAS  PubMed  PubMed Central  Google Scholar 

  • McVean GAT, Myers SR, Hunt S, Deloukas P, Bentley DR, Donnelly P (2004) The fine-scale structure of recombination rate variation in the human genome. Science 304:581–584

    CAS  PubMed  Google Scholar 

  • Miretti MM, Walsh EC, Ke XY, Delgado M, Griffiths M, Hunt S, Morrison J, Whittaker P, Lander ES, Cardon LR, Bentley DR, Rioux JD, Beck S, Deloukas P (2005) A high-resolution linkage-disequilibrium map of the human major histocompatibility complex and first generation of tag single-nucleotide polymorphisms. Am J Hum Genet 76:634–646

    CAS  PubMed  PubMed Central  Google Scholar 

  • Myers S, Bottolo L, Freeman C, McVean G, Donnelly P (2005) A fine-scale map of recombination rates and hotspots across the human genome. Science 310:321–324

    CAS  PubMed  Google Scholar 

  • O’Connell J, Gurdasani D, Delaneau O, Pirastu N, Ulivi S, Cocca M, Traglia M, Huang J, Huffman JE, Rudan I, McQuillan R, Fraser RM, Campbell H, Polasek O, Asiki G, Ekoru K, Hayward C, Wright AF, Vitart V, Navarro P, Zagury JF, Wilson JF, Toniolo D, Gasparini P, Soranzo N, Sandhu MS, Marchini J (2014) A general approach for haplotype phasing across the full spectrum of relatedness. PLoS Genet 10:e1004234

    PubMed  PubMed Central  Google Scholar 

  • Ohta T, Kimura M (1971) Linkage disequilibrium between two segregating nucleotide sites under the steady flux of mutations in a finite population. Genetics 68:571–580

    CAS  PubMed  PubMed Central  Google Scholar 

  • Pavlidis P, Jensen JD, Stephan W (2010) Searching for footprints of positive selection in whole-genome SNP data from nonequilibrium populations. Genetics 185:907–922

    CAS  PubMed  PubMed Central  Google Scholar 

  • Payseur BA, Rieseberg LH (2016) A genomic perspective on hybridization and speciation. Mol Ecol 25:2337–2360

    CAS  PubMed  PubMed Central  Google Scholar 

  • Price AL, Tandon A, Patterson N, Barnes KC, Rafaels N, Ruczinski I, Beaty TH, Mathias R, Reich D, Myers S (2009) Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet 5:e1000519

    PubMed  PubMed Central  Google Scholar 

  • R Core Team (2019) R: A language and environment for statistical computing.

  • Sall T, Nilsson NO (1994) The robustness of recombination frequency estimates in intercrosses with dominant markers. Genetics 137:589–596

    CAS  PubMed  PubMed Central  Google Scholar 

  • Schiffels S, Durbin R (2014) Inferring human population size and separation history from multiple genome sequences. Nat Genet 46:919–925

    CAS  PubMed  PubMed Central  Google Scholar 

  • Schrider DR, Kern AD (2018) Supervised machine learning for population genetics: a new paradigm. Trends Genet 34:301–312

    CAS  PubMed  PubMed Central  Google Scholar 

  • Schumer M, Xu CL, Powell DL, Durvasula A, Skov L, Holland C, Blazier JC, Sankararaman S, Andolfatto P, Rosenthal GG, Przeworski M (2018) Natural selection interacts with recombination to shape the evolution of hybrid genomes. Science 360:656–659

    CAS  PubMed  PubMed Central  Google Scholar 

  • Spence JP, Song YS (2019) Inference and analysis of population-specific fine-scale recombination maps across 26 diverse human populations. Sci Adv 5:eaaw9206

    PubMed  PubMed Central  Google Scholar 

  • Stapley J, Feulner PGD, Johnston SE, Santure AW, Smadja CM (2017) Recombination: the good, the bad and the variable. Philos Trans R Soc Lond B Biol Sci 372:20170279

    PubMed  PubMed Central  Google Scholar 

  • Stevison LS, Woerner AE, Kidd JM, Kelley JL, Veeramah KR, McManus KF, Bustamante CD, Hammer MF, Wall JD (2016) The time scale of recombination rate evolution in Great Apes. Mol Biol Evol 33:928–945

    CAS  PubMed  Google Scholar 

  • Terhorst J, Kamm JA, Song YS (2017) Robust and scalable inference of population history from hundreds of unphased whole genomes. Nat Genet 49:303–309

    CAS  PubMed  Google Scholar 

  • van Eeden G, Uren C, Moller M, Henn BM (2021) Inferring recombination patterns in African populations. Hum Mol Genet 30:R11–R16

    PubMed  PubMed Central  Google Scholar 

  • Wall JD (2000) A comparison of estimators of the population recombination rate. Mol Biol Evol 17:156–163

    CAS  PubMed  Google Scholar 

  • Walter K, Min JL, Huang J, Crooks L, Memari Y, McCarthy S, Perry JRB, Xu C, Futema M, Lawson D, Iotchkova V, Schiffels S, Hendricks AE, Danecek P, Li R, Floyd J, Wain LV, Barroso I, Humphries SE, Hurles ME, Zeggini E, Barrett JC, Plagnol V, Richards JB, Greenwood CMT, Timpson NJ, Durbin R, Soranzo N, Bala S, Clapham P, Coates G, Cox T, Daly A, Danecek P, Du Y, Durbin R, Edkins S, Ellis P, Flicek P, Guo X, Guo X, Huang L, Jackson DK, Joyce C, Keane T, Kolb-Kokocinski A, Langford C, Li Y, Liang J, Lin H, Liu R, Maslen J, McCarthy S, Muddyman D, Quail MA, Stalker J, Sun J, Tian J, Wang G, Wang J, Wang Y, Wong K, Zhang P, Barroso I, Birney E, Boustred C, Chen L, Clement G, Cocca M, Danecek P, Smith GD, Day INM, Day-Williams A, Down T, Dunham I, Durbin R, Evans DM, Gaunt TR, Geihs M, Greenwood CMT, Hart D, Hendricks AE, Howie B, Huang J, Hubbard T, Hysi P, Iotchkova V, Jamshidi Y, Karczewski KJ, Kemp JP, Lachance G, Lawson D, Lek M, Lopes M, MacArthur DG, Marchini J, Mangino M, Mathieson I, McCarthy S, Memari Y et al (2015) The UK10K project identifies rare variants in health and disease. Nature 526:82–89

    CAS  PubMed  Google Scholar 

  • Wang GD, Larson G, Kidd JM, vonHoldt BM, Ostrander EA, Zhang YP (2019) Dog10K: the international consortium of canine genome sequencing. Natl Sci Rev 6:611–613

    PubMed  PubMed Central  Google Scholar 

  • Webb AJ, Berg IL, Jeffreys A (2008) Sperm cross-over activity in regions of the human genome showing extreme breakdown of marker association. Proc Natl Acad Sci USA 105:10471–10476

    CAS  PubMed  PubMed Central  Google Scholar 

  • Wegmann D, Kessner DE, Veeramah KR, Mathias RA, Nicolae DL, Yanek LR, Sun YV, Torgerson DG, Rafaels N, Mosley T, Becker LC, Ruczinski I, Beaty TH, Kardia SLR, Meyers DA, Barnes KC, Becker DM, Freimer NB, Novembre J (2011) Recombination rates in admixed individuals identified by ancestry-based inference. Nat Genet 43:847–853

    CAS  PubMed  PubMed Central  Google Scholar 

  • Weiss KM, Clark AG (2002) Linkage disequilibrium and the mapping of complex human traits. Trends Genet 18:19–24

    CAS  PubMed  Google Scholar 

  • Wirtz J, Wiehe T (2019) The evolving Moran genealogy. Theor Popul Biol 130:94–105

    PubMed  Google Scholar 

  • Wu RG, Li HX, Peng D, Li R, Zhang YM, Hao B, Huang EW, Zheng CH, Sun HY (2019) Revisiting the potential power of human leukocyte antigen (HLA) genes on relationship testing by massively parallel sequencing-based HLA typing in an extended family. J Hum Genet 64:29–38

    CAS  PubMed  Google Scholar 

  • Yu DL, Dong LL, Yan FQ, Mu HL, Tang BX, Yang X, Zeng T, Zhou Q, Gao F, Wang ZH, Hao ZQ, Kang HE, Zheng Y, Huang HW, Wei YZ, Pan W, Xu YC, Zhu JW, Zhao SL, Wang CR, Wang PY, Dai L, Li MS, Lan L, Wang YW, Chen H, Li YX, Fu YX, Shao Z, Bao YM, Zhao FQ, Chen LN, Zhang GQ, Zhao WM, Li HP (2019) eGPS 1.0: Comprehensive software for multi-omic and evolutionary analyses. Natl Sci Rev 6:867–869

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the UK10K Project Consortium for sharing the data.

Funding

This work was supported by grants from the National Natural Science Foundation of China (nos. 31100273, 31172073, 91131010), the Strategic Priority Research Program of the Chinese Academy of Sciences (No. XDB38030100), the National Key Research and Development Project (No. 2020YFC0847000) and the funding from Shanghai Institute of Nutrition and Health (No. JBGSRWBD-SINH-2021-10).

Author information

Authors and Affiliations

Authors

Contributions

ZH, PD, YHP, and HL conceived and designed the research; ZH and HL wrote the code; ZH and PD analyzed the data; ZH, PD, YHP, and HL wrote the paper.

Corresponding authors

Correspondence to Yi-Hsuan Pan or Haipeng Li.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Availability of data and material

The datasets used in this study are available at the UK10K Project Consortium (Walter et al. 2015) (https://www.uk10k.org/) The genetic maps of OMNI data set built by LDhat (Auton and McVean 2007) were downloaded from the 1000 Genomes Project (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/working/20130507_omni_recombination_rates).

Code availability

FastEPRR 2.0 is written in R and integrated on the eGPS cloud (Yu et al. 2019) (http://www.egps-software.net). The desktop version and the genetic maps established in this study are freely available on the institute website (https://www.picb.ac.cn/evolgen/).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 382 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hao, Z., Du, P., Pan, YH. et al. Fine human genetic map based on UK10K data set. Hum Genet 141, 273–281 (2022). https://doi.org/10.1007/s00439-021-02415-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-021-02415-8

Navigation