A Compositional Approach to Allele Sharing Analysis

  • I. Galván-Femenía
  • J. Graffelman
  • C. Barceló-i-Vidal
Conference paper
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 187)

Abstract

Relatedness is of great interest in population-based genetic association studies. These studies search for genetic factors related to disease. Many statistical methods used in population-based genetic association studies (such as standard regression models, t-tests, and logistic regression) assume that the observations (individuals) are independent. These techniques can fail if independence is not satisfied. Allele sharing is a powerful data analysis technique for analyzing the degree of dependence in diploid species. Two individuals can share 0, 1, or 2 alleles for any genetic marker. This sharing may be assessed for alleles identical by state (IBS) or identical by descent (IBD). Starting from IBS alleles, it is possible to detect the type of relationship of a pair of individuals by using graphical methods. Typical allele sharing analysis consists of plotting the fraction of loci sharing 2 IBS alleles versus the fraction of sharing 0 IBS alleles. Compositional data analysis can be applied to allele sharing analysis because the proportions of sharing 0, 1 or 2 IBS alleles (denoted by \(p_0\), \(p_1\), and \(p_2\)) form a 3-part-composition. This chapter provides a graphical method to detect family relationships by plotting the isometric log-ratio transformation of \(p_0\), \(p_1\), and \(p_2\). On the other hand, the probabilities of sharing 0, 1, or 2 IBD alleles (denoted by \(k_0, k_1, k_2\)), which are termed Cotterman’s coefficients, depend on the relatedness: monozygotic twins, full-siblings, parent-offspring, avuncular, first cousins, etc. It is possible to infer the type of family relationship of a pair of individuals by using maximum likelihood methods. As a result, the estimated vector \({\hat{\varvec{k}}}=(\hat{k}_0, \hat{k}_1,\hat{k}_2)\) for each pair of individuals forms a 3-part-composition and can be plotted in a ternary diagram to identify the degree of relatedness. An R package has been developed for the study of genetic relatedness based on genetic markers such as microsatellites and single nucleotide polymorphisms from human populations, and is used for the computations and graphics of this contribution.

Keywords

Allele sharing Identical by state Identical by descent Cotterman’s coefficients Ternary diagram Isometric log-ratio transformation 

References

  1. 1.
    Cavalli-Sforza, L.L.: The human genome diversity project: past, present and future. Nature Rev. Genet. 6, 333–340 (2005)Google Scholar
  2. 2.
    Chakraborty, R., Jin, L.: Determination of relatedness between individuals using DNA fingerprinting. Hum. Biol. 65(6), 875–895 (1993)Google Scholar
  3. 3.
    Cotterman, C.W.: Relative and human genetic analysis. Sci. Mon. 53, 227–234 (1941)Google Scholar
  4. 4.
    Egozcue, J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G., Barceló-Vidal, C.: Isometric logratio transformations for compositional data analysis. Math. Geol. 35(3), 279–300 (2003)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Foulkes, A.S.: Applied Statistical Genetics with R. Springer (2009)Google Scholar
  6. 6.
    Ghalanos, A., Theussl, S.: Rsolnp: general non-linear optimization using augmented Lagrange multiplier method. R package version 1, 15 (2014)Google Scholar
  7. 7.
    Graffelman, J., Galván-Femenía, I.: An application of the isometric log-ratio transformation in relatedness research. In: Martín-Fernández J, A., Thió-Henestrosa, S. (eds.) Compositional Data Analysis, Springer Proceedings in Mathematics & Statistics 187, (2016)Google Scholar
  8. 8.
    Hamilton, N.: ggtern: An Extension to ‘ggplot2’, for the Creation of Ternary Diagrams. R package version 1.0.6.0 (2015). http://CRAN.R-project.org/package=ggtern
  9. 9.
    Laird, N.M., Lange, C.: The fundamentals of modern statistical genetics. Springer (2011)Google Scholar
  10. 10.
    Milligan, B.G.: Maximum-likelihood estimation of relatedness. Genetics 163, 1153–67 (2003)Google Scholar
  11. 11.
    Moltke, I., Albrechtsen, A.: RelateAdmix: a software tool for estimating relatedness between admixed individuals. Bioinformatics 30, 1027–8 (2014)CrossRefGoogle Scholar
  12. 12.
    Nembot-Simo, A., Graham, J., McNeney, B.: CrypticIBDcheck: an R package for checking cryptic relatedness in nominally unrelated individuals. Source Code Biol. Med. 8, 5 (2013)CrossRefGoogle Scholar
  13. 13.
    Rosenberg, N.A.: Rosenberg lab at Stanford University (2002). http://www.stanford.edu/group/rosenberglab/diversity.html
  14. 14.
    Rosenberg, N.A.: Standardized subsets of the HGDP-CEPH human genome diversity cell line panel, accounting for atypical and duplicated samples and pairs of close relatives. Ann. Hum. Genet. 70, 841–847 (2006)CrossRefGoogle Scholar
  15. 15.
    Thompson, E.A.: Estimation of pairwise relationships. Ann. Hum. Genet. 39, 173–188 (1975)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Thompson, E.A.: Estimation of relationships from genetic data. In: Rao, C.R., Chakraborty, R. (eds.) Handbook of Statistics, vol. 8, pp. 255–269. Elsevier Science, Amsterdam (1991)Google Scholar
  17. 17.
    Weir, B.S., Anderson, A.D., Hepler, A.B.: Genetic relatedness analysis: modern data and new challenges. Nature Rev. Genet. 7, 771–780 (2006)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • I. Galván-Femenía
    • 1
  • J. Graffelman
    • 2
  • C. Barceló-i-Vidal
    • 1
  1. 1.Department of Computer Science, Applied Mathematics and StatisticsUniversitat de GironaGironaSpain
  2. 2.Department of Statistics and Operations ResearchUniversitat Politècnica de CatalunyaBarcelonaSpain

Personalised recommendations