CoDaWork 2015: Compositional Data Analysis pp 75-84

# An Application of the Isometric Log-Ratio Transformation in Relatedness Research

Conference paper

DOI: 10.1007/978-3-319-44811-4_6

Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 187)
Cite this paper as:
Graffelman J., Galván-Femenía I. (2016) An Application of the Isometric Log-Ratio Transformation in Relatedness Research. In: Martín-Fernández J., Thió-Henestrosa S. (eds) Compositional Data Analysis. CoDaWork 2015. Springer Proceedings in Mathematics & Statistics, vol 187. Springer, Cham

## Abstract

Genetic marker data contains information on the degree of relatedness of a pair of individuals. Relatedness investigations are usually based on the extent to which alleles of a pair of individuals match over a set of markers for which their genotype has been determined. A distinction is usually drawn between alleles that are identical by state (IBS) and alleles that are identical by descent (IBD). Since any pair of individuals can only share 0, 1, or 2 alleles IBS or IBD for any marker, 3-way compositions can be computed that consist of the fractions of markers sharing 0, 1, or 2 alleles IBS (or IBD) for each pair. For any given standard relationship (e.g., parent–offspring, sister–brother, etc.) the probabilities $$k_0, k_1$$ and $$k_2$$ of sharing 0, 1 or 2 IBD alleles are easily deduced and are usually referred to as Cotterman’s coefficients. Marker data can be used to estimate these coefficients by maximum likelihood. This maximization problem has the 2-simplex as its domain. If there is no inbreeding, then the maximum must occur in a subset of the 2-simplex. The maximization problem is then subject to an additional nonlinear constraint ($$k_1^2 \ge 4 k_0 k_2$$). Special optimization routines are needed that do respect all constraints of the problem. A reparametrization of the likelihood in terms of isometric log-ratio (ilr) coordinates greatly simplifies the maximization problem. In isometric log-ratio coordinates the domain turns out to be rectangular, and maximization can be carried out by standard general-purpose maximization routines. We illustrate this point with some examples using data from the HapMap project.

### Keywords

Genetic marker Identity-by-state Identity-by-descent Hardy–Weinberg equilibrium Composition Closure Ternary plot Isometric log-ratio transformation