Article

Journal of Agricultural, Biological, and Environmental Statistics

, 13:177

First online:

Collision probabilities for AFLP bands, with an application to simple measures of genetic similarity

  • Gerrit GortAffiliated withWageningen University Email author 
  • , Wim J. M. KoopmanAffiliated withBiosystematics Group, National Herbarium Nederland, Wageningen University branch
  • , Alfred SteinAffiliated withDepartment of Earth Observation Science, ITC
  • , Fred A. van EeuwijkAffiliated withWageningen University

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

AFLP is a frequently used DNA fingerprinting technique that is popular in the plant sciences. A problem encountered in the interpretation and comparison of individual plant profiles, consisting of band presence-absence patterns, is that multiple DNA fragments of the same length can be generated that eventually show up as single bands on a gel. The phenomenon of two or more fragments coinciding in a band within an individual profile is a type of homoplasy, that we call collision. Homoplasy biases estimates of genetic similarity. In this study, we show how to calculate collision probabilities for bands as a function of band length, given the fragment count, the band count, or band lengths. We also determine probabilities of higher order collisions, and estimate the total number of collisions for a profile. Since short fragments occur more often, short bands are more likely to contain collisions. For a typical plant genome and AFLP procedure, the collision probability for the shortest band is 25 times larger than for the longest. In a profile with 100 bands a quarter of the bands may contain collisions, concentrated at the shorter band lengths. All calculations require a careful estimate of the monotonically decreasing fragment length distribution. Modifications of Dice and Jaccard coefficients are proposed. The principles are illustrated on data from a phylogenetic study in lettuce.

Key Words

Dice Fragment length distribution Jaccard Occupancy distribution Saddlepoint approximation Size homoplasy