Bulletin of Mathematical Biology

, Volume 67, Issue 6, pp 1303–1313 | Cite as

The statistical analysis of multivariate serological frequency data

  • Richard A. Reyment
Article

Abstract

Data occurring in the form of frequencies are common in genetics—for example, in serology. Examples are provided by the AB0 group, the Rhesus group, and also DNA data. The statistical analysis of tables of frequencies is carried out using the available methods of multivariate analysis with usually three principal aims. One of these is to seek meaningful relationships between the components of a data set, the second is to examine relationships between populations from which the data have been obtained, the third is to bring about a reduction in dimensionality. This latter aim is usually realized by means of bivariate scatter diagrams using scores computed from a multivariate analysis. The multivariate statistical analysis of tables of frequencies cannot safely be carried out by standard multivariate procedures because they represent compositions and are therefore embedded in simplex space, a subspace of full space. Appropriate procedures for simplex space are compared and contrasted with simple standard methods of multivariate analysis (“raw” principal component analysis). The study shows that the differences between a log-ratio model and a simple logarithmic transformation of proportions may not be very great, particularly as regards graphical ordinations, but important discrepancies do occur. The divergencies between logarithmically based analyses and raw data are, however, great. Published data on Rhesus alleles observed for Italian populations are used to exemplify the subject.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aitchison, J., 1983. Principal component analysis of compositional data. Biometrika 70, 57–63.MATHMathSciNetCrossRefGoogle Scholar
  2. Aitchison, J., 1986. The Statistical Analysis of Compositional Data. Chapman and Hall, London, pp. xv + 416.Google Scholar
  3. Aitchison, J., 1997. One hour course in compositional data-analysis, or compositional data-analysis is easy. In: Pawlowsky-Glahn, V. (Ed.), Proceedings of the 1997 Annual Conference of the International Association for Mathematical Geology. Universitat Politècnica de Catalunya, Barcelona, pp. 3–35.Google Scholar
  4. Box, G.E.P., Cox, D.R., 1964. The analysis of transformations. Journal of the Royal Statistical Society, B 265, 211–252.MathSciNetGoogle Scholar
  5. Campbell, N.A., 1979. Canonical variate analysis. Ph.D. Thesis. Imperial College, University of London.Google Scholar
  6. Campbell, N.A., 1980. Shrunken estimators in discriminant and canonical variate analysis. Applied Statistics 29, 5–14.MATHGoogle Scholar
  7. Edwards, A.W.F., 2000. Foundations of Mathematical Genetics, 2nd edition. Cambridge University Press, p. 121.Google Scholar
  8. Gower, J.C., 1967. Multivariate analysis and multidimensional geometry. The Statistician 17, 13–28.Google Scholar
  9. Mourant, A.E., Kopeíc, A.C., Domanievska-Sobczak, K., 1976. The Distribution of Human Blood Groups and Other Polymorphisms. Oxford University Press, London.Google Scholar
  10. Reyment, R.A., Savazzi, E., 1999. Aspects of Multivariate Statistical Analysis in Geology. Elsevier Science B. V., Amsterdam, pp. x + 285.Google Scholar
  11. Romano, V., Coli, C., Ragalmuto, A., D’Anna, R.P., Flugy, A., De Leo, G., Giambalvo, O., Lisa, A., Fiorani, O., Di Gaetano, C., Salemo, A., Tamouza, A., Chanon, D., Zei, G., Matullo, G., Piazza, A., 2003. Autosomal microsatellite and mtDNA genetic analysis in Sicily (Italy). Annals of Human Genetics 67, 42–53.CrossRefGoogle Scholar

Copyright information

© Society for Mathematical Biology 2005

Authors and Affiliations

  • Richard A. Reyment
    • 1
  1. 1.Paleozoologiska AvdelningenNaturhistoriska RiksmuseetStockholmSweden

Personalised recommendations