Skip to main content
Log in

Spectral analysis of phylogenetic data

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

The spectral analysis of sequence and distance data is a new approach to phylogenetic analysis. For two-state character sequences, the character values at a given site split the set of taxa into two subsets, a bipartition of the taxa set. The vector which counts the relative numbers of each of these bipartitions over all sites is called a sequence spectrum. Applying a transformation called a Hadamard conjugation, the sequence spectrum is transformed to the conjugate spectrum. This conjugation corrects for unobserved changes in the data, independently from the choice of phylogenetic tree. For any given phylogenetic tree with edge weights (probabilities of state change), we define a corresponding tree spectrum. The selection of a weighted phylogenetic tree from the given sequence data is made by matching the conjugate spectrum with a tree spectrum. We develop an optimality selection procedure using a least squares best fit, to find the phylogenetic tree whose tree spectrum most closely matches the conjugate spectrum. An inferred sequence spectrum can be derived from the selected tree spectrum using the inverse Hadamard conjugation to allow a comparison with the original sequence spectrum.

A possible adaptation for the analysis of four-state character sequences with unequal frequencies is considered. A corresponding spectral analysis for distance data is also introduced. These analyses are illustrated with biological examples for both distance and sequence data. Spectral analysis using the Fast Hadamard transform allows optimal trees to be found for at least 20 taxa and perhaps for up to 30 taxa.

The development presented here is self contained, although some mathematical proofs available elsewhere have been omitted. The analysis of sequence data is based on methods reported earlier, but the terminology and the application to distance data are new.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • ANDREWS, H. C. (1970),Computer Techniques in Image Processing, New York: Academic Press.

    Google Scholar 

  • CAVENDER, J. A. (1978), “Taxonomy with Confidence,”Mathematical Biosciences, 40, 271–280.

    Article  MATH  MathSciNet  Google Scholar 

  • CAVENDER, J. A., and FELSENSTEIN, J. (1978), “Invariants of Phylogenies: Simple Cases with Discrete States”Journal of Classification, 4, 57–71.

    Article  Google Scholar 

  • COOPER, B. E. (1968), “The Extension of Yates’ 2n Algorithm to any COmplete Factorial Experiment,”Technometrics, 10, 575–577.

    Article  Google Scholar 

  • DE SOETE, G. (1983), “A Least Squares Algorithm for Fitting Additive Trees to Proximity Data,”Psychometrika, 48, 621–626.

    Article  Google Scholar 

  • FARRIS, J. S. (1972), “Estimating Phylogenetic Trees from Distance Matrices,”American Naturalist, 106, 645–668.

    Article  Google Scholar 

  • FARRIS, J. S. (1978), “Inferring Phylogenetic Trees from Chromosome Inversion Data,”Systematic Zoology, 27, 275–284.

    Article  Google Scholar 

  • FELSENSTEIN, J. (1978), “Cases in which Parsimony or Compatibility Methods will be Positively Misleading,”Systematic Zoology, 27, 401–410.

    Article  Google Scholar 

  • FELSENSTEIN, J. (1987), “Estimation of Hominoid Phylogeny from a DNA Hybridization Data Set,”Journal of Molecular Evolution, 26, 123–131.

    Article  Google Scholar 

  • HADAMARD, J. (1893), “Resolution d’une question relative aux determinants,”Bulletin des Sciences Mathematiques Series 2, 17, 240–246.

    Google Scholar 

  • HEDAYAT, A., and WALLIS, W. D. (1978), “Hadamard Matrices and their Applications,”Annuls of Statistics, 6, 1184–1238.

    MATH  MathSciNet  Google Scholar 

  • HENDY, M. D., and PENNY, D. (1982), “Branch and Bound Algorithms to Determine Minimal Evolutionary Trees,”Mathematical Biosciences, 59, 277–290.

    Article  MATH  MathSciNet  Google Scholar 

  • HENDY, M. D., and PENNY, D. (1989), “A Framework for the Quantitative Study of Evolutionary Trees,”Systematic Zoology, 38, 297–309.

    Article  Google Scholar 

  • HENDY, M. D. (1989), “The Relationship Between Simple Evolutionary Tree Models and Observable Sequence Data,”Systematic Zoology, 38, 310–321.

    Article  Google Scholar 

  • HENDY, M. D. (1991), “A Combinatorial Description of the Closest Tree Algorithm for Finding Evolutionary Trees,”Discrete Mathematics, 96, 51–58.

    Article  MATH  MathSciNet  Google Scholar 

  • JUKES, T. H., and CANTOR, C. H. (1969), “Evolution of Protein Molecules,” inMammalian Protein Metabolism, Ed. H. M. Munro, New York: Academic Press, 21–123.

    Google Scholar 

  • JAKE., J. A. (1987), “Prokaryotes and Archaebacteria are not Monophyletic: Rate Invariant Analysis of rRNA Genes Indicates that Eukaryotes and Eocytes, From a monophyletic Taxon,”Cold Spring Harbor Symposia on Quantitative Biology, 52, 839–846.

    Google Scholar 

  • LAKE, J. A. (1987a), “A Rate-Independent Technique for Analysis of Nucleic Acid Sequences: Evolutionary Parsimony,”Molecular Biology and Evolution, 4, 167–191.

    Google Scholar 

  • PENNY, D., and HENDY, M. D. (1987), “TurboTree: A Fast Algorithm for Minimal Trees,”Computer Applications in the Biosciences, 3, 183–188.

    Google Scholar 

  • PENNY, D., HENDY, M. D., ZIMMER, E. A., and HAMBY, R. K. (1990), “Trees from Sequences: Panacea or Pandora’s Box?,”Australian Journal of Systematic Botany., 3, 21–38.

    Article  Google Scholar 

  • SANKOFF, D. (1990), “Designer Invariants for Large Phylogenies,”Molecular Biology and Evolution, 7, 255–269.

    Google Scholar 

  • SARICH, V. M (1969), “Pinniped Origins and the Rate of Evolution of Carnivore Albumins,”Systematic Zoology, 18, 286.

    Article  Google Scholar 

  • SCHROEDER, M. R. (1986),Number Theory in Science and Communication, 2nd ed., Berlin: Springer-Verlag.

    MATH  Google Scholar 

  • SNEATH, P. H. A., and SOKAL, R. R. (1973),Numerical Taxonomy, San Francisco: W. H. Freeman.

    MATH  Google Scholar 

  • STEEL, M. A. (1989),Distributions on Bicoloured Evolutionary Trees, Ph.D. thesis, Massey University, Palmerston North.

    Google Scholar 

  • WHELCHEL, J. E., and GUINN, D. F. (1968), “The Fast Fourier-Hadamard Transform and its Use in Signal Representation and Classification,”Eascon 1968 Convention Record, 561–573.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hendy, M.D., Penny, D. Spectral analysis of phylogenetic data. Journal of Classification 10, 5–24 (1993). https://doi.org/10.1007/BF02638451

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02638451

Keywords

Navigation