Skip to main content
Log in

Generation of hierarchically correlated multivariate symbolic sequences

With an application to the assessment of bootstrap confidence in phylogenetic analysis

  • Published:
The European Physical Journal B Aims and scope Submit manuscript

Abstract

We introduce a method to generate multivariate series of symbols from a finite alphabet with a given hierarchical structure of similarities based on the Hamming distance. The target hierarchical structure of similarities is arbitrary, for instance the one obtained by some hierarchical clustering method applied to an empirical matrix of similarities. The method that we present here is based on a generating mechanism that does not make use of mutation rate, which is widely used in phylogenetic analysis. Here we use the proposed simulation method to investigate the relationship between the bootstrap value associated with a node of a phylogeny and the probability of finding that node in the true phylogeny. The results of this analysis are compared with those obtained in the literature according to an evolutionary model with a per-symbol constant mutation rate. We observe that the relationship between the bootstrap value of a node and the probability of the corresponding clade being correct is sensitive to both the length of data series and the length of the branch connecting the node to its closest ancestor in the phylogenetic tree, whereas such a relationship is only slightly affected by the topology of the true phylogeny and by the absolute value of similarity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. W. Li, Phys. Rev. A 43, 5240 (1991)

    Article  ADS  MathSciNet  Google Scholar 

  2. S.V. Buldyrev et al., Phys. Rev. E 47, 4514 (1993)

    Article  ADS  Google Scholar 

  3. H. Makse et al., Chaos Solitons and Fractals 6, 295 (1995)

    Article  MATH  Google Scholar 

  4. H. Makse et al., Phys. Rev. E 53, 5445 (1996)

    Article  ADS  Google Scholar 

  5. F.M. Izrailev et al., Phys. Rev. E 76, 027701 (2007)

    Article  ADS  MathSciNet  Google Scholar 

  6. L.J. Emrich, M.R. Piedmonte, Amer. Statist. 45, 302 (1991)

    Article  Google Scholar 

  7. S.J. Gange, Amer. Statist. 49, 134 (1995)

    Article  Google Scholar 

  8. A.J. Lee, Comp. Stat. Data Anal. 26, 133 (1997)

    Article  MATH  Google Scholar 

  9. M. Tumminello, F. Lillo, R.N. Mantegna, EPL 78, 30006 (2007)

    Article  MathSciNet  Google Scholar 

  10. J. Felsenstein, Evolution 39, 783 (1985)

    Article  Google Scholar 

  11. D.M. Hillis, J.J. Bull, Syst. Biol. 42, 182 (1993)

    Article  Google Scholar 

  12. M.R. Anderberg, in Cluster Analysis for Applications (Academic Press, New York, 1973)

    MATH  Google Scholar 

  13. R.R. Sokal, C.D. Michener, Univ. Kansas Scient. Bull. 28, 1409 (1958)

    Google Scholar 

  14. B. Efron, R.J. Tibshirani, in An introduction to the bootstrap (Chapman & Hall/CRC, Boca Raton, Florida, USA, 1993)

    MATH  Google Scholar 

  15. B. Efron, Ann. Stat. 7, 1 (1979)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Tumminello.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tumminello, M., Lillo, F. & Mantegna, R.N. Generation of hierarchically correlated multivariate symbolic sequences. Eur. Phys. J. B 65, 333–340 (2008). https://doi.org/10.1140/epjb/e2008-00225-7

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1140/epjb/e2008-00225-7

PACS

Navigation