When Being “Most Likely” Is Not Enough: Examining the Performance of Three Uses of the Parametric Bootstrap in Phylogenetics

Antezana, Marcos

doi:10.1007/s00239-002-2394-1

When Being “Most Likely” Is Not Enough: Examining the Performance of Three Uses of the Parametric Bootstrap in Phylogenetics

Published: February 2003

Volume 56, pages 198–222, (2003)
Cite this article

Journal of Molecular Evolution Aims and scope Submit manuscript

Marcos Antezana¹

101 Accesses
13 Citations
Explore all metrics

Abstract

I show that three parametric-bootstrap (PB) applications that have been proposed for phylogenetic analysis, can be misleading as currently implemented. First, I show that simulating a topology estimated from preliminary data in order to determine the sequence length that should allow the best tree obtained from more extensive data to be correct with a desired probability, delivers an accurate estimate of this length only in topological situations in which most preliminary trees are expected to be both correct and statistically significant, i.e. when no further analysis would be needed. Otherwise, one obtains strong underestimates of the length or similarly biased values for incorrect trees. Second, I show that PB-based topology tests that use as null hypothesis the most likely tree congruent with a pre-specified topological relationship alternative to the unconstrained most likely tree, and simulate this tree for P value estimation, produce excessive type I error (from 50% to 600% and higher) when they are applied to null data generated by star-shaped or dichotomous four-taxon topologies. Simulating the most likely star topology for P value estimation results instead in correct type-I-error production even when the null data are generated by a dichotomous topology. This is a strong indication that the star topology is the correct default null hypothesis for phylogenies. Third, I show that PB-estimated confidence intervals (CIs) for the length of a tree branch are generally accurate, although in some situations they can be strongly over- or under-estimated relative to the “true” CI. Attempts to identify a biased CI through a further round of simulations were unsuccessful. Tracing the origin and propagation of parameter estimate error through the CI estimation exercise, showed that the sparseness of site-patterns which are crucial to the estimation of pivotal parameters, can allow homoplasy to bias these estimates and ultimately the PB-based CI estimation. Concluding, I stress that statistical techniques that simulate models estimated from limited data need to be carefully calibrated, and I defend the point that pattern-sparseness assessment will be the next frontier in the statistical analysis of phylogenies, an effort that will require taking advantage of the merits of black-box maximum-likelihood approaches and of insights from intuitive, site-pattern-oriented approaches like parsimony.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Renewing Felsenstein’s phylogenetic bootstrap in the era of big data

Article 18 April 2018

Fast and Accurate Branch Support Calculation for Distance-Based Phylogenetic Placements

Fast and accurate bootstrap confidence limits on genome-scale phylogenies using little bootstraps

Article 22 September 2021

Author information

Authors and Affiliations

, University of Chicago, 1101 E. 57th Street, 60637-1573, Chicago, IL, USA, , , , ,
Marcos Antezana

Authors

Marcos Antezana
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Antezana, M. When Being “Most Likely” Is Not Enough: Examining the Performance of Three Uses of the Parametric Bootstrap in Phylogenetics . J Mol Evol 56, 198–222 (2003). https://doi.org/10.1007/s00239-002-2394-1

Download citation

Received: 03 December 2001
Accepted: 17 September 2002
Issue Date: February 2003
DOI: https://doi.org/10.1007/s00239-002-2394-1

Parametric bootstrap Resampling bootstrap Topology test Phylogenetics Star tree Dichotomous tree Four-taxon case Site-pattern Conservativeness Critical length Confidence interval Null hypothesis Hypothesis testing Discreteness Sparseness Homoplasy p-value Type I error Power

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

When Being “Most Likely” Is Not Enough: Examining the Performance of Three Uses of the Parametric Bootstrap in Phylogenetics

Access this article

Similar content being viewed by others

Renewing Felsenstein’s phylogenetic bootstrap in the era of big data

Fast and Accurate Branch Support Calculation for Distance-Based Phylogenetic Placements

Fast and accurate bootstrap confidence limits on genome-scale phylogenies using little bootstraps

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

When Being “Most Likely” Is Not Enough: Examining the Performance of Three Uses of the Parametric Bootstrap in Phylogenetics

Access this article

Similar content being viewed by others

Renewing Felsenstein’s phylogenetic bootstrap in the era of big data

Fast and Accurate Branch Support Calculation for Distance-Based Phylogenetic Placements

Fast and accurate bootstrap confidence limits on genome-scale phylogenies using little bootstraps

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation