Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1996 Nov 12;93(23):13429-34.
doi: 10.1073/pnas.93.23.13429.

Bootstrap confidence levels for phylogenetic trees

Affiliations

Bootstrap confidence levels for phylogenetic trees

B Efron et al. Proc Natl Acad Sci U S A. .

Abstract

Evolutionary trees are often estimated from DNA or RNA sequence data. How much confidence should we have in the estimated trees? In 1985, Felsenstein [Felsenstein, J. (1985) Evolution 39, 783-791] suggested the use of the bootstrap to answer this question. Felsenstein's method, which in concept is a straightforward application of the bootstrap, is widely used, but has been criticized as biased in the genetics literature. This paper concerns the use of the bootstrap in the tree problem. We show that Felsenstein's method is not biased, but that it can be corrected to better agree with standard ideas of confidence levels and hypothesis testing. These corrections can be made by using the more elaborate bootstrap method presented here, at the expense of considerably more computation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Part of the data matrix of aligned nucleotide sequences for the malaria parasite Plasmodium. Shown are the first 20 columns of the 11 × 221 matrix x of polytypic sites used in most of the analyses below. The final analysis of the last section also uses the data from 1399 monotypic sites.
Figure 2
Figure 2
Phylogenetic tree based on the malaria data matrix; species are numbered as in Fig. 1. The numbers at the branches are confidence values based on Felsenstein’s bootstrap method. B = 200 bootstrap replications.
Figure 3
Figure 3
Schematic diagram of tree estimation; triangle represents the space of all possible ˜π vectors in the multinomial probability model; regions ℛ1, ℛ2. . . correspond to the different possible trees. In the case shown ˜π and ˜π̂ lie in the same region so TREE = formula image, but ˜π̂* lies in a region where formula image* does not have the 9-10 clade.
Figure 4
Figure 4
Two cases of the simple normal model; in both we observe μ̂ = (4.5, 0) ∈ ℛ1, and wish to assign a confidence value to μ ∈ ℛ1. Case I, ℛ2 is the region {μ1 ≤ 3}. Case II, ℛ2 is the region {∥μ∥ < 3}. The dashed circles indicate bootstrap sampling μ̂* ∼ N2(μ̂, I).
Figure 5
Figure 5
Confidence levels of the two cases in Fig. 4; μ̂0 = (3, 0) is the closest point to μ̂ = (4.5, 0) on the boundary separating ℛ1 from ℛ2; bootstrap vector μ̂** ∼ N2(μ̂0, I). The confidence level α̂ is the probability that μ̂** is closer than μ̂ to the boundary.

Corrected and republished from

References

    1. Efron B, Tibshirani R. An Introduction to the Bootstrap. London: Chapman & Hall; 1993.
    1. Felsenstein J. Evolution. 1985;39:783–791. - PubMed
    1. Hillis D, Bull J. Syst Biol. 1993;42:182–192.
    1. Felsenstein J, Kishino H. Syst Biol. 1993;42:193–200.
    1. Newton M A. Biometrika. 1996;83:315–328.

Publication types