Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2012 Jul;109(1):50-6.
doi: 10.1038/hdy.2012.12. Epub 2012 Mar 21.

On the comparison of population-level estimates of haplotype and nucleotide diversity: a case study using the gene cox1 in animals

Affiliations
Comparative Study

On the comparison of population-level estimates of haplotype and nucleotide diversity: a case study using the gene cox1 in animals

W P Goodall-Copestake et al. Heredity (Edinb). 2012 Jul.

Abstract

Estimates of genetic diversity represent a valuable resource for biodiversity assessments and are increasingly used to guide conservation and management programs. The most commonly reported estimates of DNA sequence diversity in animal populations are haplotype diversity (h) and nucleotide diversity (π) for the mitochondrial gene cytochrome c oxidase subunit I (cox1). However, several issues relevant to the comparison of h and π within and between studies remain to be assessed. We used population-level cox1 data from peer-reviewed publications to quantify the extent to which data sets can be re-assembled, to provide a standardized summary of h and π estimates, to explore the relationship between these metrics and to assess their sensitivity to under-sampling. Only 19 out of 42 selected publications had archived data that could be unambiguously re-assembled; this comprised 127 population-level data sets (n ≥ 15) from 23 animal species. Estimates of h and π were calculated using a 456-base region of cox1 that was common to all the data sets (median h=0.70130, median π=0.00356). Non-linear regression methods and Bayesian information criterion analysis revealed that the most parsimonious model describing the relationship between the estimates of h and π was π=0.0081 h(2). Deviations from this model can be used to detect outliers due to biological processes or methodological issues. Subsampling analyses indicated that samples of n>5 were sufficient to discriminate extremes of high from low population-level cox1 diversity, but samples of n ≥ 25 are recommended for greater accuracy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Functional breakdown of the data acquisition procedure. The upper pie chart represents the number of candidate population-level data sets found in publications that matched our search criteria and the lower pie chart the number of data sets re-assembled. Numbers in brackets after the data set descriptions follow: number of population-level data sets, number of corresponding publications.
Figure 2
Figure 2
Variance in estimates of h and π among population-level samples within species (n=23) calculated from a homologous 456-base region of the cox1 gene. Median values for the total sample set (n=127) are shown as dashed vertical lines. Species order follows the taxonomic grouping in Table 1.
Figure 3
Figure 3
Nucleotide–haplotype diversity relationship based on population-level estimates (n=127) derived from a homologous 456-base region of cox1. Fitted model π=0.0081h2 shown as a thick dark line within two thin lines that represent 95% confidence intervals. The 75th and 95th percentiles of the square root transformed residual data are shown as dashed and dotted lines, respectively. Population-level estimates with the largest deviations are numbered to identify the species: 1, C. nucula; 2, E. superba; 3, M. squamiger; 4, P. zostericola; 5, R. denticulata.
Figure 4
Figure 4
Sample size–cox1 diversity estimate relationships for h and π under low-diversity (Bombus ardens; black cross series) and high-diversity (Euphausia superba; gray plus series) scenarios. Each series of estimates from n=2 to n=49 comprises 100 random subsamples taken from the total sample size of 50. High–low bars indicate s.d. output from DNASP for n=50.

Similar articles

Cited by

References

    1. Arruda CCB, Beasley CR, Vallinoto M, Marques-Silva NS, Tagliaro CH. Significant genetic differentiation among populations of Anomalocardia brasiliana (Gmelin, 1791): A bivalve with planktonic larval dispersion. Genet Molec Biol. 2009;32:423–430. - PMC - PubMed
    1. Bazin E, Glémin S, Galtier N. Population size does not influence mitochondrial genetic diversity in animals. Science. 2006;312:570–572. - PubMed
    1. Bucklin A, Steinke D, Blanco-Bercial L. DNA Barcoding of Marine Metazoa. Annu Rev Mar Sci. 2011;3:471–508. - PubMed
    1. Burnham KP, Anderson DR. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer: New York; 2002.
    1. Bird CE, Holland BS, Bowen BW, Toonen RJ. Contrasting phylogeography in three endemic Hawaiian limpets (Cellana spp.) with similar life histories. Mol Ecol. 2007;16:3173–3186. - PubMed

Publication types