An Unbiased Estimator of Gene Diversity with Improved Variance for Samples Containing Related and Inbred Individuals of any Ploidy
- PMID: 28040781
- PMCID: PMC5295611
- DOI: 10.1534/g3.116.037168
An Unbiased Estimator of Gene Diversity with Improved Variance for Samples Containing Related and Inbred Individuals of any Ploidy
Abstract
Gene diversity, or expected heterozygosity (H), is a common statistic for assessing genetic variation within populations. Estimation of this statistic decreases in accuracy and precision when individuals are related or inbred, due to increased dependence among allele copies in the sample. The original unbiased estimator of expected heterozygosity underestimates true population diversity in samples containing relatives, as it only accounts for sample size. More recently, a general unbiased estimator of expected heterozygosity was developed that explicitly accounts for related and inbred individuals in samples. Though unbiased, this estimator's variance is greater than that of the original estimator. To address this issue, we introduce a general unbiased estimator of gene diversity for samples containing related or inbred individuals, which employs the best linear unbiased estimator of allele frequencies, rather than the commonly used sample proportion. We examine the properties of this estimator, [Formula: see text] relative to alternative estimators using simulations and theoretical predictions, and show that it predominantly has the smallest mean squared error relative to others. Further, we empirically assess the performance of [Formula: see text] on a global human microsatellite dataset of 5795 individuals, from 267 populations, genotyped at 645 loci. Additionally, we show that the improved variance of [Formula: see text] leads to improved estimates of the population differentiation statistic, [Formula: see text] which employs measures of gene diversity within its calculation. Finally, we provide an R script, BestHet, to compute this estimator from genomic and pedigree data.
Keywords: expected heterozygosity; identity state; inbreeding; locus-specific branch length; relatedness.
Copyright © 2017 Harris and DeGiorgio.
Figures






Similar articles
-
Unbiased estimation of gene diversity in samples containing related individuals: exact variance and arbitrary ploidy.Genetics. 2010 Dec;186(4):1367-87. doi: 10.1534/genetics.110.121756. Epub 2010 Oct 5. Genetics. 2010. PMID: 20923981 Free PMC article.
-
An unbiased estimator of gene diversity in samples containing related individuals.Mol Biol Evol. 2009 Mar;26(3):501-12. doi: 10.1093/molbev/msn254. Epub 2008 Nov 6. Mol Biol Evol. 2009. PMID: 18988687 Free PMC article.
-
Metafounders are related to F st fixation indices and reduce bias in single-step genomic evaluations.Genet Sel Evol. 2017 Mar 10;49(1):34. doi: 10.1186/s12711-017-0309-2. Genet Sel Evol. 2017. PMID: 28283016 Free PMC article.
-
Assessing population structure: F(ST) and related measures.Mol Ecol Resour. 2011 Jan;11(1):5-18. doi: 10.1111/j.1755-0998.2010.02927.x. Epub 2010 Oct 26. Mol Ecol Resour. 2011. PMID: 21429096 Review.
-
A maximum-likelihood estimation of pairwise relatedness for autopolyploids.Heredity (Edinb). 2015 Feb;114(2):133-42. doi: 10.1038/hdy.2014.88. Epub 2014 Nov 5. Heredity (Edinb). 2015. PMID: 25370210 Free PMC article. Review.
Cited by
-
Differential A1/A2 β-casein (CSN2) gene-derived allelic and genotypic frequencies across Ecuadorian exotic dairy cattle breeds.Front Vet Sci. 2025 Jul 9;12:1616426. doi: 10.3389/fvets.2025.1616426. eCollection 2025. Front Vet Sci. 2025. PMID: 40703919 Free PMC article.
-
Genetic Diversity and Population Structure of a Camelina sativa Spring Panel.Front Plant Sci. 2019 Feb 20;10:184. doi: 10.3389/fpls.2019.00184. eCollection 2019. Front Plant Sci. 2019. PMID: 30842785 Free PMC article.
-
Sequence variants affecting the genome-wide rate of germline microsatellite mutations.Nat Commun. 2023 Jun 29;14(1):3855. doi: 10.1038/s41467-023-39547-6. Nat Commun. 2023. PMID: 37386006 Free PMC article.
-
Genome-wide analysis identified candidate variants and genes associated with heat stress adaptation in Egyptian sheep breeds.Front Genet. 2022 Oct 3;13:898522. doi: 10.3389/fgene.2022.898522. eCollection 2022. Front Genet. 2022. PMID: 36263427 Free PMC article.
-
Genetic diversity and population structure of Quercus fabri Hance in China revealed by genotyping-by-sequencing.Ecol Evol. 2020 Jul 24;10(16):8949-8958. doi: 10.1002/ece3.6598. eCollection 2020 Aug. Ecol Evol. 2020. PMID: 32884670 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources