How array design creates SNP ascertainment bias
- PMID: 33784304
- PMCID: PMC8009414
- DOI: 10.1371/journal.pone.0245178
How array design creates SNP ascertainment bias
Abstract
Single nucleotide polymorphisms (SNPs), genotyped with arrays, have become a widely used marker type in population genetic analyses over the last 10 years. However, compared to whole genome re-sequencing data, arrays are known to lack a substantial proportion of globally rare variants and tend to be biased towards variants present in populations involved in the development process of the respective array. This affects population genetic estimators and is known as SNP ascertainment bias. We investigated factors contributing to ascertainment bias in array development by redesigning the Axiom™ Genome-Wide Chicken Array in silico and evaluating changes in allele frequency spectra and heterozygosity estimates in a stepwise manner. A sequential reduction of rare alleles during the development process was shown. This was mainly caused by the identification of SNPs in a limited set of populations and a within-population selection of common SNPs when aiming for equidistant spacing. These effects were shown to be less severe with a larger discovery panel. Additionally, a generally massive overestimation of expected heterozygosity for the ascertained SNP sets was shown. This overestimation was 24% higher for populations involved in the discovery process than not involved populations in case of the original array. The same was observed after the SNP discovery step in the redesign. However, an unequal contribution of populations during the SNP selection can mask this effect but also adds uncertainty. Finally, we make suggestions for the design of specialized arrays for large scale projects where whole genome re-sequencing techniques are still too expensive.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures







Similar articles
-
How imputation can mitigate SNP ascertainment Bias.BMC Genomics. 2021 May 12;22(1):340. doi: 10.1186/s12864-021-07663-6. BMC Genomics. 2021. PMID: 33980139 Free PMC article.
-
SNP ascertainment bias in population genetic analyses: why it is important, and how to correct it.Bioessays. 2013 Sep;35(9):780-6. doi: 10.1002/bies.201300014. Epub 2013 Jul 9. Bioessays. 2013. PMID: 23836388 Free PMC article. Review.
-
Efficiency of different strategies to mitigate ascertainment bias when using SNP panels in diversity studies.BMC Genomics. 2018 Jan 5;19(1):22. doi: 10.1186/s12864-017-4416-9. BMC Genomics. 2018. PMID: 29304727 Free PMC article.
-
Effects of single nucleotide polymorphism ascertainment on population structure inferences.G3 (Bethesda). 2021 Sep 6;11(9):jkab128. doi: 10.1093/g3journal/jkab128. G3 (Bethesda). 2021. PMID: 33871576 Free PMC article.
-
Population genetic analysis of ascertained SNP data.Hum Genomics. 2004 Mar;1(3):218-24. doi: 10.1186/1479-7364-1-3-218. Hum Genomics. 2004. PMID: 15588481 Free PMC article. Review.
Cited by
-
How imputation can mitigate SNP ascertainment Bias.BMC Genomics. 2021 May 12;22(1):340. doi: 10.1186/s12864-021-07663-6. BMC Genomics. 2021. PMID: 33980139 Free PMC article.
-
Low-coverage whole genome sequencing for a highly selective cohort of severe COVID-19 patients.GigaByte. 2024 Jun 20;2024:gigabyte127. doi: 10.46471/gigabyte.127. eCollection 2024. GigaByte. 2024. PMID: 38948510 Free PMC article.
-
Unveiling the predominance of Saccharum spontaneum alleles for resistance to orange rust in sugarcane using genome-wide association.Theor Appl Genet. 2024 Mar 13;137(4):81. doi: 10.1007/s00122-024-04583-3. Theor Appl Genet. 2024. PMID: 38478168
-
DArTseq genotyping facilitates identification of Aegilops biuncialis chromatin introgressed into bread wheat Mv9kr1.Plant Mol Biol. 2024 Nov 7;114(6):122. doi: 10.1007/s11103-024-01520-2. Plant Mol Biol. 2024. PMID: 39508930 Free PMC article.
-
A framework for research into continental ancestry groups of the UK Biobank.Hum Genomics. 2022 Jan 29;16(1):3. doi: 10.1186/s40246-022-00380-5. Hum Genomics. 2022. PMID: 35093177 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources