Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure
- PMID: 35211719
- PMCID: PMC8921734
- DOI: 10.1093/bib/bbac043
Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure
Abstract
Single nucleotide polymorphisms (SNPs) are the most abundant type of genomic variation and the most accessible to genotype in large cohorts. However, they individually explain a small proportion of phenotypic differences between individuals. Ancestry, collective SNP effects, structural variants, somatic mutations or even differences in historic recombination can potentially explain a high percentage of genomic divergence. These genetic differences can be infrequent or laborious to characterize; however, many of them leave distinctive marks on the SNPs across the genome allowing their study in large population samples. Consequently, several methods have been developed over the last decade to detect and analyze different genomic structures using SNP arrays, to complement genome-wide association studies and determine the contribution of these structures to explain the phenotypic differences between individuals. We present an up-to-date collection of available bioinformatics tools that can be used to extract relevant genomic information from SNP array data including population structure and ancestry; polygenic risk scores; identity-by-descent fragments; linkage disequilibrium; heritability and structural variants such as inversions, copy number variants, genetic mosaicisms and recombination histories. From a systematic review of recently published applications of the methods, we describe the main characteristics of R packages, command-line tools and desktop applications, both free and commercial, to help make the most of a large amount of publicly available SNP data.
Keywords: GWAS; SNP arrays; bioinformatic methods; genomic structures; software; structural variants.
© The Author(s) 2022. Published by Oxford University Press.
Figures




Similar articles
-
Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.Genet Sel Evol. 2023 Oct 18;55(1):72. doi: 10.1186/s12711-023-00843-w. Genet Sel Evol. 2023. PMID: 37853325 Free PMC article.
-
Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein-Friesian cattle.Genet Sel Evol. 2016 Dec 1;48(1):95. doi: 10.1186/s12711-016-0274-1. Genet Sel Evol. 2016. PMID: 27905878 Free PMC article.
-
Genome-wide association study and prediction of genomic breeding values for fatty-acid composition in Korean Hanwoo cattle using a high-density single-nucleotide polymorphism array.J Anim Sci. 2018 Sep 29;96(10):4063-4075. doi: 10.1093/jas/sky280. J Anim Sci. 2018. PMID: 30265318 Free PMC article.
-
Snat: a SNP annotation tool for bovine by integrating various sources of genomic information.BMC Genet. 2011 Oct 7;12:85. doi: 10.1186/1471-2156-12-85. BMC Genet. 2011. PMID: 21982513 Free PMC article.
-
The extent of linkage disequilibrium and computational challenges of single nucleotide polymorphisms in genome-wide association studies.Curr Drug Metab. 2011 Jun;12(5):498-506. doi: 10.2174/138920011795495312. Curr Drug Metab. 2011. PMID: 21453276 Review.
Cited by
-
Cytogenomic epileptology.Mol Cytogenet. 2023 Jan 5;16(1):1. doi: 10.1186/s13039-022-00634-w. Mol Cytogenet. 2023. PMID: 36600272 Free PMC article. Review.
-
Uniparental disomy (UPD) exclusion in embryos following Preimplantation Genetic Testing for Structural Rearrangements (PGT-SR).J Assist Reprod Genet. 2025 Jan;42(1):265-273. doi: 10.1007/s10815-024-03352-x. Epub 2024 Dec 18. J Assist Reprod Genet. 2025. PMID: 39693035
-
Potential Association of Cytochrome P450 Copy Number Alteration in Tumour with Chemotherapy Resistance in Lung Adenocarcinoma Patients.Int J Mol Sci. 2023 Aug 29;24(17):13380. doi: 10.3390/ijms241713380. Int J Mol Sci. 2023. PMID: 37686184 Free PMC article.
-
DPImpute: A Genotype Imputation Framework for Ultra-Low Coverage Whole-Genome Sequencing and its Application in Genomic Selection.Adv Sci (Weinh). 2025 Apr;12(16):e2412482. doi: 10.1002/advs.202412482. Epub 2025 Feb 27. Adv Sci (Weinh). 2025. PMID: 40013759 Free PMC article.
-
A High-Throughput Screening Strategy for Bacillus subtilis Producing Menaquinone-7 Based on Fluorescence-Activated Cell Sorting.Microorganisms. 2025 Feb 27;13(3):536. doi: 10.3390/microorganisms13030536. Microorganisms. 2025. PMID: 40142429 Free PMC article.
References
-
- Wang DG, Fan JB, Siao CJ, et al. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science (80-) 1998;280:1077–82. - PubMed
-
- Mielczarek M, Szyda J. Review of alignment and SNP calling algorithms for next-generation sequencing data. J Appl Genet 2016;57:71–9. - PubMed