Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Apr 7:11:171.
doi: 10.1186/1471-2105-11-171.

Assignment of chromosomal locations for unassigned SNPs/scaffolds based on pair-wise linkage disequilibrium estimates

Affiliations

Assignment of chromosomal locations for unassigned SNPs/scaffolds based on pair-wise linkage disequilibrium estimates

Mehar S Khatkar et al. BMC Bioinformatics. .

Abstract

Background: Recent developments of high-density SNP chips across a number of species require accurate genetic maps. Despite rapid advances in genome sequence assembly and availability of a number of tools for creating genetic maps, the exact genome location for a number of SNPs from these SNP chips still remains unknown. We have developed a locus ordering procedure based on linkage disequilibrium (LODE) which provides estimation of the chromosomal positions of unaligned SNPs and scaffolds. It also provides an alternative means for verification of genetic maps. We exemplified LODE in cattle.

Results: The utility of the LODE procedure was demonstrated using data from 1,943 bulls genotyped for 73,569 SNPs across three different SNP chips. First, the utility of the procedure was tested by analysing the masked positions of 1,500 randomly-chosen SNPs with known locations (50 from each chromosome), representing three classes of minor allele frequencies (MAF), namely >0.05, 0.01<MAF < or = 0.05 and 0.001<MAF < or = 0.01. The efficiency (percentage of masked SNPs that could be assigned a location) was 96.7%, 30.6% and 2.0%; with an accuracy (the percentage of SNPs assigned correctly) of 99.9%, 98.9% and 33.3% in the three classes of MAF, respectively. The average precision for placement of the SNPs was 914, 3,137 and 6,853 kb, respectively. Secondly, 4,688 of 5,314 SNPs unpositioned in the Btau4.0 assembly were positioned using the LODE procedure. Based on these results, the positions of 485 unordered scaffolds were determined. The procedure was also used to validate the genome positions of 53,068 SNPs placed on Btau4.0 bovine assembly, resulting in identification of problem areas in the assembly. Finally, the accuracy of the LODE procedure was independently validated by comparative mapping on the hg18 human assembly.

Conclusion: The LODE procedure described in this study is an efficient and accurate method for positioning SNPs (MAF>0.05), for validating and checking the quality of a genome assembly, and offers a means for positioning of unordered scaffolds containing SNPs. The LODE procedure will be helpful in refining genome sequence assemblies, especially those being created from next-generation sequencing where high-throughput SNP discovery and genotyping platforms are integrated components of genome analysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Comparison of SNP positions on human assembly obtained with BLAST of flanking sequence of SNPs vs the positions obtained by 'LiftOver' of bovine LODE positions. The comparisons for positions are shown in a single figure for the 230 SNPs combined across all chromosomes.
Figure 2
Figure 2
Comparison of repositioned locations of SNP by LODE with original location on Btau4.0. X-chromosome is labelled as 30.
Figure 3
Figure 3
The comparison of chromosomal assignments of 1776 SNPs repositioned by LODE procedure with original positions on Btau4.0 on chromosome 5. Two potential problem regions (10-11 Mb and 90-120 Mb) on Btau4.0 on this chromosome can be noted.

Similar articles

Cited by

References

    1. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank. Nucleic Acids Res. 2008. pp. D25–30. - PMC - PubMed
    1. Genome 10K Community of Scientists. Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. J Hered. 2009;100(6):659–674. doi: 10.1093/jhered/esp086. - DOI - PMC - PubMed
    1. Gibbs RA, Taylor JF, Van Tassell CP, Barendse W, Eversole KA, Gill CA, Green RD, Hamernik DL, Kappes SM, Lien S. Genome-Wide Survey of SNP Variation Uncovers the Genetic Structure of Cattle Breeds. Science. 2009;324(5926):528–532. doi: 10.1126/science.1167936. - DOI - PMC - PubMed
    1. Ramos AM, Crooijmans RP, Affara NA, Amaral AJ, Archibald AL, Beever JE, Bendixen C, Churcher C, Clark R, Dehais P. Design of a high density SNP genotyping assay in the pig using SNPs identified and characterized by next generation sequencing technology. PLoS ONE. 2009;4(8):e6524. doi: 10.1371/journal.pone.0006524. - DOI - PMC - PubMed
    1. Elsik CG, Tellam RL, Worley KC, Gibbs RA, Muzny DM, Weinstock GM, Adelson DL, Eichler EE, Elnitski L, Guigo R. The Genome Sequence of Taurine Cattle: A Window to Ruminant Biology and Evolution. Science. 2009;324(5926):522–528. doi: 10.1126/science.1169588. - DOI - PMC - PubMed

Publication types

LinkOut - more resources