Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan 13;5(1):e8219.
doi: 10.1371/journal.pone.0008219.

Rapid genomic characterization of the genus vitis

Affiliations

Rapid genomic characterization of the genus vitis

Sean Myles et al. PLoS One. .

Abstract

Next-generation sequencing technologies promise to dramatically accelerate the use of genetic information for crop improvement by facilitating the genetic mapping of agriculturally important phenotypes. The first step in optimizing the design of genetic mapping studies involves large-scale polymorphism discovery and a subsequent genome-wide assessment of the population structure and pattern of linkage disequilibrium (LD) in the species of interest. In the present study, we provide such an assessment for the grapevine (genus Vitis), the world's most economically important fruit crop. Reduced representation libraries (RRLs) from 17 grape DNA samples (10 cultivated V. vinifera and 7 wild Vitis species) were sequenced with sequencing-by-synthesis technology. We developed heuristic approaches for SNP calling, identified hundreds of thousands of SNPs and validated a subset of these SNPs on a 9K genotyping array. We demonstrate that the 9K SNP array provides sufficient resolution to distinguish among V. vinifera cultivars, between V. vinifera and wild Vitis species, and even among diverse wild Vitis species. We show that there is substantial sharing of polymorphism between V. vinifera and wild Vitis species and find that genetic relationships among V. vinifera cultivars agree well with their proposed geographic origins using principal components analysis (PCA). Levels of LD in the domesticated grapevine are low even at short ranges, but LD persists above background levels to 3 kb. While genotyping arrays are useful for assessing population structure and the decay of LD across large numbers of samples, we suggest that whole-genome sequencing will become the genotyping method of choice for genome-wide genetic mapping studies in high-diversity plant species. This study demonstrates that we can move quickly towards genome-wide studies of crop species using next-generation sequencing. Our study sets the stage for future work in other high diversity crop species, and provides a significant enhancement to current genetic resources available to the grapevine genetic community.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Alignment results of Illumina GA reads to the grapevine reference genome.
The total number of reads generated for each sample is found in the box to the right. The upper bars in the barplot indicate the proportion of reads belonging to each of the categories in the legend. Reads aligning with 0 to 2 mismatches were included for SNP discovery. Reads mapped repetitively and reads with no match were discarded. The lower bars (dark grey) show the proportion of reads beginning with the HpaII tag. The wild Vitis samples are shown in italics.
Figure 2
Figure 2. HpaII digestion results in an enrichment of genomic regions with high read coverage.
Panel A presents two overlapping fragment size distributions. The size distribution of fragments from an in silico HpaII digestion are shown in blue and the size distribution of the in silico digested fragments to which reads were successfully mapped is shown in orange. Panel B compares the observed amount of the genome sequenced at each level of coverage to the expectation at random. The random expectation was generated assuming that coverage follows a Poisson distribution (see Methods for details). The inset in gray demonstrates that the observed coverage begins to exceed the random expectation at 8x coverage. SNPs were called from positions with ≥10x coverage.
Figure 3
Figure 3. Quality score (Q) and genotypic contingency test thresholds eliminate read position bias during SNP calling.
The 470K SNP set is enriched with SNPs identified from the ends of reads. Panel A demonstrates that this read position bias can be eliminated by applying a Q score threshold. Panel B demonstrates that the genotypic contingency test also improves SNP calling.
Figure 4
Figure 4. Segregation of SNPs in the 71K SNP set within and between V. vinifera and wild Vitis species.
The proportion of SNPs polymorphic only within V. vinifera is 68.5%. The proportion segregating only within wild Vitis species is 53.1%. A substantial proportion (24.3%) of SNPs shows evidence of segregation within both V. vinifera and the wild Vitis species. Only 2.7% of SNPs appeared fixed between V. vinifera and wild Vitis.
Figure 5
Figure 5. LD decay in the grape.
Panel A shows the observed LD decay compared to background LD across 40 kb. LD was calculated as the median r2 in bins of 1000 comparisons. The background LD is the median r2 from 20,000 comparisons between SNPs on different chromosomes. Panel B shows the –log10 p-values from comparing the distribution of observed r2 values within each bin to the distribution of background r2 values generated from comparisons between SNPs on different chromosomes using a Mann-Whitney U test (see Methods for details).
Figure 6
Figure 6. Principal components analysis (PCA) plots from grapevine SNP data.
The first two PCs are shown with the proportion of the variance explained by each PC in parentheses. Panel A shows a PCA plot generated from 14,325 SNPs called from the Illumina GA without regard to segregation pattern. Panel B shows a PCA plot from the Vitis9KSNP array data, whose SNPs were chosen purposely to distinguish among V. vinifera cultivars.

Similar articles

Cited by

References

    1. Mackay TF, Stone EA, Ayroles JF. The genetics of quantitative traits: challenges and prospects. Nat Rev Genet. 2009;10:565–577. - PubMed
    1. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008;9:356. - PubMed
    1. Heffner EL, Sorrells ME, Jannink J-L. Genomic Selection for Crop Improvement. Crop Sci. 2009;49:1–12.
    1. Nordborg M, Weigel D. Next-generation genetics in plants. Nature. 2008;456:720–723. - PubMed
    1. Hillier LW, Marth GT, Quinlan AR, Dooling D, Fewell G, et al. Whole-genome sequencing and variant discovery in C. elegans. Nat Meth. 2008;5:183. - PubMed

Publication types