Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation

A high quality draft consensus sequence of the genome of a heterozygous grapevine variety

Riccardo Velasco et al. PLoS One. .

Abstract

Background: Worldwide, grapes and their derived products have a large market. The cultivated grape species Vitis vinifera has potential to become a model for fruit trees genetics. Like many plant species, it is highly heterozygous, which is an additional challenge to modern whole genome shotgun sequencing. In this paper a high quality draft genome sequence of a cultivated clone of V. vinifera Pinot Noir is presented.

Principal findings: We estimate the genome size of V. vinifera to be 504.6 Mb. Genomic sequences corresponding to 477.1 Mb were assembled in 2,093 metacontigs and 435.1 Mb were anchored to the 19 linkage groups (LGs). The number of predicted genes is 29,585, of which 96.1% were assigned to LGs. This assembly of the grape genome provides candidate genes implicated in traits relevant to grapevine cultivation, such as those influencing wine quality, via secondary metabolites, and those connected with the extreme susceptibility of grape to pathogens. Single nucleotide polymorphism (SNP) distribution was consistent with a diffuse haplotype structure across the genome. Of around 2,000,000 SNPs, 1,751,176 were mapped to chromosomes and one or more of them were identified in 86.7% of anchored genes. The relative age of grape duplicated genes was estimated and this made possible to reveal a relatively recent Vitis-specific large scale duplication event concerning at least 10 chromosomes (duplication not reported before).

Conclusions: Sanger shotgun sequencing and highly efficient sequencing by synthesis (SBS), together with dedicated assembly programs, resolved a complex heterozygous genome. A consensus sequence of the genome and a set of mapped marker loci were generated. Homologous chromosomes of Pinot Noir differ by 11.2% of their DNA (hemizygous DNA plus chromosomal gaps). SNP markers are offered as a tool with the potential of introducing a new era in the molecular breeding of grape.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. V. vinifera genomic metacontigs anchored to the LGs.
V. vinifera genomic metacontigs (yellow bars) positioned along LG 4 of the Syrah x Pinot Noir genetic map. The map was assembled according to Cartwright et al. . On the left are marker names and positions, in centimorgans, from Troggio et al. (http://genomics.research.iasma.it). Most metacontigs were anchored to the map using markers with unique sequence locations: SSRs, BAC-end sequences and SNP-based markers derived from either ESTs or assembled sequences of the two haplotypes of the Pinot Noir genome. Metacontigs without bridge markers were anchored based on their association to other metacontigs (details in Materials and Methods). Approximate size in Kb of each metacontig is indicated on the right. Gaps separating metacontigs are of undefined size.
Figure 2
Figure 2. Comparison of four plant genomes based on gene homology.
All genes were compared each other as all-vs-all similarity searches using BLAST. Genes predicted for poplar, Arabidopsis and rice are respectively from www.genome.jgi-psf.org; www.arabidopsis.org; www.tigr.org. Grape gene estimates have been carried out on 58,611 assembled contigs. Genes of similar length with over 60% of similarity alignments at protein level were considered homologous using BLOSUM62 matrix . The frequencies of sequences shared among species are reported on the right.
Figure 3
Figure 3. Chromosomal organization of disease resistance genes of V. vinifera.
A) Phylogenetic analysis of NBS-LRR protein sequences of V. vinifera present in Pinot Noir. The phylogeny of these genes is based on a distance-matrix neighbour-joining analysis (Clustal X, ; bootstrap of 1000) after alignment of sequences by TCoffee (version 5.05, [125]). The phylogenetic clades, in general, correspond to the classification based on protein domains (however, see text and Table S3). B) Genes assigned to LGs are represented by dots. Their gene number is specified in LG-specific insets and in Table S3. NBS clades (see A above) contain mainly genes of the following classes: (1) TIR-NBS-LRR in blue; (2) CC-NBS-LRRa in green; (3) CC-NBS-LRRb in yellow; (4) NBS-LRR in cyan; (5) CC-NBS-LRR in red. Other resistance genes, belonging to NBS and TIR-NBS groups, are represented by the open and filled dots, respectively. Resistance-related genes different from NBS genes are shown in black. The size of each LG is given in Mb (on the right), whereas markers of the genetic map ( and http://genomics.research.iasma.it) are shown on the left, together with the interval in cM between the two closest markers in each gene cluster.
Figure 4
Figure 4. V. vinifera pathways for phenolic and terpenoid biosynthesis.
A) V. vinifera general pathway for phenolics biosynthesis leading to stilbenes (C6-C2-C6) and flavonoids (C6-C3-C6). For each enzyme, the gene copy number is reported in brackets. Genes were identified by similarity search using BLAST where the references were the sequences of phenolic biosynthetic genes previously characterized in grape and in other plant species. Putative homologues and gene copy numbers were determined by comparing aligned amino acid sequences based on a threshold of 80% similarity between the grape sequences, and 60% similarity between grape and other species. For the large StSy, PAL and F3'5'H families, phylogenetic analysis was performed with MEGA4 package after aligning with ClustalW . The following enzymes involved in the pathway are shown: PAL, phenylalanine ammonia-lyase; C4H, cinnamate 4-hydroxylase; 4CL, 4-coumarate-CoA ligase; CHS, chalcone synthase; StSy, stilbene synthase; RSGT, resveratrol glucosyltransferase; CHI, chalcone isomerase; F3H, flavanone 3-hydroxylase; F3'H, flavonoid 3′-hydroxylase; F3'5'H, flavonoid 3′,5′-hydroxylase; DFR, dihydroflavonol-4-reductase; FLS, flavonol synthase; LDOX, leucoanthocyanidin dioxygenase; LAR, leucoanthocyanidin reductase; ANR, anthocyanidin reductase; UFGT, UDP-glucose:flavonoid 3-O-glucosyltransferase; OMT, O-methyltransferase; ACCase, acetyl CoA carboxylase. PA refers to proanthocyanidins. Enzymatic steps that have not been experimentally confirmed are marked with an asterisk (*). B) Steps of plastidic isoprenoid pathway and monoterpenoids biosynthesis. For each enzyme, the gene copy number is reported in brackets. Gene annotation was performed as described in Material and methods. Abbreviations: G3P, glyceraldehyde 3-phosphate; DXP, 1-deoxy-D-xylulose-5-phosphate; MEP, 2-C-methyl-D-erythritol 4-phosphate; CDP-ME, 4-diphosphocytidyl-2C-methyl-D-erythritol; CDP-MEP, 4-diphosphocytidyl-2Cmethyl-D-erythritol 2-phosphate; ME-cPP, 2C-ethyl-erythritol 2,4-cyclodiphosphate; HMBPP, 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate; IPP, isopentenyl pyrophosphate; DMAPP, dimethylallyl pyrophosphate. The enzymes in the pathway are indicated in blue: DXS, 1-deoxy-D-xylulose 5-phosphate synthase; DXR 1-deoxy-D-xylulose 5-phosphate reductase; ISPD, 4-diphosphocytidyl-2-C-methyl-D-erythritol synthase; ISPE, 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; ISPF, 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase; ISPG, 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase and ISPH 2-C-methyl-D-erythritol 2,4-cyclodiphosphate reductase (ISPG and ISPH are probably the same enzyme and convert directly MEcPP in IPP and DMAPP); ISPA, geranyltransferase; IDI, isopentenyl diphosphate delta-isomerase; PT, prenyltransferase; LIMS, limonene synthase; LIS, linalool synthase; GES, geraniol synthase; TES, α-terpineol synthase; TPS-CIN, myrcene/(E)-beta-ocimene synthase; CYTP450, cytochrome P450 hydroxylase; ALDH, aldehyde dehydrogenase; NADPDH, NADP dehydrogenase. Enzymatic steps that have not been experimentally confirmed are marked with an asterisk (*).
Figure 5
Figure 5. Distribution of transcription factors along the 19 V. vinifera LGs.
A) Distribution of 1,617 transcription factors along the 19 V. vinifera LGs inferred from the positions of anchored metacontigs. Different colours of the histograms corresponds to the different TF classes. B) Distribution of transcription factor clusters over the grape genome. TF organization in LGs 2, 5, 6, 7, 8, 10, 13, 16, 17 is presented. For each LG, markers of the genetic map, developed by Troggio et al. (see also http://genomics.research.iasma.it) are reported on the left together with the interval in cM between the two closest markers for each TF cluster. TF types are reported on the right.
Figure 6
Figure 6. Scatter plot of the distribution of V. vinifera transcription factors.
For each of the 60 families (1983 genes) of V. vinifera TFs (X-axis) (log base 2 transformed), family members have been plotted against the corresponding number reported for three other genomes: A) A. thaliana (http://arabtfdb.bio.uni-potsdam.de/v1.1), B) P. trichocarpa (http://poplartfdb.bio.uni-potsdam.de/v2.0) and C) O. sativa (http://ricetfdb.bio.uni-potsdam.de/v2.i). The degree of the correlation among TF gene numbers is indicated by the Pearson correlation value (r). Each scatter plot shows the TF families which were statistically over- or under- represented in pair-wise comparisons (χ2 tests were applied to untransformed data; p = 0.05).
Figure 7
Figure 7. Features of the Pinot Noir heterozygous genome.
A) Comparison of constrasting haplotypes (a and b) co-mapping at two almost contigous regions in metacontig 32,921 of chromosome 1. Above: the 188 kb region; below: the 215 kb region. I from contig groups 1030-H15, 1079-G03, 2068-K04, 1034-C17 and II 2010-J07, 2044-L11, 1030-N10. In the genetic map the two regions are positioned at 60.1 cM: see preliminary experiment in Text S1. TE elements are labeled as follows: c: Copia; g: Gypsy/gypsy; a: Gypsy/athila; d: hAT/Dart; k: Karma; h: hAT; m: Mutator. B) SNP profiles of the 19 LGs of V. vinifera. Left and right of the figure correspond respectively to top and bottom of LGs of Troggio et al. . The SNP values reported do not consider gaps in and among metacontigs. C) SNPs in exons and non-coding DNA and percentage of anchored genes tagged with SNPs. In parts B to E of this figure, gene prediction and annotation and the exon-intron boundaries were based on the methods described in Solovyev et al. ; Korf et al. ; Majoros et al. ; Altschul et al. ; Huang and Madan . D) Relative age of grape duplicated genes estimated from the number of synonymous substitutions per synonymous sites (KS values). The peak between 0.6 and 1.2 KS supports a relatively large scale duplication event. Paralog genes were identified as in Li et al. and KS distributions were calculated as in Maere et al. . E) The same as in D for genes present in duplicated chromosome segments.
Figure 8
Figure 8. Scenario of angiosperm genome evolution.
Alternative scenario to the one proposed by Jaillon et al. to explain angiosperm genome evolution. Our analyses seem to suggest that there has been a large-scale duplication event, likely a hybridization event, in the Vitis lineage, rather than before the split of Vitis and other dicots. See text for details.

References

    1. McGovern PE. 2003. Ancient Wine: The Search for the Origins of Viniculture Princeton University Press, Princeton.
    1. Panagiotakos DB, Pitsavos C, Polychronopoulos E, Chrysohoou C, Zampelas A, et al. Can a Mediterranean diet moderate the development and clinical progression of coronary heart disease? A systematic review. Med Sci Monit. 2004;10:RA193–198. - PubMed
    1. Burns J, Gardner PT, O'Neil J, Crawford S, Morecroft I, et al. Relationship among Antioxidant Activity, Vasodilation Capacity, and Phenolic Content of Red Wines. J Agric Food Chem. 2000;48:220–230. - PubMed
    1. Levadoux L. Les populations sauvages et cultivées de Vitis vinifera L. Ann Amélior Plantes. 1956;6:59–117.
    1. Olmo HP. Grapes. In: Simmonds NW, editor. Evolution of crop plants. London: 1979.

Publication types