Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Aug 7;43(9):913-8.
doi: 10.1038/ng.889.

The genome of the extremophile crucifer Thellungiella parvula

Affiliations

The genome of the extremophile crucifer Thellungiella parvula

Maheshi Dassanayake et al. Nat Genet. .

Abstract

Thellungiella parvula is related to Arabidopsis thaliana and is endemic to saline, resource-poor habitats, making it a model for the evolution of plant adaptation to extreme environments. Here we present the draft genome for this extremophile species. Exclusively by next generation sequencing, we obtained the de novo assembled genome in 1,496 gap-free contigs, closely approximating the estimated genome size of 140 Mb. We anchored these contigs to seven pseudo chromosomes without the use of maps. We show that short reads can be assembled to a near-complete chromosome level for a eukaryotic species lacking prior genetic information. The sequence identifies a number of tandem duplications that, by the nature of the duplicated genes, suggest a possible basis for T. parvula's extremophile lifestyle. Our results provide essential background for developing genomically influenced testable hypotheses for the evolution of environmental stress tolerance.

PubMed Disclaimer

Conflict of interest statement

COMPETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1
Macro synteny between T. parvula contigs and A. thaliana chromosomes. Comparison of the 20 largest T. parvula contigs, c1–c20 (a) and the 40 next largest contigs, c21–c60 (b) with A. thaliana chromosomes. A. thaliana chromosomes 1–5 are depicted as red, green, yellow, purple and blue, respectively, with the centromeric regions indicated by black bands. T. parvula contigs are represented by gray blocks. Regions containing more than 75% similarity over a minimum of 2,000 bp with maximum gap allowance of 1,000 bp are connected with lines of colors matching those used for coloring the A. thaliana chromosomes. Ticks in each chromosome or contig block indicate lengths in 1 Mb. The distributions of protein coding regions and repetitive sequences are shown in the outer circles, with the percentage of protein coding genes, DNA transposons and retrotransposons shown in blue, yellow and orange, respectively, with a window size of 0.1 Mb. In the T. parvula contigs, predicted protein coding genes without BLASTn hits (e value < 0.0001) against the A. thaliana cDNA database are shown in green.
Figure 2
Figure 2
Prediction and annotation of ORFs in the T. parvula draft genome. (a) Length distribution of predicted T. parvula ORFs. (b) Comparison of T. parvula predicted ORFs with A. thaliana cDNAs showing the highest BLASTn hit score. The ratio of T. parvula ORF length to A. thaliana cDNA length is given as a percentage. In both a and b, the vertical axes and numbers above the bars are counts. Comparison of GO ‘biological processes’ (c) and GO ‘molecular function’ categories (d) between A. thaliana cDNAs (At) and T. parvula predicted ORFs (Tp). The GO categories are as defined in TAIR GOslim (see URLs). Categories with significant differences calculated using a χ2 test, as described in the Online Methods, are indicated as *P < 0.05 or **P < 0.01. In c, the GOslim categories ‘other metabolic processes’ (GO:0008152), ‘other physiological processes’ (GO:0007582) and ‘other biological processes’ (GO:0008150) are not shown. The complete list of cDNA and ORF numbers in each of the GO categories and their associated P values are listed in Supplementary Table 8.
Figure 3
Figure 3
Comparison of local tandem duplication (T.D.) events in the A. thaliana genome and the T. parvula draft genome. (a) Examples of tandem duplications. Examples shown are for the chromosome and contig regions containing HKT1, CBL10 and MYB47. (b) A Venn diagram showing shared and specific tandem duplication events in T. parvula and A. thaliana. We defined a tandem duplication event as the presence of more than one gene with the same annotation in one location or more than one gene in one location separated by not more than one other gene with a different annotation. The numbers of genes involved in the duplication events are given in parentheses. Tandem duplications of genes with the same annotations in both species are counted as shared events. Comparison of the GO ‘biological processes’ (c) and ‘molecular function’ categories (d) between T. parvula ORFs and A. thaliana cDNAs for genes showing tandem duplications. The radial axes are the percentages of cDNA or ORFs in each GO category compared to the number of total tandem duplicated cDNA or ORFs. Categories showing significant differences are marked as *P < 0.05 or **P < 0.01. The number of tandem duplicated cDNAs or ORFs in each GO category and P values are listed in Supplementary Table 8. The complete list of tandem duplicated cDNAs and ORFs is presented as Supplementary Table 9.
Figure 4
Figure 4
Assembly of the seven chromosomes of T. parvula. (a) Outline of the ancestral karyotype segments determined by comparative chromosome painting techniques, in A. thaliana chromosomes. The ancestral karyotype segments, denoted A to X, are drawn to scale based on the A. thaliana genome sequence. (b) T. parvula contigs aligned to the Eutremeae (n = 7) karyotype schema, and the ORFs defining the borders of the ancestral karyotype segments. A. thaliana locus IDs showing the highest homology with each ORF are given in parentheses. Shown are T. parvula contigs covering the ancestral karyotype segments. Complete chromosome assignment of the 40 largest contigs, including the contigs covering the centromeric regions, are presented in Supplementary Table 10. (c) Circos plot presenting the assembly of seven chromosomes. The 40 largest T. parvula contigs are shown. The links and histograms in the outer circles showing the distribution of protein coding genes and repetitive sequences were generated as in Figure 1. The ancestral karyotype segments in the A. thaliana chromosomes and T. parvula contigs and the links connecting them are depicted with colors as in a and b.

Similar articles

Cited by

References

    1. Al-Shehbaz IA, O’Kane SL. Placement of Arabidopsis parvula in Thellungiella (Brassicaceae) Novon. 1995;5:309–310.
    1. Amtmann A. Learning from evolution: Thellungiella generates new knowledge on essential and critical components of abiotic stress tolerance in plants. Mol Plant. 2009;2:3–12. - PMC - PubMed
    1. Beilstein MA, et al. Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana. Proc Natl Acad Sci USA. 2010;107:18724–18728. - PMC - PubMed
    1. Orsini F, et al. A comparative study of salt tolerance parameters in 11 wild relatives of Arabidopsis thaliana. J Exp Bot. 2010;61:3787–3798. - PMC - PubMed
    1. Oh DH, et al. Genome structures and halophyte-specific gene expression of the extremophile Thellungiella parvula in comparison with Thellungiella salsuginea (Thellungiella halophila) and Arabidopsis. Plant Physiol. 2010;154:1040–1052. - PMC - PubMed

Publication types

LinkOut - more resources