Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 4;12(3):jkab433.
doi: 10.1093/g3journal/jkab433.

Chromosome-level genome assembly of the fully mycoheterotrophic orchid Gastrodia elata

Affiliations

Chromosome-level genome assembly of the fully mycoheterotrophic orchid Gastrodia elata

Eun-Kyung Bae et al. G3 (Bethesda). .

Abstract

Gastrodia elata, an obligate mycoheterotrophic orchid, requires complete carbon and mineral nutrient supplementation from mycorrhizal fungi during its entire life cycle. Although full mycoheterotrophy occurs most often in the Orchidaceae family, no chromosome-level reference genome from this group has been assembled to date. Here, we report a high-quality chromosome-level genome assembly of G. elata, using Illumina and PacBio sequencing methods with Hi-C technique. The assembled genome size was found to be 1045 Mb, with an N50 of 50.6 Mb and 488 scaffolds. A total of 935 complete (64.9%) matches to the 1440 embryophyte Benchmarking Universal Single-Copy Orthologs were identified in this genome assembly. Hi-C scaffolding of the assembled genome resulted in 18 pseudochromosomes, 1008 Mb in size and containing 96.5% of the scaffolds. A total of 18,844 protein-coding sequences (CDSs) were predicted in the G. elata genome, of which 15,619 CDSs (82.89%) were functionally annotated. In addition, 74.92% of the assembled genome was found to be composed of transposable elements. Phylogenetic analysis indicated a significant contraction of genes involved in various biosynthetic processes and cellular components and an expansion of genes for novel metabolic processes and mycorrhizal association. This result suggests an evolutionary adaptation of G. elata to a mycoheterotrophic lifestyle. In summary, the genomic resources generated in this study will provide a valuable reference genome for investigating the molecular mechanisms of G. elata biological functions. Furthermore, the complete G. elata genome will greatly improve our understanding of the genetics of Orchidaceae and its mycoheterotrophic evolution.

Keywords: Gastrodia elata; Orchidaceae; genome assembly; mycoheterotrophic; pseudochromosome.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Photograph of Gastrodia elata. The white arrows indicate the mature tuber and scape.
Figure 2
Figure 2
Genome-wide Hi-C interaction heatmap of G. elata. The 18 assembled scaffolds are ordered by length. The x- and y-axes provide the mapping positions for the first and second reads in each read pair, respectively, grouped into bins. The color of each square indicates the number of read pairs within that bin. Gray vertical and white horizontal lines have been added to indicate the borders between scaffolds. The off-diagonal pattern in the pseudochromosome 10 and 11 may reflect the Rabl configuration of chromatins (Tiang et al. 2012).
Figure 3
Figure 3
(A) Genome overview of the G. elata genome. The pseudochromosomes are in order from longest to shortest in a clockwise manner. The features are arranged in the order of gene density, repeat density, LTR/Gypsy, GC content, and GC skew from outside to inside in 1 Mb intervals across the 18 chromosomes. (B) Kimura distance-based copy divergence analysis of TEs in the G. elata genome. The graph represents the percentage of the genome represented by each repeat type on the y-axis to their corresponding Kimura substitution level (CpG adjusted) illustrated on the x-axis (K-value from 0 to 50). The color chart below the x-axis indicates the repeat types.
Figure 4
Figure 4
The distribution of transcript and gene length between G. elata, the other three species (A. shenzhenica, D. catenatu, and P. equestris) in Orchidaceae, and E. guineensis.
Figure 5
Figure 5
(A) Phylogenetic analysis of G. elata among 15 plants and gene family gain-and-loss analysis including the number of gained gene families (+) and lost gene families (−). (B) The number of genes in the top 10 GO terms of expanded gene families (Supplementary Table S7) and (C) contracted gene families (Supplementary Table S8) in the G. elata genome. The green, blue, and orange colored bars represent the three major GO categories, biological process, MF, and CE, respectively. The overlapping terms in both expanded and contracted gene families are underlined.
Figure 6
Figure 6
(A) Bar graph of the number of protein-coding genes in the 15 plant species including G. elata. The distribution of number of genes between G. elata and other 14 species by the type of orthogroups. Single-copy orthologs include common orthologs with one copy in all species. Multi-copy orthologs include common orthologs with multiple copy numbers in all species. The number of genes in species-specific orthogroups represents unique genes in specific species. Other orthologs include gene from families shared in 2–14 species. (B) Venn diagram of orthologous gene families between G. elata and other three species (A. shenzhenica, D. catenatu, and P. equestris) in the Orchidaceae family.

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ.. 1990. Basic local alignment search tool. J Mol Biol. 215:403–410. doi: 10.1016/S0022-2836(05)80360-2. - DOI - PubMed
    1. Bao Z, Eddy SR.. 2002. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 12:1269–1276. doi: 10.1101/gr.88502. - DOI - PMC - PubMed
    1. Barrett CF, Davis JI.. 2012. The plastid genome of the mycoheterotrophic Corallorhiza striata (Orchidaceae) is in the relatively early stages of degradation. Am J Bot. 99:1513–1523. doi: 10.3732/ajb.1200256. - DOI - PubMed
    1. Benson G. 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27:573–580. doi: 10.1093/nar/27.2.573. - DOI - PMC - PubMed
    1. Britton T, Anderson CL, Jacquet D, Lundqvist S, Bremer K.. 2007. Estimating divergence times in large phylogenetic trees. Syst Biol. 56:741–752. doi: 10.1080/10635150701613783. - DOI - PubMed

Publication types

LinkOut - more resources