Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 6;11(15):2062.
doi: 10.3390/plants11152062.

Annotation of Siberian Larch (Larix sibirica Ledeb.) Nuclear Genome-One of the Most Cold-Resistant Tree Species in the Only Deciduous GENUS in Pinaceae

Affiliations

Annotation of Siberian Larch (Larix sibirica Ledeb.) Nuclear Genome-One of the Most Cold-Resistant Tree Species in the Only Deciduous GENUS in Pinaceae

Eugenia I Bondar et al. Plants (Basel). .

Abstract

The recent release of the nuclear, chloroplast and mitochondrial genome assemblies of Siberian larch (Larix sibirica Ledeb.), one of the most cold-resistant tree species in the only deciduous genus of Pinaceae, with seasonal senescence and a rot-resistant valuable timber widely used in construction, greatly contributed to the development of genomic resources for the larch genus. Here, we present an extensive repeatome analysis and the first annotation of the draft nuclear Siberian larch genome assembly. About 66% of the larch genome consists of highly repetitive elements (REs), with the likely wave of retrotransposons insertions into the larch genome estimated to occur 4-5 MYA. In total, 39,370 gene models were predicted, with 87% of them having homology to the Arabidopsis-annotated proteins and 78% having at least one GO term assignment. The current state of the genome annotations allows for the exploration of the gymnosperm and angiosperm species for relative gene abundance in different functional categories. Comparative analysis of functional gene categories across different angiosperm and gymnosperm species finds that the Siberian larch genome has an overabundance of genes associated with programmed cell death (PCD), autophagy, stress hormone biosynthesis and regulatory pathways; genes that may play important roles in seasonal senescence and stress response to extreme cold in larch. Despite being incomplete, the draft assemblies and annotations of the conifer genomes are at a point of development where they now represent a valuable source for further genomic, genetic and population studies.

Keywords: RNA-seq; Siberian larch; angiosperms; annotation; conifer; deciduous; genome; gymnosperms; microsatellites; repeats; seasonal senescence; transcriptome; transposons.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
(A)—relative size of the repetitive sequence content of the Siberian larch genome annotated using RepeatMasker and combined library, comprising the RepeatModeler-derived library classified with TEclass, RepBase, MIPS, CPRD and PIER v1.0 libraries; (B)—microsatellite (SSR) density (number of microsatellite loci with di-, tri-, tetra-, penta-, hexa-, hepta- and octanuclotide motifs per 1 Mbp) for several conifer and angiosperm species found using the GMATo program (Larch—Larix sibirica; Pab—Picea abies; PG—Picea glauca; Pita—Pinus taeda; Popul—Populus trichocarpa; TAIR—Arabidopsis thaliana; Zea—Zea mays; (C)—box plots for number of all microsatellite loci found in all species listed in B using the GMATo and TRF programs.
Figure 2
Figure 2
(A)—structure of Copia-like and Gypsy-like LTR retrotransposons; (B)—estimation of the insertion time of the LTR-RT elements in genomes of nine gymnosperm species; estimation of the insertion time of LTR-RT (C); Copia and Gypsy superfamilies (D); and TGCA/non-TGCA LTRs (E) in the genome of Siberian larch. X-axis is in million years (MYA).
Figure 3
Figure 3
(A)—proportion of coding and intronic parts per every gene model in the Siberian larch genome according to the MAKER2 annotation; (B)—top 10% of the longest introns across 11 plant species.
Figure 4
Figure 4
Cumulative number (A) and proportion (B) of genes aligned to the Arabidopsis protein set using qcovhsp above a given coverage threshold. Gymnosperm species are presented by dashed lines; angiosperms—by dotted lines; Siberian larch—by the solid blue line.
Figure 5
Figure 5
Functional annotation of Siberian larch genes: (A)—proportion of predicted larch genes in three functional categories: BP—biological process; MF—molecular function; and CC—cellular component; (B)—percentage of larch proteins in different functional categories mapped to the Arabidopsis non-redundant protein set with a BLASTP match parameters of e  ≤  10−5, pident > 20 and qcovhsp > 20.
Figure 6
Figure 6
Percentage of genes annotated with GO terms related to cell-wall maintenance. Angiosperm species are represented by solid black columns, gymnosperms by transparent columns, Siberian larch—by yellow. Boxplots demonstrate the difference in gene number between two groups, evergreen (transparent) and deciduous (gray). Angiosperms: BP—Betula pendula; FS—Fagus sylvatica; PR—Populus trichocarpa; QR—Quercus robur; VV—Vitis vinifera). Gymnosperms: PM—Pseudotsuga menziesii; PT—Pinus taeda; PL—Pinus lambertiana; PG—Picea glauca; PA—Picea abies; LS—Larix sibirica.
Figure 7
Figure 7
Percentage of genes annotated with GO terms related to programmed cell death (PCD) and autophagy. Deciduous angiosperm species are represented by black solid columns, evergreen gymnosperms—by transparent columns, Siberian larch—by yellow column. Boxplots demonstrate the difference in gene numbers between two groups, evergreen (transparent) and deciduous (gray). Angiosperms: BP—Betula pendula; FS—Fagus sylvatica; PR—Populus trichocarpa; QR—Quercus robur; VV—Vitis vinifera). Gymnosperms: PM—Pseudotsuga menziesii; PT—Pinus taeda; PL—Pinus lambertiana; PG—Picea glauca; PA—Picea abies; LS—Larix sibirica.
Figure 8
Figure 8
Percentage of genes annotated with GO terms related to hormone signaling and response. Deciduous angiosperm species are represented by black solid columns, evergreen gymnosperms—by transparent columns, Siberian larch—by yellow column. Boxplots demonstrate the difference in gene number between two groups, evergreen (transparent) and deciduous (gray). Angiosperms: BP—Betula pendula; FS—Fagus sylvatica; PR—Populus trichocarpa; QR—Quercus robur; VV—Vitis vinifera). Gymnosperms: PM—Pseudotsuga menziesii; PT—Pinus taeda; PL—Pinus lambertiana; PG—Picea glauca; PA—Picea abies; LS—Larix sibirica.
Figure 9
Figure 9
Genome annotation workflow.

References

    1. McLoughlin S. Gymnosperms. In: Alderton D., Elias S.A., editors. Encyclopedia of Geology. 2nd ed. Volume 3. Elsevier; Amsterdam, The Netherlands: 2021. pp. 476–500. - DOI
    1. Brenner E.D., Stevenson D. Using Genomics to Study Evolutionary Origins of Seeds. In: Williams C.G., editor. Landscapes, Genomics and Transgenic Conifers. Managing Forest Ecosystems. Volume 9. Springer; Dordrecht, The Netherlands: 2006. pp. 85–106. - DOI
    1. Soltis P.S., Soltis D.E., Savolainen V., Crane P.R., Barraclough T.G. Rate heterogeneity among lineages of tracheophytes: Integration of molecular and fossil data and evidence for molecular living fossils. Proc. Natl. Acad. Sci. USA. 2002;99:4430–4435. doi: 10.1073/pnas.032087199. - DOI - PMC - PubMed
    1. Wan T., Liu Z.-M., Li L.-F., Leitch A.R., Leitch I.J., Lohaus R., Liu Z.-J., Xin H.-P., Gong Y.-B., Liu Y., et al. A genome for gnetophytes and early evolution of seed plants. Nat. Plants. 2018;4:82–89. doi: 10.1038/s41477-017-0097-2. - DOI - PubMed
    1. Stevens K.A., Wegrzyn J.L., Zimin A., Puiu D., Crepeau M., Cardeno C., Paul R., Gonzalez-Ibeas D., Koriabine M., Holtz-Morris A.E., et al. Sequence of the sugar pine megagenome. Genetics. 2016;204:1613–1626. doi: 10.1534/genetics.116.193227. - DOI - PMC - PubMed

LinkOut - more resources