Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep;8(9):1038-1051.
doi: 10.1038/s41477-022-01226-7. Epub 2022 Sep 1.

Dynamic genome evolution in a model fern

Affiliations

Dynamic genome evolution in a model fern

D Blaine Marchant et al. Nat Plants. 2022 Sep.

Abstract

The large size and complexity of most fern genomes have hampered efforts to elucidate fundamental aspects of fern biology and land plant evolution through genome-enabled research. Here we present a chromosomal genome assembly and associated methylome, transcriptome and metabolome analyses for the model fern species Ceratopteris richardii. The assembly reveals a history of remarkably dynamic genome evolution including rapid changes in genome content and structure following the most recent whole-genome duplication approximately 60 million years ago. These changes include massive gene loss, rampant tandem duplications and multiple horizontal gene transfers from bacteria, contributing to the diversification of defence-related gene families. The insertion of transposable elements into introns has led to the large size of the Ceratopteris genome and to exceptionally long genes relative to other plants. Gene family analyses indicate that genes directing seed development were co-opted from those controlling the development of fern sporangia, providing insights into seed plant evolution. Our findings and annotated genome assembly extend the utility of Ceratopteris as a model for investigating and teaching plant biology.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Ceratopteris richardii life cycle and genome assembly characteristics.
a, Life cycle of Ceratopteris with tissues sampled for RNA-seq in bold. Images are not to scale. b, Genome assembly of Ceratopteris with: (A) chromosomes, (B) gene density in a 3-Mb sliding window, maximum value of 139; (C) mRNA expression density in a 3-Mb sliding window, maximum value of 170; (D) long terminal repeat retrotransposon density in a 3-Mb sliding window, orange and blue bands represent Ty3 and Ty1 LTRs, respectively, maximum value of 970; and (E) intragenomic syntenic regions of ten or more genes. Green horizontal lines represent the 5th percentile, red horizontal lines represent the 95th percentile. c, Genome composition of Ceratopteris. LTR, long terminal repareat; TE, transposable element. d, Intron and exon lengths from a green alga (n = 166,499 exons; 147,269 introns), liverwort (n = 137,019 exons; 112,345 introns), moss (n = 587,902 exons; 501,233 introns), lycophyte (n = 197,720 exons; 162,895 introns), two water ferns (Azolla: n = 127,875 exons; 107,674 introns; Salvinia: n = 122,980 exons; 103,200 introns), Ceratopteris (n = 437,785 exons; 400928 introns), gymnosperm (n = 268,745 exons; 187,344 introns), basal angiosperm (n = 111,241 exons; 83,928 introns), monocot (n = 196,916 exons; 152,273 introns) and eudicot (n = 313,952 exons; 237,746 introns). The widest part of the violin plot represents the highest point density, whereas the top and bottom are the maximum and minimum data respectively. Box plots are in the middle of violin plots, the top and bottom lines represent 25th and 75th percentiles, the centre line is the median and whiskers are the full data range. e, Correlation between fragments per kilobase million (FPKM) and gene length at genome-wide level of Ceratopteris. Source data
Fig. 2
Fig. 2. Evidence of polyploidy in the evolutionary history of Ceratopteris.
a, Ks distributions of all paralogous genes (left) and tandemly duplicated paralogous genes (right). b, Placement of WGD events in the evolutionary history of Ceratopteris based on phylogenomic analyses. c, Proportion and length of syntenic regions in Ceratopteris (yellow) relative to the average of 27 flowering plant species (lilac). Inset shows the standard deviation of syntenic block length distribution relative to peak Ks value (WGD) for Ceratopteris (yellow), Azolla (blue), Salvinia (green), Ginkgo (red) and 27 flowering plant species (lilac). Source data
Fig. 3
Fig. 3. Whole-genome methylation profiling of Ceratopteris.
a, CG, CHG and CHH methylation level in all genes (whole gene sequence) (n = 36,857), exons (n = 159,326), introns (n = 122,469), untranslated region (UTR) (n = 32,307) and randomly selected repeats (n = 1,000). b, Introns were classified into three groups based on length: shorter than 1 kb (n = 101,510); 1–10 kb (n = 32,547); and longer than 10 kb (n = 25,269). c, CG, CHG and CHH methylation level for introns of different lengths. d, Methylation level of nine CHH contexts in repeats. e, Metaplots show distribution of DNA methylation level across gene and repeat body (right). For genes, methylation level was calculated for C-contexts in exons (left) and in whole gene sequences, including exons and introns (middle). TSS, transcription start site. TTS, transcription termination site. Source data
Fig. 4
Fig. 4. Transcriptome profiling and evolution of gene families for plant reproduction and architecture.
a, Venn diagram of gametophyte- and sporophyte-specific genes and associated expression heatmaps with log2(transcripts per million). b, Venn diagram of meiotic- and non-meiotic-specific genes and associated expression heatmaps with log2(transcripts per million). c, Phylogeny of FT genes and their expression in Ceratopteris. The MFT-like gene clade is highlighted in grey. d, Phylogeny of Type II MADS-box genes across green plants. MIKC*-group genes are highlighted in yellow, MIKCC-group genes are highlighted in light blue. For c and d, angiosperm genes are red, gymnosperm genes are purple, fern genes are blue, lycophyte genes are orange, bryophyte genes are green and algal genes are brown. TPM, transcripts per million. Source data
Fig. 5
Fig. 5. HGTs and medicinal compounds in Ceratopteris.
a, Phylogeny of aerolysin-like genes suggests an HGT from bacteria to early vascular plants and a second HGT specifically to ferns, followed by tandem duplications on chromosome 9 in Ceratopteris (highlighted). Fern genes are blue, lycophyte genes are orange, fish are purple, bacteria are cyan, fungi are brown, archaea are magenta, chlorophytic algae are dark green and red algae are rust. b, Phylogeny of PAD genes in Ceratopteris and leptosporangiate ferns suggests an HGT from bacteria to ferns followed by rampant tandem duplication across chromosome 11 (highlighted). Polypodiales are blue, Schizaeales are red, Salviniales are light green, Gleicheniales are cyan, Cyatheales are orange and bacteria are pink. c, Expression of Ceratopteris PAD genes (CrPADS) across tissues/life stages. d, Metabolic profile of previously identified medicinal compounds in Ceratopteris. CADGS, casuarine 6-alpha-d-glucoside; CGS, casuarine 3-glucoside; 5-CQA, cis-5-caffeoylquinic acid; 3,5-DCQA, 3,5-di-O-caffeoylquinic acid; 3,4-DICQA, 4,5-di-O-caffeoylquinic acid; eDHKO, ent-7alpha,12beta-dihydroxy-16-kauren-19,6beta-olide; eHKOC, ent-17-hydroxy-15-kauren-19-oic acid; FNBG, flavanone_7-O-beta-d-glucoside; KAFRP, kaempferol 3-arabinofuranoside 7-rhamnofuranoside; KFRX, kaempferol 3-rhamnoside 7-xyloside; luteolin, 2,4′,5,7-tetrahydroxyflavanone; 7-OLRP, 7-O-alpha-l-rhamnopyranoside. Data are presented as means ± s.e.m. (n = 6). Source data
Extended Data Fig. 1
Extended Data Fig. 1. MAPS; observed results and null and positive simulations.
Percentage of subtrees that contain a gene duplication shared by descendant species at each node, results from observed data (red line), 100 resampled sets of null simulations (black lines), and positive simulations (gray lines). The red oval corresponds to the putative ancient WGD.
Extended Data Fig. 2
Extended Data Fig. 2. Self-synteny analysis of Ceratopteris by chromosome (gene rank order).
Syntenic blocks (≥10 genes) are identified in red.
Extended Data Fig. 3
Extended Data Fig. 3. Synteny analysis of Ceratopteris chromosomes and Salvinia cucullata scaffolds.
Syntenic anchor BLAST hits are black points, nearby non-anchor BLAST hits are blue points, and non-syntenic reciprocal best score BLAST hits are orange points.
Extended Data Fig. 4
Extended Data Fig. 4. Horizontal gene transfers (HGTs) and their evolution among ferns.
A, Phylogeny of Tma12 with protein conservation analysis of the chitin-binding domain. B, expression heatmap in tissues of Ceratopteris. C, HiC anchoring of Chromosome 9 with the region where the aerolysin-like genes are located outlined in black, confirming their presence in the Ceratopteris genome assembly. D, Expression patterns of the aerolysin-like genes in Ceratopteris. Source data
Extended Data Fig. 5
Extended Data Fig. 5. Transcriptome-based phylogeny of aerolysin-like genes.
Two horizontal gene transfers are evident, one fern-specific and one in early land plants. Aerolysin-like genes that were tandemly duplicated on Chromosome 9 are highlighted. Fern genes are blue, lycophyte genes are orange, bryophyte genes are light green, fish genes are purple, bacteria genes are cyan, fungal genes are brown, archaea genes are magenta, streptophytic algae genes are black, chlorophytic algae genes are dark green, and red algae genes are rust.
Extended Data Fig. 6
Extended Data Fig. 6. Analysis of Ceratopteris genes and metabolites for applications in environment and medicine.
The metabolites identified in all 6 biological replications were used in this study. A, Overview of the metabolomics dataset of Ceratopteris based on HMDB (Human Metabolome Database, https://hmdb.ca). All compounds were divided into 11 clades according to the HMDB superclass. The compound numbers in each superclass were separated as positive and negative metabolites. B, KEGG enrichment of putative compounds related to human diseases. C, The Flavonoid Biosynthesis pathway in Ceratopteris. 4CL, 4-coumarate:CoA ligase; CHS, chalcone synthase; CHI, chalcone isomerase; F3H, flavanone 3-hydroxylase; F3’H, flavonoid 3’-hydroxylase; FLS, flavonol synthase; OMT1, O-methyltransferase; UGT, UDPdependent glycosyltransferase; DFR, dihydroflavonol 4-reductase; LDOX/ANS, leucoanthocyanidin dioxygenase/anthocyanidin synthase; AAT, anthocyanin acyltransferase. Source data

References

    1. Kenrick P, Crane PR. The origin and early evolution of plants on land. Nature. 1997;389:33–39. doi: 10.1038/37918. - DOI
    1. Lloyd RM. Mating systems and genetic load in pioneer and non-pioneer Hawaiian Pteridophyta. Bot. J. Linn. Soc. 1974;69:23–35. doi: 10.1111/j.1095-8339.1974.tb01611.x. - DOI
    1. Ellwood MDF, Foster WA. Doubling the estimate of invertebrate biomass in a rainforest canopy. Nature. 2004;429:549–551. doi: 10.1038/nature02560. - DOI - PubMed
    1. de León SG, Briones O, Aguirre A, Mehltreter K, Pérez-García B. Germination of an invasive fern responds better than native ferns to water and light stress in a Mexican cloud forest. Biol. Invas. 2021;20:3187–3199. doi: 10.1007/s10530-021-02570-z. - DOI
    1. Raja W, Rathaur P, John SA, Ramteke PW. Azolla: an aquatic pteridophyte with great potential. Int. J. Res. Biol. Sci. 2012;2:68–72.

Substances