Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr;106(1):275-293.
doi: 10.1111/tpj.15161. Epub 2021 Feb 8.

Aethionema arabicum genome annotation using PacBio full-length transcripts provides a valuable resource for seed dormancy and Brassicaceae evolution research

Affiliations

Aethionema arabicum genome annotation using PacBio full-length transcripts provides a valuable resource for seed dormancy and Brassicaceae evolution research

Noe Fernandez-Pozo et al. Plant J. 2021 Apr.

Abstract

Aethionema arabicum is an important model plant for Brassicaceae trait evolution, particularly of seed (development, regulation, germination, dormancy) and fruit (development, dehiscence mechanisms) characters. Its genome assembly was recently improved but the gene annotation was not updated. Here, we improved the Ae. arabicum gene annotation using 294 RNA-seq libraries and 136 307 full-length PacBio Iso-seq transcripts, increasing BUSCO completeness by 11.6% and featuring 5606 additional genes. Analysis of orthologs showed a lower number of genes in Ae. arabicum than in other Brassicaceae, which could be partially explained by loss of homeologs derived from the At-α polyploidization event and by a lower occurrence of tandem duplications after divergence of Aethionema from the other Brassicaceae. Benchmarking of MADS-box genes identified orthologs of FUL and AGL79 not found in previous versions. Analysis of full-length transcripts related to ABA-mediated seed dormancy discovered a conserved isoform of PIF6-β and antisense transcripts in ABI3, ABI4 and DOG1, among other cases found of different alternative splicing between Turkey and Cyprus ecotypes. The presented data allow alternative splicing mining and proposition of numerous hypotheses to research evolution and functional genomics. Annotation data and sequences are available at the Ae. arabicum DB (https://plantcode.online.uni-marburg.de/aetar_db).

Keywords: Aethionema arabicum; Brassicaceae evolution; Iso-seq; alternative splicing; genome annotation; seed germination; transcription factors.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Gene annotation workflow based on MAKER and supported by evidence from multiple sources. Transcripts from long and short reads were used as evidence for gene expression. Swiss‐Prot Embryophyta proteins and predicted proteins free of errors from gene annotation v3.0 were used to support the gene prediction. SNAP and Augustus were used as ab initio predictors. Red arrows show analysis steps for short reads, blue dashed arrows for long reads. Red X’s represent discarded reads because they were mapped to organelle or rRNA sequences.
Figure 2
Figure 2
Comparison of repetitive element complements across Brassicaceae. All genome assemblies were annotated using the REPET approach described in the Experimental Procedures. Bars indicate the cumulative coverage in Mbp of different classes of repeats for each genome assembly.
Figure 3
Figure 3
Example of a gene model incorrectly merged that was subsequently split into two genes. (a) Final gene model after split. (b) Gene model incorrectly predicted by MAKER. (c) Long‐read RNA‐seq evidence based on PacBio full‐length transcripts. (d) Short‐read RNA‐seq based on Illumina Scallop assembly. (e) Protein evidence based on Embryophyta proteins in Swiss‐Prot. (f) SNAP ab initio prediction. (g) Augustus ab initio prediction.
Figure 4
Figure 4
Example of two gene models incorrectly predicted that were subsequently merged into a single gene. (a) Final gene model after merging two genes. (b) Gene models incorrectly predicted by MAKER. (c) Long‐read RNA‐seq evidence based on PacBio full‐length transcripts. (d) Short‐read RNA‐seq based on Illumina scallop assembly. (e) Protein evidence based on Embryophyta proteins in Swiss‐Prot. (f) SNAP ab initio prediction. (g) Augustus ab initio prediction.
Figure 5
Figure 5
The v3.1 gene model Aa31LG9G14150 joins nine genes that had formatting or sequence errors in v3.0. The protein track shows the v3.1 gene is supported by the protein evidence.
Figure 6
Figure 6
The incorrectly annotated gene model Aa30LG10G286 from v3.0 was subsequently split into four genes in v3.1. The protein track shows that the v3.1 genes are supported by protein evidence.
Figure 7
Figure 7
Venn diagram of orthogroups of Brassicales species. Orthologs and paralogs of Ae. arabicum, A. thaliana, C. rubella, E. salsugineum and C. papaya were identified using OrthoFinder. Numbers in the Venn diagram show OG counts for the intersection of every group of species. Total gene numbers are displayed in parentheses under the species name. The numbers of species‐specific genes are shown in parentheses under the number of exclusive OGs.
Figure 8
Figure 8
OrthoFinder genes in orthogroups and ortholog relationships for Ae. arabicum genes. Left: percentage of genes in OGs (blue), in species‐specific OGs (light gray) and unassigned (dark grey). Right: the relationship between orthologs of Brassicales species and Ae. Arabicum.

Similar articles

Cited by

References

    1. Altschul, S.F. , Gish, W. , Miller, W. , Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410. - PubMed
    1. Arora, R. , Agarwal, P. , Ray, S. , Singh, A.K. , Singh, V.P. , Tyagi, A.K. and Kapoor, S. (2007) MADS‐box gene family in rice: genome‐wide identification, organization and expression profiling during reproductive development and stress. BMC Genom. 8, 242. - PMC - PubMed
    1. Arshad, W. , Sperber, K. , Steinbrecher, T. , Nichols, B. , Jansen, V.A.A. , Leubner‐Metzger, G. and Mummenhoff, K. (2019) Dispersal biophysics and adaptive significance of dimorphic diaspores in the annual Aethionema arabicum (Brassicaceae). New Phytol. 221, 1434–1446. - PMC - PubMed
    1. Barros‐Galvao, T. , Dave, A. , Gilday, A.D. , Harvey, D. , Vaistij, F.E. and Graham, I.A. (2020) ABA INSENSITIVE4 promotes rather than represses PHYA‐dependent seed germination in Arabidopsis thaliana. New Phytol. 226, 953–956. - PMC - PubMed
    1. Bentsink, L. and Koornneef, M. (2008) Seed dormancy and germination. The Arabidopsis Book/American Society of Plant Biologists, 6, e0119. - PMC - PubMed

Publication types

MeSH terms