Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun 6;20(1):456.
doi: 10.1186/s12864-019-5832-9.

A survey of transcriptome complexity using PacBio single-molecule real-time analysis combined with Illumina RNA sequencing for a better understanding of ricinoleic acid biosynthesis in Ricinus communis

Affiliations

A survey of transcriptome complexity using PacBio single-molecule real-time analysis combined with Illumina RNA sequencing for a better understanding of ricinoleic acid biosynthesis in Ricinus communis

Lijun Wang et al. BMC Genomics. .

Abstract

Background: Ricinus communis is a highly economically valuable oil crop plant from the spurge family, Euphorbiaceae. However, the available reference genomes are incomplete and to date studies on ricinoleic acid biosynthesis at the transcriptional level are limited.

Results: In this study, we combined PacBio single-molecule long read isoform and Illumina RNA sequencing to identify the alternative splicing (AS) events, novel isoforms, fusion genes, long non-coding RNAs (lncRNAs) and alternative polyadenylation (APA) sites to unveil the transcriptomic complexity of castor beans and identify critical genes related to ricinoleic acid biosynthesis. Here, we identified 11,285 AS-variants distributed in 21,448 novel genes and detected 520 fusion genes, 320 lncRNAs and 9511 (APA-sites). Furthermore, a total of 6067, 5983 and 4058 differentially expressed genes between developing beans of the R. communis lines 349 and 1115 with extremely different oil content were identified at 7, 14 and 21 days after flowering, respectively. Specifically, 14, 18 and 11 DEGs were annotated encoding key enzymes related to ricinoleic acid biosynthesis reflecting the higher castor oil content of 1115 compared than 349. Quantitative real-time RT-PCR further validated fifteen of these DEGs at three-time points.

Conclusion: Our results significantly improved the existed gene models of R. communis, and a putative model of key genes was built to show the differences between strains 349 and 1115, illustrating the molecular mechanism of castor oil biosynthesis. A multi-transcriptome database and candidate genes were provided to further improve the level of ricinoleic acid in transgenic crops.

Keywords: Full-length transcriptome; Illumina RNA sequencing; Key enzymes; Ricinoleic acid biosynthesis; Ricinus communis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Library construction of PacBio SMRT sequencing and isoform comparison between the Ricinus communis genome and full-length transcriptome. a Quality inspection of reads of inserts (ROI) in three libraries (1–2 k, 2-3 k and 3-6 k). b Quality inspection of full-length non-chimeric (FLNC) reads in three libraries (1–2 k, 2-3 k and 3-6 k). c Isoform length comparison between the reference genome and PacBio long-reads data. d Comparison of isoforms sequences between the Ricinus communis genome and full-length transcriptome
Fig. 2
Fig. 2
CIRCOS visualization of gene and transcript density compared PacBio SMRT sequences with Ricinus communis reference genome for 20 top lengths of scaffolds. a Twenty longest scaffolds schematic. b Heat map of gene density distribution of PacBio SMRT sequences. Gene density was calculated in a 1-Mb sliding window at 20 kb intervals. c Heat map of transcripts density distribution of PacBio SMRT sequences. Gene density was calculated in a 1-Mb sliding window at 20 kb intervals. d Heat map of gene density distribution of the reference genome. Gene density was calculated in a 1-Mb sliding window at 20 kb intervals. e Heat map of transcripts density distribution of the reference genome. Gene density was calculated in a 1-Mb sliding window at 20 kb intervals
Fig. 3
Fig. 3
Identification of lncRNAs, alternative splicing events, isoform numbers and alternative polyadenylation (APA) based on transcriptome technologies. a Number and categories of alternative splicing events based on the PacBio platform. b Number and categories of isoforms based on the PacBio platform. c Number and categories of APA based on the PacBio platform. d Number of long non-coding RNAs analyzed by CNCI, CPC, PFAM and CPAT based on the PacBio platform
Fig. 4
Fig. 4
CIRCOS visualization of lncRNA density and linkage of fusion transcripts for 20 top lengths of scaffolds. a Twenty longest scaffolds schematic. b LncRNA density, in 1 Mb bins on each chromosome. c Linkage of fusion transcripts: red, intra-chromosomal; green, inter-chromosomal
Fig. 5
Fig. 5
Venn diagrams of differentially expressed genes (DEGs) between R. communis strains 349 and 1115 at three time points. a Number of up-regulated DEGs among 7 DAF, 14 DAF and 21 DAF (349 vs. 1115) (b) Number of down-regulated DEGs among 7 DAF, 14 DAF and 21 DAF. c Number of up-regulated genes among 7 DAF vs. 14 DAF, 7 DAF vs. 21 DAF and 14 DAF vs. 21 DAF (349). d Number of down-regulated genes among 7 DAF vs. 14 DAF, 7 DAF vs. 21 DAF and 14 DAF vs. 21 DAF (349). e Number of up-regulated genes among 7 DAF vs. 14 DAF, 7 DAF vs. 21 DAF and 14 DAF vs. 21 DAF (1115). f Number of down-regulated genes among 7 DAF vs. 14 DAF, 7 DAF vs. 21 DAF and 14 DAF vs. 21 DAF (1115)
Fig. 6
Fig. 6
QRT-PCR validation of several DEGs related to the key enzymes for ricinoleic acid biosynthesis at three time points. a The RNA-Seq log2 values (expression ratios of 349-RPKM/1115-RPKM) and the qRT-PCR log2 values (expression ratios of 349/1115) of five important DEGs between strains 349 and 1115 at 7 DAF. b The RNA-Seq log2 values (expression ratios of 349-RPKM/1115-RPKM) and the qRT-PCR log2 values (expression ratios of 349/1115) of six important DEGs between strains 349 and 1115 at 14 DAF. c The RNA-Seq log2 values (expression ratios of 349-RPKM/1115-RPKM) and the qRT-PCR log2 values (expression ratios of 349/1115) of five important DEGs between strains 349 and 1115 at 21 DAF. d Correlation analysis of the DEG expression ratios obtained from the qRT-PCR and RNA-seq data of 16 DEGs (p-value < 0.05)
Fig. 7
Fig. 7
Illustration of DEGs encoding the key enzymes of the ricinoleic acid biosynthetic pathway

References

    1. Chan AP, Crabtree J, Zhao Q, Lorenzi H, Orvis J, Puiu D, et al. Draft genome sequence of the oilseed species Ricinus communis. Nat Biotechnol. 2010;28:951–956. doi: 10.1038/nbt.1674. - DOI - PMC - PubMed
    1. Gill RA, Ali B, Cui P, Shen E, Farooq MA, Islam F, et al. Comparative transcriptome profiling of two Brassica napus cultivars under chromium toxicity and its alleviation by reduced glutathione. BMC Genomics. 2016;17:885. doi: 10.1186/s12864-016-3200-6. - DOI - PMC - PubMed
    1. Brown AP, Kroon JT, Swarbreck D, Febrer M, Larson TR, Graham IA, et al. Tissue-specific whole transcriptome sequencing in castor, directed at understanding triacylglycerol lipid biosynthetic pathways. PLoS One. 2012;7:1–13. - PMC - PubMed
    1. Geng X, Dong N, Wang Y, Li G, Wang L, Guo X, et al. RNA-seq transcriptome analysis of the immature seeds of two Brassica napus lines with extremely different thousand-seed weight to identify the candidate genes related to seed weight. PLoS One. 2018;13:e0191297. doi: 10.1371/journal.pone.0191297. - DOI - PMC - PubMed
    1. Li Y, Fang C, Fu Y, Hu A, Li C, Zou C, et al. A survey of transcriptome complexity in Sus scrofa using single-molecule long-read sequencing. DNA Res. 2018;25:421–437. doi: 10.1093/dnares/dsy014. - DOI - PMC - PubMed