Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 12;21(1):153.
doi: 10.1186/s12864-019-6432-4.

Pan-tissue transcriptome analysis of long noncoding RNAs in the American beaver Castor canadensis

Affiliations

Pan-tissue transcriptome analysis of long noncoding RNAs in the American beaver Castor canadensis

Amita Kashyap et al. BMC Genomics. .

Abstract

Background: Long noncoding RNAs (lncRNAs) have roles in gene regulation, epigenetics, and molecular scaffolding and it is hypothesized that they underlie some mammalian evolutionary adaptations. However, for many mammalian species, the absence of a genome assembly precludes the comprehensive identification of lncRNAs. The genome of the American beaver (Castor canadensis) has recently been sequenced, setting the stage for the systematic identification of beaver lncRNAs and the characterization of their expression in various tissues. The objective of this study was to discover and profile polyadenylated lncRNAs in the beaver using high-throughput short-read sequencing of RNA from sixteen beaver tissues and to annotate the resulting lncRNAs based on their potential for orthology with known lncRNAs in other species.

Results: Using de novo transcriptome assembly, we found 9528 potential lncRNA contigs and 187 high-confidence lncRNA contigs. Of the high-confidence lncRNA contigs, 147 have no known orthologs (and thus are putative novel lncRNAs) and 40 have mammalian orthologs. The novel lncRNAs mapped to the Oregon State University (OSU) reference beaver genome with greater than 90% sequence identity. While the novel lncRNAs were on average shorter than their annotated counterparts, they were similar to the annotated lncRNAs in terms of the relationships between contig length and minimum free energy (MFE) and between coverage and contig length. We identified beaver orthologs of known lncRNAs such as XIST, MEG3, TINCR, and NIPBL-DT. We profiled the expression of the 187 high-confidence lncRNAs across 16 beaver tissues (whole blood, brain, lung, liver, heart, stomach, intestine, skeletal muscle, kidney, spleen, ovary, placenta, castor gland, tail, toe-webbing, and tongue) and identified both tissue-specific and ubiquitous lncRNAs.

Conclusions: To our knowledge this is the first report of systematic identification of lncRNAs and their expression atlas in beaver. LncRNAs-both novel and those with known orthologs-are expressed in each of the beaver tissues that we analyzed. For some beaver lncRNAs with known orthologs, the tissue-specific expression patterns were phylogenetically conserved. The lncRNA sequence data files and raw sequence files are available via the web supplement and the NCBI Sequence Read Archive, respectively.

Keywords: Beaver; Castor canadensis; Expression atlas; Long noncoding RNA; Transcriptome; lncRNA.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Noncoding transcript contigs’ model-based structural stability is inversely correlated with length. Marks indicate lncRNA contigs that have no known orthologs (“novel”; a) and that have known noncoding orthologs (“known”, b). The outlier in (b) is labeled by its known ortholog, XIST
Fig. 2
Fig. 2
The lncRNA contigs with known orthologs are longer than the novel lncRNA contigs. Density distributions of contig lengths for the 147 novel noncoding transcript contigs (“novel”) and the 40 noncoding transcript contigs that are orthologous to known noncoding transcripts (“known”)
Fig. 3
Fig. 3
In the pan-tissue transcriptome assembly, known lncRNA contigs had overall higher coverage levels than novel lncRNA contigs. Density distributions of contig coverage depths for the 147 novel noncoding transcript contigs (“novel”) and the 40 noncoding transcript contigs that are orthologous to known noncoding transcripts (“known”). For both sets of noncoding transcript contigs, average depth of coverage in the assembly was not significantly correlated with contig length (Fig. 5)
Fig. 4
Fig. 4
Tissue-specific expression of novel lncRNAs in the American beaver. Heatmap rows correspond to the 147 contigs and columns correspond to the 16 tissues that were profiled. Cells are colored by log2(1 + RPKM) expression level. Rows and columns are separately ordered by hierarchical agglomerative clustering and cut-based sub-dendrograms are colored (arbitrary color assignment to sub-clusters) as a guide for visualization. Rows are labeled with abbreviated contig names, e.g., contig4731.1 instead of Ccan_OSU1_lncRNA_contig4731.1
Fig. 5
Fig. 5
Contig average depth of read coverage in the assembly is not correlated with contig length. Marks indicate contigs that do not have orthologs (a, 147 contigs) or that are orthologous to known noncoding transcripts (b, 40 contigs). The outlier in (b) is labeled by its known ortholog, XIST
Fig. 6
Fig. 6
Tissue-specific expression of beaver lncRNAs that are orthologous to known noncoding transcripts. Heatmap rows correspond to the 40 contigs and columns correspond to the 16 tissues that were profiled. Cells are colored by log2(1 + RPKM) expression level. Rows and columns are separately ordered by hierarchical agglomerative clustering and cut-based sub-dendrograms are colored (arbitrary color assignment to sub-clusters) as a guide for visualization. Rows are labeled with abbreviated contig names, e.g., contig29838.1 instead of Ccan_OSU1_lncRNA_contig29838.1
Fig. 7
Fig. 7
Predicted minimum-free energy secondary structures of the putative beaver MEG3 lncRNA Ccan_OSU1_lncRNA_contig11359.1 (a) and the homologous sequence of human MEG3 (b). False color indicates pairing probability (see colormap in panel A)
Fig. 8
Fig. 8
Predicted minimum-free energy secondary structure of the novel spleen- and ovary-specific lncRNA Ccan_OSU1_lncRNA_contig44966.1, showing relatively high pairing probabilities. False color indicates base pairing probability (see colormap)
Fig. 9
Fig. 9
Overview of the computational pipeline for identifying beaver lncRNAs. Transcript contigs from the consensus transcriptome (“Merged Transcriptome” above) were sequentially filtered using (1) Basic Local Alignment Search Tool for nucleotide sequence (BLASTn) against the NCBI nucleotide database to eliminate probable orthologs of protein-coding genes, known lncRNAs, and other non-lncRNA transcript types; (2) CPAT to detect and eliminate contigs with protein-coding ORFs or nucleotide hexamer usage patterns that are consistent with protein coding genes; (3) HMMscan scan against the Pfam database to identify matches to protein domain motifs; and (4) BLASTn alignment against the OSU draft beaver genome assembly and eliminating those contigs that overlapped with scaffold regions that were annotated (by MAKER) as protein-coding genes. Contigs discovered by the annotation pipeline that are orthologs of known lncRNAs are shown in purple, and novel noncoding contigs identified by the annotation pipeline are shown in green

Similar articles

Cited by

References

    1. Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, Willingham AT, et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007;316:1484–1488. - PubMed
    1. Lee JT. Epigenetic regulation by long noncoding RNAs. Science. 2012;338:1435–1439. - PubMed
    1. Amaral PP, Dinger ME, Mattick JS. Non-coding RNAs in homeostasis, disease and stress responses: an evolutionary perspective. Brief Funct Genomics. 2013;12:254–278. - PubMed
    1. Yang F, Huo X-S, Yuan S-X, Zhang L, Zhou W-P, Wang F, et al. Repression of the long noncoding RNA-LET by histone Deacetylase 3 contributes to hypoxia-mediated metastasis. Mol Cell. 2013;50:303–304. - PubMed
    1. Paralkar VR, Mishra T, Luan J, Yao Y, Kossenkov AV, Anderson SM, et al. Lineage and species-specific long noncoding RNAs during erythro-megakaryocytic development. Blood. 2014;123:1927–1937. - PMC - PubMed

Substances