Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Oct 4:10:915.
doi: 10.3389/fgene.2019.00915. eCollection 2019.

Long Read Single-Molecule Real-Time Sequencing Elucidates Transcriptome-Wide Heterogeneity and Complexity in Esophageal Squamous Cells

Affiliations

Long Read Single-Molecule Real-Time Sequencing Elucidates Transcriptome-Wide Heterogeneity and Complexity in Esophageal Squamous Cells

Yin-Wei Cheng et al. Front Genet. .

Abstract

Esophageal squamous cell carcinoma is a leading cause of cancer death. Mapping the transcriptional landscapes such as isoforms, fusion transcripts, as well as long noncoding RNAs have played a central role to understand the regulating mechanism during malignant processes. However, canonical methods such as short-read RNA-seq are difficult to define the entire polyadenylated RNA molecules. Here, we combined single-molecule real-time sequencing with RNA-seq to generate high-quality long reads and to survey the transcriptional program in esophageal squamous cells. Compared with the recent annotations of human transcriptome (Ensembl 38 release 91), single-molecule real-time data identified many unannotated transcripts, novel isoforms of known genes and an expanding repository of long intergenic noncoding RNAs (lincRNAs). By integrating with annotation of lincRNA catalog, 1,521 esophageal-cancer-specific lincRNAs were defined from single-molecule real-time reads. Kyoto Encyclopedia of Genes and Genomes enrichment analysis indicated that these lincRNAs and their target genes are involved in a variety of cancer signaling pathways. Isoform usage analysis revealed the shifted alternative splicing patterns, which can be recaptured from clinical samples or supported by previous studies. Utilizing vigorous searching criteria, we also detected multiple transcript fusions, which are not documented in current gene fusion database or readily identified from RNA-seq reads. Two novel fusion transcripts were verified based on real-time PCR and Sanger sequencing. Overall, our long-read single-molecule sequencing largely expands current understanding of full-length transcriptome in esophageal cells and provides novel insights on the transcriptional diversity during oncogenic transformation.

Keywords: alternative splicing; esophageal squamous cell carcinoma; heterogeneity; lincRNA; long reads sequencing; transcript fusion; transcriptome.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The analyses pipeline for transcriptional landscapes in esophageal cells.
Figure 2
Figure 2
Single-molecule real-time (SMRT) sequencing identifies novel genes or isoforms of known genes. (A) Left panel: bar chart illustrates the percentages of novel isoform of known genes (purple), isoform of novel genes (green), and novel isoform of novel genes (brown). Right panel: average counts of full-length nonchimerics (FLNCs) in each esophageal cell line. (B) Distances distribution of transcription start site (TSS) in each full-length transcript to the closest epigenetic marks and Cap Analysis of Gene Expression (CAGE) tags. (C) The numbers of novel transcripts sharing among different cell lines. (D) Unannotated transcripts were scanned in several peptide or protein databases for each esophageal cell line.
Figure 3
Figure 3
Examples of isoforms of known genes. Single-molecule real-time (SMRT) transcripts detected in each cell line for (A) VIL2 and (B) TPM1. (C) Novel isoforms of VIL2 gene in KYSE510 cells. (D) Novel isoforms of TPM1 gene in KYSE510 cells. Blue: Known transcript annotations; red: known isoforms identified from SMRT data; black: novel isoforms from SMRT data. The number of FLNCs detected is shown in brackets.
Figure 4
Figure 4
Shifted alternative splicing pattern in esophageal cells. (A) Percentage of splicing events in each esophageal cell line. SE, skipped exon; MXE, mutually exclusive exon; A5, alternative 5¢ splice site; A3, alternative 3¢ splice site; AF, alternative first; AL, alternative last exons; and RI, retained intron. (B) Differentially spliced genes between normal-like and malignant esophageal cells are significantly enriched in three Gene Ontology (GO) terms.
Figure 5
Figure 5
Ring finger and CCCH-type domains 1–aldo-keto reductase family 1 member B10 (RC3H1-AKR1B10) is a differentially expressed transcript fusion in esophageal cells. (A) Schematic of RC3H1-AKR1B10 chimeric RNA in esophageal cells. Fusion transcripts are predicted to retain intact functional regions from both parental genes. Zn: zinc finger; ROQ: Roquin domain; HN and HC: N- and C-terminal nucleotide-binding sites of Roquin domain; Aldo_ket_red: aldo/kept reductase domain. (B) Representative RT-PCR reactions demonstrating the differentially expressed fusion in 5 esophageal cell lines.

References

    1. Abdel-Ghany S. E., Hamilton M., Jacobi J. L., Ngam P., Devitt N., Schilkey F., et al. (2016). A survey of the sorghum transcriptome using single-molecule long reads. Nat. Commun. 7, 11706. 10.1038/ncomms11706 - DOI - PMC - PubMed
    1. Au K. F., Sebastiano V., Afshar P. T., Durruthy J. D., Lee L., Williams B. A., et al. (2013). Characterization of the human ESC transcriptome by hybrid sequencing. Proc. Natl. Acad. Sci. U.S.A. 110 (50), E4821–E4830. 10.1073/pnas.1320101110 - DOI - PMC - PubMed
    1. Bernstein B. E., Stamatoyannopoulos J. A., Costello J. F., Ren B., Milosavljevic A., Meissner A., et al. (2010). The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28 (10), 1045–1048. 10.1038/nbt1010-1045 - DOI - PMC - PubMed
    1. Buchfink B., Xie C., Huson D. H. (2015). Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12 (1), 59–60. 10.1038/nmeth.3176 - DOI - PubMed
    1. Cabili M. N., Trapnell C., Goff L., Koziol M., Tazon-Vega B., Regev A., et al. (2011). Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25 (18), 1915–1927. 10.1101/gad.17446611 - DOI - PMC - PubMed

LinkOut - more resources