Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 20:12:664260.
doi: 10.3389/fgene.2021.664260. eCollection 2021.

Large-Scale Multiplexing Permits Full-Length Transcriptome Annotation of 32 Bovine Tissues From a Single Nanopore Flow Cell

Affiliations

Large-Scale Multiplexing Permits Full-Length Transcriptome Annotation of 32 Bovine Tissues From a Single Nanopore Flow Cell

Michelle M Halstead et al. Front Genet. .

Abstract

A comprehensive annotation of transcript isoforms in domesticated species is lacking. Especially considering that transcriptome complexity and splicing patterns are not well-conserved between species, this presents a substantial obstacle to genomic selection programs that seek to improve production, disease resistance, and reproduction. Recent advances in long-read sequencing technology have made it possible to directly extrapolate the structure of full-length transcripts without the need for transcript reconstruction. In this study, we demonstrate the power of long-read sequencing for transcriptome annotation by coupling Oxford Nanopore Technology (ONT) with large-scale multiplexing of 93 samples, comprising 32 tissues collected from adult male and female Hereford cattle. More than 30 million uniquely mapping full-length reads were obtained from a single ONT flow cell, and used to identify and characterize the expression dynamics of 99,044 transcript isoforms at 31,824 loci. Of these predicted transcripts, 21% exactly matched a reference transcript, and 61% were novel isoforms of reference genes, substantially increasing the ratio of transcript variants per gene, and suggesting that the complexity of the bovine transcriptome is comparable to that in humans. Over 7,000 transcript isoforms were extremely tissue-specific, and 61% of these were attributed to testis, which exhibited the most complex transcriptome of all interrogated tissues. Despite profiling over 30 tissues, transcription was only detected at about 60% of reference loci. Consequently, additional studies will be necessary to continue characterizing the bovine transcriptome in additional cell types, developmental stages, and physiological conditions. However, by here demonstrating the power of ONT sequencing coupled with large-scale multiplexing, the task of exhaustively annotating the bovine transcriptome - or any mammalian transcriptome - appears significantly more feasible.

Keywords: alternative splicyng; annotation; cattle; full-length transcript; long-read sequencing; tissue-specific; transcriptome.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Preliminary analysis of transcriptomes. (A) Principal components analysis of VST-normalized gene counts. (B) Hierarchical clustering of samples based on top 5,000 genes with highest variance in VST counts.
FIGURE 2
FIGURE 2
Predicted transcripts capture transcriptome complexity. (A) Comparison of predicted isoforms to Ensembl and NCBI gene annotations. (B) Frequency of alternative splicing events in predicted multi-exon transcript isoforms. (C) Predicted isoforms at the RSPH9 locus, which is thought to code for a component of motile cilia and flagella. In humans, multiple splicing is known to produce transcript variants, but only one transcript had been annotated in cattle, according to both the Ensembl and NCBI annotations. (D) Based on the predicted transcript set, number of expressed loci and ratio of expressed transcripts per loci, averaged per tissue.
FIGURE 3
FIGURE 3
Identification of tissue-specific isoforms. (A) Density plot of the tissue-specificity index (TSI) identified for each predicted transcript, based on average transcripts per million (TPM) in each tissue. (B) Density plot of TSI for predicted transcripts with low (average TPM < 1), moderate (1 ≤ average TPM < 10), or high expression (average TPM ≥ 10). (C) Number of tissue-specific transcripts (TSI ≥ 0.8) attributed to each tissue, categorized as known or novel isoforms, novel loci, or potential artifacts. (D) The annotated transcript at the CRYM locus was expressed across a range of tissues, whereas novel isoforms were either testis- or brain-specific. (E) Functional enrichment of genes corresponding to tissue-specific isoforms in brain cortex, kidney, liver, muscle, and testis. Top five most significant gene ontology terms reported (Benjamini corrected p-value < 0.05).
FIGURE 4
FIGURE 4
Characterization of predicted transcripts at novel loci. (A) The top ten represented KEGG pathways and GO terms (separated into Cellular Component, Molecular Function, and Biological Process terms) represented in transcripts at novel loci that corresponded to a UniProt identifier. (B) Coding potential of predicted transcripts. (C) Novel non-coding antisense transcript at the CEP63 locus. (D) Highly expressed section of chromosome 16. RepeatMasker track shows repetitive elements, which were depleted in the highly expressed region (highlighted in yellow).

Similar articles

Cited by

References

    1. Anders S., Pyl P. T., Huber W. (2015). HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31 166–169. 10.1093/bioinformatics/btu638 - DOI - PMC - PubMed
    1. Andersson L., Archibald A. L., Bottema C. D., Brauning R., Burgess S. C., Burt D. W., et al. (2015). Coordinated international action to accelerate genome-to-phenome with FAANG, the functional annotation of animal genomes project. Genome Biol. 16:57. - PMC - PubMed
    1. Arefeen A., Liu J., Xiao X., Jiang T. (2018). TAPAS: tool for alternative polyadenylation site analysis. Bioinformatics 34 2521–2529. 10.1093/bioinformatics/bty110 - DOI - PMC - PubMed
    1. Barbosa-Morais N. L., Irimia M., Pan Q., Xiong H. Y., Gueroussov S., Lee L. J., et al. (2012). The evolutionary landscape of alternative splicing in vertebrate species. Science 21 1587–1593. 10.1126/science.1230612 - DOI - PubMed
    1. Bayega A., Fahiminiya S., Oikonomopoulos S., Ragoussis J. (2018). Current and future methods for mRNA analysis: a drive toward single molecule sequencing. Methods Mol Biol. 1783 209–241. 10.1007/978-1-4939-7834-2_11 - DOI - PubMed