Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec;39(12):1517-1520.
doi: 10.1038/s41587-021-00965-w. Epub 2021 Jul 1.

Nanopore sequencing of single-cell transcriptomes with scCOLOR-seq

Affiliations

Nanopore sequencing of single-cell transcriptomes with scCOLOR-seq

Martin Philpott et al. Nat Biotechnol. 2021 Dec.

Abstract

Here we describe single-cell corrected long-read sequencing (scCOLOR-seq), which enables error correction of barcode and unique molecular identifier oligonucleotide sequences and permits standalone cDNA nanopore sequencing of single cells. Barcodes and unique molecular identifiers are synthesized using dimeric nucleotide building blocks that allow error detection. We illustrate the use of the method for evaluating barcode assignment accuracy, differential isoform usage in myeloma cell lines, and fusion transcript detection in a sarcoma cell line.

PubMed Disclaimer

Conflict of interest statement

A.T. is a full-time employee and shareholder of Bristol Myers Squibb and visiting professor, Weatherall Institute of Molecular Medicine, University of Oxford. M.P., U.O. and A.P.C. are inventors on patents filed by Oxford University Innovations for single-cell technologies and are founders of Caeruleus Genomics. T.B. Jr. is a director and shareholder of ATDBio. T.B. Sr. is a director of ATDBio. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Developing a strategy to error-correct barcode and UMI sequences from droplet-based sequencing.
a, Schematic bead and oligonucleotide structure using dimer blocks of nucleotides for Buc-seq. b, Cell barcode-assignment strategy. c, UMI deduplication strategy. d, Simulated data showing the number of barcodes recovered with increasing simulated sequencing error rates. e,f, Simulated data showing the difference and coefficient of variation between the deduplicated UMIs and the ground truth. Correction of the UMI counts was performed using a basic directional network-based approach after accounting for sequencing errors within homodimeric blocks of nucleotides.
Fig. 2
Fig. 2. scCOLOR-seq identifies transcript isoform diversity and fusion transcripts in cancer cell line models.
a,b, Human HEK293T and mouse 3T3 cells were mixed at a 1:1 ratio and approximately 1,200 cells were taken for encapsulation and cDNA synthesis followed by nanopore sequencing. a, A Barnyard plot showing the expression of mouse and human UMIs before quality filtering using an edit distance of 6. b, A UMAP plot of data after quality filtering showing the clustering of human, mouse or mixed human and mouse cells after barcode correction using an edit distance of 6. Insets: bar plots show the specificity of UMIs aligning to either the human or mouse UMAP cluster. ch, NCI-H929, DF15 and JJN3 myeloma cell lines were mixed at a 1:1:1 ratio and approximately 1,200 cells were taken for cDNA synthesis and sequenced using a PromethION flow cell. c,d, UMAP plot of gene expression (c) and transcript isoform expression (d). e, Principal CD74 (also known as HLA-DR) splice variants showing all protein-coding transcripts. fh, UMAP plot showing the isoform expression of detected CD74 transcripts ENST00000377775.7 (f), ENST00000353334.10 (g) and ENST00000009530.12 (h). i, A UMAP plot of total fusion transcripts in Ewing’s cells mapped as a parentage of the total RNA of the cell. j, A UMAP plot showing the expression of the EWS-FLI fusion transcript. k, A schematic showing the structure of the EWSR1 and FLI1 genes. The EWS-FLI fusion transcript consists of the 5′ end of the EWSR1 gene and the 3′ end of the FLI1 gene. Arrowheads denote known fusion events and the most common type-1 fusion transcript is shown. l, A circular representation of the fusion transcripts identified between FLI1 and EWSR1.

References

    1. Wenger AM, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 2019;37:1155–1162. doi: 10.1038/s41587-019-0217-9. - DOI - PMC - PubMed
    1. Weirather JL, et al. Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis. F1000Res. 2017;6:100. doi: 10.12688/f1000research.10571.2. - DOI - PMC - PubMed
    1. Gupta, I. et al. Single-cell isoform RNA sequencing characterizes isoforms in thousands of cerebellar cells. Nat. Biotechnol.36, 197–1202 (2018). - PubMed
    1. Zheng, Y.-F. et al. HIT-scISOseq: high-throughput and high-accuracy single-cell full-length isoform sequencing for corneal epithelium. Preprint at bioRxiv10.1101/2020.07.27.222349 (2020).
    1. Rang FJ, Kloosterman WP, de Ridder J. From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy. Genome Biol. 2018;19:90. doi: 10.1186/s13059-018-1462-9. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances