Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 22;7(8):e202302481.
doi: 10.26508/lsa.202302481. Print 2024 Aug.

A multiomic characterization of the leukemia cell line REH using short- and long-read sequencing

Affiliations

A multiomic characterization of the leukemia cell line REH using short- and long-read sequencing

Mariya Lysenkova Wiklander et al. Life Sci Alliance. .

Abstract

The B-cell acute lymphoblastic leukemia (ALL) cell line REH, with the t(12;21) ETV6::RUNX1 translocation, is known to have a complex karyotype defined by a series of large-scale chromosomal rearrangements. Taken from a 15-yr-old at relapse, the cell line offers a practical model for the study of pediatric B-ALL. In recent years, short- and long-read DNA and RNA sequencing have emerged as a complement to karyotyping techniques in the resolution of structural variants in an oncological context. Here, we explore the integration of long-read PacBio and Oxford Nanopore whole-genome sequencing, IsoSeq RNA sequencing, and short-read Illumina sequencing to create a detailed genomic and transcriptomic characterization of the REH cell line. Whole-genome sequencing clarified the molecular traits of disrupted ALL-associated genes including CDKN2A, PAX5, BTG1, VPREB1, and TBL1XR1, as well as the glucocorticoid receptor NR3C1 Meanwhile, transcriptome sequencing identified seven fusion genes within the genomic breakpoints. Together, our extensive whole-genome investigation makes high-quality open-source data available to the leukemia genomics community.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

None
Graphical abstract
Figure 1.
Figure 1.. Overview of the REH cell line.
(A) G-banded karyotyping used to verify the REH karyotype provided by the cell line vendor. Arrows mark the chromosomes with visible aberrations, reflecting the major features of the stemline described in the DSMZ karyotype: 46(44-47)<2n>X, -X, +16, del(3)(p22), t(4;12;21;16)(q32;p13;q22;q24.3)-inv(12)(p13q22), t(5;12)(q31-q32;p12), der(16)t(16;21)(q24.3;q22)—sideline with inv(5)der(5)(p15q31),+18. G-banded karyotyping showed that the cells in the present study did not contain the sideline. (B) Read length distributions of the long-read whole-genome sequencing datasets. (C) Variant allele frequencies of the single-nucleotide variants called by DeepVariant in the Illumina whole-genome sequencing data. The allele fractions of these single-nucleotide variants, relative to the reference alleles, are binomially distributed, with 1.0 indicating homozygous variants and a mean of 0.5 indicating heterozygous variants.
Figure S1.
Figure S1.. Read length distribution of the REH IsoSeq dataset.
Figure S2.
Figure S2.. The variant allele frequencies of the REH single-nucleotide variants called by DeepVariant.
(A, B) The variant allele frequencies in (A) the ONT ultralong dataset, which has an error rate of 6.5%, and (B) the PacBio dataset, with an average depth of coverage of 15x.
Figure S3.
Figure S3.. Distribution of allele fractions of the REH structural variants in the three-way consensus callset.
The three-way consensus callset contains the SVs called in all three whole-genome sequencing datasets (Illumina, PacBio, and ONT). The allele fractions of these SVs are binomially distributed, with 1.0 indicating homozygous variants and a mean of 0.5 indicating heterozygous variants.
Figure 2.
Figure 2.. Structural variants detected in PacBio, ONT, and Illumina whole-genome sequencing data.
(A) Chromosomal heatmap of the long-read consensus callset, showing the total number of structural variant (SV) calls at each locus that were detected in both PacBio and ONT data. (B) Venn diagram showing the number of SV calls found to overlap in each combination of callsets. (C) Strip plots showing SVs from each callset, stratified by size, with the bottom two strips in each section visualizing the long-read consensus callset and the three-way consensus callset, respectively.
Figure S4.
Figure S4.. Chromosomal heatmaps of the different REH SV callsets, showing the total number of SV calls at each locus.
(A) SVs detected in the Illumina dataset by TIDDIT. (B) SVs detected in the PacBio dataset by Sniffles. (C) SVs detected in the ONT dataset by Sniffles. (D) Chromosomal heatmap of the three-way consensus callset, showing the SVs that were detected in all three datasets.
Figure 3.
Figure 3.. Large-scale structural variants confirmed in REH.
The innermost circle shows inversions and interchromosomal translocations. The gray band depicts deletions >100 kb. The rainbow bands show the depth of coverage in PacBio, ONT, and Illumina datasets, respectively, followed by chromosome number. The outermost band indicates the GRCh38 reference cytoband with genes disrupted listed on the outer edge of the plot.
Figure S5.
Figure S5.. TBL1XR1 deletion in REH.
The breakpoints of the 146-kb del(3)(q26.32q26.32) at chr3:177050707 and chr3:177196318, supported by 5 ONT reads, 12 PB reads, and 25 Illumina reads.
Figure S6.
Figure S6.. NR3C1/ARHGAP26 deletion in REH.
The breakpoints of the 205-kb del(5)(q31.3q31.3) at chr5:143197445 and chr5:143402107, supported by 8 ONT reads, 7 PB reads, and 19 Illumina reads.
Figure S7.
Figure S7.. BTG1 deletion in REH.
The breakpoints of the 260-kb del(12)(q21.33q21.33) at chr12:91884416 and chr12:92144292, supported by 8 ONT reads, 12 PB reads, and 13 Illumina reads.
Figure S8.
Figure S8.. NFATC1 deletion in REH.
The breakpoints of the 132-kb del(18)(q23q23) at chr18:79384761 and chr18:79516951, supported by 7 ONT reads, 3 PB reads, and 10 Illumina reads.
Figure S9.
Figure S9.. VPREB1 deletion in REH.
The breakpoints of the 214-kb del(22)(q11.22q11.22) at chr22:22031472 and chr22:22245538, supported by 9 ONT reads, 6 PB reads, and 29 Illumina reads.
Figure S10.
Figure S10.. Using ONT ultralong reads to call and confirm structural variants in REH.
(A) Interchromosomal breakends detected in ONT, PacBio, and Illumina short reads. The false-positive rate for the called SVs was 85.3% in ONT, 96.7% in PacBio, and 98.6% in the Illumina data. (B, C) Visualization of the (B) three-way breakpoint t(12;21;16) and (C) novel rearrangement der(1)inv(2)(p11.2p11.2)ins(1;2)(q21.1;p11.2) as seen in ONT ultralong reads using Ribbon.
Figure 4.
Figure 4.. REH fusion gene breakpoints, visualized by the Arriba module of the nf-core/rnafusion pipeline.
(A) The more highly expressed variant of the two splice variants of ETV6::RUNX1, resulting from t(12;21). (B) The most highly expressed variant of the five splicing variants of RUNX1::PRDM7, resulting from t(16;21). (C, D, E, F, G) Fusion gene breakpoints in (C) PHAX::AC007450.2 and (D) LRP6::SLC27A6, both from t(5;12), (E) BTG1::LINC02404/AC090049.1 from del(12)(q21.33q21.33), (F) NR3C1::ARHGAP26, from del(5)(q31.3q31.3), and (G) TRAF3IP2::REV3L, from del(6)(q21q21).
Figure 5.
Figure 5.. Aberrant chromosomes of the REH cell line.
Highlighted are the retained protein domains for ETV6::RUNX1 and RUNX1::PRDM7, resulting from the two REH in-frame fusion genes. Rearrangements that could not be phased to a specific homolog have been rendered on an arbitrary homolog of their respective chromosomes.
Figure S11.
Figure S11.. NR3C1 nonsense mutation p.Gln528Ter in REH.
The mutation, at position chr5:143300653, is supported by 6 ONT reads, 12 PacBio reads, and 15 Illumina reads.
Figure S12.
Figure S12.. PAX5 frameshift mutation p.A322Rfs*19 in REH.
The mutation, at position chr9:36882052, is supported by 12 PacBio reads and 17 Illumina reads, and misidentified as a deletion in 14 ONT reads.

References

    1. Aganezov S, Goodwin S, Sherman RM, Sedlazeck FJ, Arun G, Bhatia S, Lee I, Kirsche M, Wappel R, Kramer M, et al. (2020) Comprehensive analysis of structural variants in breast cancer genomes using single-molecule sequencing. Genome Res 30: 1258–1273. 10.1101/gr.260497.119 - DOI - PMC - PubMed
    1. Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q (2020) Opportunities and challenges in long-read sequencing data analysis. Genome Biol 21: 30. 10.1186/s13059-020-1935-5 - DOI - PMC - PubMed
    1. Ardui S, Ameur A, Vermeesch JR, Hestand MS (2018) Single molecule real-time (SMRT) sequencing comes of age: Applications and utilities for medical diagnostics. Nucleic Acids Res 46: 2159–2168. 10.1093/nar/gky066 - DOI - PMC - PubMed
    1. Bachmann PS, Gorman R, Papa RA, Bardell JE, Ford J, Kees UR, Marshall GM, Lock RB (2007) Divergent mechanisms of glucocorticoid resistance in experimental models of pediatric acute lymphoblastic leukemia. Cancer Res 67: 4482–4490. 10.1158/0008-5472.CAN-06-4244 - DOI - PubMed
    1. van Belzen IAEM, Schönhuth A, Kemmeren P, Hehir-Kwa JY (2021) Structural variant detection in cancer genomes: Computational challenges and perspectives for precision oncology. Npj Precis Oncol 5: 15. 10.1038/s41698-021-00155-6 - DOI - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources