Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug;28(8):1126-1135.
doi: 10.1101/gr.231100.117. Epub 2018 Jun 28.

Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line

Affiliations

Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line

Maria Nattestad et al. Genome Res. 2018 Aug.

Abstract

The SK-BR-3 cell line is one of the most important models for HER2+ breast cancers, which affect one in five breast cancer patients. SK-BR-3 is known to be highly rearranged, although much of the variation is in complex and repetitive regions that may be underreported. Addressing this, we sequenced SK-BR-3 using long-read single molecule sequencing from Pacific Biosciences and develop one of the most detailed maps of structural variations (SVs) in a cancer genome available, with nearly 20,000 variants present, most of which were missed by short-read sequencing. Surrounding the important ERBB2 oncogene (also known as HER2), we discover a complex sequence of nested duplications and translocations, suggesting a punctuated progression. Full-length transcriptome sequencing further revealed several novel gene fusions within the nested genomic variants. Combining long-read genome and transcriptome sequencing enables an in-depth analysis of how SVs disrupt the genome and sheds new light on the complex mechanisms involved in cancer genome evolution.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Variants found in SK-BR-3 with PacBio long-read sequencing. (A) Circos (Krzywinski et al. 2009) plot showing long-range (larger than 10 kbp or inter-chromosomal) variants found by Sniffles from split-read alignments, with read coverage shown in the outer track. (B) Variant size histogram of deletions and insertions from size 50 bp up to 1 kbp found by long-read (Sniffles) and short-read (SURVIVOR 2-caller consensus) variant calling, showing similar size distributions for insertions and deletions from long reads but not for short reads, where insertions are greatly underrepresented. (C) Sniffles variant counts by type for variants above 1 kbp in size, including translocations and inverted duplications.
Figure 2.
Figure 2.
Comparing results of mapping and variant calling between PacBio and Illumina paired-end sequencing. (A) Venn diagram showing the intersection of structural variants between the Sniffles call set versus the SURVIVOR 2-caller consensus, with counts indicated. (B) Percentage of variant calls in each area of Venn diagram in A that have matching CNV calls within 50 kbp (the smallest segment allowed in segmentation), where a CNV is a difference in copy number (long-read sequencing) between segments of at least 28×, the diploid average. (C) Venn diagram showing the intersection of long-range variants between the Sniffles call set versus the SURVIVOR 2-caller consensus. Validation rates are shown as percentages below the counts for each category, and extrapolated overall validation rates are shown for Sniffles and SURVIVOR.
Figure 3.
Figure 3.
Reconstruction of the copy number amplification of the ERBB2 oncogene. (A) Copy number and translocations for the amplified region on Chr 17 that includes ERBB2 showing the relations to Chr 8. Note Chr 8 has extensive rearrangements shown by the green intra-chromosomal arcs. (B) Sequence of events that best explains the copy number and translocations found in this region. Segment 1 (orange) first translocated into Chr 8, followed by the segment 2 (yellow) translocating to a different place on Chr 8. Then, the segment 3 (green) was duplicated from segment 2 by an inversion of the piece between variants D and E along with a 1.5-Mb piece of Chr 8 that was attached at variant E, all of which then attached at variant C. The whole green segment including the 1.5 Mb of Chr 8 then underwent an inverted duplication at variant D. The purple segment could have come from the orange, yellow, or green sequences since it only shares breakpoint A. Additionally, there is a deletion of 10,305 bp between breakpoints D and E.
Figure 4.
Figure 4.
The KLHDC2-SNTB1 gene fusion in SK-BR-3 occurs through a series of three variants and is directly observed to link the two genes in several individual SMRT-seq reads (A), one of which is shown in detail in B.

References

    1. Arthur JG, Chen X, Zhou B, Urban AE, Wong WH. 2018. Detection of complex structural variation from paired-end sequencing data. bioRxiv 10.1101/200170. - DOI
    1. Asmann YW, Hossain A, Necela BM, Middha S, Kalari KR, Sun Z, Chai HS, Williamson DW, Radisky D, Schroth GP, et al. 2011. A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines. Nucleic Acids Res 39: e100. - PMC - PubMed
    1. Chaisson MJP, Huddleston J, Dennis MY, Sudmant PH, Malig M, Hormozdiari F, Antonacci F, Surti U, Sandstrom R, Boitano M, et al. 2015. Resolving the complexity of the human genome using single-molecule sequencing. Nature 517: 608–611. - PMC - PubMed
    1. Chen K, Navin NE, Wang Y, Schmidt HK, Wallis JW, Niu B, Fan X, Zhao H, McLellan MD, Hoadley KA, et al. 2013. BreakTrans: uncovering the genomic architecture of gene fusions. Genome Biol 14: R87. - PMC - PubMed
    1. Chen X, Schulz-Trieglaff O, Shaw R, Barnes B, Schlesinger F, Källberg M, Cox AJ, Kruglyak S, Saunders CT. 2016. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32: 1220–1222. - PubMed

Publication types