nFuse: discovery of complex genomic rearrangements in cancer using high-throughput sequencing
- PMID: 22745232
- PMCID: PMC3483554
- DOI: 10.1101/gr.136572.111
nFuse: discovery of complex genomic rearrangements in cancer using high-throughput sequencing
Abstract
Complex genomic rearrangements (CGRs) are emerging as a new feature of cancer genomes. CGRs are characterized by multiple genomic breakpoints and thus have the potential to simultaneously affect multiple genes, fusing some genes and interrupting other genes. Analysis of high-throughput whole-genome shotgun sequencing (WGSS) is beginning to facilitate the discovery and characterization of CGRs, but further development of computational methods is required. We have developed an algorithmic method for identifying CGRs in WGSS data based on shortest alternating paths in breakpoint graphs. Aiming for a method with the highest possible sensitivity, we use breakpoint graphs built from all WGSS data, including sequences with ambiguous genomic origin. Since the majority of cell function is encoded by the transcriptome, we target our search to find CGRs that underlie fusion transcripts predicted from matched high-throughput cDNA sequencing (RNA-seq). We have applied our method, nFuse, to the discovery of CGRs in publicly available data from the well-studied breast cancer cell line HCC1954 and primary prostate tumor sample 963. We first establish the sensitivity and specificity of the nFuse breakpoint prediction and scoring method using breakpoints previously discovered in HCC1954. We then validate five out of six CGRs in HCC1954 and two out of two CGRs in 963. We show examples of gene fusions that would be difficult to discover using methods that do not account for the existence of CGRs, including one important event that was missed in a previous study of the HCC1954 genome. Finally, we illustrate how CGRs may be used to infer the gene expression history of a tumor.
Figures





Similar articles
-
Comrad: detection of expressed rearrangements by integrated analysis of RNA-Seq and low coverage genome sequence data.Bioinformatics. 2011 Jun 1;27(11):1481-8. doi: 10.1093/bioinformatics/btr184. Epub 2011 Apr 9. Bioinformatics. 2011. PMID: 21478487
-
Identification of complex genomic rearrangements in cancers using CouGaR.Genome Res. 2017 Jan;27(1):107-117. doi: 10.1101/gr.211201.116. Epub 2016 Nov 14. Genome Res. 2017. PMID: 27986820 Free PMC article.
-
Breakpoint profiling of 64 cancer genomes reveals numerous complex rearrangements spawned by homology-independent mechanisms.Genome Res. 2013 May;23(5):762-76. doi: 10.1101/gr.143677.112. Epub 2013 Feb 14. Genome Res. 2013. PMID: 23410887 Free PMC article.
-
Using high-throughput sequencing transcriptome data for INDEL detection: challenges for cancer drug discovery.Expert Opin Drug Discov. 2016;11(3):257-68. doi: 10.1517/17460441.2016.1143813. Epub 2016 Feb 6. Expert Opin Drug Discov. 2016. PMID: 26787005 Review.
-
Complex genomic rearrangements: an underestimated cause of rare diseases.Trends Genet. 2022 Nov;38(11):1134-1146. doi: 10.1016/j.tig.2022.06.003. Epub 2022 Jul 9. Trends Genet. 2022. PMID: 35820967 Free PMC article. Review.
Cited by
-
THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data.Genome Biol. 2013 Jul 29;14(7):R80. doi: 10.1186/gb-2013-14-7-r80. Genome Biol. 2013. PMID: 23895164 Free PMC article.
-
Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine.Genome Med. 2014 Jan 30;6(1):5. doi: 10.1186/gm524. eCollection 2014. Genome Med. 2014. PMID: 24479672 Free PMC article. Review.
-
ChiTaH: a fast and accurate tool for identifying known human chimeric sequences from high-throughput sequencing data.NAR Genom Bioinform. 2021 Nov 26;3(4):lqab112. doi: 10.1093/nargab/lqab112. eCollection 2021 Dec. NAR Genom Bioinform. 2021. PMID: 34859212 Free PMC article.
-
Structural variation in the sequencing era.Nat Rev Genet. 2020 Mar;21(3):171-189. doi: 10.1038/s41576-019-0180-9. Epub 2019 Nov 15. Nat Rev Genet. 2020. PMID: 31729472 Free PMC article. Review.
-
Comparative assessment of methods for the fusion transcripts detection from RNA-Seq data.Sci Rep. 2016 Feb 10;6:21597. doi: 10.1038/srep21597. Sci Rep. 2016. PMID: 26862001 Free PMC article.
References
-
- Asmann YW, Hossain A, Necela BM, Middha S, Kalari KR, Sun Z, Chai HS, Williamson DW, Radisky D, Schroth GP, et al. 2011. A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines. Nucleic Acids Res 39: e100 doi: 10.1093/nar/gkr362 - PMC - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources