Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr 13;18(1):215.
doi: 10.1186/s12859-017-1626-8.

Segmentum: a tool for copy number analysis of cancer genomes

Affiliations

Segmentum: a tool for copy number analysis of cancer genomes

Ebrahim Afyounian et al. BMC Bioinformatics. .

Abstract

Background: Somatic alterations, including loss of heterozygosity, can affect the expression of oncogenes and tumor suppressor genes. Whole genome sequencing enables detailed characterization of such aberrations. However, due to the limitations of current high throughput sequencing technologies, this task remains challenging. Hence, accurate and reliable detection of such events is crucial for the identification of cancer-related alterations.

Results: We introduce a new tool called Segmentum for determining somatic copy numbers using whole genome sequencing from paired tumor/normal samples. In our approach, read depth and B-allele fraction signals are smoothed, and double sliding windows are used to detect breakpoints, which makes our approach fast and straightforward. Because the breakpoint detection is performed simultaneously at different scales, it allows accurate detection as suggested by the evaluation results from simulated and real data. We applied Segmentum to paired tumor/normal whole genome sequencing samples from 38 patients with low-grade glioma from the TCGA dataset and were able to confirm the recurrence of copy-neutral loss of heterozygosity in chromosome 17p in low-grade astrocytoma characterized by IDH1/2 mutation and lack of 1p/19q co-deletion, which was previously reported using SNP array data.

Conclusions: Segmentum is an accurate, user-friendly tool for somatic copy number analysis of tumor samples. We demonstrate that this tool is suitable for the analysis of large cohorts, such as the TCGA dataset.

Keywords: Cancer; Loss of heterozygosity; Segmentation; Somatic copy number analysis; Whole-genome sequencing.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Segmentum pipeline. Normal and tumor RDs are used to calculate RD log-ratios. RD log-ratios are then corrected for biases. BAF data are simultaneously mirrored and smoothed. Using RD log-ratios and BAF, the genome is segmented with a double sliding window method. Segmentation results are used to identify cnLOH regions in the genome (see the following sections for more details on each step)
Fig. 2
Fig. 2
Segmentation accuracy of Segmentum for simulated data with different degrees of normal contamination. Estimated precision, recall, and F-measure values for simulated data at different normal contamination levels (Additional file 1, Derivation of the precision, recall, and F-measure of the simulated data)
Fig. 3
Fig. 3
Comparison of the SCNA results with different tools and the SNP array (ground truth). Venn diagram values (averaged over ten TCGA LGG samples) represent the percentage of overlap among the SCNA calls
Fig. 4
Fig. 4
Pairwise JSI scores averaged over ten TCGA LGG samples. JSI scores range between 0 and 1, where 0 means no similarity and 1 represents identical results between two tools
Fig. 5
Fig. 5
Pairwise JSI scores (Segmentum vs. SNP array as ground truth) for different subsamples. JSI scores range between 0 and 1, where 0 means no similarity and 1 represents identical results between two tools
Fig. 6
Fig. 6
SCNA landscape in grade II and III gliomas. WHO-grade, histological class, and molecular subtype classification are shown by color as indicated. The thirty-eight samples are divided into 4 distinct subtypes based on the occurrence of a mutation in IDH1/2, co-deletion of chromosomes 1p and 19q and the presence of 17p cnLOH. Deletions and amplifications are visualized by boxes with different shades of blue and red, respectively. White regions are either normal or cnLOH regions. The bar charts below each box represent the mirrored and smoothed BAF values. Large mirrored and smoothed BAF values (close to 0.5) point to heterozygous SNP allelic imbalance. In the second subtype (from the top), at chromosome 17p, recurring cnLOH is apparent where the bar charts point to large mirrored and smoothed BAF values, though no deletion or amplification is detected at that region (Additional file 1: Table S5 for TCGA LGG sample barcode names)

Similar articles

Cited by

References

    1. Beroukhim R, Mermel CH, Porter D, Wei G, Raychaudhuri S, Donovan J, Barretina J, Boehm JS, Dobson J, Urashima M, Mc Henry KT, Pinchback RM, Ligon AH, Cho YJ, Haery L, Greulich H, Reich M, Winckler W, Lawrence MS, Weir BA, Tanaka KE, Chiang DY, Bass AJ, Loo A, Hoffman C, Prensner J, Liefeld T, Gao Q, Yecies D, Signoretti S, Maher E, Kaye FJ, Sasaki H, Tepper JE, Fletcher JA, Tabernero J, Baselga J, Tsao MS, Demichelis F, Rubin MA, Janne PA, Daly MJ, Nucera C, Levine RL, Ebert BL, Gabriel S, Rustgi AK, Antonescu CR, Ladanyi M, Letai A, Garraway LA, Loda M, Beer DG, True LD, Okamoto A, Pomeroy SL, Singer S, Golub TR, Lander ES, Getz G, Sellers WR, Meyerson M. The landscape of somatic copy-number alteration across human cancers. Nature. 2010;463(7283):899–905. doi: 10.1038/nature08822. - DOI - PMC - PubMed
    1. Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y, Graf S, Ha G, Haffari G, Bashashati A, Russell R, McKinney S, METABRIC Group. Langerod A, Green A, Provenzano E, Wishart G, Pinder S, Watson P, Markowetz F, Murphy L, Ellis I, Purushotham A, Borresen-Dale AL, Brenton JD, Tavare S, Caldas C, Aparicio S. The genomic and transcriptomic architecture of 2000 breast tumours reveals novel subgroups. Nature. 2012;486(7403):346–352. - PMC - PubMed
    1. Ha G, Roth A, Lai D, Bashashati A, Ding J, Goya R, Giuliany R, Rosner J, Oloumi A, Shumansky K, Chin SF, Turashvili G, Hirst M, Caldas C, Marra MA, Aparicio S, Shah SP. Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer. Genome Res. 2012;22(10):1995–2007. doi: 10.1101/gr.137570.112. - DOI - PMC - PubMed
    1. Suzuki H, Aoki K, Chiba K, Sato Y, Shiozawa Y, Shiraishi Y, Shimamura T, Niida A, Motomura K, Ohka F, Yamamoto T, Tanahashi K, Ranjit M, Wakabayashi T, Yoshizato T, Kataoka K, Yoshida K, Nagata Y, Sato-Otsubo A, Tanaka H, Sanada M, Kondo Y, Nakamura H, Mizoguchi M, Abe T, Muragaki Y, Watanabe R, Ito I, Miyano S, Natsume A, Ogawa S. Mutational landscape and clonal architecture in grade II and III gliomas. Nat Genet. 2015;47(5):458–468. doi: 10.1038/ng.3273. - DOI - PubMed
    1. Barresi V, Romano A, Musso N, Capizzi C, Consoli C, Martelli MP, Palumbo G, Di Raimondo F, Condorelli DF. Broad copy neutral-loss of heterozygosity regions and rare recurring copy number abnormalities in normal karyotype-acute myeloid leukemia genomes. Genes Chromosomes Cancer. 2010;49(11):1014–1023. doi: 10.1002/gcc.20810. - DOI - PubMed

Supplementary concepts

LinkOut - more resources