Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 17;23(1):599.
doi: 10.1186/s12864-022-08681-8.

Low-cost and clinically applicable copy number profiling using repeat DNA

Affiliations

Low-cost and clinically applicable copy number profiling using repeat DNA

Sam Abujudeh et al. BMC Genomics. .

Abstract

Background: Somatic copy number alterations (SCNAs) are an important class of genomic alteration in cancer. They are frequently observed in cancer samples, with studies showing that, on average, SCNAs affect 34% of a cancer cell's genome. Furthermore, SCNAs have been shown to be major drivers of tumour development and have been associated with response to therapy and prognosis. Large-scale cancer genome studies suggest that tumours are driven by somatic copy number alterations (SCNAs) or single-nucleotide variants (SNVs). Despite the frequency of SCNAs and their clinical relevance, the use of genomics assays in the clinic is biased towards targeted gene panels, which identify SNVs but provide limited scope to detect SCNAs throughout the genome. There is a need for a comparably low-cost and simple method for high-resolution SCNA profiling.

Results: We present conliga, a fully probabilistic method that infers SCNA profiles from a low-cost, simple, and clinically-relevant assay (FAST-SeqS). When applied to 11 high-purity oesophageal adenocarcinoma samples, we obtain good agreement (Spearman's rank correlation coefficient, rs=0.94) between conliga's inferred SCNA profiles using FAST-SeqS data (approximately £14 per sample) and those inferred by ASCAT using high-coverage WGS (gold-standard). We find that conliga outperforms CNVkit (rs=0.89), also applied to FAST-SeqS data, and is comparable to QDNAseq (rs=0.96) applied to low-coverage WGS, which is approximately four-fold more expensive, more laborious and less clinically-relevant. By performing an in silico dilution series experiment, we find that conliga is particularly suited to detecting SCNAs in low tumour purity samples. At two million reads per sample, conliga is able to detect SCNAs in all nine samples at 3% tumour purity and as low as 0.5% purity in one sample. Crucially, we show that conliga's hidden state information can be used to decide when a sample is abnormal or normal, whereas CNVkit and QDNAseq cannot provide this critical information.

Conclusions: We show that conliga provides high-resolution SCNA profiles using a convenient, low-cost assay. We believe conliga makes FAST-SeqS a more clinically valuable assay as well as a useful research tool, enabling inexpensive and fast copy number profiling of pre-malignant and cancer samples.

Keywords: Barrett’s oesophagus; Bayesian nonparametrics; Cancer; Copy number profiling; FAST-SeqS; MCMC; Oesophageal adenocarcinoma; Probabilistic model; Somatic copy number alterations; Sticky HDP-HMM; Tumour purity.

PubMed Disclaimer

Conflict of interest statement

RCF is named on patents related to Cytosponge and related assays that have been licensed to Covidien now Medtronic, and she is a shareholder in Cyted Ltd.

Figures

Fig. 1
Fig. 1
Comparison of conliga method with ASCAT, QDNAseq and CNVkit. a Total copy number (TCN) profile determined by ASCAT using HC WGS data for sample OAC2, showing all copy number segments. b RCN profile determined by QDNAseq using LC WGS data for sample OAC2, showing all 15 Kbp bins. c TCN profile determined by ASCAT from HC WGS data, d RCN profile determined by CNVkit, e RCN profile determined by conliga, all (c-e) showing copy number calls at the intersection of ASCAT’s called regions and FAST-SeqS loci for sample OAC2. f Comparison of log2 RCN calls from 11 samples between QDNAseq and ASCAT (top), CNVkit and ASCAT (middle) and conliga and ASCAT (bottom). rs represents the Spearman’s rank correlation coefficient. All RCN calls at the intersection of ASCAT’s called regions, QDNAseq 15Kb bins and FAST-SeqS loci in all 11 OAC samples are shown as points. g Distribution of differences between ASCAT RCN calls and QDNAseq RCN estimates for 11 OAC samples (top), ASCAT RCN calls and CNVkit RCN estimates for 11 OAC samples (middle) and ASCAT RCN calls and conliga RCN estimates for 11 OAC samples (bottom). h Comparison of performance at gene-level resolution between ASCAT and QDNAseq for 36 selected genes (top), ASCAT and CNVkit (middle) and ASCAT and conliga (bottom). The values represent the weighted mean of RCN calls at each gene for each of the 11 OAC samples (Methods)
Fig. 2
Fig. 2
Comparing the performance of SCNA detection in low tumour purity samples and determining the limit of detection. a Left column: RCN calls by conliga, showing a selection of chromosomes, at different dilutions of sample OAC3, compared to the ASCAT RCN profile of the undiluted sample (top left), discrete copy number states are coloured with a gradient (light green to purple), highlighting regions with differing SCNAs. Middle column: RCN calls by CNVkit at different dilutions of sample OAC3, compared to ASCAT RCN profile (top middle). Right column: RCN calls by QDNAseq at different dilutions of sample OAC3, compared to ASCAT RCN profile (top right). Calls by CNVkit and QDNAseq are not coloured because they do not provide hidden state information. Purity levels are indicated as a percentage on the right-hand side. 0% purity profiles are highlighted in a red box, for which all regions of the genome should have equal RCN. b The number of copy number states detected by conliga (left) in each of eight OAC samples at differing purity levels. The limit of detection is determined by the lowest purity level in which more than one copy number state is detected. Red dashed line indicates one hidden state inferred, indicating that zero copy number changes are inferred. The number of unique RCN calls observed within each sample for CNVkit (middle) and QDNAseq (right). Note the differing y-axis ranges and that the number of unique RCN calls are always greater than one for CNVkit and QDNAseq

References

    1. Beroukhim R, Mermel CH, Porter D, Wei G, Raychaudhuri S, Donovan J, Barretina J, Boehm JS, Dobson J, Urashima M, Henry KTM, Pinchback RM, Ligon AH, Cho Y. -j., Haery L, Greulich H, Reich M, Winckler W, Lawrence MS, Weir BA, Tanaka KE, Chiang DY, Bass AJ, Loo A, Hoffman C, Prensner J, Liefeld T, Gao Q, Yecies D, Signoretti S, Maher E, Kaye FJ, Sasaki H, Tepper JE, Fletcher JA, Tsao M. -s., Demichelis F, Rubin MA, Janne PA, Tabernero J, Daly MJ, Nucera C, Levine RL, Ebert BL, Gabriel S, Rustgi AK, Antonescu CR, Ladanyi M, Letai A, Garraway LA, Loda M, Beer DG, True LD, Okamoto A, Pomeroy SL, Singer S, Golub TR, Lander ES, Getz G, Sellers WR, Meyerson M. The landscape of somatic copy-number alteration across human cancers. Nature. 2010;463(February):899–905. doi: 10.1038/nature08822. - DOI - PMC - PubMed
    1. Zack TI, Schumacher SE, Carter SL, Cherniack AD, Saksena G, Tabak B, Lawrence MS, Zhang C. -z., Wala J, Mermel CH, Sougnez C, Gabriel SB, Hernandez B, Shen H, Laird PW, Getz G. Pan-cancer patterns of somatic copy number alteration. Nat Genet. 2013;45(10):1134–40. doi: 10.1038/ng.2760. - DOI - PMC - PubMed
    1. Ciriello G, Miller ML, Aksoy BA, Senbabaoglu Y, Schultz N, Sander C. Emerging landscape of oncogenic signatures across human cancers. Nat Genet. 2013;45(10):1127–33. doi: 10.1038/ng.2762. - DOI - PMC - PubMed
    1. Li Y, Roberts ND, Wala JA, Shapira O, Schumacher SE, Kumar K, Khurana E, Waszak S, Korbel JO, Haber JE, Imielinski M, PCAWG Structural Variation Working Group. Weischenfeldt J, Beroukhim R, Campbell PJ, PCAWG Consortium Patterns of somatic structural variation in human cancer genomes. Nature. 2020;578:112–21. doi: 10.1038/s41586-019-1913-9. - DOI - PMC - PubMed
    1. Harbers L, Agostini F, Nicos M, Poddighe D, Bienko M, Crosetto N. Somatic Copy Number Alterations in Human Cancers: An Analysis of Publicly Available Data From The Cancer Genome Atlas. Front Oncol. 2021;11(July):1–11. - PMC - PubMed