Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Aug 10:2024.08.09.607342.
doi: 10.1101/2024.08.09.607342.

Somatic mutation phasing and haplotype extension using linked-reads in multiple myeloma

Affiliations

Somatic mutation phasing and haplotype extension using linked-reads in multiple myeloma

Steven M Foltz et al. bioRxiv. .

Abstract

Somatic mutation phasing informs our understanding of cancer-related events, like driver mutations. We generated linked-read whole genome sequencing data for 23 samples across disease stages from 14 multiple myeloma (MM) patients and systematically assigned somatic mutations to haplotypes using linked-reads. Here, we report the reconstructed cancer haplotypes and phase blocks from several MM samples and show how phase block length can be extended by integrating samples from the same individual. We also uncover phasing information in genes frequently mutated in MM, including DIS3, HIST1H1E, KRAS, NRAS, and TP53, phasing 79.4% of 20,705 high-confidence somatic mutations. In some cases, this enabled us to interpret clonal evolution models at higher resolution using pairs of phased somatic mutations. For example, our analysis of one patient suggested that two NRAS hotspot mutations occurred on the same haplotype but were independent events in different subclones. Given sufficient tumor purity and data quality, our framework illustrates how haplotype-aware analysis of somatic mutations in cancer can be beneficial for some cancer cases.

PubMed Disclaimer

Conflict of interest statement

Competing interests The authors declare no competing interests.

Figures

Figure 1.
Figure 1.. Linked-read data generation and analysis pipeline.
a. The 10X Genomics Chromium platform tags large DNA molecules with barcodes such that reads originating from the same molecule have the same barcode. The Long Ranger pipeline aligns reads and phases variants. b. SomaticHaplotype builds upon Long Ranger output with several modules, including phaseblock, summarize, somatic, extend, and ancestry. c. Our cohort comprises 14 multiple myeloma patients across several disease stages for a total of 23 tumor samples. d. Quality control measures for our tumor and normal samples plus 1000 Genomes samples NA12878 (+) and NA19240 (x). Violin plots defined as: center line, median; violin limits, minimum and maximum values; points, every observation. Molecule Length (mean, Kb): length-weighted mean input DNA length in kilobases. Linked-Reads per Molecule (N50): N50 of read-pairs per input DNA molecule. Phase Block Length (N50, Mb): N50 length of phase blocks in megabases.
Figure 2.
Figure 2.. Phasing somatic mutations to haplotypes.
a. Overview of methods used to phase somatic mutations. b. Number of somatic mutations phased using two phasing methods (H1 = phased to haplotype 1; H2 = phased to haplotype 2; NC = not enough coverage for phasing; NP = not phased). c. Phasing somatic mutations commonly observed in multiple myeloma. d. Distribution of somatic mutations per phase block and the proportion of mutations phased.
Figure 3.
Figure 3.. Tumor evolution models derived from mutation pairs.
a. Number of overlapping barcodes by distance between somatic mutations. b. Proportion of somatic mutation pairs in close proximity sharing barcodes and mutations. c. Patterns of mutation pairs observed on barcodes (REF = reference allele; ALT = alternate allele). A dark green square indicates that a barcode with that pattern of two alleles was observed. Combinations of patterns can interpreted as evidence of sequential (e.g. 1101, 1011) or distinct (e.g. 1110) mutations. d. NRAS mutation pair observed in 27522 (P) and evolution model (NC = no coverage). e. Interpretation of evolution model observed from NRAS mutation pair in 27522 (P). f. ACTG1 mutation pair observed in 27522 (Rel) and evolution model. g. Interpretation of evolution model observed from ACTG1 mutation pair in 27522 (Rel).
Figure 4.
Figure 4.. Extension of phase blocks using additional sample information.
a. Model for phase block extension using overlap between target and reference phase blocks. b. Data-driven example of phase block overlap between samples. c. Number of phased variants needed for switch/no switch recommendation. d. Length of phase block overlap needed for switch/no switch recommendation. e. Phase block groups extended by overlap with another sample. f. Distribution of phase block lengths before and after extension. Violin plots defined as: center line, median; violin limits, minimum and maximum values; individual points not shown. g. Use of identity-by-descent segments as overlap between phase blocks.

Similar articles

References

    1. Browning S. R. & Browning B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 81, 1084–1097, doi:10.1086/521987 (2007). - DOI - PMC - PubMed
    1. Bansal V. & Bafna V. HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24, i153–159, doi:10.1093/bioinformatics/btn298 (2008). - DOI - PubMed
    1. Browning B. L. & Browning S. R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet 84, 210–223, doi:10.1016/j.ajhg.2009.01.005 (2009). - DOI - PMC - PubMed
    1. Genomes Project C. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65, doi:10.1038/nature11632 (2012). - DOI - PMC - PubMed
    1. Snyder M. W., Adey A., Kitzman J. O. & Shendure J. Haplotype-resolved genome sequencing: experimental methods and applications. Nat Rev Genet 16, 344–358, doi:10.1038/nrg3903 (2015). - DOI - PubMed

Publication types