Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Jul;34(7):545-557.
doi: 10.1016/j.tig.2018.04.003. Epub 2018 May 3.

Detecting Somatic Mutations in Normal Cells

Affiliations
Review

Detecting Somatic Mutations in Normal Cells

Yanmei Dou et al. Trends Genet. 2018 Jul.

Abstract

Somatic mutations have been studied extensively in the context of cancer. Recent studies have demonstrated that high-throughput sequencing data can be used to detect somatic mutations in non-tumor cells. Analysis of such mutations allows us to better understand the mutational processes in normal cells, explore cell lineages in development, and examine potential associations with age-related disease. We describe here approaches for characterizing somatic mutations in normal and non-tumor disease tissues. We discuss several experimental designs and common pitfalls in somatic mutation detection, as well as more recent developments such as phasing and linked-read technology. With the dramatically increasing numbers of samples undergoing genome sequencing, bioinformatic analysis will enable the characterization of somatic mutations and their impact on non-cancer tissues.

Keywords: cell lineage; linked reads; mosaicism; phasing; single-nucleotide variants.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Detecting somatic mosaicism in the genome through various sequencing strategies
Somatic mutations arise during development and propagate to a sub-population of cells (blue: 50% of cells; red: 25%; yellow: 12.5%). With bulk sequencing, these somatic mutations are expected to be approximately half of the sub-population frequency. Lower frequency somatic mutations require higher sequencing depth to maintain detection sensitivity. With single cell sequencing, somatic mutations can be detected as heterozygous variants that occur in a subset of cells. The ability to detect variants is dependent on uniformity of coverage and allelic balance in genome amplification, as well as picking cells that contain variants. Clonal expansion followed by bulk sequencing does not suffer from the problems associated with single cell sequencing, but artifactual mutations that occur early during expansion (green) can be difficult to distinguish from mutations in the original cell.
Figure 2
Figure 2. Different strategies for detecting and filtering somatic mutations
Somatic variant callers for a tumor tissue often require a matched normal tissue from the same individual. However, this strategy is not possible when matched normal tissue is unavailable. For somatic mosaicism in a non-tumor tissue, a matched ‘normal’ may not exist, as mutations of interest may be shared across tissues. Whenever matched normal tissue is unavailable, germline variants as well as some artifacts can be removed by querying public variation databases or by constructing a ‘panel of normals’ from sequencing data of unrelated individuals. Additional filters can be applied to further remove artifacts.
Figure 3
Figure 3. Overview of read-based mosaic phasing scenarios
Read-based phasing can help identify true somatic mosaic mutations by examining the relationship between germline heterozygous variants and putative somatic mutations. However, some patterns of false-positives can confound this method. (A) If a real mosaic SNV (red star) arises near a heterozygous SNP, it will always be found in conjunction with one of the two SNP alleles (green) and will never appear on reads with the other allele (blue). This generates three haplotypes in bulk sequencing (HM for mosaic haplotype, in addition to HA and HB). (B) Similarly, a true mosaic CNV will phase with one allele of a nearby heterozygous SNP, resulting in three haplotypes. (C) Segmental duplications can cause a germline variant (orange) occurring on one duplicated segment to phase to a nearby heterozygous SNP occurring on both segments as if it were somatic, resulting in a false-positive identification.
Figure 4
Figure 4. False positive mosaic calls can arise from multiple sources
Clockwise from top left: inverted repeats, homopolymers, and some specific nucleotide contexts are common locations of sequencing error. Uneven read coverage can cause false positive calls of mosaic CNVs. Platform-specific errors from targeted sequencing methods may result in underestimated VAF for germline variants; barcode swapping can lead to the spread of false positive signals in multiplexed samples. Germline mutations with low VAF due to read sampling bias can be misclassified as somatic. DNA damage can induce artifactual single-base substitutions during sample handling and library preparation. PCR errors are also common and will propagate in subsequent PCR steps. Misalignment, especially within repetitive regions of the genome, contributes to a large proportion of false positive calls. Cross-individual contamination may lead to false positives.

References

    1. Martincorena I, Campbell PJ. Somatic mutation in cancer and normal cells. Science. 2015;349(6255):1483–9. - PubMed
    1. Ju YS, et al. Somatic mutations reveal asymmetric cellular dynamics in the early human embryo. Nature. 2017;543(7647):714–718. - PMC - PubMed
    1. Martincorena I, et al. Universal Patterns of Selection in Cancer and Somatic Tissues. Cell. 2017;171(5):1029–1041 e21. - PMC - PubMed
    1. Cibulskis K, et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol. 2013;31(3):213–9. - PMC - PubMed
    1. Lodato MA, et al. Somatic mutation in single human neurons tracks developmental and transcriptional history. Science. 2015;350(6256):94–98. - PMC - PubMed