Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Jan;61(1):135-151.
doi: 10.1002/em.22342. Epub 2019 Nov 11.

Next-Generation Genotoxicology: Using Modern Sequencing Technologies to Assess Somatic Mutagenesis and Cancer Risk

Affiliations
Review

Next-Generation Genotoxicology: Using Modern Sequencing Technologies to Assess Somatic Mutagenesis and Cancer Risk

Jesse J Salk et al. Environ Mol Mutagen. 2020 Jan.

Abstract

Mutations have a profound effect on human health, particularly through an increased risk of carcinogenesis and genetic disease. The strong correlation between mutagenesis and carcinogenesis has been a driving force behind genotoxicity research for more than 50 years. The stochastic and infrequent nature of mutagenesis makes it challenging to observe and to study. Indeed, decades have been spent developing increasingly sophisticated assays and methods to study these low-frequency genetic errors, in hopes of better predicting which chemicals may be carcinogens, understanding their mode of action, and informing guidelines to prevent undue human exposure. While effective, widely used genetic selection-based technologies have a number of limitations that have hampered major advancements in the field of genotoxicity. Emerging new tools, in the form of enhanced next-generation sequencing platforms and methods, are changing this paradigm. In this review, we discuss rapidly evolving sequencing tools and technologies, such as error-corrected sequencing and single cell analysis, which we anticipate will fundamentally reshape the field. In addition, we consider a variety emerging applications for these new technologies, including the detection of DNA adducts, inference of mutational processes based on genomic site and local sequence contexts, and evaluation of genome engineering fidelity, as well as other cutting-edge challenges for the next 50 years of environmental and molecular mutagenesis research. Environ. Mol. Mutagen. 61:135-151, 2020. © 2019 The Authors. Environmental and Molecular Mutagenesis published by Wiley Periodicals, Inc. on behalf of Environmental Mutagen Society.

Keywords: in vivo mutation; cancer risk assessment; chemical carcinogenesis; consensus sequencing; error-corrected NGS; single molecule sequencing; single-cell sequencing.

PubMed Disclaimer

Conflict of interest statement

J.J.S. is an employee and equity holder at TwinStrand Biosciences. S.R.K. is a paid consultant and equity holder at TwinStrand Biosciences and a paid consultant for Wilcox & Savage, PC.

Figures

Figure 1
Figure 1
The genesis of cancer. Cancer exists on a continuum. Mutations arise as a result of repair and replication errors due to endogenous processes and environmental factors. These mutations are the substrate for neoplastic clonal evolution: those that confer a proliferative or survival advantage upon the host cell will be naturally selected. Carcinogens promote tumorigenesis by increasing the rate of mutation or by enhancing net‐positive selection. Given the often impractically long lag‐time between a carcinogenic insult and overt tumor formation, technologies that are able to sensitively detect DNA damage, mutation induction, and clonal outgrowths are essential tools in a genetic toxicologist's armamentarium.
Figure 2
Figure 2
Analog vs. digital DNA sequencing. A common need in genetic toxicology is to identify mutations in cell populations. The appropriateness of the sequencing technology depends on mutational clonality. (A) Clonal mutations are those present in all or most cells in a tissue (gray), whereas subclonal mutations (colors) are present in only a subset. (B) When DNA is extracted from a tissue, a mutation's clonality is reflected in the isolated molecules that are then (C) prepared for sequencing. (D) With traditional Sanger sequencing, all molecules from the same genomic region are genotyped together en masse in a capillary system, which produces an analog output (electropherogram tracing) that is the average of many different DNA molecules. (E) Generally only substantially clonal mutations can be reliably detected. (F) In contrast, next‐generation sequencing operates by massively parallel sequencing of millions of individual molecules digitally. On the widely used Illumina sequencing‐by‐synthesis platform, this is accomplished by flowing fluorescently labeled nucleotides across a surface coated with small biochemically generated colonies of individual molecules (clusters), and recording the sequence of colors of each cluster through multiple cycles of addition. (G) The resulting output is not a single sequence, but millions of individual ones that reflect both clonal and subclonal mutations down to approximately 1% abundance.
Figure 3
Figure 3
Techniques for error corrected DNA sequencing (ecNGS). The highest accuracy NGS methods rely on sequencing‐by‐consensus, whereby data from multiple sequence reads derived from an original molecule are combined to reduce the impact of sequencing or sample preparation errors in each read. (A) The SafeSeqS approach uses random molecular barcodes applied to PCR primers to uniquely tag PCR amplicons, which are then further amplified and sequenced. Variation within the sequence of reads with identical tags can be discounted as technical artifacts (X's). Some errors that occur during the first extension cycle may escape correction (triangles). (B) Duplex Sequencing relies on ligation to apply molecular barcodes to both strands of original double‐stranded molecules. These are used alone or in combination with fragmentation points to uniquely label both strands such that derivative sequence reads from each strand can be directly related back to their founder strand and compared to those from its complement. The method is significantly more accurate that single‐stranded consensus‐making methods but is more sequencing‐intensive. (C) 2D sequencing on nanopore platforms uses physical linkage of the two strands of an original duplex, which are then sequenced together without the need for amplification. The method is fast and simple, but nanopore platforms are lower accuracy and throughput than more widely used sequencing‐by‐synthesis platforms. (D) Circular Consensus Sequencing on the PacBio single‐molecule platform similarly links the two strands of an original double‐stranded with hairpins to allow multiple sequencing passes across both original strands. As with 2D, lower raw platform accuracy and throughput are drawbacks but very long reads can be obtained.
Figure 4
Figure 4
Approaches for assessing mutational signatures. Mutational spectra, particularly polynucleotide mutational signatures, provide important mechanistic insights into mutational processes. Most of what we know about these patterns has come from natural or artificial means of single cell cloning. (A) Exome or whole‐genome sequencing of tumor populations reflects the somatic processes operative in the founding cell of the most recent clonal sweep. (B) Single cells can be cloned from cultured populations exposed to known or suspected mutagens to assess their mutational signatures (C) The clonal variants present in individuals that were not present in their parents reflects the state of mutational processes during gametogenesis or early embryogenesis. (D) Sequencing of cloned cells or molecules from certain selection‐based mutagenicity assays can be used similarly, although the patterns may be distorted by the selection system itself. (E) With ecNGS, it is now possible to obtain mutational spectra by directly sequencing DNA from any tissue of any organism.
Figure 5
Figure 5
Canary‐in‐a‐coal‐mine: a century later. A hundred years ago, at the suggestion of John Scott Haldane, caged canaries were routinely brought into British coal mines as an early warning sign of human‐relevant toxic gases. Although their routine use ceased in the 1980s, the broader concept of using sentinel species to infer the presence of environmental hazards remains highly germane in modern genetic toxicology. Should it have been possible to collect and analyze a DNA sample from one of Haldane's birds using modern ecNGS techniques, it is quite likely that the mutagenic signature of benzo[a]pyrene could have been identified and used to inform efforts to mitigate the environmental cancer risk. Other naturally present sentinel organisms, including humans themselves, can be similarly used.

References

    1. Akcakaya P, Bobbin ML, Guo JA, Malagon‐Lopez J, Clement K, Garcia SP, Fellows MD, Porritt MJ, Firth MA, Carreras A, et al. 2018. In vivo CRISPR editing with no detectable genome‐wide off‐target mutations. Nature 561:416–419. - PMC - PubMed
    1. Alexandrov LB, Nik‐Zainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Børresen‐Dale A‐L, et al. 2013. Signatures of mutational processes in human cancer. Nature 500:415–421. - PMC - PubMed
    1. Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ, Nik‐Zainal S, Stratton MR. 2015. Clock‐like mutational processes in human somatic cells. Nat Genet 47:1402–1407. - PMC - PubMed
    1. Alexandrov L, Kim J, Haradhvala NJ, Huang MN, Ng AWT, Boot A, Covington KR, Gordenin DA, Bergstrom E, Lopez‐Bigas N, et al. 2018. The repertoire of mutational signatures in human cancer. bioRxiv 322859 10.1101/322859. - DOI
    1. Alkan C, Coe BP, Eichler EE. 2011. Genome structural variation discovery and genotyping. Nat Rev Genet 12:363–376. - PMC - PubMed

Publication types