Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec;18(12):749-760.
doi: 10.1038/nrg.2017.59. Epub 2017 Aug 30.

Beyond editing to writing large genomes

Affiliations

Beyond editing to writing large genomes

Raj Chari et al. Nat Rev Genet. 2017 Dec.

Abstract

Recent exponential advances in genome sequencing and engineering technologies have enabled an unprecedented level of interrogation into the impact of DNA variation (genotype) on cellular function (phenotype). Furthermore, these advances have also prompted realistic discussion of writing and radically re-writing complex genomes. In this Perspective, we detail the motivation for large-scale engineering, discuss the progress made from such projects in bacteria and yeast and describe how various genome-engineering technologies will contribute to this effort. Finally, we describe the features of an ideal platform and provide a roadmap to facilitate the efficient writing of large genomes.

PubMed Disclaimer

Conflict of interest statement

Competing interests statement

G.M.C. is a co-founder of Editas Medicine and eGenesis Bio and serves advisory roles in several companies involved in genome editing and engineering. A detailed listing of G.M.C.’s Tech Transfer, Advisory Roles, and Funding Sources can be obtained from http://arep.med.harvard.edu/gmc/tech.html. R.C. declares no competing interests.

Figures

Figure 1
Figure 1. Timeline of gene targeting
The timeline depicts progress in gene targeting rates through recombination (red) and the number of edited sites in a single genome (blue). Recombination rates with dashed lines denote ranges observed in the published literature,,,,,,,–. In the mid to late 1980s, it was observed that in the presence of exogenous DNA, homologous recombination (HR) could occur at a very low frequency (0.01%). Subsequently, it was shown that by inducing a double-strand break (DSB), the rates of HR could be dramatically improved. Thus, different modalities have been discovered and employed to create DSBs at higher frequencies, from meganucleases (such as I-SceI) to zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) to the current CRISPR-associated nucleases. In addition, approaches that do not use DSB-stimulated HR such as Cre recombinase or phiC31 integrase, have also shown tremendous promise for DNA replacement, but are restricted by the limited range of sequences that can be targeted. As a corollary, in combination with the progress in DNA synthesis technologies, our improved ability to edit DNA has permitted increases in the total number of modifications being made in a single genome. In the past, 1 to 2 edits were typically made in a single genome. Ambitious projects such as RC57 and Sc2.0 have aimed to make up to four or even five orders of magnitude more edits in a single genome.
Figure 2
Figure 2. Potential applications of large-scale genome engineering of different organisms
For applications such as bioenergy, chemicals, proteins, and vaccines, the goal is to increase production of these entities by engineering metabolic pathways in particular organisms. Similarly, engineering of the metabolic pathways in these organisms could also be used for environmental remediation, from maintaining clean water supplies to removing pollutants and CO2 from the air. While for applications such as food supply, in addition to making plants and livestock larger, large-scale engineering of regulatory variants would also aim to make these sources grow more robustly (that is, tolerant to different environmental conditions) as well as less susceptible to disease to minimize loss. Finally, for mammalian genome engineering, some of the main key applications would be for deciphering complex genotype–phenotype relationships, modelling disease, and producing cells of interest. Subsequently, these engineered cells could be built into tissues and organs ex vivo using approaches such as 3D printing or, if the engineering was performed in pigs, organs could be harvested for xenotransplantation.
Figure 3
Figure 3. DNA-editing nanomachines
a | DNA-based recognition. Multiplex automated genome engineering (MAGE) utilizes machinery from the phage λ, namely the single stranded binding protein beta and an exonuclease (Exo). Designer oligonucleotides, along with these two proteins, can enable the incorporation of desired changes at multiple genomic locations. Conjugative assembly genome engineering (CAGE) enables the hierarchical assembly of parallel edited genomes. For every pair of edited genomes (strains), one strain is designated a donor strain and the other a recipient strain such that the direction of DNA transfer is engineered to be in a single direction. This cycle can be repeated until the result is a single strain with all desired changes. The Argonaute (Ago) system utilizes a 24 nucleotide guide DNA sequence along with the Ago protein to direct a double-strand break (DSB) in a site-specific manner. b | Protein-based recognition., Zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and meganucleases utilize staggered-end forming DSBs to facilitate DNA editing. For each site that is being targeted, a new custom protein is required. Tyrosine (Tyr) and serine (Ser) site-specific recombinases (SSRs) mediate editing with (Ser) or without (Tyr) using double stranded breaks. A pair of short inverted repeats (IRs), flanking both the target site and donor DNA molecule, are required for DNA to be exchanged. Currently, the repertoire of different recombinase recognition sites is quite limited. c | RNA-based recognition. Group II introns encode both a self-catalyzing RNA moiety along with a reverse transcriptase, which enable site-specific disruption within a sequence of interest. The CRISPR–Cas9 system has demonstrated the most promise amongst current genome engineering technologies. A single-guide RNA (sgRNA) specifies the location in which a blunt-end-forming DSB is made by the Cas9 protein and due to this simplicity, enables simultaneous targeting of multiple sites by including multiple sgRNAs. Protein crystal structures were generated using Chimera and Protein Data Bank (PDB) accession numbers are provided underneath each structure.
Figure 4
Figure 4. Two main approaches to large-scale genome engineering in bacteria and yeast
a | For complete de novo synthesis, one must first computationally design oligonucleotide sequences that comprise the desired product while ensuring highest compatibility with the experimental strategy used for assembly. For example, thermodynamic properties of the sequence such as GC% and secondary structure formation are important parameters to consider. Subsequently, using in vitro PCR-based approaches, oligonucleotides (~150–200 bases in length) can then be assembled into fragments 1–2 kbp in size. Finally, using either bacteria or yeast as a chassis, these fragments can be assembled into segments in the order of Megabase pairs (Mbp). b | Extensive editing of an existing genome scaffold. Depending on the size of the DNA segment being edited, a segment is computationally divided into pieces with approximately similar level of editing. Each piece can then be edited independently in bacteria or yeast and upon completion, can be re-assembled into a single final segment and then placed into the final recipient cell.
Figure 5
Figure 5. A roadmap to building large genomes using a combination of de novo synthesis and genome editing
Here, a hierarchical strategy is employed starting from short oligonucleotides (150 to 200 bases in length), which can be assembled into short DNA fragments ~3 kbp in size. Short DNA fragments would then be assembled into larger segments between 50 kbp and 1 Mbp in size. Segments ~4 Mbp in size can be built in bacteria or yeast and subsequently transferred into human cells using conjugation. These can then be built into segments up to ~100 Mbp in size and be transferred into recipient cells by microcell-mediated chromosomal transfer (MMCT). At each stage where a segment is built, genome-editing tools such as multiplex automated genome engineering (MAGE), CRISPR, homologous recombination (HR) and site specific recombinases (SSRs) can be utilized to make further edits from what was specified in the initial set of oligonucleotides.

References

    1. Kosuri S, Church GM. Large-scale de novo DNA synthesis: technologies and applications. Nat Methods. 2014;11:499–507. - PMC - PubMed
    1. Reuter JA, Spacek DV, Snyder MP. High-throughput sequencing technologies. Mol Cell. 2015;58:586–597. - PMC - PubMed
    1. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016;17:333–351. - PMC - PubMed
    1. Smithies O, Gregg RG, Boggs SS, Koralewski MA, Kucherlapati RS. Insertion of DNA sequences into the human chromosomal beta-globin locus by homologous recombination. Nature. 1985;317:230–234. - PubMed
    1. DeWitt MA, et al. Selection-free genome editing of the sickle mutation in human adult hematopoietic stem/progenitor cells. Sci Transl Med. 2016;8:360ra134. - PMC - PubMed

Publication types