Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov 26;116(48):24206-24213.
doi: 10.1073/pnas.1905990116. Epub 2019 Nov 12.

Definitive demonstration by synthesis of genome annotation completeness

Affiliations

Definitive demonstration by synthesis of genome annotation completeness

Paul R Jaschke et al. Proc Natl Acad Sci U S A. .

Abstract

We develop a method for completing the genetics of natural living systems by which the absence of expected future discoveries can be established. We demonstrate the method using bacteriophage øX174, the first DNA genome to be sequenced. Like many well-studied natural organisms, closely related genome sequences are available-23 Bullavirinae genomes related to øX174. Using bioinformatic tools, we first identified 315 potential open reading frames (ORFs) within the genome, including the 11 established essential genes and 82 highly conserved ORFs that have no known gene products or assigned functions. Using genome-scale design and synthesis, we made a mutant genome in which all 11 essential genes are simultaneously disrupted, leaving intact only the 82 conserved but cryptic ORFs. The resulting genome is not viable. Cell-free gene expression followed by mass spectrometry revealed only a single peptide expressed from both the cryptic ORF and wild-type genomes, suggesting a potential new gene. A second synthetic genome in which 71 conserved cryptic ORFs were simultaneously disrupted is viable but with ∼50% reduced fitness relative to the wild type. However, rather than finding any new genes, repeated evolutionary adaptation revealed a single point mutation that modulates expression of gene H, a known essential gene, and fully suppresses the fitness defect. Taken together, we conclude that the annotation of currently functional ORFs for the øX174 genome is formally complete. More broadly, we show that sequencing and bioinformatics followed by synthesis-enabled reverse genomics, proteomics, and evolutionary adaptation can definitely establish the sufficiency and completeness of natural genome annotations.

Keywords: cleanomics; gene discovery; reverse genomics; synthetic biology; synthetic genomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
Designing genomes encoding only essential or only cryptic ORFs. (A) Contemporary genetic map of øX174. Lettered boxes represent the 11 established protein-coding ORFs. (B) The øX174 genome with 315 ORFs, each 60 bp or longer and starting with an ATG, GTG, or TTG codon (various colors), plus the 11 established protein-coding ORFs (black). (C) Filtered annotation of the øX174 genome with ORFs; 11 previously identified protein-coding ORFs (black) and 82 cryptic ORFs of unknown protein-coding status (green). (D) The cryptX174 genome design with the 11 established ORFs disrupted (gray), and 82 cryptic ORFs of unknown protein-coding status (green). (E) The kleenX174 genome design showing disruption of 71 cryptic ORFs (gray), 11 previously identified protein-coding ORFs (black), and remaining 11 cryptic ORFs of unknown protein-coding status (green).
Fig. 2.
Fig. 2.
A “clean” øX174 genome in which 71 cryptic ORFs are simultaneously disrupted is viable but has reduced fitness. (A) Linear depiction of kleenX174 genome showing the locations and modes of cryptic ORF disruption; see Dataset S1 for detailed information. (B) Plaques of wild-type and kleenX174 phage. 85 mm diameter plates. (C) Plaque diameter of wild-type/kleenX174 chimeras, arranged from largest average plaque size to smallest. Vertical bars represent 1 SD from n = 50 plaque measurements. Each chimeric phage consists of 5 genome segments chosen from a mixture of wild-type genome segment (green) and kleenX174 modified genome segments (red). (Inset) The boundaries of the 5 genome segments, protein-coding ORFs found in each segment, and total number of nucleotide differences between wild-type øX174 and kleenX174 genome sequences in each segment.
Fig. 3.
Fig. 3.
Evolutionary adaptation of kleenX174 to high-growth rate results in a reversion (2939C > T) and increased gene H expression. (A) The mutation 2939C > T observed in 2 independent adaptation experiments both restores the putative start codon of cryptic ORF 36 (GCG > GTG) and silently changes the third codon of gene H (GGC > GGT, Gly > Gly). (B) Evolved kleenX174(2939C > T) grows faster than ancestral kleenX174 in liquid culture (population doublings per hour). (C) The kleenX174(2939C > T) mutation also recovers plaque size. (D) RNA structure predictions of gene H sequence variants from wild-type øX174, kleenX174, and mutant kleenX174(2939C > T). NUPACK lowest-energy RNA structures generated from an 83-nt window surrounding the gene H start codon. (E) Protein H production from kleenX174(2939C > T) is increased compared to ancestral kleenX174 genome. Protein H produced from synthetic dsDNA templates (Datasets S2–S4) containing either wild-type øX174, kleenX174, or kleenX174(2939C > T) gene H sequence plus 20-bp upstream sequence identical to genome background, flanked by T7 promoter (PT7) and terminator sequences (T7 Term). PURExpress reactions run with 0.8 nM template were separated on SDS/PAGE followed by fluorescence detection of BODIPY-FL tagged lysine incorporated into proteins produced during the transcription/translation reaction. Error bars represent 1 SD (n = 3).

References

    1. Carlson R., The pace and proliferation of biological technologies. Biosecur. Bioterror. 1, 203–214 (2003). - PubMed
    1. Landenmark H. K. E., Forgan D. H., Cockell C. S., An estimate of the total DNA in the biosphere. PLoS Biol. 13, e1002168 (2015). - PMC - PubMed
    1. Tatusova T., et al. , “Prokaryotic genome annotation pipeline” in The NCBI Handbook (National Center for Biotechnology Information, Bethesda, MD, 2013), pp. 175–188.
    1. Carvunis A. R., et al. , Proto-genes and de novo gene birth. Nature 487, 370–374 (2012). - PMC - PubMed
    1. Hutchison C. A. I., Bacteriophage PhiX174: Viral Genes and Functions, PhD thesis, California Institute of Technology, Pasadena, CA (2003). (1969).

LinkOut - more resources