Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 21;12(4):1154-1163.
doi: 10.1021/acssynbio.2c00648. Epub 2023 Mar 22.

A Genetic Programming Approach to Engineering MRI Reporter Genes

Affiliations

A Genetic Programming Approach to Engineering MRI Reporter Genes

Alexander R Bricco et al. ACS Synth Biol. .

Abstract

Here we develop a mechanism of protein optimization using a computational approach known as "genetic programming". We developed an algorithm called Protein Optimization Engineering Tool (POET). Starting from a small library of literature values, the use of this tool allowed us to develop proteins that produce four times more MRI contrast than what was previously state-of-the-art. Interestingly, many of the peptides produced using POET were dramatically different with respect to their sequence and chemical environment than existing CEST producing peptides, and challenge prior understandings of how those peptides function. While existing algorithms for protein engineering rely on divergent evolution, POET relies on convergent evolution and consequently allows discovery of peptides with completely different sequences that perform the same function with as good or even better efficiency. Thus, this novel approach can be expanded beyond developing imaging agents and can be used widely in protein engineering.

Keywords: CEST MRI; genetic programming; protein engineering.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
The principles of POET. (a) Illustration of conventional directed evolution, where in each evolution cycle one mutant exhibits better fitness and thus is used as a template for the following generation of evolution. (b) Often, in directed evolution, the protein fitness reaches a local maximum, and consequently all the mutants exhibit lower fitness (empty arrows). In this case it is impossible to predict which mutant should be used as a template to achieve improved fitness (the route from 1 to 5). (c) In the case of POET, the route from 1 to 5 is not determined by stepwise mutagenesis and adhering to the parental protein, but rather by generating libraries of peptides that cover broadly all the search space. Each generation helps to shape a set of rules that determine the next set of peptides. This way all the search space of the fitness landscape is covered and consequently minimizing the probability of missing the absolute maximum.
Figure 2
Figure 2
Schematic illustration of POET. (a–d) Paradigm and workflow.
Figure 3
Figure 3
Improvement of CESTides by POET. (a) MTR and z-spectra from the best and the worst peptide in generation 7. (b) The MTRasym is normalized against the contrast generated by K12 in the same experiment to provide a consistent comparison across experiments and plotted with respect to the generations.
Figure 4
Figure 4
Structure of four representative distinct peptides. (a) K12; KKKKKKKKKKKK; Theoretical pI/Mw: 11.04/1556.10. (b) A peptide from generation 2 has a neutral pI, yet generates contrast higher than the K12; NSSNHSNNMPCQ; Theoretical pI/Mw: 6.73/1332.38. (c) A peptide from generation 5 that generates contrast that is approximately 4 times larger than K12: KMWDWEQKKKWI; Theoretical pI/Mw: 9.53/1706.04. (d) A peptide from generation 7 that generates contrast that is twice that of K12 but has an acidic pI: ICLKSQPICGID.
Figure 5
Figure 5
Sensitivity of evolved CEST peptides. (a) Contrast for each sample from the dilution experiment. Lines are from linear regressions, and each has an R2 greater than 0.95. (b) Probability maps show the p-values from a t test to determine if any contrast perceived is statistically significant. (c) CEST maps show the MTRasym values for each pixel.
Figure 6
Figure 6
Grantham distance between discovered CESTides. (a) Intergenerational distance, where peptides are compared to those in their own generation and all prior ones. (b) Intragenerational distance, where peptides are only compared to those in the same generation. The peptides discovered using POET are blue circles (mean ± 95% confidence interval (CI)), simulated peptides generated randomly are shown as red squares (mean ± 95% CI). Each data set has a trendline fit to an exponential decay curve (a), or linearly (b).

References

    1. Baker D. What has de novo protein design taught us about protein folding and biophysics?. Protein Sci. 2019, 28 (4), 678–683. 10.1002/pro.3588. - DOI - PMC - PubMed
    1. Romero P. A.; Arnold F. H. Exploring protein fitness landscapes by directed evolution. Nat. Rev. Mol. Cell Biol. 2009, 10 (12), 866–76. 10.1038/nrm2805. - DOI - PMC - PubMed
    1. Meyer J. R.; Dobias D. T.; Weitz J. S.; Barrick J. E.; Quick R. T.; Lenski R. E. Repeatability and Contingency in the Evolution of a Key Innovation in Phage Lambda. Science 2012, 335 (6067), 428–432. 10.1126/science.1214449. - DOI - PMC - PubMed
    1. Goldsmith M.; Tawfik D. S. Enzyme engineering: reaching the maximal catalytic efficiency peak. Curr. Opin Struct Biol. 2017, 47, 140–150. 10.1016/j.sbi.2017.09.002. - DOI - PubMed
    1. Koza J. R. Genetic programming as a means for programming computers by natural selection. Stat. Comput. 1994, 4 (2), 87–112. 10.1007/BF00175355. - DOI

Publication types