Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 27;25(3):bbae118.
doi: 10.1093/bib/bbae118.

Enhancing cryo-EM structure prediction with DeepTracer and AlphaFold2 integration

Affiliations

Enhancing cryo-EM structure prediction with DeepTracer and AlphaFold2 integration

Jason Chen et al. Brief Bioinform. .

Abstract

Understanding the protein structures is invaluable in various biomedical applications, such as vaccine development. Protein structure model building from experimental electron density maps is a time-consuming and labor-intensive task. To address the challenge, machine learning approaches have been proposed to automate this process. Currently, the majority of the experimental maps in the database lack atomic resolution features, making it challenging for machine learning-based methods to precisely determine protein structures from cryogenic electron microscopy density maps. On the other hand, protein structure prediction methods, such as AlphaFold2, leverage evolutionary information from protein sequences and have recently achieved groundbreaking accuracy. However, these methods often require manual refinement, which is labor intensive and time consuming. In this study, we present DeepTracer-Refine, an automated method that refines AlphaFold predicted structures by aligning them to DeepTracers modeled structure. Our method was evaluated on 39 multi-domain proteins and we improved the average residue coverage from 78.2 to 90.0% and average local Distance Difference Test score from 0.67 to 0.71. We also compared DeepTracer-Refine with Phenixs AlphaFold refinement and demonstrated that our method not only performs better when the initial AlphaFold model is less precise but also surpasses Phenix in run-time performance.

Keywords: AlphaFold; DeepTracer; cryo-EM; protein docking; protein structure; refinement.

PubMed Disclaimer

Figures

Figure 1
Figure 1
An illustration of the two approaches in protein structure prediction. (A) Map-to-model detects the residues from a cryo-EM electron density map and connects them into a 3D structure through backbone tracing. (B) Sequence-to-model uses a protein sequence as input to fold the 1D sequence into a 3D structure.
Figure 2
Figure 2
Current challenges of map-to-model methods. (A) Cryo-EM maps lack the resolution to provide atomic-level details for accurate side-chain identification even at high resolution. Two residues, phenylalanine and tyrosine, are placed in cryo-EM density map EMD-20621 and the side-chains are indistinguishable from the map. The balls represent atoms and the circled atoms in the top half of the residues are the side-chain atoms. (B) Missing residues causing false connections. In the close-up section from DeepTracer's prediction, the balls represent residues and DeepTracer missed four residues (enclosed by the oval) causing the backbone to be connected incorrectly.
Figure 3
Figure 3
The design of DeepTracer-Refine pipeline. We utilized AlphaFolds pLDDT score to detect and rank locations to split the structure into compact domains. We process the pLDDT scores first and calculate an empirical score for each possible low-confidence location. We rank the locations based on the empirical scores and a higher score suggests a poorer AlphaFold prediction. In this example of a rotavirus VP6 [28], we split the AlphaFold model at residue 148 and aligned each domain separately to DeepTracers prediction. The merged structure is improved as we can see the conformation of the right-side structure resembles more closely to the solved structure.
Figure 4
Figure 4
DeepTracer-Refine improvements in residue coverage (left y-axis, 0–100%) and lDDT score (right y-axis, 0.00–1.00). For each entry, the two left bars are residue coverage and the right two right bars are lDDT. The lighter shades represent initial AlphaFold results and the darker shades are DeepTracer-Refine results, respectively.
Figure 5
Figure 5
Examples of structural improvements after DeepTracer-Refine. In both examples, the solved and DeepTracer structures do not have complete residue coverage. On the other hand, AlphaFold and DeepTracer-Refine structures contain the entire sequence but the backbones are not accurate enough for the residues to be matched so the residue coverage is incomplete as well. (A) The original DeepTracer prediction is accurate in residue coverage and predicted sequence. DeepTracer-Refine split the AlphaFold model at residue 218 and aligned each half correctly. (B) An example when the original DeepTracer is not as accurate in sequence prediction. DeepTracer-Refine improved the encircled region to be more similar with the solved structure.
Figure 6
Figure 6
Top right and top left graphs compare the results of DeepTracer-Refine and Phenix-Refine with AlphaFolds initial predicted structures. DeepTracer-Refine is more effective when the initial AlphaFold prediction is less accurate (residue coverage <80% and lDDT <0.65). Bottom graph compares the run-time performance and it is displayed in logarithmic scale. The average run-time of DeepTracer-Refine is ~8 min while Phenix-Refine is 3 h and 35 min.
Figure 7
Figure 7
An example of a worsened AlphaFold structure after DeepTracer-Refine. This is caused by the missing backbone in the DeepTracer prediction (highlighted in the circle) so the compact domains were unable to be properly aligned.
Figure 8
Figure 8
An example of DeepTracer-Refine’s result with and without the distance checking strategy. DeepTracer-Refine aligned the compact domain of residue 1–41 (top-right, highlighted by the circle) to a helical structure that is similar to it, however it is too far away to be considered as a possible alignment.

References

    1. Assaiya A, Burada AP, Dhingra S, Kumar J. An overview of the recent advances in cryo-electron microscopy for life sciences. Bose K, editor.Emerg Top Life Sci 2021;5(1):151–68. - PubMed
    1. Alberts B, Johnson A, Lewis J. Molecular Biology of the Cell, 4th edn. New York: Garland Science, 2002.
    1. Terashi G, Kihara D. De novo main-chain modeling for EM maps using MAINMAST. Nat Commun 2018;9(1):1618. - PMC - PubMed
    1. Liebschner D, Afonine PV, Baker ML, et al. . Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Cryst D 2019;75(10):861–77. - PMC - PubMed
    1. Li PN, Oliveira SHP, Wakatsuki S, Bedem H.Sequence-guided protein structure determination using graph convolutional and recurrent networks. 2020 IEEE 20th International Conference on Bioinformatics and Bioengineering 2020:122–127.

LinkOut - more resources