Deep diversification of an AAV capsid protein by machine learning
- PMID: 33574611
- DOI: 10.1038/s41587-020-00793-4
Deep diversification of an AAV capsid protein by machine learning
Abstract
Modern experimental technologies can assay large numbers of biological sequences, but engineered protein libraries rarely exceed the sequence diversity of natural protein families. Machine learning (ML) models trained directly on experimental data without biophysical modeling provide one route to accessing the full potential diversity of engineered proteins. Here we apply deep learning to design highly diverse adeno-associated virus 2 (AAV2) capsid protein variants that remain viable for packaging of a DNA payload. Focusing on a 28-amino acid segment, we generated 201,426 variants of the AAV2 wild-type (WT) sequence yielding 110,689 viable engineered capsids, 57,348 of which surpass the average diversity of natural AAV serotype sequences, with 12-29 mutations across this region. Even when trained on limited data, deep neural network models accurately predict capsid viability across diverse variants. This approach unlocks vast areas of functional but previously unreachable sequence space, with many potential applications for the generation of improved viral vectors and protein therapeutics.
Similar articles
-
Prediction of Adeno-Associated Virus Fitness with a Protein Language-Based Machine Learning Model.Hum Gene Ther. 2025 May;36(9-10):823-829. doi: 10.1089/hum.2024.227. Epub 2025 Apr 16. Hum Gene Ther. 2025. PMID: 40241334
-
Systematic multi-trait AAV capsid engineering for efficient gene delivery.Nat Commun. 2024 Aug 4;15(1):6602. doi: 10.1038/s41467-024-50555-y. Nat Commun. 2024. PMID: 39097583 Free PMC article.
-
Mutational analysis of the adeno-associated virus type 2 (AAV2) capsid gene and construction of AAV2 vectors with altered tropism.J Virol. 2000 Sep;74(18):8635-47. doi: 10.1128/jvi.74.18.8635-8647.2000. J Virol. 2000. PMID: 10954565 Free PMC article.
-
Expressing Transgenes That Exceed the Packaging Capacity of Adeno-Associated Virus Capsids.Hum Gene Ther Methods. 2016 Feb;27(1):1-12. doi: 10.1089/hgtb.2015.140. Hum Gene Ther Methods. 2016. PMID: 26757051 Free PMC article. Review.
-
Next-generation AAV vectors for clinical use: an ever-accelerating race.Virus Genes. 2017 Oct;53(5):707-713. doi: 10.1007/s11262-017-1502-7. Epub 2017 Jul 31. Virus Genes. 2017. PMID: 28762205 Review.
Cited by
-
Significance of Artificial Intelligence in the Study of Virus-Host Cell Interactions.Biomolecules. 2024 Jul 26;14(8):911. doi: 10.3390/biom14080911. Biomolecules. 2024. PMID: 39199298 Free PMC article. Review.
-
Enabling technology and core theory of synthetic biology.Sci China Life Sci. 2023 Aug;66(8):1742-1785. doi: 10.1007/s11427-022-2214-2. Epub 2023 Feb 6. Sci China Life Sci. 2023. PMID: 36753021 Free PMC article. Review.
-
Synthetic biology as driver for the biologization of materials sciences.Mater Today Bio. 2021 May 26;11:100115. doi: 10.1016/j.mtbio.2021.100115. eCollection 2021 Jun. Mater Today Bio. 2021. PMID: 34195591 Free PMC article. Review.
-
Rationale and strategies for the development of safe and effective optimized AAV vectors for human gene therapy.Mol Ther Nucleic Acids. 2023 May 17;32:949-959. doi: 10.1016/j.omtn.2023.05.014. eCollection 2023 Jun 13. Mol Ther Nucleic Acids. 2023. PMID: 37293185 Free PMC article. Review.
-
Inference and visualization of complex genotype-phenotype maps with gpmap-tools.bioRxiv [Preprint]. 2025 Mar 15:2025.03.09.642267. doi: 10.1101/2025.03.09.642267. bioRxiv. 2025. PMID: 40161830 Free PMC article. Preprint.
References
-
- Huang, P. S. et al. High thermodynamic stability of parametrically designed helical bundles. Science 346, 481–485 (2014). - DOI
-
- Butterfield, G. L. et al. Evolution of a designed protein assembly encapsulating its own RNA genome. Nature 552, 415–420 (2017). - DOI
-
- Langan, R. A. et al. De novo design of bioactive protein switches. Nature 572, 205–210 (2019). - DOI
-
- Weinreich, D. M., Delaney, N. F., DePristo, M. A. & Hartl, D. L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006). - DOI
-
- Halabi, N., Rivoire, O., Leibler, S. & Ranganathan, R. Protein sectors: evolutionary units of three-dimensional structure. Cell 138, 774–786 (2009). - DOI
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources