This is a preprint.
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion
- PMID: 39764410
- PMCID: PMC11703324
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion
Abstract
We present PepTune, a multi-objective discrete diffusion model for simultaneous generation and optimization of therapeutic peptide SMILES. Built on the Masked Discrete Language Model (MDLM) framework, PepTune ensures valid peptide structures with a novel bond-dependent masking schedule and invalid loss function. To guide the diffusion process, we introduce Monte Carlo Tree Guidance (MCTG), an inference-time multi-objective guidance algorithm that balances exploration and exploitation to iteratively refine Pareto-optimal sequences. MCTG integrates classifier-based rewards with search-tree expansion, overcoming gradient estimation challenges and data sparsity. Using PepTune, we generate diverse, chemically-modified peptides simultaneously optimized for multiple therapeutic properties, including target binding affinity, membrane permeability, solubility, hemolysis, and non-fouling for various disease-relevant targets. In total, our results demonstrate that MCTG for masked discrete diffusion is a powerful and modular approach for multi-objective sequence design in discrete state spaces.
Conflict of interest statement
Competing Interests P.C. is a co-founder of Gameto, Inc. and UbiquiTx, Inc. and advises companies involved in peptide therapeutics development. P.C., S.T., and Y.Z. have and are currently filing patent applications related to this work. P.C.’s interests are reviewed and managed by Duke University in accordance with their conflict-of-interest policies.
Figures




Similar articles
-
ScITree: Scalable Bayesian inference of transmission tree from epidemiological and genomic data.PLoS Comput Biol. 2025 Jun 10;21(6):e1012657. doi: 10.1371/journal.pcbi.1012657. eCollection 2025 Jun. PLoS Comput Biol. 2025. PMID: 40493703 Free PMC article.
-
De Novo Design of Multiple Microplastic-Binding Peptides with a Protein Language Model-Guided Generative Adversarial Network.J Chem Inf Model. 2025 Aug 6. doi: 10.1021/acs.jcim.5c01401. Online ahead of print. J Chem Inf Model. 2025. PMID: 40765481
-
Diffusing on Two Levels and Optimizing for Multiple Properties: A Novel Approach to Generating Molecules With Desirable Properties.IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2050-2063. doi: 10.1109/TCBB.2024.3434461. Epub 2024 Dec 10. IEEE/ACM Trans Comput Biol Bioinform. 2024. PMID: 39058606
-
Physical exercise for people with Parkinson's disease: a systematic review and network meta-analysis.Cochrane Database Syst Rev. 2023 Jan 5;1(1):CD013856. doi: 10.1002/14651858.CD013856.pub2. Cochrane Database Syst Rev. 2023. Update in: Cochrane Database Syst Rev. 2024 Apr 08;4:CD013856. doi: 10.1002/14651858.CD013856.pub3. PMID: 36602886 Free PMC article. Updated.
-
Artificial intelligence for diagnosing exudative age-related macular degeneration.Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2. Cochrane Database Syst Rev. 2024. PMID: 39417312
References
-
- Austin J., Johnson D. D., Ho J., Tarlow D., and Berg R. v. d. Structured denoising diffusion models in discrete state-spaces. Advances in Neural Information Processing Systems, 2021.
-
- Bi Y., Liu L., Lu Y., Sun T., Shen C., Chen X., Chen Q., An S., He X., and Ruan C. e. a. T7 peptide-functionalized peg-plga micelles loaded with carmustine for targeting therapy of glioma. ACS Applied Materials & Interfaces, 8(41):27465–27473, 2016. - PubMed
-
- Brenner M., Johnson A. B., Boespflug-Tanguy O., Rodriguez D., Goldman J. E., and Messing A. Mutations in gfap, encoding glial fibrillary acidic protein, are associated with alexander disease. Nature Genetics, 27(1): 117–120, 2001. - PubMed
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources