Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr 7;11(4):e1004074.
doi: 10.1371/journal.pcbi.1004074. eCollection 2015 Apr.

Machine learning assisted design of highly active peptides for drug discovery

Affiliations

Machine learning assisted design of highly active peptides for drug discovery

Sébastien Giguère et al. PLoS Comput Biol. .

Abstract

The discovery of peptides possessing high biological activity is very challenging due to the enormous diversity for which only a minority have the desired properties. To lower cost and reduce the time to obtain promising peptides, machine learning approaches can greatly assist in the process and even partly replace expensive laboratory experiments by learning a predictor with existing data or with a smaller amount of data generation. Unfortunately, once the model is learned, selecting peptides having the greatest predicted bioactivity often requires a prohibitive amount of computational time. For this combinatorial problem, heuristics and stochastic optimization methods are not guaranteed to find adequate solutions. We focused on recent advances in kernel methods and machine learning to learn a predictive model with proven success. For this type of model, we propose an efficient algorithm based on graph theory, that is guaranteed to find the peptides for which the model predicts maximal bioactivity. We also present a second algorithm capable of sorting the peptides of maximal bioactivity. Extensive analyses demonstrate how these algorithms can be part of an iterative combinatorial chemistry procedure to speed up the discovery and the validation of peptide leads. Moreover, the proposed approach does not require the use of known ligands for the target protein since it can leverage recent multi-target machine learning predictors where ligands for similar targets can serve as initial training data. Finally, we validated the proposed approach in vitro with the discovery of new cationic antimicrobial peptides. Source code freely available at http://graal.ift.ulaval.ca/peptide-design/.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Illustration of the 3-partite graph G hy with k = 3 and a two letters alphabet .
In this graph, every source-sink path represent a peptide of size 5 (l = n + k − 1) based on the alphabet {A, B}.
Figure 2
Figure 2. Iterative process for the design of peptide ligands.
Figure 3
Figure 3. The 100,000 peptides with highest antimicrobial activity found by the K-longest path algorithm.
Figure 4
Figure 4. Correlation coefficient of h random predictions on the CAMPs data while varying R, the number of random peptides used as training set.
Figure 5
Figure 5. CAMP bioactivity motifs.
Top motif: the best 1,000 peptides obtained from the oracle. Middle motif: the best 1,000 peptides obtained from h random. Bottom motif: the best 1,000 out of 1,000,000 random peptides.

Similar articles

Cited by

References

    1. Mee R, Auton T, Morgan P (1997) Design of active analogues of a 15-residue peptide using d-optimal design, qsar and a combinatorial search algorithm. The Journal of peptide research 49: 89–102. 10.1111/j.1399-3011.1997.tb01125.x - DOI - PubMed
    1. Furka A, SEBESTYÉN F, ASGEDOM M, DIBÓ G (1991) General method for rapid synthesis of multicomponent peptide mixtures. International journal of peptide and protein research 37: 487–493. 10.1111/j.1399-3011.1991.tb00765.x - DOI - PubMed
    1. Houghten RA, Pinilla C, Blondelle SE, Appel JR, Dooley CT, et al. (1991) Generation and use of synthetic peptide combinatorial libraries for basic research and drug discovery. Nature 354: 84–86. 10.1038/354084a0 - DOI - PubMed
    1. Lam KS, Salmon SE, Hersh EM, Hruby VJ, Kazmierski WM, et al. (1991) A new type of synthetic peptide library for identifying ligand-binding activity. Nature 354: 82–84. 10.1038/354082a0 - DOI - PubMed
    1. Latacz G, Pekala E, Ciopinska A, Kiec-Kononowicz K (2006) Unnatural d-amino acids as building blocks of new peptidomimetics. Acta Poloniae Pharmaceutica–Drug Research 62: 430–433. - PubMed

Publication types

MeSH terms