in silico Peptide Design: Comparison of Similarity-based and Machine learning-based Scorings

Published on: April 3, 2024

Rational and reliable in silico design of a peptide with a high affinity for a specific target protein has been an intense field of research. Natural peptides only possess 20 amino acids and one may imagine that in silico sequence generation would a trivial task with machine learning. However, as the prediction of protein-protein interactions is non-trivial, that of protein-peptide interaction is also difficult and still a reasonably applicable predictor is not precent.

Here is an interesting paper which compared three sequence-based PPIs predictors to figure the potential use for in silico peptide design.¹⁾ General PPI prediction algorithm can evaluate the tendency of interaction between proteins of interest by score. They applied the algorithms for protein-peptide interaction so as to apply it for in silico peptide design. This research is something related to PepMetics®, which can mimic the sequence of a particular peptide to inhibit or modulate the target PPI. Sequence-based in silico design of peptides for the target protein would open up a wide possibility of PepMetics®.

The authors compared one similarity-based scoring, SPRINT²⁾, and two machine learning-based ones, PIPR³⁾ and D-SCRIPT⁴⁾. The evaluation was conducted by one-to-all curve to visualize the interaction landscape since it allows prediction of relative interactions scores of all proteins in the target organism or the organ for each peptide. One-to-all curve was applied for PPI evaluation and miRNA target prediction.⁵⁾,⁶⁾ In general, a peptide whose target protein is outranked by lots of off-target proteins on the one-to-all curve won’t be a drug candidate.

FDA-approved peptide drugs with the length of 20 or more were assessed for the comparison of three sequence-based predictors. Their assessment was conducted in an optimistic scenario and a pessimistic one. The optimistic scenario included the endogenous equivalents of these peptides were included. But in the pessimistic excluded them based on the BLAST-searched sequence similarity so as not to know the model to know the answer of the test.

As a result, SPRINT outperformed other machine learning-based models. The authors counted the FDA-approved peptides in the top 1%. SPRINT achieved 6 compounds prediction in the pessimistic scenario. No therapeutic peptide was predicted by PIPR or D-SCRIPT. PIPR could have a chance for improvement because it predicted two peptides in the optimistic scenario.

SPRINT, a similarity-based scoring method, is not the answer for in silico design of peptides for PPIs. The author demonstrated de novo design of peptides and evaluated the quality. The result indicated the difficulty in application for screening. It is due to the sensitivity of SPRINT for small changes of the training data, although the success rate was better than other two.

Similarity-based PPI predictors of utilizing just the sequence of amino acids shows, at least now, better results in prediction protein-peptide interactions. There are still limitations on de novo design of suitable peptides but it is worth trying the prediction for rational design of the peptide for PPI perturbation.

It would be desirable if protein-peptide interaction-specific algorism is developed for peptide-based PPIs modulation. PepMetics® would matches the concept of this concept and we’d be eager to convey trials like this research to achieve faster target-small molecule prediction.

1) https://doi.org/10.1038/s41598-022-13227-9
2) https://doi.org/10.1186/s12859-017-1871-x
3) https://doi.org/10.1093/bioinformatics/btz328
4) https://doi.org/10.1101/2021.01.22.427866
5) https://doi.org/10.1038/s41598-018-30044-1
6) https://doi.org/10.1038/s41598-020-68251-4

Blog

PRISM is open to all levels of scientific discussion. Our scientists would love to speak with you!