Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb;32(2):e4566.
doi: 10.1002/pro.4566.

ScrepYard: An online resource for disulfide-stabilized tandem repeat peptides

Affiliations

ScrepYard: An online resource for disulfide-stabilized tandem repeat peptides

Junyu Liu et al. Protein Sci. 2023 Feb.

Abstract

Receptor avidity through multivalency is a highly sought-after property of ligands. While readily available in nature in the form of bivalent antibodies, this property remains challenging to engineer in synthetic molecules. The discovery of several bivalent venom peptides containing two homologous and independently folded domains (in a tandem repeat arrangement) has provided a unique opportunity to better understand the underpinning design of multivalency in multimeric biomolecules, as well as how naturally occurring multivalent ligands can be identified. In previous work, we classified these molecules as a larger class termed secreted cysteine-rich repeat-proteins (SCREPs). Here, we present an online resource; ScrepYard, designed to assist researchers in identification of SCREP sequences of interest and to aid in characterizing this emerging class of biomolecules. Analysis of sequences within the ScrepYard reveals that two-domain tandem repeats constitute the most abundant SCREP domain architecture, while the interdomain "linker" regions connecting the functional domains are found to be abundant in amino acids with short or polar sidechains and contain an unusually high abundance of proline residues. Finally, we demonstrate the utility of ScrepYard as a virtual screening tool for discovery of putatively multivalent peptides, by using it as a resource to identify a previously uncharacterized serine protease inhibitor and confirm its predicted activity using an enzyme assay.

Keywords: SCREPs; bioactive; bivalent; disulfide-rich; multivalent; peptide; secreted proteins; tandem-repeat.

PubMed Disclaimer

Conflict of interest statement

The authors have no conflicts of interest to declare.

Figures

FIGURE 1
FIGURE 1
Flowchart outlining the construction of the ScrepYard database. The process can be divided into three stages, (1) SCREP datamining (green), (2) SCREP architecture annotation (purple), and (3) the compiling and upload of SCREP data to ScrepYard (orange). Key processes for each step shown on the right.
FIGURE 2
FIGURE 2
Distribution of SCREP architectures and linker analysis of two‐domain SCREPs. (a) The inner circle demonstrates the two major clusters of SCREPs; “InterProScan‐identified” (SCREPs with predicted domain types) and “unknown architectures” (SCREPs with unknown domain types). The outer circle demonstrates the different architecture types; unknown architectures (50.7%), pure domain repeats (PDR) (37.6%), and combinatorial domain repeats (CDR) (11.6%), dividing the PDR's and CDR's into a distribution based on the length of repeating domains. (b) The frequency distribution of linker lengths within the TR1‐TR2 dataset. Most linkers are <20 AAs in length (85.57%), with the remaining linkers (14.43%) extending between 20 and 100 AAs. (c) A heat map of amino acid composition for all known two‐domain SCREP for linker lengths between 1–20 AAs. AAs are sorted left‐to‐right in order of decreasing side hydrophobicity (Monera et al., 1995). (d) A grouped heatmap displaying the abundance of domain specific linker lengths in bacteria, fungi, plants, and metazoans. Each kingdom contains the four highest occurring domain types, with the frequency of each linker length displayed. The coloring indicates the relative level of abundance for each domain type
FIGURE 3
FIGURE 3
Sequence based identification, NMR confirmation of structural order and trypsin inhibition assay of d‐Gs1a (A0A098LW49). (a) Alignment between Kalicludine‐3 with each domain of d‐Gs1a. Conserved residues between Kalicludine‐3 and d‐Gs1a are highlighted in red, while cysteines are highlighted in yellow. (b) 1D 1H‐NMR spectrum of d‐Gs1a demonstrating well resolved and dispersed signal within the NH region, a characteristic feature of a well‐defined globular fold. (c) Trypsin assay in the presence of d‐Gs1a (0.1 μM and 0.25 μM) demonstrating inhibition of digestion of a trypsin substrate which fluoresces upon enzymatic cleavage (increased absorbance correlates with enzyme activity). All trypsin assays were performed in triplicate with 0.5 μM trypsin.

References

    1. Armenteros JJA, Tsirigos KD, Sonderby CK, Petersen TN, Winther O, Brunak S, et al. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat Biotechnol. 2019;37:420–3. - PubMed
    1. Arner ES, Holmgren A. Physiological functions of thioredoxin and thioredoxin reductase. Eur J Biochem. 2000;267:6102–9. - PubMed
    1. Bae C, Anselmi C, Kalia J, Jara‐Oseguera A, Schwieters CD, Krepkiy D, et al. Structural insights into the mechanism of activation of the TRPV1 channel by a membrane‐bound tarantula toxin. Elife. 2016;5. - PMC - PubMed
    1. Baghshani H, Abadi MS. Thiosulphate: cyanide sulphur transferase activity in some species of helminth parasites. J Parasit Dis. 2014;38:181–4. - PMC - PubMed
    1. Bairoch A, Apweiler R, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. The universal protein resource (UniProt). Nucleic Acids Res. 2005;33:D154–9. - PMC - PubMed

Publication types