Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr;90(4):959-972.
doi: 10.1002/prot.26288. Epub 2021 Dec 13.

Scaling-up a fragment-based protein-protein interaction method using a human reference interaction set

Affiliations

Scaling-up a fragment-based protein-protein interaction method using a human reference interaction set

Stephanie Schaefer-Ramadan et al. Proteins. 2022 Apr.

Abstract

Protein-protein interactions (PPIs) are essential in understanding numerous aspects of protein function. Here, we significantly scaled and modified analyses of the recently developed all-vs-all sequencing (AVA-Seq) approach using a gold-standard human protein interaction set (hsPRS-v2) containing 98 proteins. Binary interaction analyses recovered 20 of 47 (43%) binary PPIs from this positive reference set (PRS), comparing favorably with other methods. However, the increase of 20× in the interaction search space for AVA-Seq analysis in this manuscript resulted in numerous changes to the method required for future use in genome-wide interaction studies. We show that standard sequencing analysis methods must be modified to consider the possible recovery of thousands of positives among millions of tested interactions in a single sequencing run. The PRS data were used to optimize data scaling, auto-activator removal, rank interaction features (such as orientation and unique fragment pairs), and statistical cutoffs. Using these modifications to the method, AVA-Seq recovered >500 known and novel PPIs, including interactions between wild-type fragments of tumor protein p53 and minichromosome maintenance complex proteins 2 and 5 (MCM2 and MCM5) that could be of interest in human disease.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests.

Figures

FIGURE 1
FIGURE 1
Method schematic. (A) PRS Batch 1 (39 proteins) and Batch 2 (41 proteins) were treated as separate experiments and processed in parallel (Supplementary Table 1). First, the proteins were pooled, sheared, size selected, and ligated into pBORF‐AD and pBORF‐DBD. After selecting the open reading frame (ORF), fragments were amplified, “stitched” together using overlap extension PCR, and ligated into pAVA for screening. For each PRS batch, two separate screenings (A and B) were conducted, and the data generated were pooled during analysis. (B) Data analysis for Batch 1 and 2 was performed identically but separately since the protein pools are unique. Here, a graphical representation of criteria used for analysis along with several recovered PPIs is shown. For each batch, the expected binary interactions were determined (Table 1 and Supplementary Table 2), and a cumulative table of all‐vs‐all interactions (Batch 1 and 2) were populated (Supplementary Table 3). Batch 1 and 2 included an additional nine RRS proteins for control. Different FDR and logFC requirements were utilized at other steps of the data analysis process. These steps are color‐coded, with blue being the least stringent and orange being the most rigorous criteria to define an interaction. DBD, DNA‐binding domains; PCR, polymerase chain reaction; PRS, positive reference set; RRS, random reference set
FIGURE 2
FIGURE 2
Heat maps of gene coverage. (A) Positive reference set (PRS) Batch 1 (39 × 39 proteins). (B) PRS Batch 2 (41 × 41proteins). Color scale indicates percent gene coverage in a specific orientation (AD or DBD associated), with 1 being 100% coverage of the protein interaction space and 0 representing 0% coverage. Random reference set (RRS) proteins are not included. AD, Activation domains; DBD, DNA‐binding domains
FIGURE 3
FIGURE 3
Influence of protein length versus interaction on the PRS protein pairs. This study utilizes 47 pairs of proteins known to interact (a subset of the hsPRS‐v2 library). This figure characterizes these well‐studied positive reference interactions in the context of the AVA‐Seq method. (A) Individual protein length in amino acids of proteins used in this study categorized into expected interaction not detected (blue; mean 245.7; n = 27) or expected interaction detected (red; mean 434.7; n = 20; t = 4.524, df = 45). p value < .0001 indicated. (B) The minimum number of relative fragment starting points divided by protein length in amino acids versus expected interaction not detected (blue; mean 0.03088; n = 27) or expected interaction detected (red; mean 0.1211; n = 20; t = 5.689; df = 45). p value < .0001 indicated. (C) The number of protein fragments per protein length (in amino acids) plotted against the minimum protein length in the expected interacting pair. Blue dots represent expected interaction not detected, and red dots represent expected interaction detected. AVA‐Seq, All‐vs‐all sequencing; PRS, positive reference set
FIGURE 4
FIGURE 4
Selectivity of fragment interaction. Panels A and B illustrate the selectivity of the interacting fragments between HGS and NF2 genes. The blue traces (A and B) represent the number of screened fragments (left y‐axis) versus fragment start point, while the red traces (A and B) represent interacting fragments (right y‐axis) versus fragment start point. The gray shaded regions in A and B highlight the expected interaction region of HGS with NF2 from the literature. Panels C and D illustrate the fragment pairings between HGS and NF2 along with logFC and FDR, respectively. (E) The average fragment distance in amino acids (aa) plotted against the average protein length. Protein fragments utilized in this plot were associated with proteins that had at least two interacting start points fragments with at least one other interacting partner. The average distance of interacting starting points was then computed. (F) Paired t test for data in the panel (t = 10.84; df = 40). FC, Fold change; FDR, false discovery rate
FIGURE 5
FIGURE 5
Overlay of selected protein fragments with MCM3. Top: One representative trace (shown in black) is shown with reference to the left y‐axis. The location of interacting fragments and abundance from selected proteins (colored lines) are shown with reference to the right y‐axis. BstX1 restriction sites are indicated with a black arrow (residues 165, 302, 335, and 399). MCM3 phosphorylation sites include S112, S160, T198, S292, T383, S535, S672, T674, S711, T722, and S728 (iupred.elte.hu). Bottom: IUPred score is shown for the primary sequence of MCM3. A score closer to 1.0 indicates a region of high disorder, and a score closer to 0.0 indicates less disorder. For simplicity, the x‐axis for the bottom graph uses the same amino acid numbering as the top graph

Similar articles

Cited by

References

    1. Miura K. An overview of current methods to confirm protein‐protein interactions. Protein Pept Lett. 2018;25:728‐733. - PMC - PubMed
    1. Fields S, Song OK. A novel genetic system to detect protein‐protein interactions. Nature. 1989;340:245‐246. doi:10.1038/340245a0 - DOI - PubMed
    1. Li S, Armstrong CM, Bertin N, et al. A map of the Interactome network of the Metazoan C. elegans. Science. 2004;303:540‐543. - PMC - PubMed
    1. Rolland T, Taşan M, Charloteaux B, et al. A proteome‐scale map of the human interactome network. Cell. 2014;159:1212‐1226. - PMC - PubMed
    1. Luck K, Kim DK, Lambourne L, et al. A reference map of the human binary protein interactome. Nature. 2020;580:402‐408. doi:10.1038/s41586-020-2188-x - DOI - PMC - PubMed

Publication types