Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 18;15(1):30169.
doi: 10.1038/s41598-025-16353-2.

Exploring weighting schemes for the discovery of informative generalized between pathway models to uncover pathways in genetic interaction networks

Affiliations

Exploring weighting schemes for the discovery of informative generalized between pathway models to uncover pathways in genetic interaction networks

Kevin M Yu et al. Sci Rep. .

Abstract

In S. cerevisiae, a large and rich collection of epistasis data has been collected. When this data comes from double knockouts, it has a natural representation as a signed and weighted graph, where the weight on an edge is computed based on deviation from the expected sickness or health of the double-deletion mutant as compared to its constituent single deletion mutants. Different probabilistic null models (minimum, multiplicative, and logarithmic) to set edge weights appropriately were studied empirically by Mani et al. where the goal was to determine the best weighting scheme for detecting the presence or absence of epistasic effect in an individual double knockout in isolation. On the other hand, approaches such as the LocalCut algorithm of Leiserson et al. look at the entire network, and search for graph-theoretic structure indicative of compensatory pathways. The effect of different edge weighting schemes on the biological pathways returned by algorithms such as LocalCut has not been previously studied. We compare the generalized Between Pathway Models produced by LocalCut under multiple different ways of calculating edge weights, and analyze the resulting collections of putative redundant pathways that are produced. We recover some known pathways, find some interesting new pathways as well as give broad recommendations for how to set the parameters of LocalCut to produce the most biologically relevant gene sets.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
(A) Example BPM graph motif in the original Kelley and Ideker (unweighted) setting. Here nodes a-e form compensatory pathway 1 and f-h compensatory pathway 2; completing either pathway 1 or 2 is necessary for the cell to be viable. Solid lines are physical interaction edges, and dashed lines are synthetic lethality edges. Note that pathway 1 also has internal redundancy (only one of paralogs b and c is necessary to complete pathway 1) so that interactions between b or c and the nodes of pathway 2 is not lethal. We have colored the true synthetic lethality relation from g to e red to represent that perhaps this pair wasn’t tested and so that edge is missing from our data. (B) In the weighted setting, we consider genetic interaction edges only. Edges have both positive (purple) and negative (black) genetic interaction weights, with strength of line indicating relative magnitude of the edge weight. Within each pathway we have a near clique of low-weight positive interaction edges. The red edge indicates that the edge from e to c is erroneously reported as missing because it was not tested or below the noise threshold. In this setting, the pattern of negative interaction edges shows deleting a, d, or e together with a pathway 2 gene gives a synthetic growth defect (negative weight of smaller magnitude than the lethality edges in A), but deleting b or c and a pathway 2 gene gives growth similar to wildtype.
Fig. 2
Fig. 2
When the weights are squared, the number of gBPM modules increases as the consistency parameter c is reduced from 90 to 70. Min produces the fewest gBPM modules, and almost none when not squared. When the weights are not squared, there are a larger number of gBPM modules with formula image and 80 for both mult and log, but many fewer gBPM modules than result from squaring the weights when formula image.
Fig. 3
Fig. 3
The proportion of enriched modules is consistently improved with squared weights for each of mult, min and log weighting schemes. Discounting min not squared, which produces at most 4 modules, min (squared) has the largest proportion gBPMs with both pathways enriched for some known function (blue plus orange bars). On the other hand, mult (squared) with formula image has the fewest proportion of gBPMs with neither module enriched for a known function (red bar): over 90% of the modules returned by this method have at least one component pathway functionally enriched.
Fig. 4
Fig. 4
Discounting min (not squared) which only produces 4 modules total, only mult (squared) at formula image and min (squared) at formula image and formula image produce module collections from their gBPMs that are over 90% enriched for known function. All weights and c values we tested produced module collections that were at least 50% enriched for known function. Squaring the weights for multi and log at the same c value improves the proportion of enriched modules across the board.
Fig. 5
Fig. 5
Average module size across all choices of weighting schemes, squaring and c parameter lie between 4 and 12 genes. Unsurprisingly, formula image for all weighting schemes gives the smallest average module size (again ignoring min (not squared) which produces very few gBPMs total).
Fig. 6
Fig. 6
The gBPM component pathways produced by multi with squared weights and formula image correlate significantly more often with gene expression than random genesets.
Fig. 7
Fig. 7
The gBPM component pathways produced by multi with squared weights and formula image correlate significantly more often with gene expression than random genesets.
Fig. 8
Fig. 8
The gBPM component pathways produced by multi with unsquared weights and formula image look very close to random genesets in terms of their coexpression correlation.

Similar articles

References

    1. Winzeler, E. A. et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 285, 901–906 ( 1999). - PubMed
    1. Tong, A. H. Y. et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science294, 2364–2368 (2001). - PubMed
    1. Kelley, R. & Ideker, T. Systematic interpretation of genetic interactions using protein networks. Nat. Biotechnol.23, 561–566 (2005). - PMC - PubMed
    1. Brady, A., Maxwell, K., Daniels, N. & Cowen, L. J. Fault tolerance in protein interaction networks: stable bipartite subgraphs and redundant pathways. PLoS ONE4, e5364 (2009). - PMC - PubMed
    1. Ulitsky, I. & Shamir, R. Pathway redundancy and protein essentiality revealed in the Saccharomyces cerevisiae interaction networks. Mol. Sys. Bio.3, 104 (2007). - PMC - PubMed

LinkOut - more resources