. 2025 Aug 18;15(1):30169.

doi: 10.1038/s41598-025-16353-2.

Exploring weighting schemes for the discovery of informative generalized between pathway models to uncover pathways in genetic interaction networks

Kevin M Yu¹, Lenore J Cowen²

Affiliations

¹ Department of Computer Science, Tufts University, Medford, MA, 02155, USA.
² Department of Computer Science, Tufts University, Medford, MA, 02155, USA. cowen@cs.tufts.edu.

PMID: 40825844
PMCID: PMC12361521
DOI: 10.1038/s41598-025-16353-2

Exploring weighting schemes for the discovery of informative generalized between pathway models to uncover pathways in genetic interaction networks

Kevin M Yu et al. Sci Rep. 2025.

. 2025 Aug 18;15(1):30169.

doi: 10.1038/s41598-025-16353-2.

Authors

Kevin M Yu¹, Lenore J Cowen²

Affiliations

¹ Department of Computer Science, Tufts University, Medford, MA, 02155, USA.
² Department of Computer Science, Tufts University, Medford, MA, 02155, USA. cowen@cs.tufts.edu.

PMID: 40825844
PMCID: PMC12361521
DOI: 10.1038/s41598-025-16353-2

Abstract

In S. cerevisiae, a large and rich collection of epistasis data has been collected. When this data comes from double knockouts, it has a natural representation as a signed and weighted graph, where the weight on an edge is computed based on deviation from the expected sickness or health of the double-deletion mutant as compared to its constituent single deletion mutants. Different probabilistic null models (minimum, multiplicative, and logarithmic) to set edge weights appropriately were studied empirically by Mani et al. where the goal was to determine the best weighting scheme for detecting the presence or absence of epistasic effect in an individual double knockout in isolation. On the other hand, approaches such as the LocalCut algorithm of Leiserson et al. look at the entire network, and search for graph-theoretic structure indicative of compensatory pathways. The effect of different edge weighting schemes on the biological pathways returned by algorithms such as LocalCut has not been previously studied. We compare the generalized Between Pathway Models produced by LocalCut under multiple different ways of calculating edge weights, and analyze the resulting collections of putative redundant pathways that are produced. We recover some known pathways, find some interesting new pathways as well as give broad recommendations for how to set the parameters of LocalCut to produce the most biologically relevant gene sets.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

**Fig. 1**
(A) Example BPM graph motif in the original Kelley and Ideker (unweighted) setting. Here nodes a-e form compensatory pathway 1 and f-h compensatory pathway 2; completing either pathway 1 or 2 is necessary for the cell to be viable. Solid lines are physical interaction edges, and dashed lines are synthetic lethality edges. Note that pathway 1 also has internal redundancy (only one of paralogs b and c is necessary to complete pathway 1) so that interactions between b or c and the nodes of pathway 2 is not lethal. We have colored the true synthetic lethality relation from g to e red to represent that perhaps this pair wasn’t tested and so that edge is missing from our data. (B) In the weighted setting, we consider genetic interaction edges only. Edges have both positive (purple) and negative (black) genetic interaction weights, with strength of line indicating relative magnitude of the edge weight. Within each pathway we have a near clique of low-weight positive interaction edges. The red edge indicates that the edge from e to c is erroneously reported as missing because it was not tested or below the noise threshold. In this setting, the pattern of negative interaction edges shows deleting a, d, or e together with a pathway 2 gene gives a synthetic growth defect (negative weight of smaller magnitude than the lethality edges in A), but deleting b or c and a pathway 2 gene gives growth similar to wildtype.

**Fig. 2**
When the weights are squared, the number of gBPM modules increases as the consistency parameter c is reduced from 90 to 70. Min produces the fewest gBPM modules, and almost none when not squared. When the weights are not squared, there are a larger number of gBPM modules with and 80 for both mult and log, but many fewer gBPM modules than result from squaring the weights when .

formula image — **Fig. 2**
When the weights are squared, the number of gBPM modules increases as the consistency parameter c is reduced from 90 to 70. Min produces the fewest gBPM modules, and almost none when not squared. When the weights are not squared, there are a larger number of gBPM modules with and 80 for both mult and log, but many fewer gBPM modules than result from squaring the weights when .

**Fig. 3**
The proportion of enriched modules is consistently improved with squared weights for each of mult, min and log weighting schemes. Discounting min not squared, which produces at most 4 modules, min (squared) has the largest proportion gBPMs with both pathways enriched for some known function (blue plus orange bars). On the other hand, mult (squared) with has the fewest proportion of gBPMs with neither module enriched for a known function (red bar): over 90% of the modules returned by this method have at least one component pathway functionally enriched.

**Fig. 4**
Discounting min (not squared) which only produces 4 modules total, only mult (squared) at and min (squared) at and produce module collections from their gBPMs that are over 90% enriched for known function. All weights and c values we tested produced module collections that were at least 50% enriched for known function. Squaring the weights for multi and log at the same c value improves the proportion of enriched modules across the board.

**Fig. 5**
Average module size across all choices of weighting schemes, squaring and c parameter lie between 4 and 12 genes. Unsurprisingly, for all weighting schemes gives the smallest average module size (again ignoring min (not squared) which produces very few gBPMs total).

**Fig. 6**
The gBPM component pathways produced by multi with squared weights and correlate significantly more often with gene expression than random genesets.

**Fig. 7**
The gBPM component pathways produced by multi with squared weights and correlate significantly more often with gene expression than random genesets.

**Fig. 8**
The gBPM component pathways produced by multi with unsquared weights and look very close to random genesets in terms of their coexpression correlation.

See this image and copyright information in PMC

References

1. Winzeler, E. A. et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 285, 901–906 ( 1999). - PubMed
1. Tong, A. H. Y. et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science294, 2364–2368 (2001). - PubMed
1. Kelley, R. & Ideker, T. Systematic interpretation of genetic interactions using protein networks. Nat. Biotechnol.23, 561–566 (2005). - PMC - PubMed
1. Brady, A., Maxwell, K., Daniels, N. & Cowen, L. J. Fault tolerance in protein interaction networks: stable bipartite subgraphs and redundant pathways. PLoS ONE4, e5364 (2009). - PMC - PubMed
1. Ulitsky, I. & Shamir, R. Pathway redundancy and protein essentiality revealed in the Saccharomyces cerevisiae interaction networks. Mol. Sys. Bio.3, 104 (2007). - PMC - PubMed

Grants and funding

2149871/National Science Foundation

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Exploring weighting schemes for the discovery of informative generalized between pathway models to uncover pathways in genetic interaction networks

Affiliations

Exploring weighting schemes for the discovery of informative generalized between pathway models to uncover pathways in genetic interaction networks

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

Grants and funding

LinkOut - more resources

Full Text Sources