Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar 14;11(Suppl 2):8.
doi: 10.1186/s12918-017-0388-2.

Prior knowledge guided active modules identification: an integrated multi-objective approach

Affiliations

Prior knowledge guided active modules identification: an integrated multi-objective approach

Weiqi Chen et al. BMC Syst Biol. .

Abstract

Background: Active module, defined as an area in biological network that shows striking changes in molecular activity or phenotypic signatures, is important to reveal dynamic and process-specific information that is correlated with cellular or disease states.

Methods: A prior information guided active module identification approach is proposed to detect modules that are both active and enriched by prior knowledge. We formulate the active module identification problem as a multi-objective optimisation problem, which consists two conflicting objective functions of maximising the coverage of known biological pathways and the activity of the active module simultaneously. Network is constructed from protein-protein interaction database. A beta-uniform-mixture model is used to estimate the distribution of p-values and generate scores for activity measurement from microarray data. A multi-objective evolutionary algorithm is used to search for Pareto optimal solutions. We also incorporate a novel constraints based on algebraic connectivity to ensure the connectedness of the identified active modules.

Results: Application of proposed algorithm on a small yeast molecular network shows that it can identify modules with high activities and with more cross-talk nodes between related functional groups. The Pareto solutions generated by the algorithm provides solutions with different trade-off between prior knowledge and novel information from data. The approach is then applied on microarray data from diclofenac-treated yeast cells to build network and identify modules to elucidate the molecular mechanisms of diclofenac toxicity and resistance. Gene ontology analysis is applied to the identified modules for biological interpretation.

Conclusions: Integrating knowledge of functional groups into the identification of active module is an effective method and provides a flexible control of balance between pure data-driven method and prior information guidance.

Keywords: Active module identification; Multi-objective evolutionary algorithm; Prior knowlege.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
BUM model estimation on p-values in network 1. Left figure is a histogram of p-values with fitted beta-uniform-mixture model distribution. Blue line indicates the uniformly distributed noises and red line the signals as beta distribution B(a,1). Right figure is a Q-Q plot of the fitted distribution versus the empirical p-values for network 1
Fig. 2
Fig. 2
Network 1 with active modules detected by jActiveModule. Each node denotes for one gene. Node color is a continuous mapping of the p-value generated from differential expression analysis. Red color indicates a significant change with small p-value and green color means no significant difference. The point where color will change between red and green is set to the threshold τ=1.76×10−4 that is used as a parameter for the proposed algorithm. Color of nodes near the changing point is white. Modules identified by jActiveModule are highlighted with black node border. Modules may overlap with each other
Fig. 3
Fig. 3
Visualization of Module 1 with maximized active score S A detected by the proposed algorithm in network 1. Node color and border are set the same as Fig. 2. Module contains the majority of red nodes that are connected densely, indicating high activity. Notice that compared to small separated modules identified by jActiveModule shown in Fig. 2, this module tends to connect small areas of red nods by including linkage nodes with white or light green color. Although these intermediate nodes shows only modest changes in expression, they serve as bridges for cross-talk between functional groups, or as transcription factors that regulate other genes
Fig. 4
Fig. 4
Visualization of Module 2 which is the knee point of the Pareto front with optimal trade-off between S A and R A detected by the proposed algorithm in network 1. Node color and border are set the same as Fig. 2. Compared to Fig. 3, this module expands broader as R A gets higher
Fig. 5
Fig. 5
BUM model estimation on network 2. Histogram of p-values with fitted BUM model and a Q-Q plot of estimated and empirical distribution of p-values for network 2. As the network size increases, estimation becomes more accurate
Fig. 6
Fig. 6
Visualization of module 3 identified by the proposed algorithm in network 2. Each node represents for a gene. The setting for node color is the same with Fig. 2. The turning point between red and green is set to the value τ=7.71×10−6. Three rectangle shaped nodes with black border are genes involved in drug export and are highly consistent in all modules. YDR406W is an ATP-binding cassette multidrug transporter. YDR011W is a ATP-binding cassette transporter. YOR153W is also an ATP-binding cassette multidrug transporter. The three genes serve as an important role in yeast’s resistance to diclofenac

References

    1. Allison DB, Cui X, Page GP, Sabripour M. Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006;7(1):55–65. doi: 10.1038/nrg1749. - DOI - PubMed
    1. Barabasi AL, Oltvai ZN. Network biology: understanding the cell’s functional organization. Nat Rev Genet. 2004;5(2):101–13. doi: 10.1038/nrg1272. - DOI - PubMed
    1. Barabási AL, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12(1):56–68. doi: 10.1038/nrg2918. - DOI - PMC - PubMed
    1. Gross AM, Ideker T. Molecular networks in context. Nat Biotechnol. 2015;33(7):720–1. doi: 10.1038/nbt.3283. - DOI - PubMed
    1. Liu Y, Tennant DA, Zhu Z, Heath JK, Yao X, He S. Dime: a scalable disease module identification algorithm with application to glioma progression. PloS ONE. 2014;9(2):86693. doi: 10.1371/journal.pone.0086693. - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources