Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr 19;8(4):e61183.
doi: 10.1371/journal.pone.0061183. Print 2013.

The index-based subgraph matching algorithm (ISMA): fast subgraph enumeration in large networks using optimized search trees

Affiliations

The index-based subgraph matching algorithm (ISMA): fast subgraph enumeration in large networks using optimized search trees

Sofie Demeyer et al. PLoS One. .

Abstract

Subgraph matching algorithms are designed to find all instances of predefined subgraphs in a large graph or network and play an important role in the discovery and analysis of so-called network motifs, subgraph patterns which occur more often than expected by chance. We present the index-based subgraph matching algorithm (ISMA), a novel tree-based algorithm. ISMA realizes a speedup compared to existing algorithms by carefully selecting the order in which the nodes of a query subgraph are investigated. In order to achieve this, we developed a number of data structures and maximally exploited symmetry characteristics of the subgraph. We compared ISMA to a naive recursive tree-based algorithm and to a number of well-known subgraph matching algorithms. Our algorithm outperforms the other algorithms, especially on large networks and with large query subgraphs. An implementation of ISMA in Java is freely available at http://sourceforge.net/projects/isma/.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. The motif adjacency Matrix.
In this article, motifs are denoted by a motif specification which can be deduced from its adjacency matrix as indicated by the red arrow.
Figure 2
Figure 2. Examples of motifs and their specifications.
Here nodes are denoted by their index. Look at for example the motif AAAB000B0A00BAA. Its motif specification can be deduced as follows: a directed link of type formula image from node 1 to node 2, a directed link of type formula image from node 1 to node 3, a directed link of type formula image from node 2 to node 3, a directed link of type formula image from node 1 to node 4, no link between node 2 and node 4, no link between node 3 and node 4, no link between node 1 and node 5, a directed link of type formula image from node 2 to node 5, no link between node 3 and node 5, a directed link of type formula image from node 4 to node 5, no link between node 1 and node 6, no link between node 2 and node 6, a directed link of type formula image from node 3 to node 6, a directed link of type formula image from node 4 to node 6, a directed link of type formula image from node 5 to node 6.
Figure 3
Figure 3. Example motif and network.
The motif (left) which is searched for in the example network (right). Links of type A or B are directed; links of type C are undirected.
Figure 4
Figure 4. Search tree.
The search tree of the ISMA algorithm (left) and the standard recursive algorithm (right) applied to the example network. A search tree indicates which network nodes have been mapped on the motif nodes.
Figure 5
Figure 5. Data Structures.
(a) The checklist. In the ISMA algorithm, the circles represent motif nodes (b) The motif iterator. (c) The priority queue map. (d) The priority object. It is assumed that the motif has k nodes.
Figure 6
Figure 6. Example of data structures.
We are looking for a 4-node motif. In the motif instance (a) network node 9 is mapped on motif node 2 and network node 5 is mapped on motif node 4. These motif nodes were added (in the correct order) to the chosen list of the checklist (b), while the other two motif nodes (1 and 3) remain in the rest set. The motif iterator (c) contains two iterators that are of importance, namely the ones for the motif nodes of the chosen list. These iterate over the possible network nodes that can be mapped on the motif nodes. The priorityqueuemap (d) only contains valuable priority queues for the motif nodes 1 and 3. Each priority queue contains a priorityobject for each network node that is already mapped in the instance.
Figure 7
Figure 7. Examples of reflection and cyclic rotation symmetries.
The motif on the left has a reflection symmetry between nodes 2 and 3. The motif on the right has a cyclic rotation symmetry between the three nodes.
Figure 8
Figure 8. Reflection symmetry.
Enumeration of all possibilities to map 5 network nodes on 2 reflection symmetric motif nodes. The squares represent motif nodes, the circles represent network nodes. Once a network node has been mapped on a motif node that is part of a reflection symmetry, it will never be mapped on one of the other nodes of the symmetry.
Figure 9
Figure 9. Cyclic rotation symmetry.
Enumeration of all possibilities to map 5 network nodes on 3 motif nodes that are part of a cyclic rotation. The squares represent motif nodes, the circles represent network nodes. Once a network node is mapped on the ‘first’ motif of the symmetry, it will never be mapped on the other nodes of the symmetry. For the other motif nodes of the symmetry, all possibilities still need to be explored.
Figure 10
Figure 10. Execution times.
Comparison of the execution times (in ms) of the RSMA and the ISMA algorithm for finding 3-node motifs in the PGS network.
Figure 11
Figure 11. Size search tree.
Comparison of the number of nodes in the search tree of the RSMA and the ISMA algorithm for finding 3-node motifs in the PGS network. The search tree reduction factor is defined as the size of the search tree of RSMA divided by the size of the search tree of ISMA.

Similar articles

Cited by

References

    1. Jasny B, Zahn L, Marshall E (2009) Connections. Science (New York, NY) 325: 405. - PubMed
    1. Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, et al. (2002) Network motifs: simple building blocks of complex networks. Science 298: 824–827. - PubMed
    1. Alon U (2007) Network motifs: theory and experimental approaches. Nat Rev Genet 8: 450–461. - PubMed
    1. Kashtan N, Itzkovitz S, Milo R, Alon U (2004) Topological generalizations of network motifs. Phys Rev E 70: 031909. - PubMed
    1. Dobrin R, Beg QK, Barabási AL, Oltvai ZN (2004) Aggregation of topological motifs in the Es- cherichia coli transcription regulatory network. BMC Bioinformatics 5: 10. - PMC - PubMed

Publication types

LinkOut - more resources