Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 15;20(1):27.
doi: 10.1186/s12859-018-2483-9.

Optimising orbit counting of arbitrary order by equation selection

Affiliations

Optimising orbit counting of arbitrary order by equation selection

Ine Melckenbeeck et al. BMC Bioinformatics. .

Abstract

Background: Graphlets are useful for bioinformatics network analysis. Based on the structure of Hočevar and Demšar's ORCA algorithm, we have created an orbit counting algorithm, named Jesse. This algorithm, like ORCA, uses equations to count the orbits, but unlike ORCA it can count graphlets of any order. To do so, it generates the required internal structures and equations automatically. Many more redundant equations are generated, however, and Jesse's running time is highly dependent on which of these equations are used. Therefore, this paper aims to investigate which equations are most efficient, and which factors have an effect on this efficiency.

Results: With appropriate equation selection, Jesse's running time may be reduced by a factor of up to 2 in the best case, compared to using randomly selected equations. Which equations are most efficient depends on the density of the graph, but barely on the graph type. At low graph density, equations with terms in their right-hand side with few arguments are more efficient, whereas at high density, equations with terms with many arguments in the right-hand side are most efficient. At a density between 0.6 and 0.7, both types of equations are about equally efficient.

Conclusions: Our Jesse algorithm became up to a factor 2 more efficient, by automatically selecting the best equations based on graph density. It was adapted into a Cytoscape App that is freely available from the Cytoscape App Store to ease application by bioinformaticians.

Keywords: Cytoscape app; Equations; Graph theory; Graphlets; Optimisation; Orbits.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
All graphlets of 2-5 nodes with their original numbering. Within each graphlet, grayscales show the graphlets’ orbits. Orbit numbers are shown close to a node of that orbit. Figure adapted from [10]
Fig. 2
Fig. 2
All datapoints from all runs of the speed vs number of lookups test. The datapoints are in grayscale, according to the set of equations that was used to generate them. On the x axis: the number of lookups done, on the y axis: the running time in seconds. The datapoints were fitted with a power trend line of f(x)=7.72∗10−8x1.03
Fig. 3
Fig. 3
The running time of Jesse, counting 5-graphlets in ER, BA and GEO graphs of different orders and densities. Three different sets of equations were tested: randomly chosen equations, equations with RHS terms with fewer arguments and equations with RHS terms with more arguments. All experiments were repeated 20 times. The error bars show the standard deviation of all runs
Fig. 4
Fig. 4
The relative running time of Jesse, counting 5-graphlets in ER, BA and GEO graphs of different orders and densities. The speed of equations with RHS terms with more or fewer arguments relative to random equations is shown, as compared to the running time with random equations. All experiments were repeated 20 times. The error bars show the standard deviation of all runs
Fig. 5
Fig. 5
The Jesse interface in Cytoscape. a Search Options panel, with basic options (graphlet order, search network, only count graphlets for selected nodes). b File Options panel, used to select orbit and tree files for graphlets of order greater than 7. c File Generation panel, used to generate orbit and tree files for graphlets of order greater than 7. d Run button to start the algorithm. e Results panel, where the graphlet degrees of each node can be seen. Selecting a node in the network view will also select it in the results panel, and vice versa, as indicated by the arrow. Clicking on a column header will visualise the corresponding orbit. f Buttons to save the used equations and computed graphlet degrees

References

    1. Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U. Network motifs: simple building blocks of complex networks. Science (New York, N.Y.) 2002;298(5594):824–7. doi: 10.1126/science.298.5594.824. - DOI - PubMed
    1. Wernicke S, Rasche F. FANMOD: a tool for fast network motif detection. Bioinformatics (Oxford, England) 2006;22(9):1152–3. doi: 10.1093/bioinformatics/btl038. - DOI - PubMed
    1. Lin W, Xiao X, Xie X, Li X-lL. Network motif discovery: A GPU approach. In: 2015 IEEE 31st International Conference on Data Engineering. IEEE: 2015. p. 831–42.
    1. Pržulj N, Corneil DG, Jurisica I. Modeling interactome: scale-free or geometric? Bioinformatics (Oxford, England) 2004;20(18):3508–15. doi: 10.1093/bioinformatics/bth436. - DOI - PubMed
    1. Melckenbeeck I, Audenaert P, Colle D, Pickavet M. Efficiently counting all orbits of graphlets of any order in a graph using autogenerated equations. Bioinformatics (Oxford, England) 2017;34(1372):758. - PubMed