CompositeSearch: A Generalized Network Approach for Composite Gene Families Detection
- PMID: 29092069
- PMCID: PMC5850286
- DOI: 10.1093/molbev/msx283
CompositeSearch: A Generalized Network Approach for Composite Gene Families Detection
Abstract
Genes evolve by point mutations, but also by shuffling, fusion, and fission of genetic fragments. Therefore, similarity between two sequences can be due to common ancestry producing homology, and/or partial sharing of component fragments. Disentangling these processes is especially challenging in large molecular data sets, because of computational time. In this article, we present CompositeSearch, a memory-efficient, fast, and scalable method to detect composite gene families in large data sets (typically in the range of several million sequences). CompositeSearch generalizes the use of similarity networks to detect composite and component gene families with a greater recall, accuracy, and precision than recent programs (FusedTriplets and MosaicFinder). Moreover, CompositeSearch provides user-friendly quality descriptions regarding the distribution and primary sequence conservation of these gene families allowing critical biological analyses of these data.
Keywords: bioinformatics; evolution; molecular evolution; network analysis; protein sequence analysis.
© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Figures
References
-
- Adai AT, Date SV, Wieland S, Marcotte EM.. 2004. LGL: creating a map of protein function with an algorithm for visualizing very large biological networks. J Mol Biol. 340(1):179–190. - PubMed
-
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ.. 1990. Basic local alignment search tool. J Mol Biol. 215(3):403–410. - PubMed
-
- Bornberg-Bauer E, Schmitz J, Heberlein M.. 2015. Emergence of de novo proteins from ‘dark genomic matter’ by ‘grow slow and moult’. Biochem Soc Trans. 43(5):867–873. - PubMed
-
- Corel E, Lopez P, Meheust R, Bapteste E.. 2016. Network-Thinking: graphs to analyze microbial complexity and evolution. Trends Microbiol. 24(3):224–237.http://dx.doi.org/10.1016/j.tim.2015.12.003 - DOI - PMC - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
