Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Feb;22(2):269-272.
doi: 10.1038/s41592-024-02552-8. Epub 2025 Jan 3.

Orthology inference at scale with FastOMA

Affiliations

Orthology inference at scale with FastOMA

Sina Majidian et al. Nat Methods. 2025 Feb.

Abstract

The surge in genome data, with ongoing efforts aiming to sequence 1.5 M eukaryotes in a decade, could revolutionize genomics, revealing the origins, evolution and genetic innovations of biological processes. Yet, traditional genomics methods scale poorly with such large datasets. Here, addressing this, 'FastOMA' provides linear scalability for orthology inference, enabling the processing of thousands of eukaryotic genomes within a day. FastOMA maintains the high accuracy and resolution of the well-established Orthologous Matrix (OMA) approach in benchmarks. FastOMA is available via GitHub at https://github.com/DessimozLab/FastOMA/ .

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. FastOMA algorithm overview.
Input proteomes are mapped to reference gene families using the OMAmer software, forming hierarchical orthologous groups (HOGs) at the root level (rootHOGs), see Methods. HOGs are inferred using a ‘bottom-up’ approach, starting from the leaves of the species tree and moving towards the root. At each taxonomic level, HOGs from the child level are merged, resulting in HOGs at the current level. To decide which HOGs should be merged, sequences from the child HOGs are used to create a MSA, followed by gene tree inference to identify speciation and duplication events. Child HOGs are merged if their genes evolved through speciation (see Methods and Supplementary Information 1 for details). Credit: human silhouette, T. Michael Keesey (Public Domain Mark 1.0); chimpanzee silhouette, Jonathan Lawley (CC0 1.0 Universal); mouse silhouette, Soledad Miranda-Rottman (CC BY 3.0), PhyloPic.
Fig. 2
Fig. 2. FastOMA is not only fast but also accurate.
a, QfO benchmark, agreement with SwissTree reference phylogeny covering 19 manually curated gene trees. The error bars indicate 95% confidence intervals comparing FastOMA with EnsemblCompara, Domainoid, OrthoMCL, Ortholnspector, sonicparanoid, PANTHER, OrthoFinder, Hieranoid and the OMA family including OMA pairs, OMA groups and OMA GETHOGs (graph-based efficient technique for HOGs). b, QfO benchmarking of the generalized species discordance test on the Eukaryota clade, where the gene tree inferred from orthologous genes is compared with the reference species tree considering up to 3,000 gene trees per method (see Supplementary Information 2.1 for details). c, A computation time comparison of FastOMA and state-of-the-art alternatives. d, The impact of species tree resolution on the complexity of the gene family evolutionary scenario (proxied by the number of gene losses over the gene family history). Each point represents a gene family (a rootHOG), whereby the size of a gene family corresponds to the number of genes in it (the figure is truncated to focus on the most relevant region; see Supplementary Fig. 24 for a version with all data, and see Methods for the implied losses calculation).

References

    1. Lewin, H. A. et al. Earth BioGenome Project: sequencing life for the future of life. Proc. Natl Acad. Sci. USA115, 4325–4333 (2018). - PMC - PubMed
    1. Fitch, W. M. Distinguishing homologous from analogous proteins. Syst. Zool.19, 99–113 (1970). - PubMed
    1. Glover, N. et al. Advances and applications in the Quest for Orthologs. Mol. Biol. Evol.36, 2157–2164 (2019). - PMC - PubMed
    1. Linard, B. et al. Ten years of collaborative progress in the Quest for Orthologs. Mol. Biol. Evol. 10.1093/molbev/msab098 (2021). - PMC - PubMed
    1. Altenhoff, A. M. et al. OMA orthology in 2024: improved prokaryote coverage, ancestral and extant GO enrichment, a revamped synteny viewer and more in the OMA Ecosystem. Nucleic Acids Res. 10.1093/nar/gkad1020 (2023). - PMC - PubMed

LinkOut - more resources