Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan;43(Database issue):D234-9.
doi: 10.1093/nar/gku1203. Epub 2014 Nov 27.

InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic

Affiliations

InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic

Erik L L Sonnhammer et al. Nucleic Acids Res. 2015 Jan.

Abstract

The InParanoid database (http://InParanoid.sbc.su.se) provides a user interface to orthologs inferred by the InParanoid algorithm. As there are now international efforts to curate and standardize complete proteomes, we have switched to using these resources rather than gathering and curating the proteomes ourselves. InParanoid release 8 is based on the 66 reference proteomes that the 'Quest for Orthologs' community has agreed on using, plus 207 additional proteomes from the UniProt complete proteomes--in total 273 species. These represent 246 eukaryotes, 20 bacteria and seven archaea. Compared to the previous release, this increases the number of species by 173% and the number of pairwise species comparisons by 650%. In turn, the number of ortholog groups has increased by 423%. We present the contents and usages of InParanoid 8, and a detailed analysis of how the proteome content has changed since the previous release.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Workflow for the parallel 2-pass BLAST procedure used for generating InParanoid 8. BLAST runs are launched for all pairs of proteomes, running both passes in parallel. When both passes are finished, their outputs are validated by checking for truncation or failure to complete. Intra-proteome matches are checked against the proteome sequences to ensure inclusion of all genes. Pass 1 pairs are combined with pass 2 results such that only pairs accepted in pass 1 are kept, but with alignments from pass 2. A failed validation will either lead to a whole proteome rerun for failed/truncated results or individual serial pass2 reruns for pass1 pairs lacking pass2 results.
Figure 2.
Figure 2.
Example of online output when browsing InParanoid 8, showing the neighbor-joining tree and Pfam (20) domain architectures of the proteins in ortholog group 99 between human and soybean (Glycine max). All proteins have the same Pfam domain architecture—a Tubulin (green) and a Tubulin_C (red) domain. The tree indicates that these tubulin-alpha proteins have been duplicated many times independently in the two lineages since they diverged, giving rise to seven human and 10 soybean inparalogs. All human inparalogs are orthologous to all soybean inparalogs as they are all related via the inferred speciation event at the root of the tree.
Figure 3.
Figure 3.
Scatterplot of the number of inparalogs between species pairs in InParanoid 8 and InParanoid 7, for the species common to both releases. The number of inparalogs has generally not changed much, with some exceptions that are highlighted in color (orange for B. malayi, green for T. cruzi and blue for B. floridae).
Figure 4.
Figure 4.
Distributions of the relative number of inparalogs (yellow), inparalogs per sequence (black) and sequences (blue), comparing InParanoid8 to InParanoid 7.

References

    1. Fitch W.M. Distinguishing homologous from analogous proteins. Syst. Zoolog. 1970;19:99–113. - PubMed
    1. Sonnhammer E.L., Koonin E.V. Orthology, paralogy and proposed classification for paralog subtypes. Trends Genet. 2002;18:619–620. - PubMed
    1. Sonnhammer E.L., Gabaldon T., Sousa da Silva A.W., Martin M., Robinson-Rechavi M., Boeckmann B., Thomas P.D., Dessimoz C. Big data and other challenges in the quest for orthologs. Bioinformatics. 2014;30:2993–2998. - PMC - PubMed
    1. Remm M., Storm C.E., Sonnhammer E.L. Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol. 2001;314:1041–1052. - PubMed
    1. Hulsen T., Huynen M.A., de Vlieg J., Groenen P.M. Benchmarking ortholog identification methods using functional genomics data. Genome Biol. 2006;7:R31. - PMC - PubMed

Publication types