Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Oct;13(10):2229-35.
doi: 10.1101/gr.1589103.

Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution

Affiliations

Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution

Dmitri M Krylov et al. Genome Res. 2003 Oct.

Abstract

Lineage-specific gene loss, to a large extent, accounts for the differences in gene repertoires between genomes, particularly among eukaryotes. We derived a parsimonious scenario of gene losses for eukaryotic orthologous groups (KOGs) from seven complete eukaryotic genomes. The scenario involves substantial gene loss in fungi, nematodes, and insects. Based on this evolutionary scenario and estimates of the divergence times between major eukaryotic phyla, we introduce a numerical measure, the propensity for gene loss (PGL). We explore the connection among the propensity of a gene to be lost in evolution (PGL value), protein sequence divergence, the effect of gene knockout on fitness, the number of protein-protein interactions, and expression level for the genes in KOGs. Significant correlations between PGL and each of these variables were detected. Genes that have a lower propensity to be lost in eukaryotic evolution accumulate fewer substitutions in their protein sequences and tend to be essential for the organism viability, tend to be highly expressed, and have many interaction partners. The dependence between PGL and gene dispensability and interactivity is much stronger than that for sequence evolution rate. Thus, propensity of a gene to be lost during evolution seems to be a direct reflection of its biological importance.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The phylogeny of eukaryotes and PGL calculations. (A) Estimated divergence times in millions of years ago (MYA) are shown for all internal nodes of the tree; the estimates are from Hedges et al. (2001). The number of lost genes according to the reconstructed parsimonious scenario is shown next to each branch. (B, C) Examples of PGL calculation. The presence and absence of a gene in each of the extant species is indicated by “+” and “-”, respectively. Red branches are those that retained the gene; blue branches are those to which a loss was mapped. (B) The loss of gene in the branch leading to the common ancestor of yeasts and microsporidian is shown by a blue dot because this branch formally has zero length.
Figure 2
Figure 2
Distribution of essential and nonessential yeast genes among PGL classes. Yeast proteins were binned into four classes according to the PGL values for the corresponding KOGs. The number of essential (E) and nonessential (N) genes in each class is indicated. If there were multiple yeast paralogs in a KOG, the KOG was counted as essential if at least one of the paralogs was essential.
Figure 3
Figure 3
PGL and number of protein-protein interactions for yeast proteins. Yeast proteins were binned into four classes according to the PGL values for the corresponding KOGs. The average number of interactions was calculated for each class. For KOGs with multiple yeast paralogs, the sum of interactions for all paralogs was used, with the rationale that this is the natural integral measure of the interactivity of the proteins in the given KOG, under the assumption that all paralogs in a KOG have evolved via relatively recent, lineage-specific duplications.

References

    1. Adachi, J. and Hasegawa, M. 1992. MOLPHY: Programs for molecular phylogenetics. Institute of Statistical Mathematics, Tokyo, Japan.
    1. Albert, R., Jeong, H., and Barabasi, A.L. 2000. Error and attack tolerance of complex networks. Nature 406: 378-382. - PubMed
    1. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 3389-3402. - PMC - PubMed
    1. Aravind, L., Watanabe, H., Lipman, D.J., and Koonin, E.V. 2000. Lineage-specific loss and divergence of functionally linked genes in eukaryotes. Proc. Natl. Acad. Sci. 97: 11319-11324. - PMC - PubMed
    1. Barabasi, A.L. 2002. Linked: The new science of networks. Perseus Press, New York.

WEB SITE REFERENCES

    1. http://www.ncbi.nlm.nih.gov/COG/new/shokog.cgi; eukaryotic clusters of orthologous groups (KOGs).
    1. http://genome-www.stanford.edu/Saccharomyces/; Saccharomyces Genome Database (SGD).
    1. http://biodata.mshri.on.ca/grid/servlet/Index; the General Repository for Interaction Datasets (GRID).

MeSH terms

LinkOut - more resources